Dictare / Docs / Pipeline

Pipeline¶

Dictare processes transcriptions through a pipeline of filters and executors before delivering text to agents. The pipeline enables voice commands like "OK send", "OK mute", and "agent claude" to be intercepted and acted upon.

Architecture¶

The pipeline has two stages:

Filters — analyze and modify transcriptions, extract commands
Executors — execute the actions detected by filters

Filters and executors are loaded via PipelineLoader using dependency injection.

Filters¶

Submit Filter¶

Detects submit triggers at the end of a transcription and sends the input to the agent.

[pipeline.submit_filter]
triggers = ["send", "submit", "enter"]
confidence_threshold = 0.8

When you say "OK send" or "OK submit" at the end of your dictation, the filter strips the trigger phrase and submits the remaining text.

The confidence_threshold (0.0 to 1.0) controls how strictly the trigger must match. Lower values are more lenient, higher values reduce false positives.

Mute Filter¶

Detects mute/unmute commands to control the microphone.

[pipeline.mute_filter]
triggers = ["mute", "listen"]
phrases = ["OK mute", "OK listen"]

Say "OK mute" to silence the microphone, "OK listen" to resume.

Agent Filter¶

Detects agent switching commands in multi-agent setups.

[pipeline.agent_filter]
triggers = ["agent"]
match_threshold = 0.7

Say "agent claude" or "agent codex" to switch which agent receives input. The match_threshold controls fuzzy matching tolerance for agent names.

Executors¶

Executors are the action handlers that run after filters extract commands:

Executor	Triggered by	Action
`InputExecutor`	Submit filter	Delivers text + submit keystroke to agent
`MuteExecutor`	Mute filter	Toggles microphone state
`AgentSwitchExecutor`	Agent filter	Switches active agent

Pattern Matching¶

Filters share a common text matching module (pipeline/filters/_text.py) that handles:

Case-insensitive matching
Fuzzy matching with configurable thresholds
Trigger phrase extraction and stripping
Position-aware matching (triggers at end of text for submit)

How It Works¶

You speak: "Add a test for the login function OK send"
STT transcribes the audio
Submit filter detects "OK send" at the end
Filter strips "OK send", passes "Add a test for the login function" forward
InputExecutor delivers the text to the active agent and presses Enter

Customizing Triggers¶

You can customize the trigger phrases to match your speaking style:

[pipeline.submit_filter]
triggers = ["go", "send", "submit", "do it"]

[pipeline.mute_filter]
triggers = ["mute", "listen", "quiet", "resume"]
phrases = ["OK quiet", "OK resume", "OK mute", "OK listen"]

Disabling Filters¶

To disable a filter, set its triggers to an empty list:

[pipeline.submit_filter]
triggers = []