Pipeline¶
Dictare processes transcriptions through a pipeline of filters and executors before delivering text to agents. The pipeline enables voice commands like "OK send", "OK mute", and "agent claude" to be intercepted and acted upon.
Architecture¶
The pipeline has two stages:
- Filters — analyze and modify transcriptions, extract commands
- Executors — execute the actions detected by filters
Filters and executors are loaded via PipelineLoader using dependency injection.
Filters¶
Submit Filter¶
Detects submit triggers at the end of a transcription and sends the input to the agent.
[pipeline.submit_filter]
triggers = ["send", "submit", "enter"]
confidence_threshold = 0.8
When you say "OK send" or "OK submit" at the end of your dictation, the filter strips the trigger phrase and submits the remaining text.
The confidence_threshold (0.0 to 1.0) controls how strictly the trigger must match. Lower values are more lenient, higher values reduce false positives.
Mute Filter¶
Detects mute/unmute commands to control the microphone.
[pipeline.mute_filter]
triggers = ["mute", "listen"]
phrases = ["OK mute", "OK listen"]
Say "OK mute" to silence the microphone, "OK listen" to resume.
Agent Filter¶
Detects agent switching commands in multi-agent setups.
[pipeline.agent_filter]
triggers = ["agent"]
match_threshold = 0.7
Say "agent claude" or "agent codex" to switch which agent receives input. The match_threshold controls fuzzy matching tolerance for agent names.
Executors¶
Executors are the action handlers that run after filters extract commands:
| Executor | Triggered by | Action |
|---|---|---|
InputExecutor |
Submit filter | Delivers text + submit keystroke to agent |
MuteExecutor |
Mute filter | Toggles microphone state |
AgentSwitchExecutor |
Agent filter | Switches active agent |
Pattern Matching¶
Filters share a common text matching module (pipeline/filters/_text.py) that handles:
- Case-insensitive matching
- Fuzzy matching with configurable thresholds
- Trigger phrase extraction and stripping
- Position-aware matching (triggers at end of text for submit)
How It Works¶
- You speak: "Add a test for the login function OK send"
- STT transcribes the audio
- Submit filter detects "OK send" at the end
- Filter strips "OK send", passes "Add a test for the login function" forward
- InputExecutor delivers the text to the active agent and presses Enter
Customizing Triggers¶
You can customize the trigger phrases to match your speaking style:
[pipeline.submit_filter]
triggers = ["go", "send", "submit", "do it"]
[pipeline.mute_filter]
triggers = ["mute", "listen", "quiet", "resume"]
phrases = ["OK quiet", "OK resume", "OK mute", "OK listen"]
Disabling Filters¶
To disable a filter, set its triggers to an empty list:
[pipeline.submit_filter]
triggers = []