feat: add initial Hold Slayer AI telephony gateway implementation
Complete project scaffolding and core implementation of an AI-powered telephony system that calls companies, navigates IVR menus, waits on hold, and transfers to the user when a human answers. Key components: - FastAPI server with REST API, WebSocket, and MCP (SSE) interfaces - SIP/VoIP call management via PJSUA2 with RTP audio streaming - LLM-powered IVR navigation using OpenAI/Anthropic with tool calling - Hold detection service combining audio analysis and silence detection - Real-time STT (Whisper/Deepgram) and TTS (OpenAI/Piper) pipelines - Call recording with per-channel and mixed audio capture - Event bus (asyncio pub/sub) for real-time client updates - Web dashboard with live call monitoring - SQLite persistence via SQLAlchemy with call history and analytics - Notification support (email, SMS, webhook, desktop) - Docker Compose deployment with Opal VoIP and Opal Media containers - Comprehensive test suite with unit, integration, and E2E tests - Simplified .gitignore and full project documentation in README
This commit is contained in:
233
docs/call-flows.md
Normal file
233
docs/call-flows.md
Normal file
@@ -0,0 +1,233 @@
|
||||
# Call Flows
|
||||
|
||||
Call flows are reusable IVR navigation trees that tell Hold Slayer exactly how to navigate a company's phone menu. Once a flow is learned (manually or via exploration), subsequent calls to the same number skip the LLM analysis and follow the stored steps directly.
|
||||
|
||||
## Data Model
|
||||
|
||||
### CallFlowStep
|
||||
|
||||
A single step in the IVR navigation:
|
||||
|
||||
```python
|
||||
class CallFlowStep(BaseModel):
|
||||
id: str # Unique step identifier
|
||||
type: CallFlowStepType # DTMF, WAIT, LISTEN, HOLD, SPEAK, TRANSFER
|
||||
description: str # Human-readable description
|
||||
dtmf: Optional[str] = None # Digits to press (for DTMF steps)
|
||||
timeout: float = 10.0 # Max seconds to wait
|
||||
next_step: Optional[str] = None # ID of the next step
|
||||
conditions: dict = {} # Conditional branching rules
|
||||
metadata: dict = {} # Extra data (transcript patterns, etc.)
|
||||
```
|
||||
|
||||
### Step Types
|
||||
|
||||
| Type | Purpose | Key Fields |
|
||||
|------|---------|------------|
|
||||
| `DTMF` | Press touch-tone digits | `dtmf="3"` |
|
||||
| `WAIT` | Pause for a duration | `timeout=5.0` |
|
||||
| `LISTEN` | Record + transcribe + decide | `timeout=15.0`, optional `dtmf` for hardcoded response |
|
||||
| `HOLD` | Wait on hold, monitor for human | `timeout=7200` (max hold time) |
|
||||
| `SPEAK` | Play audio to the call | `metadata={"audio_file": "greeting.wav"}` |
|
||||
| `TRANSFER` | Bridge call to user's device | `metadata={"device": "sip_phone"}` |
|
||||
|
||||
### CallFlow
|
||||
|
||||
A complete IVR navigation tree:
|
||||
|
||||
```python
|
||||
class CallFlow(BaseModel):
|
||||
id: str # "chase_bank_main"
|
||||
name: str # "Chase Bank — Main Menu"
|
||||
company: Optional[str] # "Chase Bank"
|
||||
phone_number: Optional[str] # "+18005551234"
|
||||
description: Optional[str] # "Navigate to disputes department"
|
||||
steps: list[CallFlowStep] # Ordered list of steps
|
||||
created_at: datetime
|
||||
updated_at: datetime
|
||||
version: int = 1
|
||||
tags: list[str] = [] # ["banking", "disputes"]
|
||||
success_count: int = 0 # Times this flow succeeded
|
||||
fail_count: int = 0 # Times this flow failed
|
||||
```
|
||||
|
||||
## Example Call Flow
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "chase_bank_disputes",
|
||||
"name": "Chase Bank — Disputes",
|
||||
"company": "Chase Bank",
|
||||
"phone_number": "+18005551234",
|
||||
"steps": [
|
||||
{
|
||||
"id": "wait_greeting",
|
||||
"type": "WAIT",
|
||||
"description": "Wait for greeting to finish",
|
||||
"timeout": 5.0,
|
||||
"next_step": "main_menu"
|
||||
},
|
||||
{
|
||||
"id": "main_menu",
|
||||
"type": "LISTEN",
|
||||
"description": "Listen to main menu options",
|
||||
"timeout": 15.0,
|
||||
"next_step": "press_3"
|
||||
},
|
||||
{
|
||||
"id": "press_3",
|
||||
"type": "DTMF",
|
||||
"description": "Press 3 for account services",
|
||||
"dtmf": "3",
|
||||
"next_step": "sub_menu"
|
||||
},
|
||||
{
|
||||
"id": "sub_menu",
|
||||
"type": "LISTEN",
|
||||
"description": "Listen to account services sub-menu",
|
||||
"timeout": 15.0,
|
||||
"next_step": "press_1"
|
||||
},
|
||||
{
|
||||
"id": "press_1",
|
||||
"type": "DTMF",
|
||||
"description": "Press 1 for disputes",
|
||||
"dtmf": "1",
|
||||
"next_step": "hold"
|
||||
},
|
||||
{
|
||||
"id": "hold",
|
||||
"type": "HOLD",
|
||||
"description": "Wait on hold for disputes agent",
|
||||
"timeout": 7200,
|
||||
"next_step": "transfer"
|
||||
},
|
||||
{
|
||||
"id": "transfer",
|
||||
"type": "TRANSFER",
|
||||
"description": "Transfer to user's phone"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Call Flow Learner (`services/call_flow_learner.py`)
|
||||
|
||||
Automatically builds call flows from exploration data.
|
||||
|
||||
### How It Works
|
||||
|
||||
1. **Exploration mode** records "discoveries" — what the Hold Slayer encountered and did at each step
|
||||
2. The learner converts discoveries into `CallFlowStep` objects
|
||||
3. Steps are ordered and linked (`next_step` pointers)
|
||||
4. The resulting `CallFlow` is saved for future calls
|
||||
|
||||
### Discovery Types
|
||||
|
||||
| Discovery | Becomes Step |
|
||||
|-----------|-------------|
|
||||
| Heard IVR prompt, pressed DTMF | `LISTEN` → `DTMF` |
|
||||
| Detected hold music | `HOLD` |
|
||||
| Detected silence (waiting) | `WAIT` |
|
||||
| Heard speech (human) | `TRANSFER` |
|
||||
| Sent DTMF digits | `DTMF` |
|
||||
|
||||
### Building a Flow
|
||||
|
||||
```python
|
||||
learner = CallFlowLearner()
|
||||
|
||||
# After an exploration call completes:
|
||||
discoveries = [
|
||||
{"type": "wait", "duration": 3.0, "description": "Initial silence"},
|
||||
{"type": "ivr_menu", "transcript": "Press 1 for billing...", "dtmf_sent": "1"},
|
||||
{"type": "ivr_menu", "transcript": "Press 3 for disputes...", "dtmf_sent": "3"},
|
||||
{"type": "hold", "duration": 480.0},
|
||||
{"type": "human_detected", "transcript": "Thank you for calling..."},
|
||||
]
|
||||
|
||||
flow = learner.build_flow(
|
||||
discoveries=discoveries,
|
||||
phone_number="+18005551234",
|
||||
company="Chase Bank",
|
||||
intent="dispute a charge",
|
||||
)
|
||||
# Returns a CallFlow with 5 steps: WAIT → LISTEN/DTMF → LISTEN/DTMF → HOLD → TRANSFER
|
||||
```
|
||||
|
||||
### Merging Discoveries
|
||||
|
||||
When the same number is called again with exploration, new discoveries can be merged into the existing flow:
|
||||
|
||||
```python
|
||||
updated_flow = learner.merge_discoveries(
|
||||
existing_flow=flow,
|
||||
new_discoveries=new_discoveries,
|
||||
)
|
||||
```
|
||||
|
||||
This handles:
|
||||
- New menu options discovered
|
||||
- Changed IVR structure
|
||||
- Updated timing information
|
||||
- Success/failure tracking
|
||||
|
||||
## REST API
|
||||
|
||||
### List Call Flows
|
||||
|
||||
```
|
||||
GET /api/call-flows
|
||||
GET /api/call-flows?company=Chase+Bank
|
||||
GET /api/call-flows?tag=banking
|
||||
```
|
||||
|
||||
### Get Call Flow
|
||||
|
||||
```
|
||||
GET /api/call-flows/{flow_id}
|
||||
```
|
||||
|
||||
### Create Call Flow
|
||||
|
||||
```
|
||||
POST /api/call-flows
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"name": "Chase Bank — Disputes",
|
||||
"company": "Chase Bank",
|
||||
"phone_number": "+18005551234",
|
||||
"steps": [ ... ]
|
||||
}
|
||||
```
|
||||
|
||||
### Update Call Flow
|
||||
|
||||
```
|
||||
PUT /api/call-flows/{flow_id}
|
||||
Content-Type: application/json
|
||||
|
||||
{ ... updated flow ... }
|
||||
```
|
||||
|
||||
### Delete Call Flow
|
||||
|
||||
```
|
||||
DELETE /api/call-flows/{flow_id}
|
||||
```
|
||||
|
||||
### Learn Flow from Exploration
|
||||
|
||||
```
|
||||
POST /api/call-flows/learn
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
"call_id": "call_abc123",
|
||||
"phone_number": "+18005551234",
|
||||
"company": "Chase Bank"
|
||||
}
|
||||
```
|
||||
|
||||
This triggers the Call Flow Learner to build a flow from the call's exploration data.
|
||||
Reference in New Issue
Block a user