feat: add initial Hold Slayer AI telephony gateway implementation

Complete project scaffolding and core implementation of an AI-powered telephony system that calls companies, navigates IVR menus, waits on hold, and transfers to the user when a human answers. Key components: - FastAPI server with REST API, WebSocket, and MCP (SSE) interfaces - SIP/VoIP call management via PJSUA2 with RTP audio streaming - LLM-powered IVR navigation using OpenAI/Anthropic with tool calling - Hold detection service combining audio analysis and silence detection - Real-time STT (Whisper/Deepgram) and TTS (OpenAI/Piper) pipelines - Call recording with per-channel and mixed audio capture - Event bus (asyncio pub/sub) for real-time client updates - Web dashboard with live call monitoring - SQLite persistence via SQLAlchemy with call history and analytics - Notification support (email, SMS, webhook, desktop) - Docker Compose deployment with Opal VoIP and Opal Media containers - Comprehensive test suite with unit, integration, and E2E tests - Simplified .gitignore and full project documentation in README
2026-03-21 19:23:26 +00:00
parent c9ff60702b
commit ecf37658ce
56 changed files with 11601 additions and 164 deletions
--- a/docs/call-flows.md
+++ b/docs/call-flows.md
@@ -0,0 +1,233 @@
+# Call Flows
+
+Call flows are reusable IVR navigation trees that tell Hold Slayer exactly how to navigate a company's phone menu. Once a flow is learned (manually or via exploration), subsequent calls to the same number skip the LLM analysis and follow the stored steps directly.
+
+## Data Model
+
+### CallFlowStep
+
+A single step in the IVR navigation:
+
+```python
+class CallFlowStep(BaseModel):
+    id: str                          # Unique step identifier
+    type: CallFlowStepType           # DTMF, WAIT, LISTEN, HOLD, SPEAK, TRANSFER
+    description: str                 # Human-readable description
+    dtmf: Optional[str] = None       # Digits to press (for DTMF steps)
+    timeout: float = 10.0            # Max seconds to wait
+    next_step: Optional[str] = None  # ID of the next step
+    conditions: dict = {}            # Conditional branching rules
+    metadata: dict = {}              # Extra data (transcript patterns, etc.)
+```
+
+### Step Types
+
+| Type | Purpose | Key Fields |
+|------|---------|------------|
+| `DTMF` | Press touch-tone digits | `dtmf="3"` |
+| `WAIT` | Pause for a duration | `timeout=5.0` |
+| `LISTEN` | Record + transcribe + decide | `timeout=15.0`, optional `dtmf` for hardcoded response |
+| `HOLD` | Wait on hold, monitor for human | `timeout=7200` (max hold time) |
+| `SPEAK` | Play audio to the call | `metadata={"audio_file": "greeting.wav"}` |
+| `TRANSFER` | Bridge call to user's device | `metadata={"device": "sip_phone"}` |
+
+### CallFlow
+
+A complete IVR navigation tree:
+
+```python
+class CallFlow(BaseModel):
+    id: str                          # "chase_bank_main"
+    name: str                        # "Chase Bank — Main Menu"
+    company: Optional[str]           # "Chase Bank"
+    phone_number: Optional[str]      # "+18005551234"
+    description: Optional[str]       # "Navigate to disputes department"
+    steps: list[CallFlowStep]        # Ordered list of steps
+    created_at: datetime
+    updated_at: datetime
+    version: int = 1
+    tags: list[str] = []             # ["banking", "disputes"]
+    success_count: int = 0           # Times this flow succeeded
+    fail_count: int = 0              # Times this flow failed
+```
+
+## Example Call Flow
+
+```json
+{
+  "id": "chase_bank_disputes",
+  "name": "Chase Bank — Disputes",
+  "company": "Chase Bank",
+  "phone_number": "+18005551234",
+  "steps": [
+    {
+      "id": "wait_greeting",
+      "type": "WAIT",
+      "description": "Wait for greeting to finish",
+      "timeout": 5.0,
+      "next_step": "main_menu"
+    },
+    {
+      "id": "main_menu",
+      "type": "LISTEN",
+      "description": "Listen to main menu options",
+      "timeout": 15.0,
+      "next_step": "press_3"
+    },
+    {
+      "id": "press_3",
+      "type": "DTMF",
+      "description": "Press 3 for account services",
+      "dtmf": "3",
+      "next_step": "sub_menu"
+    },
+    {
+      "id": "sub_menu",
+      "type": "LISTEN",
+      "description": "Listen to account services sub-menu",
+      "timeout": 15.0,
+      "next_step": "press_1"
+    },
+    {
+      "id": "press_1",
+      "type": "DTMF",
+      "description": "Press 1 for disputes",
+      "dtmf": "1",
+      "next_step": "hold"
+    },
+    {
+      "id": "hold",
+      "type": "HOLD",
+      "description": "Wait on hold for disputes agent",
+      "timeout": 7200,
+      "next_step": "transfer"
+    },
+    {
+      "id": "transfer",
+      "type": "TRANSFER",
+      "description": "Transfer to user's phone"
+    }
+  ]
+}
+```
+
+## Call Flow Learner (`services/call_flow_learner.py`)
+
+Automatically builds call flows from exploration data.
+
+### How It Works
+
+1. **Exploration mode** records "discoveries" — what the Hold Slayer encountered and did at each step
+2. The learner converts discoveries into `CallFlowStep` objects
+3. Steps are ordered and linked (`next_step` pointers)
+4. The resulting `CallFlow` is saved for future calls
+
+### Discovery Types
+
+| Discovery | Becomes Step |
+|-----------|-------------|
+| Heard IVR prompt, pressed DTMF | `LISTEN` → `DTMF` |
+| Detected hold music | `HOLD` |
+| Detected silence (waiting) | `WAIT` |
+| Heard speech (human) | `TRANSFER` |
+| Sent DTMF digits | `DTMF` |
+
+### Building a Flow
+
+```python
+learner = CallFlowLearner()
+
+# After an exploration call completes:
+discoveries = [
+    {"type": "wait", "duration": 3.0, "description": "Initial silence"},
+    {"type": "ivr_menu", "transcript": "Press 1 for billing...", "dtmf_sent": "1"},
+    {"type": "ivr_menu", "transcript": "Press 3 for disputes...", "dtmf_sent": "3"},
+    {"type": "hold", "duration": 480.0},
+    {"type": "human_detected", "transcript": "Thank you for calling..."},
+]
+
+flow = learner.build_flow(
+    discoveries=discoveries,
+    phone_number="+18005551234",
+    company="Chase Bank",
+    intent="dispute a charge",
+)
+# Returns a CallFlow with 5 steps: WAIT → LISTEN/DTMF → LISTEN/DTMF → HOLD → TRANSFER
+```
+
+### Merging Discoveries
+
+When the same number is called again with exploration, new discoveries can be merged into the existing flow:
+
+```python
+updated_flow = learner.merge_discoveries(
+    existing_flow=flow,
+    new_discoveries=new_discoveries,
+)
+```
+
+This handles:
+- New menu options discovered
+- Changed IVR structure
+- Updated timing information
+- Success/failure tracking
+
+## REST API
+
+### List Call Flows
+
+```
+GET /api/call-flows
+GET /api/call-flows?company=Chase+Bank
+GET /api/call-flows?tag=banking
+```
+
+### Get Call Flow
+
+```
+GET /api/call-flows/{flow_id}
+```
+
+### Create Call Flow
+
+```
+POST /api/call-flows
+Content-Type: application/json
+
+{
+  "name": "Chase Bank — Disputes",
+  "company": "Chase Bank",
+  "phone_number": "+18005551234",
+  "steps": [ ... ]
+}
+```
+
+### Update Call Flow
+
+```
+PUT /api/call-flows/{flow_id}
+Content-Type: application/json
+
+{ ... updated flow ... }
+```
+
+### Delete Call Flow
+
+```
+DELETE /api/call-flows/{flow_id}
+```
+
+### Learn Flow from Exploration
+
+```
+POST /api/call-flows/learn
+Content-Type: application/json
+
+{
+  "call_id": "call_abc123",
+  "phone_number": "+18005551234",
+  "company": "Chase Bank"
+}
+```
+
+This triggers the Call Flow Learner to build a flow from the call's exploration data.