Files
hold-slayer/docs/call-flows.md
Robert Helewka ecf37658ce feat: add initial Hold Slayer AI telephony gateway implementation
Complete project scaffolding and core implementation of an AI-powered
telephony system that calls companies, navigates IVR menus, waits on
hold, and transfers to the user when a human answers.

Key components:
- FastAPI server with REST API, WebSocket, and MCP (SSE) interfaces
- SIP/VoIP call management via PJSUA2 with RTP audio streaming
- LLM-powered IVR navigation using OpenAI/Anthropic with tool calling
- Hold detection service combining audio analysis and silence detection
- Real-time STT (Whisper/Deepgram) and TTS (OpenAI/Piper) pipelines
- Call recording with per-channel and mixed audio capture
- Event bus (asyncio pub/sub) for real-time client updates
- Web dashboard with live call monitoring
- SQLite persistence via SQLAlchemy with call history and analytics
- Notification support (email, SMS, webhook, desktop)
- Docker Compose deployment with Opal VoIP and Opal Media containers
- Comprehensive test suite with unit, integration, and E2E tests
- Simplified .gitignore and full project documentation in README
2026-03-21 19:23:26 +00:00

6.0 KiB

Call Flows

Call flows are reusable IVR navigation trees that tell Hold Slayer exactly how to navigate a company's phone menu. Once a flow is learned (manually or via exploration), subsequent calls to the same number skip the LLM analysis and follow the stored steps directly.

Data Model

CallFlowStep

A single step in the IVR navigation:

class CallFlowStep(BaseModel):
    id: str                          # Unique step identifier
    type: CallFlowStepType           # DTMF, WAIT, LISTEN, HOLD, SPEAK, TRANSFER
    description: str                 # Human-readable description
    dtmf: Optional[str] = None       # Digits to press (for DTMF steps)
    timeout: float = 10.0            # Max seconds to wait
    next_step: Optional[str] = None  # ID of the next step
    conditions: dict = {}            # Conditional branching rules
    metadata: dict = {}              # Extra data (transcript patterns, etc.)

Step Types

Type Purpose Key Fields
DTMF Press touch-tone digits dtmf="3"
WAIT Pause for a duration timeout=5.0
LISTEN Record + transcribe + decide timeout=15.0, optional dtmf for hardcoded response
HOLD Wait on hold, monitor for human timeout=7200 (max hold time)
SPEAK Play audio to the call metadata={"audio_file": "greeting.wav"}
TRANSFER Bridge call to user's device metadata={"device": "sip_phone"}

CallFlow

A complete IVR navigation tree:

class CallFlow(BaseModel):
    id: str                          # "chase_bank_main"
    name: str                        # "Chase Bank — Main Menu"
    company: Optional[str]           # "Chase Bank"
    phone_number: Optional[str]      # "+18005551234"
    description: Optional[str]       # "Navigate to disputes department"
    steps: list[CallFlowStep]        # Ordered list of steps
    created_at: datetime
    updated_at: datetime
    version: int = 1
    tags: list[str] = []             # ["banking", "disputes"]
    success_count: int = 0           # Times this flow succeeded
    fail_count: int = 0              # Times this flow failed

Example Call Flow

{
  "id": "chase_bank_disputes",
  "name": "Chase Bank — Disputes",
  "company": "Chase Bank",
  "phone_number": "+18005551234",
  "steps": [
    {
      "id": "wait_greeting",
      "type": "WAIT",
      "description": "Wait for greeting to finish",
      "timeout": 5.0,
      "next_step": "main_menu"
    },
    {
      "id": "main_menu",
      "type": "LISTEN",
      "description": "Listen to main menu options",
      "timeout": 15.0,
      "next_step": "press_3"
    },
    {
      "id": "press_3",
      "type": "DTMF",
      "description": "Press 3 for account services",
      "dtmf": "3",
      "next_step": "sub_menu"
    },
    {
      "id": "sub_menu",
      "type": "LISTEN",
      "description": "Listen to account services sub-menu",
      "timeout": 15.0,
      "next_step": "press_1"
    },
    {
      "id": "press_1",
      "type": "DTMF",
      "description": "Press 1 for disputes",
      "dtmf": "1",
      "next_step": "hold"
    },
    {
      "id": "hold",
      "type": "HOLD",
      "description": "Wait on hold for disputes agent",
      "timeout": 7200,
      "next_step": "transfer"
    },
    {
      "id": "transfer",
      "type": "TRANSFER",
      "description": "Transfer to user's phone"
    }
  ]
}

Call Flow Learner (services/call_flow_learner.py)

Automatically builds call flows from exploration data.

How It Works

  1. Exploration mode records "discoveries" — what the Hold Slayer encountered and did at each step
  2. The learner converts discoveries into CallFlowStep objects
  3. Steps are ordered and linked (next_step pointers)
  4. The resulting CallFlow is saved for future calls

Discovery Types

Discovery Becomes Step
Heard IVR prompt, pressed DTMF LISTENDTMF
Detected hold music HOLD
Detected silence (waiting) WAIT
Heard speech (human) TRANSFER
Sent DTMF digits DTMF

Building a Flow

learner = CallFlowLearner()

# After an exploration call completes:
discoveries = [
    {"type": "wait", "duration": 3.0, "description": "Initial silence"},
    {"type": "ivr_menu", "transcript": "Press 1 for billing...", "dtmf_sent": "1"},
    {"type": "ivr_menu", "transcript": "Press 3 for disputes...", "dtmf_sent": "3"},
    {"type": "hold", "duration": 480.0},
    {"type": "human_detected", "transcript": "Thank you for calling..."},
]

flow = learner.build_flow(
    discoveries=discoveries,
    phone_number="+18005551234",
    company="Chase Bank",
    intent="dispute a charge",
)
# Returns a CallFlow with 5 steps: WAIT → LISTEN/DTMF → LISTEN/DTMF → HOLD → TRANSFER

Merging Discoveries

When the same number is called again with exploration, new discoveries can be merged into the existing flow:

updated_flow = learner.merge_discoveries(
    existing_flow=flow,
    new_discoveries=new_discoveries,
)

This handles:

  • New menu options discovered
  • Changed IVR structure
  • Updated timing information
  • Success/failure tracking

REST API

List Call Flows

GET /api/call-flows
GET /api/call-flows?company=Chase+Bank
GET /api/call-flows?tag=banking

Get Call Flow

GET /api/call-flows/{flow_id}

Create Call Flow

POST /api/call-flows
Content-Type: application/json

{
  "name": "Chase Bank — Disputes",
  "company": "Chase Bank",
  "phone_number": "+18005551234",
  "steps": [ ... ]
}

Update Call Flow

PUT /api/call-flows/{flow_id}
Content-Type: application/json

{ ... updated flow ... }

Delete Call Flow

DELETE /api/call-flows/{flow_id}

Learn Flow from Exploration

POST /api/call-flows/learn
Content-Type: application/json

{
  "call_id": "call_abc123",
  "phone_number": "+18005551234",
  "company": "Chase Bank"
}

This triggers the Call Flow Learner to build a flow from the call's exploration data.