# Services The intelligence layer services that power Hold Slayer's decision-making, transcription, recording, analytics, and notifications. ## LLM Client (`services/llm_client.py`) Async HTTP client for any OpenAI-compatible chat completion API. No SDK dependency — just httpx. ### Supported Backends | Backend | URL | Notes | |---------|-----|-------| | Ollama | `http://localhost:11434/v1` | Local, free, good for dev | | LM Studio | `http://localhost:1234/v1` | Local, free, GUI | | vLLM | `http://localhost:8000/v1` | Local, fast, production | | OpenAI | `https://api.openai.com/v1` | Cloud, paid, best quality | ### Usage ```python client = LLMClient( base_url="http://localhost:11434/v1", model="llama3", api_key="not-needed", # Ollama doesn't need a key timeout=30.0, max_tokens=1024, temperature=0.3, ) # Simple chat response = await client.chat("What is 2+2?") # "4" # Chat with system prompt response = await client.chat( "Parse this menu transcript...", system="You are a phone menu parser. Return JSON.", ) # Structured JSON response (auto-parses) result = await client.chat_json( "Extract menu options from: Press 1 for billing, press 2 for support", system="Return JSON with 'options' array.", ) # {"options": [{"digit": "1", "label": "billing"}, {"digit": "2", "label": "support"}]} ``` ### IVR Menu Analysis The primary use case — analyzing IVR transcripts to pick the right menu option: ```python decision = await client.analyze_ivr_menu( transcript="Welcome to Chase Bank. Press 1 for account balance, press 2 for recent transactions, press 3 for disputes, press 0 for an agent.", intent="dispute a charge from Amazon on December 15th", previous_selections=["main_menu"], ) # {"action": "dtmf", "digits": "3", "reasoning": "Disputes is the correct department"} ``` ### JSON Extraction The client handles messy LLM output gracefully: 1. Try `json.loads()` on the raw response 2. If that fails, look for ```json ... ``` markdown blocks 3. If that fails, look for `{...}` patterns in the text 4. If all fail, return empty dict (caller handles gracefully) ### Stats Tracking ```python stats = client.stats # { # "total_requests": 47, # "total_errors": 2, # "avg_latency_ms": 234.5, # "model": "llama3", # "base_url": "http://localhost:11434/v1" # } ``` ### Error Handling - HTTP errors return empty string/dict (never crashes the call) - Timeouts are configurable (default 30s) - All errors are logged with full context - Stats track error rates for monitoring ## Transcription Service (`services/transcription.py`) Real-time speech-to-text using Speaches (a self-hosted Whisper API). ### Architecture ``` Audio frames (from AudioTap) │ └── POST /v1/audio/transcriptions ├── model: whisper-large-v3 ├── audio: WAV bytes └── language: en │ └── Response: { "text": "Press 1 for billing..." } ``` ### Usage ```python service = TranscriptionService( speaches_url="http://perseus.helu.ca:22070", model="whisper-large-v3", ) # Transcribe audio bytes text = await service.transcribe(audio_bytes) # "Welcome to Chase Bank. For English, press 1." # Transcribe with language hint text = await service.transcribe(audio_bytes, language="fr") ``` ### Integration with Hold Slayer The transcription service is called when the audio classifier detects speech (IVR_PROMPT or LIVE_HUMAN). The transcript is then: 1. Published as a `TRANSCRIPT_CHUNK` event (→ WebSocket clients) 2. Fed to the LLM for IVR menu analysis 3. Stored in the call's transcript history 4. Used by the Call Flow Learner to build reusable flows ## Recording Service (`services/recording.py`) Manages call recordings via the PJSUA2 media pipeline. ### Storage Structure ``` recordings/ ├── 2026/ │ ├── 01/ │ │ ├── 15/ │ │ │ ├── call_abc123_outbound.wav │ │ │ ├── call_abc123_mixed.wav │ │ │ └── call_def456_outbound.wav │ │ └── 16/ │ │ └── ... │ └── 02/ │ └── ... ``` ### Recording Types | Type | Description | |------|-------------| | **Outbound** | Audio from the company (IVR, hold music, agent) | | **Inbound** | Audio from the user's device (after transfer) | | **Mixed** | Both parties in one file (for review) | ### Usage ```python service = RecordingService( storage_dir="recordings", max_recording_seconds=7200, # 2 hours sample_rate=16000, ) # Start recording session = await service.start_recording(call_id, stream_id) # session.path = "recordings/2026/01/15/call_abc123_outbound.wav" # Stop recording metadata = await service.stop_recording(call_id) # metadata = { "duration": 847.3, "file_size": 27113600, "path": "..." } # List recordings for a call recordings = service.get_recordings(call_id) ``` ## Call Analytics (`services/call_analytics.py`) Tracks call metrics and provides insights for monitoring and optimization. ### Metrics Tracked | Metric | Description | |--------|-------------| | Hold time | Duration spent on hold per call | | Total call duration | End-to-end call time | | Success rate | Percentage of calls that reached a human | | IVR navigation time | Time spent navigating menus | | Company patterns | Per-company hold time averages | | Time-of-day trends | When hold times are shortest | ### Usage ```python analytics = CallAnalytics(max_history=10000) # Record a completed call analytics.record_call( call_id="call_abc123", number="+18005551234", company="Chase Bank", hold_time=780, total_duration=847, success=True, ivr_steps=6, ) # Get summary summary = analytics.get_summary() # { # "total_calls": 142, # "success_rate": 0.89, # "avg_hold_time": 623.4, # "avg_total_duration": 712.1, # } # Per-company stats stats = analytics.get_company_stats("Chase Bank") # { # "total_calls": 23, # "avg_hold_time": 845.2, # "best_time": "Tuesday 10:00 AM", # "success_rate": 0.91, # } # Top numbers by call volume top = analytics.get_top_numbers(limit=10) # Hold time trends by hour trends = analytics.get_hold_time_trend() # [{"hour": 9, "avg_hold": 320}, {"hour": 10, "avg_hold": 480}, ...] ``` ## Notification Service (`services/notification.py`) Sends alerts when important things happen on calls. ### Notification Channels | Channel | Status | Use Case | |---------|--------|----------| | **WebSocket** | ✅ Active | Real-time UI updates (always on) | | **SMS** | ✅ Active | Critical alerts (human detected, call failed) | | **Push** | 🔮 Future | Mobile app notifications | ### Notification Priority | Priority | Events | Delivery | |----------|--------|----------| | `CRITICAL` | Human detected, transfer started | WebSocket + SMS | | `HIGH` | Call failed, call timeout | WebSocket + SMS | | `NORMAL` | Hold detected, call ended | WebSocket only | | `LOW` | IVR step, DTMF sent | WebSocket only | ### Event → Notification Mapping | Event | Notification | |-------|-------------| | `HUMAN_DETECTED` | 🚨 "A live person picked up — transferring you now!" | | `TRANSFER_STARTED` | 📞 "Your call has been connected. Pick up your phone!" | | `CALL_FAILED` | ❌ "The call couldn't be completed." | | `HOLD_DETECTED` | ⏳ "You're on hold. We'll notify you when someone picks up." | | `IVR_STEP` | 📍 "Navigating phone menu..." | | `IVR_DTMF_SENT` | 📱 "Pressed 3" | | `CALL_ENDED` | 📴 "The call has ended." | ### Deduplication The notification service tracks what's been sent per call to avoid spamming: ```python # Won't send duplicate "on hold" notifications for the same call self._notified: dict[str, set[str]] # call_id → set of event dedup keys ``` Tracking is cleaned up when a call ends. ### SMS Configuration SMS is sent for `CRITICAL` priority notifications when `NOTIFY_SMS_NUMBER` is configured: ```env NOTIFY_SMS_NUMBER=+15559876543 ``` The SMS sender is a placeholder — wire up your preferred provider (Twilio, AWS SNS, etc.).