feat: add initial Hold Slayer AI telephony gateway implementation
Complete project scaffolding and core implementation of an AI-powered telephony system that calls companies, navigates IVR menus, waits on hold, and transfers to the user when a human answers. Key components: - FastAPI server with REST API, WebSocket, and MCP (SSE) interfaces - SIP/VoIP call management via PJSUA2 with RTP audio streaming - LLM-powered IVR navigation using OpenAI/Anthropic with tool calling - Hold detection service combining audio analysis and silence detection - Real-time STT (Whisper/Deepgram) and TTS (OpenAI/Piper) pipelines - Call recording with per-channel and mixed audio capture - Event bus (asyncio pub/sub) for real-time client updates - Web dashboard with live call monitoring - SQLite persistence via SQLAlchemy with call history and analytics - Notification support (email, SMS, webhook, desktop) - Docker Compose deployment with Opal VoIP and Opal Media containers - Comprehensive test suite with unit, integration, and E2E tests - Simplified .gitignore and full project documentation in README
This commit is contained in:
165
docs/configuration.md
Normal file
165
docs/configuration.md
Normal file
@@ -0,0 +1,165 @@
|
||||
# Configuration
|
||||
|
||||
All configuration is via environment variables, loaded through Pydantic Settings. Copy `.env.example` to `.env` and edit.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
### SIP Trunk
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `SIP_TRUNK_HOST` | Your SIP provider hostname | — | Yes |
|
||||
| `SIP_TRUNK_PORT` | SIP signaling port | `5060` | No |
|
||||
| `SIP_TRUNK_USERNAME` | SIP auth username | — | Yes |
|
||||
| `SIP_TRUNK_PASSWORD` | SIP auth password | — | Yes |
|
||||
| `SIP_TRUNK_DID` | Your phone number (E.164) | — | Yes |
|
||||
| `SIP_TRUNK_TRANSPORT` | Transport protocol (`udp`, `tcp`, `tls`) | `udp` | No |
|
||||
|
||||
### Gateway
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `GATEWAY_SIP_PORT` | Port for device SIP registration | `5080` | No |
|
||||
| `GATEWAY_RTP_PORT_MIN` | Minimum RTP port | `10000` | No |
|
||||
| `GATEWAY_RTP_PORT_MAX` | Maximum RTP port | `20000` | No |
|
||||
| `GATEWAY_HOST` | Bind address | `0.0.0.0` | No |
|
||||
|
||||
### LLM
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `LLM_BASE_URL` | OpenAI-compatible API endpoint | `http://localhost:11434/v1` | No |
|
||||
| `LLM_MODEL` | Model name for IVR analysis | `llama3` | No |
|
||||
| `LLM_API_KEY` | API key (if required) | `not-needed` | No |
|
||||
| `LLM_TIMEOUT` | Request timeout in seconds | `30.0` | No |
|
||||
| `LLM_MAX_TOKENS` | Max tokens per response | `1024` | No |
|
||||
| `LLM_TEMPERATURE` | Sampling temperature | `0.3` | No |
|
||||
|
||||
### Speech-to-Text
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `SPEACHES_URL` | Speaches/Whisper STT endpoint | `http://localhost:22070` | No |
|
||||
| `SPEACHES_MODEL` | Whisper model name | `whisper-large-v3` | No |
|
||||
|
||||
### Database
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `DATABASE_URL` | PostgreSQL or SQLite connection string | `sqlite+aiosqlite:///./hold_slayer.db` | No |
|
||||
|
||||
### Notifications
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `NOTIFY_SMS_NUMBER` | Phone number for SMS alerts (E.164) | — | No |
|
||||
|
||||
### Audio Classifier
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `CLASSIFIER_WINDOW_SECONDS` | Audio window size for classification | `3.0` | No |
|
||||
| `CLASSIFIER_SILENCE_THRESHOLD` | RMS below this = silence | `0.85` | No |
|
||||
| `CLASSIFIER_MUSIC_THRESHOLD` | Spectral flatness below this = music | `0.7` | No |
|
||||
| `CLASSIFIER_SPEECH_THRESHOLD` | Spectral flatness above this = speech | `0.6` | No |
|
||||
|
||||
### Hold Slayer
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `MAX_HOLD_TIME` | Maximum seconds to wait on hold | `7200` | No |
|
||||
| `HOLD_CHECK_INTERVAL` | Seconds between audio checks | `2.0` | No |
|
||||
| `DEFAULT_TRANSFER_DEVICE` | Device to transfer to | `sip_phone` | No |
|
||||
|
||||
### Recording
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `RECORDING_DIR` | Directory for WAV recordings | `recordings` | No |
|
||||
| `RECORDING_MAX_SECONDS` | Maximum recording duration | `7200` | No |
|
||||
| `RECORDING_SAMPLE_RATE` | Audio sample rate | `16000` | No |
|
||||
|
||||
## Settings Architecture
|
||||
|
||||
Configuration is managed by Pydantic Settings in `config.py`:
|
||||
|
||||
```python
|
||||
from config import get_settings
|
||||
|
||||
settings = get_settings()
|
||||
settings.sip_trunk_host # "sip.provider.com"
|
||||
settings.llm.base_url # "http://localhost:11434/v1"
|
||||
settings.llm.model # "llama3"
|
||||
settings.speaches_url # "http://localhost:22070"
|
||||
settings.database_url # "sqlite+aiosqlite:///./hold_slayer.db"
|
||||
```
|
||||
|
||||
LLM settings are nested under `settings.llm` as a `LLMSettings` sub-model.
|
||||
|
||||
## Deployment
|
||||
|
||||
### Development
|
||||
|
||||
```bash
|
||||
# 1. Clone and install
|
||||
git clone <repo-url>
|
||||
cd hold-slayer
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate
|
||||
pip install -e ".[dev]"
|
||||
|
||||
# 2. Configure
|
||||
cp .env.example .env
|
||||
# Edit .env
|
||||
|
||||
# 3. Start Ollama (for LLM)
|
||||
ollama serve
|
||||
ollama pull llama3
|
||||
|
||||
# 4. Start Speaches (for STT)
|
||||
docker run -p 22070:8000 ghcr.io/speaches-ai/speaches
|
||||
|
||||
# 5. Run
|
||||
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
|
||||
```
|
||||
|
||||
### Production
|
||||
|
||||
```bash
|
||||
# Use PostgreSQL instead of SQLite
|
||||
DATABASE_URL=postgresql+asyncpg://user:pass@localhost/hold_slayer
|
||||
|
||||
# Use vLLM for faster inference
|
||||
LLM_BASE_URL=http://localhost:8000/v1
|
||||
LLM_MODEL=meta-llama/Llama-3-8B-Instruct
|
||||
|
||||
# Run with multiple workers (note: each worker is independent)
|
||||
uvicorn main:app --host 0.0.0.0 --port 8000 --workers 1
|
||||
```
|
||||
|
||||
Note: Hold Slayer is designed as a single-process application. Multiple workers would each have their own SIP engine and call state. For high availability, run behind a load balancer with sticky sessions.
|
||||
|
||||
### Docker
|
||||
|
||||
```dockerfile
|
||||
FROM python:3.13-slim
|
||||
|
||||
# Install system dependencies for PJSUA2 and Sippy
|
||||
RUN apt-get update && apt-get install -y \
|
||||
build-essential \
|
||||
libpjproject-dev \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
WORKDIR /app
|
||||
COPY . .
|
||||
RUN pip install -e .
|
||||
|
||||
EXPOSE 8000 5080/udp 10000-20000/udp
|
||||
|
||||
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
```
|
||||
|
||||
Port mapping:
|
||||
- `8000` — HTTP API + WebSocket + MCP
|
||||
- `5080/udp` — SIP device registration
|
||||
- `10000-20000/udp` — RTP media ports
|
||||
Reference in New Issue
Block a user