284 lines
9.8 KiB
Markdown
284 lines
9.8 KiB
Markdown
# kottos
|
||
|
||
Engineering agents for Daedalus — powered by [Pallas](https://git.helu.ca/r/pallas).
|
||
|
||
Kottos is a pure agent project: Python agent definitions + YAML configuration.
|
||
The runtime (serving, registry, health checks, multimodal support) lives in Pallas.
|
||
|
||
## Architecture
|
||
|
||
```
|
||
Daedalus Backend — FastAPI
|
||
│ MCP over StreamableHTTP
|
||
▼
|
||
Pallas MCP Bridge (pallas.server:main)
|
||
│ reads agents.yaml for topology
|
||
│ reads fastagent.config.yaml for LLM + model capabilities
|
||
│
|
||
├── Registry → /.well-known/mcp/server.json (agent discovery)
|
||
├── Harper → kernos_harper, gitea, argos, neo4j_cypher, grafana,
|
||
│ rommie, angelia, time, research, tech_research
|
||
├── Scotty → kernos_scotty, argos, tech_research, neo4j_cypher, grafana, time
|
||
├── Research → argos, neo4j_cypher
|
||
└── Tech Research → context7, github, argos
|
||
```
|
||
|
||
## Project Structure
|
||
|
||
```
|
||
.
|
||
├── agents.yaml # Deployment topology — agents, ports, host, namespace
|
||
├── fastagent.config.yaml # LLM provider, MCP servers, model capabilities (committed)
|
||
├── fastagent.secrets.yaml # API keys and tokens (gitignored — never commit)
|
||
├── fastagent.secrets.yaml.example
|
||
├── agents/ # Agent definitions (FastAgent @fast.agent decorators)
|
||
│ ├── harper.py
|
||
│ ├── scotty.py
|
||
│ ├── research.py
|
||
│ └── tech_research.py
|
||
├── docs/
|
||
│ └── pallas_integration.md
|
||
├── pyproject.toml
|
||
└── LICENSE
|
||
```
|
||
|
||
## Agents
|
||
|
||
| Agent | Port | MCP URL | Purpose |
|
||
|-------|------|---------|---------|
|
||
| Harper | 24101 | `http://puck.incus:24101/mcp` | Scrappy engineer — rapid prototyping, hacking, and creative problem-solving |
|
||
| Scotty | 24102 | `http://puck.incus:24102/mcp` | Systems administration — infrastructure diagnostics and security hardening |
|
||
| Research | 24150 | `http://puck.incus:24150/mcp` | Web search + knowledge graph chain |
|
||
| Tech Research | 24151 | `http://puck.incus:24151/mcp` | Technical investigation — library docs, code examples, API comparisons |
|
||
| Registry | 24100 | `http://puck.incus:24100/.well-known/mcp/server.json` | Agent discovery |
|
||
|
||
## Configuration
|
||
|
||
### `agents.yaml` — Deployment Topology
|
||
|
||
Single source of truth for agent names, ports, dependencies, host, and namespace.
|
||
Read by Pallas at startup.
|
||
|
||
```yaml
|
||
name: kottos
|
||
version: "1.0.0"
|
||
host: puck.incus
|
||
namespace: ca.helu.kottos
|
||
registry_port: 24100
|
||
|
||
agents:
|
||
harper:
|
||
module: agents.harper
|
||
port: 24101
|
||
title: Harper
|
||
description: "Scrappy engineer — rapid prototyping, hacking, and creative problem-solving"
|
||
depends_on: [research, tech_research]
|
||
# ...
|
||
```
|
||
|
||
To deploy a different agent group, swap `agents.yaml` — no code changes needed.
|
||
Override the config path with `PALLAS_AGENTS_CONFIG` env var.
|
||
|
||
### `fastagent.config.yaml` — LLM + Model Capabilities
|
||
|
||
Committed to the repo. Contains LLM provider settings and explicit model capability
|
||
declarations.
|
||
|
||
In Ansible-managed deployments this file is replaced by the
|
||
`fastagent.config.yaml.j2` template which renders environment-specific values
|
||
for model, MCP URLs, etc.
|
||
|
||
```yaml
|
||
default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf
|
||
|
||
model_capabilities:
|
||
vision: false
|
||
context_window: 192000
|
||
max_output_tokens: 16384
|
||
```
|
||
|
||
The `model_capabilities` section declares capabilities explicitly rather than
|
||
inferring from the model name. Exposed in the registry for Daedalus to use when
|
||
routing requests.
|
||
|
||
### `fastagent.secrets.yaml` — API Keys and Tokens
|
||
|
||
Gitignored — never commit. Place in the repo root alongside `fastagent.config.yaml`.
|
||
|
||
In Ansible-managed deployments this file is replaced by the
|
||
`fastagent.secrets.yaml.j2` template which renders secrets from OCI Vault.
|
||
|
||
```yaml
|
||
openai:
|
||
api_key: "your-key-here"
|
||
|
||
mcp:
|
||
servers:
|
||
angelia:
|
||
headers:
|
||
Authorization: "Bearer your-token"
|
||
github:
|
||
env:
|
||
GITHUB_PERSONAL_ACCESS_TOKEN: "your-token"
|
||
# ...
|
||
```
|
||
|
||
## Quickstart
|
||
|
||
```bash
|
||
# 1. Install dependencies (Python 3.13 required)
|
||
source ~/env/kottos/bin/activate
|
||
pip install -e .
|
||
|
||
# 2. Configure secrets
|
||
cp fastagent.secrets.yaml.example fastagent.secrets.yaml
|
||
# Edit: set api_key and service tokens
|
||
|
||
# 3. Start all agents
|
||
kottos
|
||
|
||
# 4. Verify
|
||
curl http://localhost:24101/mcp
|
||
|
||
# 5. Start a single agent
|
||
kottos --agent harper
|
||
```
|
||
|
||
## Daedalus Integration
|
||
|
||
Daedalus connects to agents via the MCP Python SDK's `streamable_http_client`.
|
||
|
||
Registry endpoint: `http://puck.incus:24100/.well-known/mcp/server.json`
|
||
|
||
The registry includes model capabilities on each agent entry:
|
||
|
||
```json
|
||
{
|
||
"capabilities": {
|
||
"model": "Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf",
|
||
"vision": false,
|
||
"context_window": 192000,
|
||
"max_output_tokens": 16384
|
||
}
|
||
}
|
||
```
|
||
|
||
## Deployment
|
||
|
||
Kottos runs two ways:
|
||
|
||
1. **Locally on caliban**, hand-started for iteration (`kottos` from the repo root). This is the flow documented above in *Quickstart*.
|
||
2. **In Ouranos / Virgo / Taurus via Ansible**, as a `systemd`-managed `pallas` process on the puck.incus container. This is the pipeline that feeds the Puck Services dashboard in Grafana.
|
||
|
||
### Ansible role
|
||
|
||
Lives in `ouranos/ansible/kottos/`:
|
||
|
||
| File | Purpose |
|
||
|---|---|
|
||
| `deploy.yml` | Main playbook — user/group, venv, systemd unit, config templating, registry probe. |
|
||
| `stage.yml` | Clones `git.helu.ca/r/kottos` at `{{ kottos_rel }}` and creates the release tarball. |
|
||
| `kottos.service.j2` | systemd unit. `SyslogIdentifier=kottos`, `StandardOutput=journal`, `PALLAS_LOG_STDOUT=1` via the env file. |
|
||
| `.env.j2` | Runtime environment for `pallas` — logging config, `PALLAS_AGENTS_CONFIG`. |
|
||
| `agents.yaml.j2` | Deployment topology with host/ports pulled from inventory. |
|
||
| `fastagent.config.yaml.j2` | LLM provider + MCP server URLs, parametric per environment. |
|
||
| `fastagent.secrets.yaml.j2` | API keys and auth tokens, rendered from Ansible Vault. |
|
||
|
||
### Inventory
|
||
|
||
Host variables live in `inventory/host_vars/puck.incus.yml` under **Kottos Configuration**:
|
||
|
||
```yaml
|
||
kottos_user: kottos
|
||
kottos_group: kottos
|
||
kottos_directory: /srv/kottos
|
||
kottos_host: "puck.incus"
|
||
kottos_registry_port: 24100
|
||
kottos_harper_port: 24101
|
||
kottos_scotty_port: 24102
|
||
kottos_research_port: 24150
|
||
kottos_tech_research_port: 24151
|
||
pallas_log_level: INFO
|
||
# Local Qwen served via fast-agent's Generic (OpenAI-compatible) provider.
|
||
# The openai_base_url slot is reserved for cloud OpenAI endpoints (e.g.
|
||
# Bedrock Mantle Chat Completions).
|
||
kottos_default_model: "generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf"
|
||
kottos_generic_base_url: "http://nyx.helu.ca:22079/v1"
|
||
# ...plus one entry per downstream MCP URL so each environment overrides freely
|
||
```
|
||
|
||
Every host variable is parametric — Virgo's `puck.virgo.yml` (or wherever the Pallas host lives) can override any value without touching the templates.
|
||
|
||
### Vault
|
||
|
||
Four vault keys required — all documented in `inventory/group_vars/all/vault.yml.example`:
|
||
|
||
| Key | Used for |
|
||
|---|---|
|
||
| `vault_kottos_openai_api_key` | OpenAI-compatible LLM endpoint (nyx Qwen in Ouranos). |
|
||
| `vault_kottos_github_pat` | `GITHUB_PERSONAL_ACCESS_TOKEN` for the local GitHub MCP Docker container. |
|
||
| `vault_kottos_angelia_bearer` | Bearer token accepted by the Angelia MCP server. |
|
||
| `vault_kottos_mnemosyne_jwt` | Long-lived team JWT from Daedalus admin UI — Mnemosyne validates it on every `search_memory` call and scopes results to this team's workspaces. |
|
||
|
||
### Deploying
|
||
|
||
Wired into `site.yml`:
|
||
|
||
```bash
|
||
cd ansible
|
||
ansible-playbook kottos/stage.yml # clone repo + build tarball (local)
|
||
ansible-playbook kottos/deploy.yml # deploy + template + start
|
||
```
|
||
|
||
Or run the full site (`ansible-playbook site.yml`) — kottos's stage + deploy steps are the last block in the sequence.
|
||
|
||
### Logs
|
||
|
||
Journal identifier `kottos`, so on the host:
|
||
|
||
```bash
|
||
sudo journalctl -u kottos -f --output=cat | jq .
|
||
```
|
||
|
||
Alloy on puck's journal source relabels `__journal_syslog_identifier=kottos` to `{service="pallas", project="kottos"}`, then into Loki. Everything shows up in Grafana's *Puck Services — Logs & Health* dashboard under the **Pallas** row, with per-agent colouring driven by the `component` JSON field (`harper`, `scotty`, `research`, `tech_research`).
|
||
|
||
For per-agent follow-along:
|
||
|
||
```logql
|
||
{service="pallas", project="kottos", component="harper"} | json
|
||
```
|
||
|
||
For the opaque-MCP-transport-failure trace stream (see Pallas's bearer-forwarding incident history):
|
||
|
||
```logql
|
||
{service="pallas", project="kottos"} |= "pallas.forward.trace" | json
|
||
```
|
||
|
||
See [logging.md](logging.md) for the full label schema + level policy + add-a-new-service guide.
|
||
|
||
## Downstream MCP Servers
|
||
|
||
| Server | Host | URL |
|
||
|--------|------|-----|
|
||
| argos | miranda.incus | `http://miranda.incus:25534/mcp` |
|
||
| neo4j_cypher | circe.helu.ca | `http://circe.helu.ca:22034/mcp` |
|
||
| caliban | caliban.incus | `http://caliban.incus:22062/mcp` |
|
||
| rommie | caliban.incus | `http://caliban.incus:22061/mcp` |
|
||
| gitea | miranda.incus | `http://miranda.incus:25535/mcp` |
|
||
| grafana | miranda.incus | `http://miranda.incus:25533/mcp` |
|
||
| korax | korax.helu.ca | `http://korax.helu.ca:20261/mcp` |
|
||
| angelia | ouranos.helu.ca | `https://ouranos.helu.ca/mcp/` |
|
||
| github | local (Docker stdio) | `ghcr.io/github/github-mcp-server` |
|
||
| context7 | local (stdio) | `npx -y @upstash/context7-mcp` |
|
||
| time | local (stdio) | `mcp-server-time` |
|
||
|
||
## Notes
|
||
|
||
- **Python 3.13** required (`fast-agent-mcp` pins `>=3.13`)
|
||
- **Runtime:** [Pallas](https://git.helu.ca/r/pallas) — `pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git`
|
||
- **Transport:** StreamableHTTP (`/mcp`) throughout — not SSE
|
||
- **LLM:** Local Qwen via fast-agent's Generic (OpenAI-compatible) provider at
|
||
`http://nyx.helu.ca:22079/v1`
|
||
- **Logging:** Console output — stdout → syslog → Alloy → Loki in production
|
||
- **Port scheme:** registry at 24100, agents 24101–24149, sub-agents 24150–24199
|