ouranos/docs/kottos.md

# kottos

Engineering agents for Daedalus — powered by [Pallas](https://git.helu.ca/r/pallas).

Kottos is a pure agent project: Python agent definitions + YAML configuration.
The runtime (serving, registry, health checks, multimodal support) lives in Pallas.

## Architecture

```
Daedalus Backend — FastAPI
    │  MCP over StreamableHTTP
    ▼
Pallas MCP Bridge (pallas.server:main)
    │  reads agents.yaml for topology
    │  reads fastagent.config.yaml for LLM + model capabilities
    │
    ├── Registry      → /.well-known/mcp/server.json (agent discovery)
    ├── Harper        → kernos_harper, gitea, argos, neo4j_cypher, grafana,
    │                   rommie, angelia, time, research, tech_research
    ├── Scotty        → kernos_scotty, argos, tech_research, neo4j_cypher, grafana, time
    ├── Research      → argos, neo4j_cypher
    └── Tech Research → context7, github, argos
```

## Project Structure

```
.
├── agents.yaml              # Deployment topology — agents, ports, host, namespace
├── fastagent.config.yaml    # LLM provider, MCP servers, model capabilities (committed)
├── fastagent.secrets.yaml   # API keys and tokens (gitignored — never commit)
├── fastagent.secrets.yaml.example
├── agents/                  # Agent definitions (FastAgent @fast.agent decorators)
│   ├── harper.py
│   ├── scotty.py
│   ├── research.py
│   └── tech_research.py
├── docs/
│   └── pallas_integration.md
├── pyproject.toml
└── LICENSE
```

## Agents

| Agent | Port | MCP URL | Purpose |
|-------|------|---------|---------|
| Harper | 24101 | `http://puck.incus:24101/mcp` | Scrappy engineer — rapid prototyping, hacking, and creative problem-solving |
| Scotty | 24102 | `http://puck.incus:24102/mcp` | Systems administration — infrastructure diagnostics and security hardening |
| Research | 24150 | `http://puck.incus:24150/mcp` | Web search + knowledge graph chain |
| Tech Research | 24151 | `http://puck.incus:24151/mcp` | Technical investigation — library docs, code examples, API comparisons |
| Registry | 24100 | `http://puck.incus:24100/.well-known/mcp/server.json` | Agent discovery |

## Configuration

### `agents.yaml` — Deployment Topology

Single source of truth for agent names, ports, dependencies, host, and namespace.
Read by Pallas at startup.

```yaml
name: kottos
version: "1.0.0"
host: puck.incus
namespace: ca.helu.kottos
registry_port: 24100

agents:
  harper:
    module: agents.harper
    port: 24101
    title: Harper
    description: "Scrappy engineer — rapid prototyping, hacking, and creative problem-solving"
    depends_on: [research, tech_research]
  # ...
```

To deploy a different agent group, swap `agents.yaml` — no code changes needed.
Override the config path with `PALLAS_AGENTS_CONFIG` env var.

### `fastagent.config.yaml` — LLM + Model Capabilities

Committed to the repo. Contains LLM provider settings and explicit model capability
declarations.

In Ansible-managed deployments this file is replaced by the
`fastagent.config.yaml.j2` template which renders environment-specific values
for model, MCP URLs, etc.

```yaml
default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf

model_capabilities:
  vision: false
  context_window: 192000
  max_output_tokens: 16384
```

The `model_capabilities` section declares capabilities explicitly rather than
inferring from the model name. Exposed in the registry for Daedalus to use when
routing requests.

### `fastagent.secrets.yaml` — API Keys and Tokens

Gitignored — never commit. Place in the repo root alongside `fastagent.config.yaml`.

In Ansible-managed deployments this file is replaced by the
`fastagent.secrets.yaml.j2` template which renders secrets from OCI Vault.

```yaml
openai:
  api_key: "your-key-here"

mcp:
  servers:
    angelia:
      headers:
        Authorization: "Bearer your-token"
    github:
      env:
        GITHUB_PERSONAL_ACCESS_TOKEN: "your-token"
    # ...
```

## Quickstart

```bash
# 1. Install dependencies (Python 3.13 required)
source ~/env/kottos/bin/activate
pip install -e .

# 2. Configure secrets
cp fastagent.secrets.yaml.example fastagent.secrets.yaml
# Edit: set api_key and service tokens

# 3. Start all agents
kottos

# 4. Verify
curl http://localhost:24101/mcp

# 5. Start a single agent
kottos --agent harper
```

## Daedalus Integration

Daedalus connects to agents via the MCP Python SDK's `streamable_http_client`.

Registry endpoint: `http://puck.incus:24100/.well-known/mcp/server.json`

The registry includes model capabilities on each agent entry:

```json
{
  "capabilities": {
    "model": "Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf",
    "vision": false,
    "context_window": 192000,
    "max_output_tokens": 16384
  }
}
```

## Deployment

Kottos runs two ways:

1. **Locally on caliban**, hand-started for iteration (`kottos` from the repo root). This is the flow documented above in *Quickstart*.
2. **In Ouranos / Virgo / Taurus via Ansible**, as a `systemd`-managed `pallas` process on the puck.incus container. This is the pipeline that feeds the Puck Services dashboard in Grafana.

### Ansible role

Lives in `ouranos/ansible/kottos/`:

| File | Purpose |
|---|---|
| `deploy.yml` | Main playbook — user/group, venv, systemd unit, config templating, registry probe. |
| `stage.yml` | Clones `git.helu.ca/r/kottos` at `{{ kottos_rel }}` and creates the release tarball. |
| `kottos.service.j2` | systemd unit. `SyslogIdentifier=kottos`, `StandardOutput=journal`, `PALLAS_LOG_STDOUT=1` via the env file. |
| `.env.j2` | Runtime environment for `pallas` — logging config, `PALLAS_AGENTS_CONFIG`. |
| `agents.yaml.j2` | Deployment topology with host/ports pulled from inventory. |
| `fastagent.config.yaml.j2` | LLM provider + MCP server URLs, parametric per environment. |
| `fastagent.secrets.yaml.j2` | API keys and auth tokens, rendered from Ansible Vault. |

### Inventory

Host variables live in `inventory/host_vars/puck.incus.yml` under **Kottos Configuration**:

```yaml
kottos_user: kottos
kottos_group: kottos
kottos_directory: /srv/kottos
kottos_host: "puck.incus"
kottos_registry_port: 24100
kottos_harper_port: 24101
kottos_scotty_port: 24102
kottos_research_port: 24150
kottos_tech_research_port: 24151
pallas_log_level: INFO
# Local Qwen served via fast-agent's Generic (OpenAI-compatible) provider.
# The openai_base_url slot is reserved for cloud OpenAI endpoints (e.g.
# Bedrock Mantle Chat Completions).
kottos_default_model: "generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf"
kottos_generic_base_url: "http://nyx.helu.ca:22079/v1"
# ...plus one entry per downstream MCP URL so each environment overrides freely
```

Every host variable is parametric — Virgo's `puck.virgo.yml` (or wherever the Pallas host lives) can override any value without touching the templates.

### Vault

Four vault keys required — all documented in `inventory/group_vars/all/vault.yml.example`:

| Key | Used for |
|---|---|
| `vault_kottos_openai_api_key` | OpenAI-compatible LLM endpoint (nyx Qwen in Ouranos). |
| `vault_kottos_github_pat` | `GITHUB_PERSONAL_ACCESS_TOKEN` for the local GitHub MCP Docker container. |
| `vault_kottos_angelia_bearer` | Bearer token accepted by the Angelia MCP server. |
| `vault_kottos_mnemosyne_jwt` | Long-lived team JWT from Daedalus admin UI — Mnemosyne validates it on every `search_memory` call and scopes results to this team's workspaces. |

### Deploying

Wired into `site.yml`:

```bash
cd ansible
ansible-playbook kottos/stage.yml     # clone repo + build tarball (local)
ansible-playbook kottos/deploy.yml    # deploy + template + start
```

Or run the full site (`ansible-playbook site.yml`) — kottos's stage + deploy steps are the last block in the sequence.

### Logs

Journal identifier `kottos`, so on the host:

```bash
sudo journalctl -u kottos -f --output=cat | jq .
```

Alloy on puck's journal source relabels `__journal_syslog_identifier=kottos` to `{service="pallas", project="kottos"}`, then into Loki. Everything shows up in Grafana's *Puck Services — Logs & Health* dashboard under the **Pallas** row, with per-agent colouring driven by the `component` JSON field (`harper`, `scotty`, `research`, `tech_research`).

For per-agent follow-along:

```logql
{service="pallas", project="kottos", component="harper"} | json
```

For the opaque-MCP-transport-failure trace stream (see Pallas's bearer-forwarding incident history):

```logql
{service="pallas", project="kottos"} |= "pallas.forward.trace" | json
```

See [logging.md](logging.md) for the full label schema + level policy + add-a-new-service guide.

## Downstream MCP Servers

| Server | Host | URL |
|--------|------|-----|
| argos | miranda.incus | `http://miranda.incus:25534/mcp` |
| neo4j_cypher | circe.helu.ca | `http://circe.helu.ca:22034/mcp` |
| caliban | caliban.incus | `http://caliban.incus:22062/mcp` |
| rommie | caliban.incus | `http://caliban.incus:22061/mcp` |
| gitea | miranda.incus | `http://miranda.incus:25535/mcp` |
| grafana | miranda.incus | `http://miranda.incus:25533/mcp` |
| korax | korax.helu.ca | `http://korax.helu.ca:20261/mcp` |
| angelia | ouranos.helu.ca | `https://ouranos.helu.ca/mcp/` |
| github | local (Docker stdio) | `ghcr.io/github/github-mcp-server` |
| context7 | local (stdio) | `npx -y @upstash/context7-mcp` |
| time | local (stdio) | `mcp-server-time` |

## Notes

- **Python 3.13** required (`fast-agent-mcp` pins `>=3.13`)
- **Runtime:** [Pallas](https://git.helu.ca/r/pallas) — `pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git`
- **Transport:** StreamableHTTP (`/mcp`) throughout — not SSE
- **LLM:** Local Qwen via fast-agent's Generic (OpenAI-compatible) provider at
  `http://nyx.helu.ca:22079/v1`
- **Logging:** Console output — stdout → syslog → Alloy → Loki in production
- **Port scheme:** registry at 24100, agents 24101–24149, sub-agents 24150–24199