r/ouranos

Files

Robert Helewka b2fc398782 Move llama-cpp to generic fastagent slot

2026-05-12 15:07:00 -04:00

9.8 KiB

Raw Blame History

kottos

Engineering agents for Daedalus — powered by Pallas.

Kottos is a pure agent project: Python agent definitions + YAML configuration. The runtime (serving, registry, health checks, multimodal support) lives in Pallas.

Architecture

Daedalus Backend — FastAPI
    │  MCP over StreamableHTTP
    ▼
Pallas MCP Bridge (pallas.server:main)
    │  reads agents.yaml for topology
    │  reads fastagent.config.yaml for LLM + model capabilities
    │
    ├── Registry      → /.well-known/mcp/server.json (agent discovery)
    ├── Harper        → kernos_harper, gitea, argos, neo4j_cypher, grafana,
    │                   rommie, angelia, time, research, tech_research
    ├── Scotty        → kernos_scotty, argos, tech_research, neo4j_cypher, grafana, time
    ├── Research      → argos, neo4j_cypher
    └── Tech Research → context7, github, argos

Project Structure

.
├── agents.yaml              # Deployment topology — agents, ports, host, namespace
├── fastagent.config.yaml    # LLM provider, MCP servers, model capabilities (committed)
├── fastagent.secrets.yaml   # API keys and tokens (gitignored — never commit)
├── fastagent.secrets.yaml.example
├── agents/                  # Agent definitions (FastAgent @fast.agent decorators)
│   ├── harper.py
│   ├── scotty.py
│   ├── research.py
│   └── tech_research.py
├── docs/
│   └── pallas_integration.md
├── pyproject.toml
└── LICENSE

Agents

Agent	Port	MCP URL	Purpose
Harper	24101	`http://puck.incus:24101/mcp`	Scrappy engineer — rapid prototyping, hacking, and creative problem-solving
Scotty	24102	`http://puck.incus:24102/mcp`	Systems administration — infrastructure diagnostics and security hardening
Research	24150	`http://puck.incus:24150/mcp`	Web search + knowledge graph chain
Tech Research	24151	`http://puck.incus:24151/mcp`	Technical investigation — library docs, code examples, API comparisons
Registry	24100	`http://puck.incus:24100/.well-known/mcp/server.json`	Agent discovery

Configuration

`agents.yaml` — Deployment Topology

Single source of truth for agent names, ports, dependencies, host, and namespace. Read by Pallas at startup.

name: kottos
version: "1.0.0"
host: puck.incus
namespace: ca.helu.kottos
registry_port: 24100

agents:
  harper:
    module: agents.harper
    port: 24101
    title: Harper
    description: "Scrappy engineer — rapid prototyping, hacking, and creative problem-solving"
    depends_on: [research, tech_research]
  # ...

To deploy a different agent group, swap agents.yaml — no code changes needed. Override the config path with PALLAS_AGENTS_CONFIG env var.

`fastagent.config.yaml` — LLM + Model Capabilities

Committed to the repo. Contains LLM provider settings and explicit model capability declarations.

In Ansible-managed deployments this file is replaced by the fastagent.config.yaml.j2 template which renders environment-specific values for model, MCP URLs, etc.

default_model: generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf

model_capabilities:
  vision: false
  context_window: 192000
  max_output_tokens: 16384

The model_capabilities section declares capabilities explicitly rather than inferring from the model name. Exposed in the registry for Daedalus to use when routing requests.

`fastagent.secrets.yaml` — API Keys and Tokens

Gitignored — never commit. Place in the repo root alongside fastagent.config.yaml.

In Ansible-managed deployments this file is replaced by the fastagent.secrets.yaml.j2 template which renders secrets from OCI Vault.

openai:
  api_key: "your-key-here"

mcp:
  servers:
    angelia:
      headers:
        Authorization: "Bearer your-token"
    github:
      env:
        GITHUB_PERSONAL_ACCESS_TOKEN: "your-token"
    # ...

Quickstart

# 1. Install dependencies (Python 3.13 required)
source ~/env/kottos/bin/activate
pip install -e .

# 2. Configure secrets
cp fastagent.secrets.yaml.example fastagent.secrets.yaml
# Edit: set api_key and service tokens

# 3. Start all agents
kottos

# 4. Verify
curl http://localhost:24101/mcp

# 5. Start a single agent
kottos --agent harper

Daedalus Integration

Daedalus connects to agents via the MCP Python SDK's streamable_http_client.

Registry endpoint: http://puck.incus:24100/.well-known/mcp/server.json

The registry includes model capabilities on each agent entry:

{
  "capabilities": {
    "model": "Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf",
    "vision": false,
    "context_window": 192000,
    "max_output_tokens": 16384
  }
}

Deployment

Kottos runs two ways:

Locally on caliban, hand-started for iteration (kottos from the repo root). This is the flow documented above in Quickstart.
In Ouranos / Virgo / Taurus via Ansible, as a systemd-managed pallas process on the puck.incus container. This is the pipeline that feeds the Puck Services dashboard in Grafana.

Ansible role

Lives in ouranos/ansible/kottos/:

File	Purpose
`deploy.yml`	Main playbook — user/group, venv, systemd unit, config templating, registry probe.
`stage.yml`	Clones `git.helu.ca/r/kottos` at `{{ kottos_rel }}` and creates the release tarball.
`kottos.service.j2`	systemd unit. `SyslogIdentifier=kottos`, `StandardOutput=journal`, `PALLAS_LOG_STDOUT=1` via the env file.
`.env.j2`	Runtime environment for `pallas` — logging config, `PALLAS_AGENTS_CONFIG`.
`agents.yaml.j2`	Deployment topology with host/ports pulled from inventory.
`fastagent.config.yaml.j2`	LLM provider + MCP server URLs, parametric per environment.
`fastagent.secrets.yaml.j2`	API keys and auth tokens, rendered from Ansible Vault.

Inventory

Host variables live in inventory/host_vars/puck.incus.yml under Kottos Configuration:

kottos_user: kottos
kottos_group: kottos
kottos_directory: /srv/kottos
kottos_host: "puck.incus"
kottos_registry_port: 24100
kottos_harper_port: 24101
kottos_scotty_port: 24102
kottos_research_port: 24150
kottos_tech_research_port: 24151
pallas_log_level: INFO
# Local Qwen served via fast-agent's Generic (OpenAI-compatible) provider.
# The openai_base_url slot is reserved for cloud OpenAI endpoints (e.g.
# Bedrock Mantle Chat Completions).
kottos_default_model: "generic.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf"
kottos_generic_base_url: "http://nyx.helu.ca:22079/v1"
# ...plus one entry per downstream MCP URL so each environment overrides freely

Every host variable is parametric — Virgo's puck.virgo.yml (or wherever the Pallas host lives) can override any value without touching the templates.

Vault

Four vault keys required — all documented in inventory/group_vars/all/vault.yml.example:

Key	Used for
`vault_kottos_openai_api_key`	OpenAI-compatible LLM endpoint (nyx Qwen in Ouranos).
`vault_kottos_github_pat`	`GITHUB_PERSONAL_ACCESS_TOKEN` for the local GitHub MCP Docker container.
`vault_kottos_angelia_bearer`	Bearer token accepted by the Angelia MCP server.
`vault_kottos_mnemosyne_jwt`	Long-lived team JWT from Daedalus admin UI — Mnemosyne validates it on every `search_memory` call and scopes results to this team's workspaces.

Deploying

Wired into site.yml:

cd ansible
ansible-playbook kottos/stage.yml     # clone repo + build tarball (local)
ansible-playbook kottos/deploy.yml    # deploy + template + start

Or run the full site (ansible-playbook site.yml) — kottos's stage + deploy steps are the last block in the sequence.

Logs

Journal identifier kottos, so on the host:

sudo journalctl -u kottos -f --output=cat | jq .

Alloy on puck's journal source relabels __journal_syslog_identifier=kottos to {service="pallas", project="kottos"}, then into Loki. Everything shows up in Grafana's Puck Services — Logs & Health dashboard under the Pallas row, with per-agent colouring driven by the component JSON field (harper, scotty, research, tech_research).

For per-agent follow-along:

{service="pallas", project="kottos", component="harper"} | json

For the opaque-MCP-transport-failure trace stream (see Pallas's bearer-forwarding incident history):

{service="pallas", project="kottos"} |= "pallas.forward.trace" | json

See logging.md for the full label schema + level policy + add-a-new-service guide.

Downstream MCP Servers

Server	Host	URL
argos	miranda.incus	`http://miranda.incus:25534/mcp`
neo4j_cypher	circe.helu.ca	`http://circe.helu.ca:22034/mcp`
caliban	caliban.incus	`http://caliban.incus:22062/mcp`
rommie	caliban.incus	`http://caliban.incus:22061/mcp`
gitea	miranda.incus	`http://miranda.incus:25535/mcp`
grafana	miranda.incus	`http://miranda.incus:25533/mcp`
korax	korax.helu.ca	`http://korax.helu.ca:20261/mcp`
angelia	ouranos.helu.ca	`https://ouranos.helu.ca/mcp/`
github	local (Docker stdio)	`ghcr.io/github/github-mcp-server`
context7	local (stdio)	`npx -y @upstash/context7-mcp`
time	local (stdio)	`mcp-server-time`

Notes

Python 3.13 required (fast-agent-mcp pins >=3.13)
Runtime: Pallas — pallas-mcp @ git+ssh://git@git.helu.ca:22022/r/pallas.git
Transport: StreamableHTTP (/mcp) throughout — not SSE
LLM: Local Qwen via fast-agent's Generic (OpenAI-compatible) provider at http://nyx.helu.ca:22079/v1
Logging: Console output — stdout → syslog → Alloy → Loki in production
Port scheme: registry at 24100, agents 24101–24149, sub-agents 24150–24199

9.8 KiB Raw Blame History Unescape Escape