# Pallas MCP Interface Specification This document defines the contract between **Daedalus** (MCP client / web UI) and **Pallas** (FastAgent MCP servers). It specifies the interfaces Pallas must expose: a **registry endpoint** for agent discovery, a **`get_health` MCP tool** on each agent for health monitoring, and **progress notifications** for real-time feedback during agent execution. --- ## Architecture Overview ``` Pallas Instance (puck.incus) ┌────────────────────────────────────────┐ │ │ │ Registry (port 23030) │ Daedalus ──GET──▶│ /.well-known/mcp/server.json │ │ │ │ Agent: Research (port 23031) │ Daedalus ──MCP──▶│ MultimodalAgentMCPServer │──MCP──▶ Argos, Neo4j │ └─ get_health tool │ │ │ │ Agent: Engineering (port 23032) │ Daedalus ──MCP──▶│ MultimodalAgentMCPServer │──MCP──▶ Kernos, Gitea │ └─ get_health tool │ │ │ │ Agent: Orchestrator (port 23033) │ Daedalus ──MCP──▶│ MultimodalAgentMCPServer │──MCP──▶ Research, Infra │ └─ get_health tool │ └────────────────────────────────────────┘ ``` A single Pallas instance hosts multiple FastAgent agents, each on its own port. The registry runs on a dedicated port (e.g. 23030) and provides a catalogue of all agents. Each agent exposes a `get_health` MCP tool that FastAgent intercepts programmatically — no LLM invocation. Daedalus registers the registry URL once in global settings. Everything else is automatic. --- ## 1. Registry Endpoint ### `GET {registry_url}/.well-known/mcp/server.json` The registry is a plain HTTP endpoint (not MCP) served on a dedicated port. It returns a dynamic list of all agents currently provided by the Pallas instance, following the [MCP Server Schema](https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json). #### Request ``` GET http://puck.incus:23030/.well-known/mcp/server.json Accept: application/json ``` No authentication. No query parameters. #### Response ```json { "servers": [ { "server": { "$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json", "name": "ca.helu.ouranos/pallas-research", "title": "Research Agent", "description": "Web search via Argos and knowledge graph via Neo4j", "version": "1.0.0", "icons": [ { "src": "https://daedalus.ouranos.helu.ca/icons/research.svg", "sizes": "any" } ], "remotes": [ { "type": "streamable-http", "url": "http://puck.incus:23031/mcp" } ], "capabilities": { "model": "qwen3-8b-q5", "vision": false, "context_window": 200000, "max_output_tokens": 32000 } }, "_meta": { "io.modelcontextprotocol.registry/official": { "status": "active", "updatedAt": "2026-03-12T10:00:00Z", "isLatest": true } } }, { "server": { "name": "ca.helu.ouranos/pallas-infra", "title": "Engineering Agent", "description": "Shell access via Kernos and repository management via Gitea", "version": "1.0.0", "remotes": [ { "type": "streamable-http", "url": "http://puck.incus:23032/mcp" } ], "capabilities": { "model": "qwen3-8b-q5", "vision": false, "context_window": 200000, "max_output_tokens": 32000 } }, "_meta": { "io.modelcontextprotocol.registry/official": { "status": "active", "updatedAt": "2026-03-12T10:00:00Z", "isLatest": true } } } ] } ``` #### Schema | Field | Type | Required | Description | |-------|------|----------|-------------| | `servers` | array | yes | List of server entries | | `servers[].server.name` | string | yes | Reverse-domain identifier (e.g. `ca.helu.ouranos/pallas-research`). Daedalus derives `server_id` from the segment after the last `/`. | | `servers[].server.title` | string | no | Human-readable display name. Falls back to `name` if absent. | | `servers[].server.description` | string | no | One-line description shown in Daedalus UI. | | `servers[].server.version` | string | no | Semver version string. | | `servers[].server.icons` | array | no | Array of `{ src, sizes }`. Daedalus uses the first entry. | | `servers[].server.remotes` | array | yes | Connection endpoints. Daedalus looks for `type: "streamable-http"` and uses its `url`. | | `servers[].server.capabilities` | object | no | Model capabilities. Contains `model` (string), `vision` (bool), `context_window` (int), `max_output_tokens` (int). Published when `model_capabilities` is configured in `fastagent.config.yaml`. | | `servers[]._meta` | object | no | Registry metadata. Informational only — Daedalus does not act on it. | #### Behaviour - The response **must** reflect the current set of registered agents. If an agent is added or removed from Pallas, subsequent requests must reflect the change. - Content-Type **must** be `application/json`. - Every entry in `remotes` with `type: "streamable-http"` is treated as an MCP endpoint Daedalus can connect to. - The `icons[].src` URL may be absolute or relative. Daedalus stores it as-is. --- ## 2. Health Tool ### MCP tool: `get_health` Each agent's MCP server **must** expose a tool named `get_health`. FastAgent intercepts this tool programmatically — it does not route through the LLM. This keeps health checks fast (~ms) and free of inference cost. #### Tool Definition The tool should appear in `session.list_tools()` with: ```json { "name": "get_health", "description": "Returns the health status of this agent and its downstream dependencies.", "inputSchema": { "type": "object", "properties": {}, "additionalProperties": false } } ``` No input arguments. #### Invocation Daedalus calls this via the standard MCP SDK: ```python result = await session.call_tool("get_health") ``` #### Response The tool returns a single `text` content block containing a JSON object: ```json { "status": "ok", "timestamp": "2026-03-12T15:42:00Z" } ``` ##### Status Values | Status | Meaning | Daedalus Behaviour | |--------|---------|-------------------| | `ok` | Agent healthy, all downstream MCP servers reachable | Green badge. Normal operation. | | `degraded` | Agent responds but with issues (slow responses, partial downstream outage) | Yellow badge + warning banner. Chat allowed. | | `error` | Agent cannot process requests | Red badge. Chat disabled — user cannot send messages. | ##### Fields | Field | Type | Required | Description | |-------|------|----------|-------------| | `status` | `"ok" \| "degraded" \| "error"` | yes | Current health state | | `timestamp` | string (ISO 8601) | no | When the health check was performed | | `message` | string | no | Human-readable explanation. Required when `status` is `degraded` or `error`. Shown in Daedalus UI tooltips and warning banners. | ##### Examples **Healthy:** ```json { "status": "ok", "timestamp": "2026-03-12T15:42:00Z" } ``` **Degraded:** ```json { "status": "degraded", "timestamp": "2026-03-12T15:42:00Z", "message": "Avg response 12s — Neo4j connection slow" } ``` **Error:** ```json { "status": "error", "timestamp": "2026-03-12T15:42:00Z", "message": "Argos MCP server unreachable" } ``` #### Implementation Guidance The `get_health` tool checks connectivity to all downstream MCP servers the agent depends on using the MCP `initialize` handshake — the only MCP method that works without a pre-established session. This avoids burning LLM tokens on health checks. For each downstream MCP server: 1. `POST` an MCP `initialize` request to the server URL (with auth headers and `Accept: application/json, text/event-stream`) 2. On success, tear down the session by sending `DELETE` with the returned `Mcp-Session-Id` header to avoid leaking server-side state 3. On failure (HTTP error, timeout, connection refused), record the server as unreachable Result mapping: - All downstream servers reachable and active LLM provider healthy → `ok` - Some downstream servers unreachable, or active LLM provider failed preflight → `degraded` with explanation - Agent failed to start or cannot process requests → `error` with explanation The tool **must not** invoke the LLM. It should complete in under 1 second (3-second timeout per downstream probe). --- ## 3. Daedalus Consumption ### Registration Flow 1. User enters registry URL in Daedalus global settings (e.g. `http://puck.incus:23030`) 2. Daedalus `GET`s `{url}/.well-known/mcp/server.json` 3. Daedalus stores the `PallasInstance` with its registry URL 4. Discovered agents are shown with metadata (title, description, icon) ### Workspace Attachment 1. User selects a registered Pallas instance in workspace settings 2. Daedalus re-fetches the registry and creates `AgentConnection` rows for every agent in the instance 3. All agents from the instance become available in the workspace 4. Detaching removes all agent connections for that instance from the workspace ### Health Polling - Daedalus polls `get_health` on connected agents at a configurable interval (`DAEDALUS_MCP_HEALTH_INTERVAL`, default 60 seconds) - Health is cached in memory and exposed via the agent status API - Prometheus gauge `daedalus_agent_health{instance, agent}` tracks health (1.0=ok, 0.5=degraded, 0.0=error) - If health check fails entirely (connection error, timeout), status is treated as `error` ### Chat Blocking - If the target agent's cached health is `error`, the chat endpoint returns HTTP 503 and the UI disables the message input - If `degraded`, a warning bar appears but chat is allowed - Users **can** create a workspace and attach an instance with unhealthy agents — health only blocks sending messages --- ## 4. Agent Progress Notifications Agent tool calls can take tens of seconds to minutes when the agent enters an agentic loop — calling sub-agents, searching the web, querying knowledge graphs, etc. During this time, the MCP tool call has not yet returned. Without progress feedback, the user sees a dead spinner. MCP provides a built-in mechanism for this: `notifications/progress`. Pallas already emits these notifications during agent execution. Daedalus must opt in by sending a `progressToken` and rendering the notifications it receives. ### How It Works ``` Daedalus Pallas (harper, port 24101) │ │ │── tools/call ─────────────────────────▶│ { message: "...", _meta: { progressToken: "abc123" } } │ │ │ │── LLM generates text + tool calls ──▶ │ │ │◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 0, message: "research/research__research: started" } │ │ │◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 1, message: "harper step 1 (tool)" } │ │ │◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 2, message: "harper step 2 (llm)" } │ │ │◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 1, total: 1, message: "research/research__research: completed" } │ │ │◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 1, total: 1, message: "tech_research/tech_research__tech_research: completed" } │ │ │◀── tools/call result ─────────────────│ { content: [{ type: "text", text: "..." }] } │ │ ``` All messages flow over the existing SSE connection established by MCP Streamable HTTP. No additional transport is needed. ### Daedalus Requirements #### Sending the Progress Token When calling any agent tool (except `get_health`), Daedalus **must** include a `progressToken` in the request's `_meta`: ```python result = await session.call_tool( "harper", arguments={"message": user_input}, request_params={"_meta": {"progressToken": str(uuid4())}}, ) ``` Without the `progressToken`, Pallas skips all progress notifications and Daedalus receives nothing until the final result. #### Handling Progress Notifications Daedalus receives `notifications/progress` messages on the SSE stream during the tool call. Each notification contains: | Field | Type | Description | |-------|------|-------------| | `progressToken` | string/int | Matches the token sent in the request | | `progress` | float | Monotonically increasing step counter | | `total` | float \| null | `null` = indeterminate (loop in progress), `1.0` = task finished | | `message` | string \| null | Human-readable status text | #### Message Format Progress messages follow predictable patterns: | Pattern | Meaning | Example | |---------|---------|---------| | `{server}/{tool}: started` | Tool invocation began | `research/research__research: started` | | `{server}/{tool}: completed` | Tool invocation finished | `tech_research/tech_research__tech_research: completed` | | `{server}/{tool}: failed` | Tool invocation failed | `argos/search_web: failed` | | `{agent} step N (llm)` | Agent loop: LLM turn | `harper step 2 (llm)` | | `{agent} step N (tool)` | Agent loop: tool execution | `harper step 3 (tool)` | #### Rendering Guidance - Display the `message` as a status line beneath the "thinking" indicator - Replace the previous status on each new notification (not appended) - When `total` is `null`, show an indeterminate progress indicator (spinner) - When `total` equals `progress` (typically `1.0/1.0`), the specific tool/sub-task has completed — but the overall tool call may still be in progress - Clear the progress indicator when the final `tools/call` result arrives ### Pallas Guarantees - Progress notifications are emitted automatically by FastAgent's `MCPToolProgressManager` — no additional server-side configuration is needed - Notifications are only sent when the client provides a `progressToken` - At minimum, `on_tool_start` (progress 0) and `on_tool_complete` (progress 1/1) are emitted for every downstream tool invocation - Loop step notifications are emitted when `emit_loop_progress=True` (the default for all Pallas agents) - Progress notifications are best-effort — if one fails to send, the agent loop continues unaffected ### Limitations - **LLM intermediate text is not streamed as progress.** When the agent says "Let me look into that..." before calling tools, this text is generated server-side during the LLM streaming step but is not forwarded as a progress notification. The text is included in the final tool result. A future enhancement may stream LLM text deltas as progress messages with a distinguishable prefix. - **Parallel tool calls** emit interleaved progress messages. Each message includes a tool-specific prefix (`{server}/{tool}`), so Daedalus can track them independently if desired, or simply display the most recent message. --- ## 5. Why MCP (Not REST) Pallas wraps each FastAgent instance in a `MultimodalAgentMCPServer` and serves it over StreamableHTTP. The MCP transport gives Daedalus: - **Tool discovery** — `session.list_tools()` returns the full capability manifest - **Streaming** — MCP Streamable HTTP handles streaming natively - **Health checks** — `get_health` is just another tool call, no separate API surface - **Protocol alignment** — MCP is the abstraction boundary both above and below Pallas. No MCP→REST→MCP translation layer. The alternative (REST between Daedalus and Pallas) would require building a custom API layer in Pallas that reimplements what the MCP server already provides, with no simplification on the Daedalus side.