- Add /healthz endpoint returning LLM provider validation status - Add /metrics endpoint serving Prometheus metrics via prometheus_client - Replace all print() calls in health.py with proper logging module - Remove _PREFIX variable in favor of structured logger context
17 KiB
Pallas MCP Interface Specification
This document defines the contract between Daedalus (MCP client / web UI) and Pallas (FastAgent MCP servers). It specifies the interfaces Pallas must expose: a registry endpoint for agent discovery, a get_health MCP tool on each agent for health monitoring, and progress notifications for real-time feedback during agent execution.
Architecture Overview
Pallas Instance (puck.incus)
┌────────────────────────────────────────┐
│ │
│ Registry (port 23030) │
Daedalus ──GET──▶│ /.well-known/mcp/server.json │
│ │
│ Agent: Research (port 23031) │
Daedalus ──MCP──▶│ MultimodalAgentMCPServer │──MCP──▶ Argos, Neo4j
│ └─ get_health tool │
│ │
│ Agent: Engineering (port 23032) │
Daedalus ──MCP──▶│ MultimodalAgentMCPServer │──MCP──▶ Kernos, Gitea
│ └─ get_health tool │
│ │
│ Agent: Orchestrator (port 23033) │
Daedalus ──MCP──▶│ MultimodalAgentMCPServer │──MCP──▶ Research, Infra
│ └─ get_health tool │
└────────────────────────────────────────┘
A single Pallas instance hosts multiple FastAgent agents, each on its own port. The registry runs on a dedicated port (e.g. 23030) and provides a catalogue of all agents. Each agent exposes a get_health MCP tool that FastAgent intercepts programmatically — no LLM invocation.
Daedalus registers the registry URL once in global settings. Everything else is automatic.
1. Registry Endpoint
GET {registry_url}/.well-known/mcp/server.json
The registry is a plain HTTP endpoint (not MCP) served on a dedicated port. It returns a dynamic list of all agents currently provided by the Pallas instance, following the MCP Server Schema.
Request
GET http://puck.incus:23030/.well-known/mcp/server.json
Accept: application/json
No authentication. No query parameters.
Response
{
"servers": [
{
"server": {
"$schema": "https://static.modelcontextprotocol.io/schemas/2025-12-11/server.schema.json",
"name": "ca.helu.ouranos/pallas-research",
"title": "Research Agent",
"description": "Web search via Argos and knowledge graph via Neo4j",
"version": "1.0.0",
"icons": [
{ "src": "https://daedalus.ouranos.helu.ca/icons/research.svg", "sizes": "any" }
],
"remotes": [
{ "type": "streamable-http", "url": "http://puck.incus:23031/mcp" }
],
"capabilities": {
"model": "qwen3-8b-q5",
"vision": false,
"context_window": 200000,
"max_output_tokens": 32000
}
},
"_meta": {
"io.modelcontextprotocol.registry/official": {
"status": "active",
"updatedAt": "2026-03-12T10:00:00Z",
"isLatest": true
}
}
},
{
"server": {
"name": "ca.helu.ouranos/pallas-infra",
"title": "Engineering Agent",
"description": "Shell access via Kernos and repository management via Gitea",
"version": "1.0.0",
"remotes": [
{ "type": "streamable-http", "url": "http://puck.incus:23032/mcp" }
],
"capabilities": {
"model": "qwen3-8b-q5",
"vision": false,
"context_window": 200000,
"max_output_tokens": 32000
}
},
"_meta": {
"io.modelcontextprotocol.registry/official": {
"status": "active",
"updatedAt": "2026-03-12T10:00:00Z",
"isLatest": true
}
}
}
]
}
Schema
| Field | Type | Required | Description |
|---|---|---|---|
servers |
array | yes | List of server entries |
servers[].server.name |
string | yes | Reverse-domain identifier (e.g. ca.helu.ouranos/pallas-research). Daedalus derives server_id from the segment after the last /. |
servers[].server.title |
string | no | Human-readable display name. Falls back to name if absent. |
servers[].server.description |
string | no | One-line description shown in Daedalus UI. |
servers[].server.version |
string | no | Semver version string. |
servers[].server.icons |
array | no | Array of { src, sizes }. Daedalus uses the first entry. |
servers[].server.remotes |
array | yes | Connection endpoints. Daedalus looks for type: "streamable-http" and uses its url. |
servers[].server.capabilities |
object | no | Model capabilities. Contains model (string), vision (bool), context_window (int), max_output_tokens (int). Published when model_capabilities is configured in fastagent.config.yaml. |
servers[]._meta |
object | no | Registry metadata. Informational only — Daedalus does not act on it. |
Behaviour
- The response must reflect the current set of registered agents. If an agent is added or removed from Pallas, subsequent requests must reflect the change.
- Content-Type must be
application/json. - Every entry in
remoteswithtype: "streamable-http"is treated as an MCP endpoint Daedalus can connect to. - The
icons[].srcURL may be absolute or relative. Daedalus stores it as-is.
2. Health Tool
MCP tool: get_health
Each agent's MCP server must expose a tool named get_health. FastAgent intercepts this tool programmatically — it does not route through the LLM. This keeps health checks fast (~ms) and free of inference cost.
Tool Definition
The tool should appear in session.list_tools() with:
{
"name": "get_health",
"description": "Returns the health status of this agent and its downstream dependencies.",
"inputSchema": {
"type": "object",
"properties": {},
"additionalProperties": false
}
}
No input arguments.
Invocation
Daedalus calls this via the standard MCP SDK:
result = await session.call_tool("get_health")
Response
The tool returns a single text content block containing a JSON object:
{
"status": "ok",
"timestamp": "2026-03-12T15:42:00Z"
}
Status Values
| Status | Meaning | Daedalus Behaviour |
|---|---|---|
ok |
Agent healthy, all downstream MCP servers reachable | Green badge. Normal operation. |
degraded |
Agent responds but with issues (slow responses, partial downstream outage) | Yellow badge + warning banner. Chat allowed. |
error |
Agent cannot process requests | Red badge. Chat disabled — user cannot send messages. |
Fields
| Field | Type | Required | Description |
|---|---|---|---|
status |
"ok" | "degraded" | "error" |
yes | Current health state |
timestamp |
string (ISO 8601) | no | When the health check was performed |
message |
string | no | Human-readable explanation. Required when status is degraded or error. Shown in Daedalus UI tooltips and warning banners. |
Examples
Healthy:
{
"status": "ok",
"timestamp": "2026-03-12T15:42:00Z"
}
Degraded:
{
"status": "degraded",
"timestamp": "2026-03-12T15:42:00Z",
"message": "Avg response 12s — Neo4j connection slow"
}
Error:
{
"status": "error",
"timestamp": "2026-03-12T15:42:00Z",
"message": "Argos MCP server unreachable"
}
Implementation Guidance
The get_health tool checks connectivity to all downstream MCP servers the agent depends on using the MCP initialize handshake — the only MCP method that works without a pre-established session. This avoids burning LLM tokens on health checks.
For each downstream MCP server:
POSTan MCPinitializerequest to the server URL (with auth headers andAccept: application/json, text/event-stream)- On success, tear down the session by sending
DELETEwith the returnedMcp-Session-Idheader to avoid leaking server-side state - On failure (HTTP error, timeout, connection refused), record the server as unreachable
Result mapping:
- All downstream servers reachable and active LLM provider healthy →
ok - Some downstream servers unreachable, or active LLM provider failed preflight →
degradedwith explanation - Agent failed to start or cannot process requests →
errorwith explanation
The tool must not invoke the LLM. It should complete in under 1 second (3-second timeout per downstream probe).
3. Daedalus Consumption
Registration Flow
- User enters registry URL in Daedalus global settings (e.g.
http://puck.incus:23030) - Daedalus
GETs{url}/.well-known/mcp/server.json - Daedalus stores the
PallasInstancewith its registry URL - Discovered agents are shown with metadata (title, description, icon)
Workspace Attachment
- User selects a registered Pallas instance in workspace settings
- Daedalus re-fetches the registry and creates
AgentConnectionrows for every agent in the instance - All agents from the instance become available in the workspace
- Detaching removes all agent connections for that instance from the workspace
Health Polling
- Daedalus polls
get_healthon connected agents at a configurable interval (DAEDALUS_MCP_HEALTH_INTERVAL, default 60 seconds) - Health is cached in memory and exposed via the agent status API
- Prometheus gauge
daedalus_agent_health{instance, agent}tracks health (1.0=ok, 0.5=degraded, 0.0=error) - If health check fails entirely (connection error, timeout), status is treated as
error
Chat Blocking
- If the target agent's cached health is
error, the chat endpoint returns HTTP 503 and the UI disables the message input - If
degraded, a warning bar appears but chat is allowed - Users can create a workspace and attach an instance with unhealthy agents — health only blocks sending messages
4. Agent Progress Notifications
Agent tool calls can take tens of seconds to minutes when the agent enters an agentic loop — calling sub-agents, searching the web, querying knowledge graphs, etc. During this time, the MCP tool call has not yet returned. Without progress feedback, the user sees a dead spinner.
MCP provides a built-in mechanism for this: notifications/progress. Pallas already emits these notifications during agent execution. Daedalus must opt in by sending a progressToken and rendering the notifications it receives.
How It Works
Daedalus Pallas (harper, port 24101)
│ │
│── tools/call ─────────────────────────▶│ { message: "...", _meta: { progressToken: "abc123" } }
│ │
│ │── LLM generates text + tool calls ──▶
│ │
│◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 0, message: "research/research__research: started" }
│ │
│◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 1, message: "harper step 1 (tool)" }
│ │
│◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 2, message: "harper step 2 (llm)" }
│ │
│◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 1, total: 1, message: "research/research__research: completed" }
│ │
│◀── notifications/progress ─────────────│ { progressToken: "abc123", progress: 1, total: 1, message: "tech_research/tech_research__tech_research: completed" }
│ │
│◀── tools/call result ─────────────────│ { content: [{ type: "text", text: "..." }] }
│ │
All messages flow over the existing SSE connection established by MCP Streamable HTTP. No additional transport is needed.
Daedalus Requirements
Sending the Progress Token
When calling any agent tool (except get_health), Daedalus must include a progressToken in the request's _meta:
result = await session.call_tool(
"harper",
arguments={"message": user_input},
request_params={"_meta": {"progressToken": str(uuid4())}},
)
Without the progressToken, Pallas skips all progress notifications and Daedalus receives nothing until the final result.
Handling Progress Notifications
Daedalus receives notifications/progress messages on the SSE stream during the tool call. Each notification contains:
| Field | Type | Description |
|---|---|---|
progressToken |
string/int | Matches the token sent in the request |
progress |
float | Monotonically increasing step counter |
total |
float | null | null = indeterminate (loop in progress), 1.0 = task finished |
message |
string | null | Human-readable status text |
Message Format
Progress messages follow predictable patterns:
| Pattern | Meaning | Example |
|---|---|---|
{server}/{tool}: started |
Tool invocation began | research/research__research: started |
{server}/{tool}: completed |
Tool invocation finished | tech_research/tech_research__tech_research: completed |
{server}/{tool}: failed |
Tool invocation failed | argos/search_web: failed |
{agent} step N (llm) |
Agent loop: LLM turn | harper step 2 (llm) |
{agent} step N (tool) |
Agent loop: tool execution | harper step 3 (tool) |
Rendering Guidance
- Display the
messageas a status line beneath the "thinking" indicator - Replace the previous status on each new notification (not appended)
- When
totalisnull, show an indeterminate progress indicator (spinner) - When
totalequalsprogress(typically1.0/1.0), the specific tool/sub-task has completed — but the overall tool call may still be in progress - Clear the progress indicator when the final
tools/callresult arrives
Pallas Guarantees
- Progress notifications are emitted automatically by FastAgent's
MCPToolProgressManager— no additional server-side configuration is needed - Notifications are only sent when the client provides a
progressToken - At minimum,
on_tool_start(progress 0) andon_tool_complete(progress 1/1) are emitted for every downstream tool invocation - Loop step notifications are emitted when
emit_loop_progress=True(the default for all Pallas agents) - Progress notifications are best-effort — if one fails to send, the agent loop continues unaffected
Limitations
- LLM intermediate text is not streamed as progress. When the agent says "Let me look into that..." before calling tools, this text is generated server-side during the LLM streaming step but is not forwarded as a progress notification. The text is included in the final tool result. A future enhancement may stream LLM text deltas as progress messages with a distinguishable prefix.
- Parallel tool calls emit interleaved progress messages. Each message includes a tool-specific prefix (
{server}/{tool}), so Daedalus can track them independently if desired, or simply display the most recent message.
5. Why MCP (Not REST)
Pallas wraps each FastAgent instance in a MultimodalAgentMCPServer and serves it over StreamableHTTP. The MCP transport gives Daedalus:
- Tool discovery —
session.list_tools()returns the full capability manifest - Streaming — MCP Streamable HTTP handles streaming natively
- Health checks —
get_healthis just another tool call, no separate API surface - Protocol alignment — MCP is the abstraction boundary both above and below Pallas. No MCP→REST→MCP translation layer.
The alternative (REST between Daedalus and Pallas) would require building a custom API layer in Pallas that reimplements what the MCP server already provides, with no simplification on the Daedalus side.