r/pallas

Files

Robert Helewka 679a809f66 Fix bearer forwarding across anyio TaskGroup boundary

The Mnemosyne Authorization: Bearer token was being dropped on outbound MCP
calls because fast-agent runs downstream transports inside a long-lived
anyio TaskGroup whose context is snapshotted at manager startup —
request_bearer_token.get() inside _prepare_headers_and_auth therefore
always resolved to None even when the request handler had just set it.

Fix:
* pallas/_fastagent_patch.py
    - add _pending_bearers registry keyed by id(server_config) with a
      threading.Lock; publish_bearer / revoke_bearer helpers.
    - patched _prepare_headers_and_auth reads the registry first, falls
      back to the ContextVar for non-persistent probe paths.
    - emit INFO log on install() so the journal shows the patch ran;
      verbose flow logs at DEBUG on pallas.forward.

* pallas/multimodal_server.py
    - send_message resolves the agent's opted-in downstreams, publishes
      the inbound bearer for each, and revokes them all in the finally.
    - bearer/header diagnostics go to pallas.auth (DEBUG) instead of
      /tmp/pallas-bearer.log which is invisible under systemd PrivateTmp.

* pallas/log.py
    - honour PALLAS_LOG_LEVEL env var (default INFO) so operators can
      flip the forward/auth diagnostics on without a code change.

* docs/pallas.md, docs/mnemosyne_integration.md
    - document the registry-based forwarding and the task-group
      ContextVar constraint that forced it.

2026-05-05 12:09:51 -04:00

6.6 KiB

Raw Blame History

Mnemosyne Integration — Pallas Reference

This document describes how Pallas-hosted agents connect to Mnemosyne for workspace-scoped knowledge search. The full integration specification lives in daedalus/docs/mnemosyne_integration.md.

Overview

Mnemosyne is a downstream MCP server like any other from Pallas's perspective. Agents declare "mnemosyne" in their servers list; the server URL and bearer-forward opt-in live in the project's fastagent.config.yaml.

What makes Mnemosyne different from other downstream servers:

Workspace-scoped search. Daedalus mints a per-turn HS256 JWT carrying the workspace UUID and sends it as Authorization: Bearer on the send_message call to Pallas. Pallas captures it in request_bearer_token, and the fast-agent patch (pallas._fastagent_patch) forwards it on outgoing calls to Mnemosyne when forward_inbound_auth: true is set. Mnemosyne validates the JWT and scopes all Cypher searches to that workspace.
The LLM never sees workspace_id. The scoping is claim-driven: Mnemosyne reads the JWT claims, overwrites any workspace_id the model may have produced in tool arguments, and enforces containment server-side. Pallas is transparent transport.

Configuration

fastagent.config.yaml

Add the mnemosyne stanza to mcp.servers. The only Mnemosyne-specific flag is forward_inbound_auth: true:

mcp:
  servers:
    mnemosyne:
      transport: http
      url: "https://mnemosyne.ouranos.helu.ca/mcp/"
      forward_inbound_auth: true

This is already deployed in iolaus/fastagent.config.yaml, kottos/fastagent.config.yaml, and mentor/fastagent.config.yaml and their Ansible templates in virgo/ansible/.

Agent Definitions

Add "mnemosyne" to the servers list of any agent that should be able to search workspace content. Sub-agents (e.g. research, tech_research) that are orchestrated by primary agents do not need it unless they independently issue search calls.

iolaus — all primary agents have Mnemosyne access: shawn, david, hypatia, watson, nate, garth, bourdain, cousteau, marcus, cristiano, mikael.

kottos — harper, scotty.

mentor — alan, ann, jeffrey, jarvis, aws_sa.

Example (from iolaus/agents/shawn.py):

@fast.agent(
    name="shawn",
    instruction=_INSTRUCTION,
    servers=["argos", "mnemosyne", "neo4j_cypher", "kernos", "time"],
    default=True,
)
async def _shawn():
    pass

How Bearer Forwarding Works

Daedalus mints a per-turn JWT:

{
  "iss": "daedalus",
  "sub": "chat",
  "ws": "<workspace_uuid>",
  "libs": [],
  "iat": <now>,
  "exp": <now + 600>,
  "jti": "<uuid4>"
}

Daedalus calls Pallas's send_message tool with Authorization: Bearer <token> in the HTTP request headers.
Pallas's MultimodalAgentMCPServer captures the token by reading the request's Authorization header directly through fastmcp.server.dependencies.get_http_request() — get_access_token() returns None because Pallas runs without the FastMCP auth middleware. The token is pushed into the request_bearer_token ContextVar (for LLM-provider passthrough) and also registered in a per-request bearer registry keyed by each opted-in downstream's MCPServerSettings object.
The fast-agent patch in pallas/_fastagent_patch.py (installed at import time in pallas/__init__.py) wraps _prepare_headers_and_auth. When a server config has forward_inbound_auth: true, the patch reads the bearer out of the per-request registry (with the ContextVar as a fallback) and injects Authorization: Bearer <token> into the outgoing HTTP headers for that MCP call. The registry is required because fast-agent's MCPConnectionManager runs the transport in its own anyio TaskGroup, which does not inherit the request handler's contextvars.Context.
The request handler's finally clause revokes every bearer it published, so per-request tokens never outlive the call and no stale credentials can be reused.
Mnemosyne receives the same token, validates the HMAC signature against its MCPSigningKey table, and scopes all search Cypher queries to ws from the claims.

The forward_inbound_auth flag is per-server — other servers in the same agent (argos, neo4j_cypher, time, etc.) never receive the bearer.

Available Mnemosyne MCP Tools

These tools become available to agents with "mnemosyne" in their servers list:

Tool	Purpose
`search_knowledge`	Hybrid vector + full-text + graph search with re-ranking, scoped to the current workspace
`search_by_category`	Search within a specific library type (technical, fiction, business, etc.)
`list_libraries`	List accessible libraries
`list_collections`	List collections within a library
`get_item`	Retrieve item metadata, chunk previews, and concept links
`get_concepts`	Traverse the concept graph

All tools are transparently scoped to the workspace by JWT claims. An agent in workspace A cannot retrieve content from workspace B regardless of what arguments it produces.

Downstream MCP Servers

Server	URL	`forward_inbound_auth`
mnemosyne	`https://mnemosyne.ouranos.helu.ca/mcp/`	`true`
argos	`http://miranda.incus:25534/mcp`	—
neo4j_cypher	`http://circe.helu.ca:22034/mcp`	—
kernos	`http://caliban.incus:22021/mcp`	—
gitea	`http://miranda.incus:25535/mcp`	—
rommie	`http://caliban.incus:22031/mcp`	—
grafana	`http://miranda.incus:25533/mcp`	—

Provisioning (one-time, server-side)

On the Mnemosyne host, generate the signing key:

docker compose exec app python manage.py seed_signing_key --kid daedalus-1
# Copy the printed hex secret

Set on Daedalus (.env or Ansible vault):

DAEDALUS_MNEMOSYNE_MCP_URL=https://mnemosyne.ouranos.helu.ca/mcp/
DAEDALUS_MNEMOSYNE_SIGNING_KID=daedalus-1
DAEDALUS_MNEMOSYNE_SIGNING_SECRET=<hex from step 1>
DAEDALUS_MNEMOSYNE_TOKEN_TTL_SECONDS=600

Restart Daedalus and the three agent deployments (iolaus, kottos, mentor).

The OCI vault secret is virgo-mnemosyne-signing-secret; Ansible injects it via mnemosyne_signing_secret.

Degraded Mode

If Daedalus's MNEMOSYNE_SIGNING_SECRET is blank or MNEMOSYNE_MCP_URL is empty, mint_chat_token returns None. Pallas calls proceed without a bearer; Mnemosyne search is unavailable but all other agent tools continue normally. No error is surfaced to the user.

6.6 KiB Raw Blame History