The Mnemosyne Authorization: Bearer token was being dropped on outbound MCP
calls because fast-agent runs downstream transports inside a long-lived
anyio TaskGroup whose context is snapshotted at manager startup —
request_bearer_token.get() inside _prepare_headers_and_auth therefore
always resolved to None even when the request handler had just set it.
Fix:
* pallas/_fastagent_patch.py
- add _pending_bearers registry keyed by id(server_config) with a
threading.Lock; publish_bearer / revoke_bearer helpers.
- patched _prepare_headers_and_auth reads the registry first, falls
back to the ContextVar for non-persistent probe paths.
- emit INFO log on install() so the journal shows the patch ran;
verbose flow logs at DEBUG on pallas.forward.
* pallas/multimodal_server.py
- send_message resolves the agent's opted-in downstreams, publishes
the inbound bearer for each, and revokes them all in the finally.
- bearer/header diagnostics go to pallas.auth (DEBUG) instead of
/tmp/pallas-bearer.log which is invisible under systemd PrivateTmp.
* pallas/log.py
- honour PALLAS_LOG_LEVEL env var (default INFO) so operators can
flip the forward/auth diagnostics on without a code change.
* docs/pallas.md, docs/mnemosyne_integration.md
- document the registry-based forwarding and the task-group
ContextVar constraint that forced it.
6.6 KiB
Mnemosyne Integration — Pallas Reference
This document describes how Pallas-hosted agents connect to Mnemosyne for workspace-scoped knowledge search. The full integration specification lives in daedalus/docs/mnemosyne_integration.md.
Overview
Mnemosyne is a downstream MCP server like any other from Pallas's perspective. Agents declare "mnemosyne" in their servers list; the server URL and bearer-forward opt-in live in the project's fastagent.config.yaml.
What makes Mnemosyne different from other downstream servers:
- Workspace-scoped search. Daedalus mints a per-turn HS256 JWT carrying the workspace UUID and sends it as
Authorization: Beareron thesend_messagecall to Pallas. Pallas captures it inrequest_bearer_token, and the fast-agent patch (pallas._fastagent_patch) forwards it on outgoing calls to Mnemosyne whenforward_inbound_auth: trueis set. Mnemosyne validates the JWT and scopes all Cypher searches to that workspace. - The LLM never sees
workspace_id. The scoping is claim-driven: Mnemosyne reads the JWT claims, overwrites anyworkspace_idthe model may have produced in tool arguments, and enforces containment server-side. Pallas is transparent transport.
Configuration
fastagent.config.yaml
Add the mnemosyne stanza to mcp.servers. The only Mnemosyne-specific flag is forward_inbound_auth: true:
mcp:
servers:
mnemosyne:
transport: http
url: "https://mnemosyne.ouranos.helu.ca/mcp/"
forward_inbound_auth: true
This is already deployed in iolaus/fastagent.config.yaml, kottos/fastagent.config.yaml, and mentor/fastagent.config.yaml and their Ansible templates in virgo/ansible/.
Agent Definitions
Add "mnemosyne" to the servers list of any agent that should be able to search workspace content. Sub-agents (e.g. research, tech_research) that are orchestrated by primary agents do not need it unless they independently issue search calls.
iolaus — all primary agents have Mnemosyne access: shawn, david, hypatia, watson, nate, garth, bourdain, cousteau, marcus, cristiano, mikael.
kottos — harper, scotty.
mentor — alan, ann, jeffrey, jarvis, aws_sa.
Example (from iolaus/agents/shawn.py):
@fast.agent(
name="shawn",
instruction=_INSTRUCTION,
servers=["argos", "mnemosyne", "neo4j_cypher", "kernos", "time"],
default=True,
)
async def _shawn():
pass
How Bearer Forwarding Works
-
Daedalus mints a per-turn JWT:
{ "iss": "daedalus", "sub": "chat", "ws": "<workspace_uuid>", "libs": [], "iat": <now>, "exp": <now + 600>, "jti": "<uuid4>" } -
Daedalus calls Pallas's
send_messagetool withAuthorization: Bearer <token>in the HTTP request headers. -
Pallas's
MultimodalAgentMCPServercaptures the token by reading the request'sAuthorizationheader directly throughfastmcp.server.dependencies.get_http_request()—get_access_token()returnsNonebecause Pallas runs without the FastMCP auth middleware. The token is pushed into therequest_bearer_tokenContextVar (for LLM-provider passthrough) and also registered in a per-request bearer registry keyed by each opted-in downstream'sMCPServerSettingsobject. -
The fast-agent patch in
pallas/_fastagent_patch.py(installed at import time inpallas/__init__.py) wraps_prepare_headers_and_auth. When a server config hasforward_inbound_auth: true, the patch reads the bearer out of the per-request registry (with the ContextVar as a fallback) and injectsAuthorization: Bearer <token>into the outgoing HTTP headers for that MCP call. The registry is required because fast-agent'sMCPConnectionManagerruns the transport in its own anyioTaskGroup, which does not inherit the request handler'scontextvars.Context. -
The request handler's
finallyclause revokes every bearer it published, so per-request tokens never outlive the call and no stale credentials can be reused. -
Mnemosyne receives the same token, validates the HMAC signature against its
MCPSigningKeytable, and scopes all search Cypher queries towsfrom the claims.
The forward_inbound_auth flag is per-server — other servers in the same agent (argos, neo4j_cypher, time, etc.) never receive the bearer.
Available Mnemosyne MCP Tools
These tools become available to agents with "mnemosyne" in their servers list:
| Tool | Purpose |
|---|---|
search_knowledge |
Hybrid vector + full-text + graph search with re-ranking, scoped to the current workspace |
search_by_category |
Search within a specific library type (technical, fiction, business, etc.) |
list_libraries |
List accessible libraries |
list_collections |
List collections within a library |
get_item |
Retrieve item metadata, chunk previews, and concept links |
get_concepts |
Traverse the concept graph |
All tools are transparently scoped to the workspace by JWT claims. An agent in workspace A cannot retrieve content from workspace B regardless of what arguments it produces.
Downstream MCP Servers
| Server | URL | forward_inbound_auth |
|---|---|---|
| mnemosyne | https://mnemosyne.ouranos.helu.ca/mcp/ |
true |
| argos | http://miranda.incus:25534/mcp |
— |
| neo4j_cypher | http://circe.helu.ca:22034/mcp |
— |
| kernos | http://caliban.incus:22021/mcp |
— |
| gitea | http://miranda.incus:25535/mcp |
— |
| rommie | http://caliban.incus:22031/mcp |
— |
| grafana | http://miranda.incus:25533/mcp |
— |
Provisioning (one-time, server-side)
-
On the Mnemosyne host, generate the signing key:
docker compose exec app python manage.py seed_signing_key --kid daedalus-1 # Copy the printed hex secret -
Set on Daedalus (
.envor Ansible vault):DAEDALUS_MNEMOSYNE_MCP_URL=https://mnemosyne.ouranos.helu.ca/mcp/ DAEDALUS_MNEMOSYNE_SIGNING_KID=daedalus-1 DAEDALUS_MNEMOSYNE_SIGNING_SECRET=<hex from step 1> DAEDALUS_MNEMOSYNE_TOKEN_TTL_SECONDS=600 -
Restart Daedalus and the three agent deployments (iolaus, kottos, mentor).
The OCI vault secret is virgo-mnemosyne-signing-secret; Ansible injects it via mnemosyne_signing_secret.
Degraded Mode
If Daedalus's MNEMOSYNE_SIGNING_SECRET is blank or MNEMOSYNE_MCP_URL is empty, mint_chat_token returns None. Pallas calls proceed without a bearer; Mnemosyne search is unavailable but all other agent tools continue normally. No error is surfaced to the user.