Files
pallas/docs/mnemosyne_integration.md
Robert Helewka 679a809f66 Fix bearer forwarding across anyio TaskGroup boundary
The Mnemosyne Authorization: Bearer token was being dropped on outbound MCP
calls because fast-agent runs downstream transports inside a long-lived
anyio TaskGroup whose context is snapshotted at manager startup —
request_bearer_token.get() inside _prepare_headers_and_auth therefore
always resolved to None even when the request handler had just set it.

Fix:
* pallas/_fastagent_patch.py
    - add _pending_bearers registry keyed by id(server_config) with a
      threading.Lock; publish_bearer / revoke_bearer helpers.
    - patched _prepare_headers_and_auth reads the registry first, falls
      back to the ContextVar for non-persistent probe paths.
    - emit INFO log on install() so the journal shows the patch ran;
      verbose flow logs at DEBUG on pallas.forward.

* pallas/multimodal_server.py
    - send_message resolves the agent's opted-in downstreams, publishes
      the inbound bearer for each, and revokes them all in the finally.
    - bearer/header diagnostics go to pallas.auth (DEBUG) instead of
      /tmp/pallas-bearer.log which is invisible under systemd PrivateTmp.

* pallas/log.py
    - honour PALLAS_LOG_LEVEL env var (default INFO) so operators can
      flip the forward/auth diagnostics on without a code change.

* docs/pallas.md, docs/mnemosyne_integration.md
    - document the registry-based forwarding and the task-group
      ContextVar constraint that forced it.
2026-05-05 12:09:51 -04:00

146 lines
6.6 KiB
Markdown

# Mnemosyne Integration — Pallas Reference
This document describes how Pallas-hosted agents connect to Mnemosyne for workspace-scoped knowledge search. The full integration specification lives in [`daedalus/docs/mnemosyne_integration.md`](../../daedalus/docs/mnemosyne_integration.md).
---
## Overview
Mnemosyne is a downstream MCP server like any other from Pallas's perspective. Agents declare `"mnemosyne"` in their `servers` list; the server URL and bearer-forward opt-in live in the project's `fastagent.config.yaml`.
What makes Mnemosyne different from other downstream servers:
- **Workspace-scoped search.** Daedalus mints a per-turn HS256 JWT carrying the workspace UUID and sends it as `Authorization: Bearer` on the `send_message` call to Pallas. Pallas captures it in `request_bearer_token`, and the fast-agent patch (`pallas._fastagent_patch`) forwards it on outgoing calls to Mnemosyne when `forward_inbound_auth: true` is set. Mnemosyne validates the JWT and scopes all Cypher searches to that workspace.
- **The LLM never sees `workspace_id`.** The scoping is claim-driven: Mnemosyne reads the JWT claims, overwrites any `workspace_id` the model may have produced in tool arguments, and enforces containment server-side. Pallas is transparent transport.
---
## Configuration
### fastagent.config.yaml
Add the `mnemosyne` stanza to `mcp.servers`. The only Mnemosyne-specific flag is `forward_inbound_auth: true`:
```yaml
mcp:
servers:
mnemosyne:
transport: http
url: "https://mnemosyne.ouranos.helu.ca/mcp/"
forward_inbound_auth: true
```
This is already deployed in `iolaus/fastagent.config.yaml`, `kottos/fastagent.config.yaml`, and `mentor/fastagent.config.yaml` and their Ansible templates in `virgo/ansible/`.
### Agent Definitions
Add `"mnemosyne"` to the `servers` list of any agent that should be able to search workspace content. Sub-agents (e.g. `research`, `tech_research`) that are orchestrated by primary agents do not need it unless they independently issue search calls.
**iolaus** — all primary agents have Mnemosyne access: `shawn`, `david`, `hypatia`, `watson`, `nate`, `garth`, `bourdain`, `cousteau`, `marcus`, `cristiano`, `mikael`.
**kottos**`harper`, `scotty`.
**mentor**`alan`, `ann`, `jeffrey`, `jarvis`, `aws_sa`.
Example (from `iolaus/agents/shawn.py`):
```python
@fast.agent(
name="shawn",
instruction=_INSTRUCTION,
servers=["argos", "mnemosyne", "neo4j_cypher", "kernos", "time"],
default=True,
)
async def _shawn():
pass
```
---
## How Bearer Forwarding Works
1. Daedalus mints a per-turn JWT:
```json
{
"iss": "daedalus",
"sub": "chat",
"ws": "<workspace_uuid>",
"libs": [],
"iat": <now>,
"exp": <now + 600>,
"jti": "<uuid4>"
}
```
2. Daedalus calls Pallas's `send_message` tool with `Authorization: Bearer <token>` in the HTTP request headers.
3. Pallas's `MultimodalAgentMCPServer` captures the token by reading the request's `Authorization` header directly through `fastmcp.server.dependencies.get_http_request()` — `get_access_token()` returns `None` because Pallas runs without the FastMCP auth middleware. The token is pushed into the `request_bearer_token` ContextVar (for LLM-provider passthrough) and **also** registered in a per-request bearer registry keyed by each opted-in downstream's `MCPServerSettings` object.
4. The fast-agent patch in `pallas/_fastagent_patch.py` (installed at import time in `pallas/__init__.py`) wraps `_prepare_headers_and_auth`. When a server config has `forward_inbound_auth: true`, the patch reads the bearer out of the per-request registry (with the ContextVar as a fallback) and injects `Authorization: Bearer <token>` into the outgoing HTTP headers for that MCP call. The registry is required because fast-agent's `MCPConnectionManager` runs the transport in its own anyio `TaskGroup`, which does not inherit the request handler's `contextvars.Context`.
5. The request handler's `finally` clause revokes every bearer it published, so per-request tokens never outlive the call and no stale credentials can be reused.
6. Mnemosyne receives the same token, validates the HMAC signature against its `MCPSigningKey` table, and scopes all search Cypher queries to `ws` from the claims.
The `forward_inbound_auth` flag is **per-server** — other servers in the same agent (`argos`, `neo4j_cypher`, `time`, etc.) never receive the bearer.
---
## Available Mnemosyne MCP Tools
These tools become available to agents with `"mnemosyne"` in their `servers` list:
| Tool | Purpose |
|------|---------|
| `search_knowledge` | Hybrid vector + full-text + graph search with re-ranking, scoped to the current workspace |
| `search_by_category` | Search within a specific library type (technical, fiction, business, etc.) |
| `list_libraries` | List accessible libraries |
| `list_collections` | List collections within a library |
| `get_item` | Retrieve item metadata, chunk previews, and concept links |
| `get_concepts` | Traverse the concept graph |
All tools are transparently scoped to the workspace by JWT claims. An agent in workspace A cannot retrieve content from workspace B regardless of what arguments it produces.
---
## Downstream MCP Servers
| Server | URL | `forward_inbound_auth` |
|--------|-----|----------------------|
| mnemosyne | `https://mnemosyne.ouranos.helu.ca/mcp/` | `true` |
| argos | `http://miranda.incus:25534/mcp` | — |
| neo4j_cypher | `http://circe.helu.ca:22034/mcp` | — |
| kernos | `http://caliban.incus:22021/mcp` | — |
| gitea | `http://miranda.incus:25535/mcp` | — |
| rommie | `http://caliban.incus:22031/mcp` | — |
| grafana | `http://miranda.incus:25533/mcp` | — |
---
## Provisioning (one-time, server-side)
1. On the Mnemosyne host, generate the signing key:
```bash
docker compose exec app python manage.py seed_signing_key --kid daedalus-1
# Copy the printed hex secret
```
2. Set on Daedalus (`.env` or Ansible vault):
```
DAEDALUS_MNEMOSYNE_MCP_URL=https://mnemosyne.ouranos.helu.ca/mcp/
DAEDALUS_MNEMOSYNE_SIGNING_KID=daedalus-1
DAEDALUS_MNEMOSYNE_SIGNING_SECRET=<hex from step 1>
DAEDALUS_MNEMOSYNE_TOKEN_TTL_SECONDS=600
```
3. Restart Daedalus and the three agent deployments (iolaus, kottos, mentor).
The OCI vault secret is `virgo-mnemosyne-signing-secret`; Ansible injects it via `mnemosyne_signing_secret`.
---
## Degraded Mode
If Daedalus's `MNEMOSYNE_SIGNING_SECRET` is blank or `MNEMOSYNE_MCP_URL` is empty, `mint_chat_token` returns `None`. Pallas calls proceed without a bearer; Mnemosyne search is unavailable but all other agent tools continue normally. No error is surfaced to the user.