Team JWTs include `aud=mnemosyne` while per-turn JWTs omit `aud`
entirely. Since `iss` + `typ` already partition the two token
populations, explicitly skip audience verification to avoid rejecting
valid tokens.
Also expand test coverage for the MCP auth surface to exercise all
three credential types (opaque MCPToken, per-turn JWT, team JWT),
including replay cache behavior and Neo4j-backed library resolution
via mocked cypher queries.
Refine the phase-2 integration spec to reflect implementation details:
- Change `resolved_libraries` from `set[str]` to ordered `list[str]`
- Document `MCPToken.allowed_libraries` as JSONField (not M2M) since
Library lives in Neo4j, not Django's ORM
- Clarify that `Library.workspace_id` is a content-routing attribute,
not an authorization axis
- Describe retirement of the three-branch `_WORKSPACE_SCOPE_CLAUSE` in
favor of a single `lib.uid IN $resolved_libraries` check
- Specify team JWT resolution via `TeamWorkspaceAssignment` DB join
- Note admin UI materializes full Library UID list explicitly
In FastMCP's on_call_tool hook the middleware context is already
MiddlewareContext[CallToolRequestParams] (per fastmcp's own
middleware.py:158), so tool name lives at context.message.name, not
at context.message.params.name — the latter always returned None,
silently breaking the PUBLIC_TOOLS bypass for get_health and making
the per-tool ACL short-circuit.
Also wrap call_next in a traced helper that logs any exception with
a full traceback and logs the success-path result type. During the
Pallas↔Mnemosyne shakedown the tool results were coming back to
fast-agent as the literal string "object NoneType can't be used in
'await' expression" with no trace in either process — that's Python's
TypeError for 'await X' where X is None. If that TypeError is raised
inside FastMCP dispatch we want the frame in Mnemosyne's own log
rather than having Pallas's aggregator turn it into a terse
CallToolResult(isError=True) with no stack.
Daedalus mints one JWT per chat turn; a turn routinely drives several
Mnemosyne tool calls (list_libraries -> search -> get_document ...)
re-using that same bearer. The old _remember_jti flagged every repeat
as replay, so the 2nd+Nth tool call in each turn failed with
'Token replay detected.'.
Change the cache to store jti -> exp. A repeat within the token's own
validity window is legitimate and allowed. A repeat *past* exp (+ the
symmetric _JWT_LEEWAY_SECONDS PyJWT uses on the signature check) is
a genuine replay and still rejected -- this is belt-and-braces since
PyJWT's own exp check would have already caught an expired token.
Also validate exp is numeric at the call site for defence in depth
against future PyJWT changes to claim shapes.
Temporarily instrument MCPAuthMiddleware to emit one log line per
on_call_tool and one per _extract_token. Needed to diagnose why
workspace-scoped JWTs forwarded by Pallas land on tool calls with
'Authentication required. Provide a Bearer token.'
Logs include header names, auth-header length+prefix, and the request
URL so we can tell in one turn whether the header is missing, present
but rejected, or get_http_request() raised. Also adds lowercase-bearer
tolerance for clients that normalize to lowercase.
Demote to DEBUG once the end-to-end path is green.
Health probes (Pallas health pollers, agent startup checks) call get_health
without a bearer token. Auth should only be required for data-access tools.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extend library list endpoint with `include_workspace` and
`with_item_count` query params to support Daedalus registry mirroring
- Expand search scope clause to three modes: workspace-only, workspace
plus allowed user libraries, and global
- Add `allowed_libraries` field to SearchRequest for Phase-2 JWT claims
- Introduce JWT-based actor resolution using a synthetic service user
(`MCP_JWT_SERVICE_USERNAME`) for Daedalus-originated requests
Replace plaintext token storage with SHA-256 hashes so leaked database
contents cannot be used to authenticate. Plaintext is generated, shown
once at creation time, and never persisted.
- Add `hash_token()` helper and `MCPTokenManager.create_token()` that
returns `(instance, plaintext)`.
- Replace `token` field with indexed `token_hash`; look up bearers by
hashing the incoming value.
- Update dashboard, management command, and admin to surface plaintext
only at creation. Disable admin "add" since it cannot reveal plaintext.
- Migration drops the old `token` column and adds `token_hash`;
pre-existing tokens are invalidated and must be reissued.
- Remove Phase 4 RAG pipeline in favor of retrieval-only architecture
- Add FastMCP server exposing search, get_chunk, list_libraries tools
- Mount MCP endpoints (streamable HTTP + SSE) via Starlette in ASGI config
- Update README to clarify Mnemosyne is a retrieval engine, not RAG
- Let calling LLMs drive synthesis and iterative retrieval themselves