docs: clarify Daedalus-Pallas integration auth model
All checks were successful
CVE Scan & Docker Build / security-scan (push) Successful in 51s
CVE Scan & Docker Build / build-and-push (push) Successful in 2m27s

Refine the phase-2 integration spec to reflect implementation details:

- Change `resolved_libraries` from `set[str]` to ordered `list[str]`
- Document `MCPToken.allowed_libraries` as JSONField (not M2M) since
  Library lives in Neo4j, not Django's ORM
- Clarify that `Library.workspace_id` is a content-routing attribute,
  not an authorization axis
- Describe retirement of the three-branch `_WORKSPACE_SCOPE_CLAUSE` in
  favor of a single `lib.uid IN $resolved_libraries` check
- Specify team JWT resolution via `TeamWorkspaceAssignment` DB join
- Note admin UI materializes full Library UID list explicitly
This commit is contained in:
2026-05-10 11:59:44 -04:00
parent e9f6eeb1a3
commit 16fb7ff4dc
35 changed files with 1839 additions and 2035 deletions

View File

@@ -23,9 +23,12 @@ model connecting three services:
The model replaces the per-turn JWT *forwarding* scheme with a unified
**bearer → resolved library set** abstraction. Every authenticated
Mnemosyne request resolves to a set of Library UIDs the caller may
read; the principal type (opaque `MCPToken`, Daedalus per-turn JWT,
team JWT) only determines how that set is derived.
Mnemosyne request resolves to a single ordered `resolved_libraries`
list of Library UIDs the caller may read; the principal type (opaque
`MCPToken`, Daedalus per-turn JWT, team JWT) only determines how that
list is derived. `Library.workspace_id` is a Daedalus content-routing
attribute used by the ingest and workspace-lifecycle APIs; it is **not**
consulted by the auth layer.
It also records the UX shift in Daedalus: **workspaces attach Teams
(Pallas instances), not individual agents**; the agent picker in chat
@@ -86,24 +89,48 @@ and the design collapses to two credential types.
### 3.3 Resolved-library abstraction
Mnemosyne's auth middleware populates a single
`resolved_libraries: set[str]` per request. Downstream code (search,
get_document, list_libraries, etc.) only reads that set; it does not
care where the set came from.
`resolved_libraries: list[str]` per request. Downstream code (search,
get_chunk, list_libraries, list_collections, list_items, …) only
reads that list; it does not care where it came from.
```
Bearer → classify → dispatch
├─ Opaque MCPToken → allowed_libraries M2M
├─ Opaque MCPToken → token.allowed_libraries (JSON list of UIDs)
├─ per-turn JWT → claims["libs"]
└─ team JWT (typ=team) → live DB: team.workspaces → libraries
(filtered by Library.workspace_id)
└─ team JWT (typ=team) → live DB join:
TeamWorkspaceAssignment.workspace_id
→ Library.workspace_id → Library.uid
resolved_libraries: set[str]
resolved_libraries: list[str]
downstream tools
```
Fail-closed: if the resolution produces an empty set, the request sees
no Libraries. There is no "empty means everything" path.
Fail-closed: if the resolution produces an empty list, the request
sees no Libraries. There is no "empty means everything" fallback.
#### 3.3.1 Retirement of the old three-branch scope clause
The pre-phase-2 search pipeline ran every Cypher query against a
`_WORKSPACE_SCOPE_CLAUSE` with three branches keyed on whether
`workspace_id` and/or `allowed_libraries` were set. Phase 2 removes
that clause entirely. Every authorization check collapses to:
```cypher
WHERE lib.uid IN $resolved_libraries
```
`Library.workspace_id` stays on the node as a Daedalus content-routing
attribute (used by the ingest API to find-or-create the per-workspace
Library, and by the workspace-lifecycle API to cascade-delete that
Library's contents). It is **not** an authorization axis and is not
consulted anywhere in the auth middleware, the MCP tool surface, or
the search service.
Admin-UI-initiated searches (Django staff logged into the Mnemosyne
admin / search page) materialize `resolved_libraries` explicitly as
"every Library UID the database contains" — the same mechanism used
today as a workaround, now the only code path.
---
@@ -133,9 +160,14 @@ class LibraryMembership(models.Model):
User can scope a Library into `MCPToken.allowed_libraries` iff they
have `owner` or `manager` role on it.
#### `MCPToken.allowed_libraries` (new M2M on existing model)
#### `MCPToken.allowed_libraries` (new field on existing model)
```python
allowed_libraries = models.ManyToManyField(Library, blank=True)
# JSON list of Library.uid strings. A real M2M isn't possible because
# Library lives in Neo4j (neomodel StructuredNode), not Django's ORM.
# The admin/dashboard form materializes the picker by querying
# Library.nodes and filtering to libraries where the token's user has
# an ``owner`` or ``manager`` LibraryMembership.
allowed_libraries = models.JSONField(default=list, blank=True)
```
Fail-closed: empty → token grants access to zero libraries.
Admin form filters the picker by the current user's owned/managed
@@ -254,6 +286,7 @@ def resolve_mcp_jwt(token_string: str) -> dict:
typ = claims.get("typ")
if typ == "team":
# No replay cache — team tokens are reused on every request.
# Validate sub=="team:<uuid>" shape; stash the uuid on claims.
pass
else:
if _remember_jti(jti, float(exp)):
@@ -262,19 +295,31 @@ def resolve_mcp_jwt(token_string: str) -> dict:
return claims
```
Downstream, the middleware branches:
Middleware populates `STATE_KEY_RESOLVED_LIBRARIES` per request:
```python
if claims.get("typ") == "team":
team = Team.objects.get(id=uuid_from_sub(claims["sub"]),
active=True,
active_jti=claims["jti"])
resolved_libraries = _libraries_for_team(team)
else:
resolved_libraries = claims["libs"]
# Opaque MCPToken
resolved_libraries = list(token.allowed_libraries or [])
# Per-turn JWT (legacy; retires phase 4)
resolved_libraries = list(claims.get("libs") or [])
# Team JWT
team = Team.objects.get(id=uuid_from_sub(claims["sub"]),
active=True,
active_jti=claims["jti"])
resolved_libraries = _libraries_for_team(team) # see below
```
`_libraries_for_team(team)` = all `Library` UIDs whose `workspace_id`
is in the team's `TeamWorkspaceAssignment` set.
`_libraries_for_team(team)` runs a single Cypher query against Neo4j:
```cypher
MATCH (l:Library)
WHERE l.workspace_id IN $workspace_ids
RETURN l.uid
```
where `$workspace_ids` is `list(team.workspace_assignments.values_list("workspace_id", flat=True))`.
---
@@ -283,13 +328,16 @@ is in the team's `TeamWorkspaceAssignment` set.
### 6.1 Third-party MCP client with opaque `MCPToken`
1. Client sends `Authorization: Bearer <plaintext>`.
2. Middleware hashes → looks up `MCPToken` → validates active/expired.
3. `resolved_libraries = token.allowed_libraries.values_list("uid")`.
3. `resolved_libraries = list(token.allowed_libraries or [])` — the
JSON list of Library UIDs the admin / dashboard granted at mint.
4. Fails closed if empty.
### 6.2 Daedalus chat per-turn JWT (legacy, retires Phase 4)
Unchanged from today. `iss=daedalus`, `typ` absent, `libs` carries the
workspace's user-managed libraries, `ws` carries the workspace id.
Mnemosyne validates against `MCPSigningKey` keyed by `kid`.
`iss=daedalus`, `typ` absent, `libs` carries the full library set
Daedalus pre-computed for that turn (the workspace's auto-Library
plus any user-managed extras), `ws` is present but no longer consulted
server-side. Middleware assigns `resolved_libraries = claims["libs"]`.
Mnemosyne validates the JWT against `MCPSigningKey` keyed by `kid`.
### 6.3 Agent team (Kottos / Mentor / Iolaus / post-migration Daedalus-chat)
1. Pallas sends `Authorization: Bearer <team-jwt>` (static, read from