mnemosyne

Author	SHA1	Message	Date
Robert Helewka	16fb7ff4dc	docs: clarify Daedalus-Pallas integration auth model All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 51s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m27s Details Refine the phase-2 integration spec to reflect implementation details: - Change `resolved_libraries` from `set[str]` to ordered `list[str]` - Document `MCPToken.allowed_libraries` as JSONField (not M2M) since Library lives in Neo4j, not Django's ORM - Clarify that `Library.workspace_id` is a content-routing attribute, not an authorization axis - Describe retirement of the three-branch `_WORKSPACE_SCOPE_CLAUSE` in favor of a single `lib.uid IN $resolved_libraries` check - Specify team JWT resolution via `TeamWorkspaceAssignment` DB join - Note admin UI materializes full Library UID list explicitly	2026-05-10 11:59:44 -04:00
Robert Helewka	e9f6eeb1a3	docs: add Daedalus/Pallas/Mnemosyne integration design v1 All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 52s Details CVE Scan & Docker Build / build-and-push (push) Successful in 44s Details Document the end-state auth/authz model unifying the three services around a bearer → resolved library set abstraction. Replaces the per-turn JWT forwarding scheme with static team JWTs held by Pallas deployments, eliminating custom transport code and the monkey-patch chain that caused opaque failures in agent teams. Also records the UX shift where Daedalus workspaces attach Teams (Pallas instances) rather than individual agents.	2026-05-10 11:11:29 -04:00
Robert Helewka	55523adbf7	fix(library): use {% comment %} for multi-line template comments All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 50s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m15s Details Django's `{# #}` syntax only supports single-line comments; multi-line blocks were rendering as literal text in the search and library detail templates. Replace them with `{% comment %}...{% endcomment %}` blocks and add a note explaining the distinction.	2026-05-10 08:19:15 -04:00
Robert Helewka	a945b382e6	feat: add init sidecar for migrations and setup on compose up All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 50s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m30s Details Introduces a one-shot `init` service in docker-compose that runs Postgres migrations, Neo4j index setup, and library-type seeding on every `up`. Long-running services (`app`, `mcp`, `worker`) now depend on its successful completion via `service_completed_successfully`, blocking the stack on configuration errors (missing embedding model, dimension mismatch, unreachable DB) rather than serving silent zero-result searches. Also standardizes reranker test fixtures to use the `/v1` OpenAI-style base URL convention used across other service clients.	2026-05-10 08:01:58 -04:00
Robert Helewka	9ceb01f829	fix(library): admin UI search now sees workspace-scoped libraries All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 53s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m34s Details Root cause ---------- SearchService unconditionally appends _WORKSPACE_SCOPE_CLAUSE to every Cypher query. With both workspace_id and allowed_libraries NULL, the clause only matches libraries whose workspace_id is also NULL: AND ( ($workspace_id IS NOT NULL AND lib.workspace_id = $workspace_id) OR ($allowed_libraries IS NOT NULL AND lib.uid IN $allowed_libraries) OR ($workspace_id IS NULL AND $allowed_libraries IS NULL AND lib.workspace_id IS NULL) ) search_page and library_search both built their SearchRequest without setting either parameter, so the third branch was always the only one that matched. Every Daedalus-ingested library carries a non-null workspace_id, so documents ingested via Daedalus were invisible to the /library/search/ admin UI — the symptom being zero results for terms that demonstrably exist in indexed chunks. Fix --- Both admin-UI views are `@login_required` debug/admin tools for Django-authenticated operators, not MCP endpoints — they have no workspace-scoping contract to honour. Added `_all_library_uids()` helper that returns every Library UID (or [] when Neo4j is down / a neomodel error bubbles up) and wired it into both views as `allowed_libraries=`. This flips the scope clause into its second branch ('lib.uid IN $allowed_libraries'), which matches every library regardless of workspace_id — reusing the exact mechanism Phase-2 chat turns use for user-managed libraries. SearchRequest.__post_init__ collapses an empty list to None, so an unreachable Neo4j gracefully reverts to the legacy global-only behaviour rather than 500-ing the page. Tests ----- library/tests/test_search_views_admin_scope.py: * AllLibraryUidsHelperTests — Neo4j unavailable, normal listing, empty/None-uid filtering, unexpected-exception degradation. * SearchPageAllowedLibrariesTests — admin POST to /library/search/ reaches SearchService with the captured list; empty list collapses to None. Stubs SearchService.search to keep the test hermetic. 6 new tests; all 16 tests in library.tests.test_search* are green: TEST_NEO4J_ENABLED=0 python manage.py test \ library.tests.test_search_views_admin_scope \ library.tests.test_search_scoping \ --testrunner=test_db_manager.django_integration.PostgreSQLTestRunner	2026-05-09 21:54:30 -04:00
Robert Helewka	642268cec1	Add unit tests for MCPAuthMiddleware._extract_tool_name / _extract_token All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 1m8s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m43s Details Both helpers were load-bearing during the Pallas<->Mnemosyne shakedown: * _extract_tool_name: covers the current FastMCP shape (context.message.name directly), the legacy .params.name fallback, prefer-direct behaviour, and every None-producing path. Includes a contract test against the real mcp.types.CallToolRequestParams which skips if the mcp package isn't importable. * _extract_token: covers Bearer/bearer schemes, Authorization/ authorization header casing, whitespace stripping, missing/empty/ non-Bearer headers, RuntimeError degrading to None (outside an HTTP dispatch), and non-RuntimeError propagating loudly. Uses SimpleTestCase (no DB) with unittest.mock.patch on mcp_server.auth.get_http_request to avoid pulling in FastMCP internals. Run as part of mnemosyne's mcp_server suite: TEST_NEO4J_ENABLED=0 python manage.py test mcp_server \ --testrunner=test_db_manager.django_integration.PostgreSQLTestRunner 17 new tests, all green; total mcp_server suite 59 tests passing.	2026-05-08 20:32:42 -04:00
Robert Helewka	d11ee72527	feat(library): protect Daedalus workspace-scoped libraries from manual deletion All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 49s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m13s Details - Add guard in `library_delete` view to block deletion of libraries owned by a Daedalus workspace, redirecting with an error message - Disable the Delete button in `library_detail.html` for workspace- scoped libraries and show a warning alert explaining managed ownership - Add a "Daedalus workspace" badge in both `library_detail.html` and `library_list.html` to visually identify workspace-owned libraries Prevents state desync between Mnemosyne and Daedalus by ensuring workspace-scoped libraries can only be removed via the Daedalus workspace DELETE API endpoint.	2026-05-08 06:55:07 -04:00
Robert Helewka	3c7f85cba0	fix(urls): move static-prefix routes before dynamic `<str:uid>/` pattern All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 52s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m14s Details Relocate `search/`, `concepts/`, and `concepts/<str:uid>/` URL patterns to appear before the `<str:uid>/` catch-all route, preventing Django from incorrectly matching those static prefixes as library UIDs.	2026-05-08 06:28:04 -04:00
Robert Helewka	027de096bc	refactor: move nav items to base navbar template Some checks failed CVE Scan & Docker Build / security-scan (push) Successful in 50s Details CVE Scan & Docker Build / build-and-push (push) Has been cancelled Details Consolidate navigation items into the base navbar template instead of requiring each app to override nav blocks. Nav links are now conditionally rendered based on authentication status, removing the need for duplicate nav block definitions in dashboard.html.	2026-05-08 06:01:59 -04:00
Robert Helewka	4cf022e615	feat: add image query support to search service and library UI All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 51s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m36s Details - Add `query_image_ext` field to `SearchRequest` (defaults to "png") - Embed query from image when supplied and model supports multimodal, with fallback to text embedding on failure or unsupported model - Add search form to library detail page with optional image upload, shown only when multimodal embeddings are available - Display side-by-side baseline vs re-ranked results with query mode indicator, timing stats, and score/rank change highlighting	2026-05-08 05:58:36 -04:00
Robert Helewka	e0fa825189	auth: read tool name off context.message directly; trace call_next failures All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 50s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m26s Details In FastMCP's on_call_tool hook the middleware context is already MiddlewareContext[CallToolRequestParams] (per fastmcp's own middleware.py:158), so tool name lives at context.message.name, not at context.message.params.name — the latter always returned None, silently breaking the PUBLIC_TOOLS bypass for get_health and making the per-tool ACL short-circuit. Also wrap call_next in a traced helper that logs any exception with a full traceback and logs the success-path result type. During the Pallas↔Mnemosyne shakedown the tool results were coming back to fast-agent as the literal string "object NoneType can't be used in 'await' expression" with no trace in either process — that's Python's TypeError for 'await X' where X is None. If that TypeError is raised inside FastMCP dispatch we want the frame in Mnemosyne's own log rather than having Pallas's aggregator turn it into a terse CallToolResult(isError=True) with no stack.	2026-05-06 19:47:52 -04:00
Robert Helewka	15d70c2cf9	mcp_auth: allow jti re-use within its exp window All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 1m6s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m27s Details Daedalus mints one JWT per chat turn; a turn routinely drives several Mnemosyne tool calls (list_libraries -> search -> get_document ...) re-using that same bearer. The old _remember_jti flagged every repeat as replay, so the 2nd+Nth tool call in each turn failed with 'Token replay detected.'. Change the cache to store jti -> exp. A repeat within the token's own validity window is legitimate and allowed. A repeat past exp (+ the symmetric _JWT_LEEWAY_SECONDS PyJWT uses on the signature check) is a genuine replay and still rejected -- this is belt-and-braces since PyJWT's own exp check would have already caught an expired token. Also validate exp is numeric at the call site for defence in depth against future PyJWT changes to claim shapes.	2026-05-05 22:03:36 -04:00
Robert Helewka	8b2e2068e0	mcp_auth: INFO-level bearer extraction diagnostics All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 1m21s Details CVE Scan & Docker Build / build-and-push (push) Successful in 3m5s Details Temporarily instrument MCPAuthMiddleware to emit one log line per on_call_tool and one per _extract_token. Needed to diagnose why workspace-scoped JWTs forwarded by Pallas land on tool calls with 'Authentication required. Provide a Bearer token.' Logs include header names, auth-header length+prefix, and the request URL so we can tell in one turn whether the header is missing, present but rejected, or get_http_request() raised. Also adds lowercase-bearer tolerance for clients that normalize to lowercase. Demote to DEBUG once the end-to-end path is green.	2026-05-05 21:48:39 -04:00
Robert Helewka	f8536b5474	fix(mcp): exempt get_health from bearer token auth requirement All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 51s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m45s Details Health probes (Pallas health pollers, agent startup checks) call get_health without a bearer token. Auth should only be required for data-access tools. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 18:18:44 -04:00
Robert Helewka	8d650c0570	docs(mnemosyne): update Phase 3 status to implemented All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 55s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m15s Details Mark per-turn JWT access control as implemented in the Mnemosyne integration docs. Update Phase 2/3 status tables, replace deferred language with concrete implementation details, and document the `MCPSigningKey` model, `resolve_mcp_jwt`, and `_scope_from_claims` components now live in the MCP server.	2026-05-04 15:06:34 -04:00
Robert Helewka	56e977ffb5	fix(library): normalize MIME types to file extensions in Daedalus ingest All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 1m9s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m15s Details Daedalus may send `file_type` as a MIME type (e.g. `text/markdown`) rather than a bare extension. Add a `_normalize_file_type` helper with a MIME→ext lookup table and sensible fallbacks so ingested items are stored with proper extensions like `md` instead of `text/markdown`.	2026-05-04 12:39:54 -04:00
Robert Helewka	37bb38ee43	fix(mnemosyne): use STORAGES config for S3 health check All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 49s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m14s Details Update `_check_s3` to read S3 settings from the `STORAGES` dict instead of deprecated top-level `AWS_*` settings. Skip the check when local storage is enabled and return an error early if no bucket is configured.	2026-05-04 12:26:50 -04:00
Robert Helewka	cbe7921938	fix(deploy): use /ready/ healthcheck and /srv/mnemosyne path All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 1m9s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m31s Details - Change app healthcheck from /live/ to /ready/ to verify full readiness including dependencies (DB, Neo4j, S3) - Increase healthcheck timeout from 5s to 10s to accommodate dependency checks - Add S3 bucket connectivity check to readiness probe - Update deployment documentation to use /srv/mnemosyne instead of /opt/mnemosyne as the compose project directory	2026-05-04 09:23:36 -04:00
Robert Helewka	de0d7a4317	docs(mnemosyne): update integration doc for container deployment All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 50s Details CVE Scan & Docker Build / build-and-push (push) Successful in 4m2s Details	2026-05-04 08:56:49 -04:00
Robert Helewka	e34b7f46a5	feat(mcp_server): add --password option to ensure_service_user command All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 1m2s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m15s Details	2026-05-04 08:43:55 -04:00
Robert Helewka	df2e495660	docs: add Red Panda Django Standards V1-02 All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 49s Details CVE Scan & Docker Build / build-and-push (push) Successful in 42s Details Introduces the Red Panda Approval standards document for Django projects, covering environment setup, directory structure, dependency pinning, Docker Compose per-service environment scoping, nginx reverse-proxy configuration (Docker DNS, X-Forwarded-Proto preservation, access-log filtering, internal allowlists), and Memcached deployment notes.	2026-05-04 07:47:08 -04:00
Robert Helewka	c9328c58fc	refactor(nginx): overhaul config with dynamic resolution and media serving All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 1m12s Details CVE Scan & Docker Build / build-and-push (push) Successful in 1m5s Details - Add Docker DNS resolver to prevent stale upstream IPs after container restarts - Preserve X-Forwarded-Proto from HAProxy for correct HTTPS detection - Mount mnemosyne-media volume for direct /media/ serving - Add IP allowlisting for probe/metrics endpoints (RFC1918 + loopback) - Fix access_log inheritance so probe paths are properly suppressed - Expand inline documentation covering routing model and conventions	2026-05-04 07:41:15 -04:00
Robert Helewka	003f958f7b	docs(env): expand .env.example into full compose interpolation template All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 51s Details CVE Scan & Docker Build / build-and-push (push) Successful in 3m3s Details Replace the minimal placeholder .env.example with a comprehensive template documenting every variable consumed by docker-compose.yaml, organized by service (Django core, HTTP, Postgres, Neo4j, Memcached, S3/MinIO, Daedalus, Celery/RabbitMQ, etc.). Clarifies that this file is rendered from an Ansible Jinja2 template with vaulted secrets in production, and distinguishes it from the in-tree mnemosyne/.env used for bare-Python development.	2026-05-04 07:04:28 -04:00
Robert Helewka	d84f0e548b	Docker Compose: Set pull policy to always All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 53s Details CVE Scan & Docker Build / build-and-push (push) Successful in 43s Details	2026-05-03 20:06:38 -04:00
Robert Helewka	72bd4b381d	Port number adjustments All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 50s Details CVE Scan & Docker Build / build-and-push (push) Successful in 56s Details	2026-05-03 19:56:01 -04:00
Robert Helewka	7185d326eb	feat(docker): rename web service to app, add nginx as web All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 53s Details CVE Scan & Docker Build / build-and-push (push) Successful in 3m0s Details Reorganize Docker Compose services: the Django/gunicorn container is now `app` and nginx is `web`, better reflecting their roles. Add a dedicated gunicorn configuration and install curl in the runtime image for health checks. Update documentation to reflect: - Neo4j migration from ariel.incus to a dedicated umbriel.incus instance - Rationale for requiring a dedicated Neo4j instance (single-tenancy assumptions, label/index isolation, schema ownership) - New service naming in compose commands and log tailing examples	2026-05-03 19:35:27 -04:00
Robert Helewka	a2c885cf34	feat(library): add workspace-scoped search and JWT auth for Daedalus All checks were successful CVE Scan & Docker Build / security-scan (push) Successful in 52s Details CVE Scan & Docker Build / build-and-push (push) Successful in 2m32s Details - Extend library list endpoint with `include_workspace` and `with_item_count` query params to support Daedalus registry mirroring - Expand search scope clause to three modes: workspace-only, workspace plus allowed user libraries, and global - Add `allowed_libraries` field to SearchRequest for Phase-2 JWT claims - Introduce JWT-based actor resolution using a synthetic service user (`MCP_JWT_SERVICE_USERNAME`) for Daedalus-originated requests	2026-05-03 17:36:06 -04:00
Robert Helewka	e5618973fc	docs(integration): mark Phases 1+2 as implemented; add Phase 3 stub The integration doc was forward-looking spec but most of it now ships: Phase 1 (REST workspace + ingest API for Daedalus) ✅ implemented Phase 2 (MCP server: search/get_chunk/list_*/get_health) ✅ implemented Phase 3 (per-turn signed-token access control) 📋 deferred Updated: - Tool table reflects actual implementation (search, get_chunk, list_libraries, list_collections, list_items, get_health) instead of the speculative names (search_knowledge, search_by_category, etc.) - Project structure matches the as-built layout (tools/discovery.py exists; no separate browse.py). - REST API table covers both workspace lifecycle endpoints and ingest endpoints, with correct routes (/library/api/...). - Ingest request schema includes content_hash and workspace_id (the actual idempotency key on the Mnemosyne side). - Celery task description matches library.tasks.ingest_from_daedalus rather than the placeholder embed_item. - Phase 6 checklist marks Phases 1+2 done; adds Phase 3 (per-turn token access control) with a per-Mnemosyne-side TODO list pointing at the matching Daedalus-side §9 design. Internal MCP port stays 22091; public access via nginx on 23090. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-02 21:54:05 -04:00
Robert Helewka	236d9e2e74	feat(deploy): production docker compose stack + Gitea CI image build Adds a complete deployment surface for production: Dockerfile multi-stage 3.12-slim build, collectstatic baked into the image, runs as non-root mnemosyne uid/gid 1000. docker/entrypoint.sh dispatches `web \| mcp \| worker \| beat \| migrate \| setup \| shell` from a single image, so every service in compose runs the same artifact. docker-compose.yaml five services: static-init (one-shot copies statics into the shared volume on every up), web (gunicorn), mcp (uvicorn), worker (celery), nginx. External services (Postgres, Neo4j, RabbitMQ, S3, Memcached, embedder, reranker) reached over the 10.10.0.0/24 internal network and configured via mnemosyne/.env. nginx/mnemosyne.conf reverse proxy: /library/* and /admin/* → web, /mcp/* → mcp, /static/* → volume, /metrics internal-network-only (127/8 + RFC1918), /healthz proxies to /mcp/health for liveness probes. .gitea/workflows/ CVE scan + image build, image pushed to git.helu.ca/r/mnemosyne. Trivy scans pyproject extras (dev/test/lint/docs) and the built image. pyproject.toml adds [test], [lint], [docs] extras so the CI pip-compile step has something to resolve. README documents the bring-up flow (`docker compose run --rm web migrate`, then `setup`, then `up -d`), day-to-day commands, and the env-var values that need adjusting for production (DEBUG=False, KVDB_LOCATION pointing at the external memcached, AWS keys filled in, etc.). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 12:05:23 -04:00
Robert Helewka	1cd556c3f6	fix(asgi): redirect /mcp → /mcp/ for clients that omit the trailing slash Starlette's Mount("/mcp", ...) only matches /mcp/* paths. A POST to bare /mcp falls through to the catch-all Django mount and returns 404. The fast-agent MCP client and the README example both used the no-slash URL, so the validator was never able to initialize a session — every call landed in django.request. Adds a 307 redirect at /mcp so any client URL works, and points the validator config at /mcp/ directly to skip the redirect round-trip. Also gitignores fastagent.jsonl (a runtime log file fast-agent writes into the working directory). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 12:04:42 -04:00
Robert Helewka	e2a6d45b77	chore(validator): drop .env, keep all config in FastAgent YAMLs OPENAI_BASE_URL was duplicated between .env and fastagent.config.yaml; the YAML is authoritative, so .env is dead weight. Removing the .env template and gitignore entry, updating README to reflect. The real fastagent.secrets.yaml stays gitignored; fastagent.secrets.yaml.example remains as the documented schema. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 07:01:52 -04:00
Robert Helewka	97a14fb03a	feat(validator): add bare FastAgent + Pallas validator for Mnemosyne MCP A self-contained sub-project under validator/ that wraps Mnemosyne's MCP server in a single FastAgent. Use it to confirm — outside of Daedalus — that Mnemosyne's MCP transport works, every tool registers, args/responses round-trip, and an LLM can actually drive the tools. The validator is its own Pallas-consuming project with its own pyproject (pallas-mcp + fast-agent-mcp), agents.yaml, and fastagent.config.yaml — matching the pattern used by Iolaus and other Pallas consumers. It does not import Mnemosyne Python code; it only speaks MCP over HTTP. The agent never sets workspace_id, so all calls run against the global scope (libraries with workspace_id IS NULL). Workspace-scoped validation will come once Daedalus's chat path is wired (Daedalus injects workspace_id server-side, force-overwriting whatever the LLM produces). Default model is openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf served by llama.cpp at nyx.helu.ca:22079/v1. Token provisioning via `python manage.py create_mcp_token --user <u> --name validator`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 06:53:48 -04:00
Robert Helewka	2a8a3d75b4	docs(readme): document operations + Daedalus integration endpoints Adds a "Running Mnemosyne" section with the three commands needed to operate the system: Django web app (gunicorn), MCP server (uvicorn on :22091), and Celery worker — with notes on the embedding queue that the Daedalus ingest task depends on. Adds the Ouranos host map (Portia / Ariel / Oberon / Nyx / Memcached), one-time setup commands (migrate, setup_neo4j_indexes, load_library_types), the Daedalus integration endpoints table, and the two new library types (business, finance) in the existing Library Types table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 06:27:46 -04:00
Robert Helewka	5527cf6bdb	feat(search,mcp): workspace-scope search and add get_health MCP tool Workspace scoping is the integration's security-critical property: an agent in workspace A must never see content from workspace B or from any global library, regardless of what the calling LLM tries. Adds `workspace_id` to SearchRequest with __post_init__ normalization that converts empty strings to None — so "" cannot slip through as a truthy filter at the Cypher boundary. Extracts the workspace scope clause to a single string and appends it to all five search queries (vector, fulltext-chunk, fulltext-concept, graph, image): ($workspace_id IS NULL AND lib.workspace_id IS NULL OR lib.workspace_id = $workspace_id) Either workspace-only or global-only — never both — and the operator precedence is bracketed so a refactor can't accidentally widen it. A test verifies the literal clause string for that exact reason. Adds `workspace_id` as a parameter to every MCP tool (`search`, `get_chunk`, `list_libraries`, `list_collections`, `list_items`). Deliberately undocumented in tool docstrings so the calling LLM is never told the parameter exists — it is system-injected by Daedalus's chat path and force-overwritten before reaching Mnemosyne. Mnemosyne also validates the value but the security guarantee is enforced upstream. Adds the `get_health` MCP tool per the Pallas health spec: returns ok / degraded / error after probing Neo4j, S3, and the embedding model registration. Used by Daedalus's existing health poller. Updates the server INSTRUCTIONS string to advertise the new tool and the two new library types (business, finance). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 06:27:32 -04:00
Robert Helewka	f2af28d96d	feat(api): add workspace + ingest REST endpoints for Daedalus Adds the REST API surface that Daedalus calls to manage workspace lifecycle and dispatch file ingestion. All endpoints under /library/api/: POST /workspaces/ create workspace (idempotent on workspace_id; library_type frozen) GET /workspaces/{workspace_id}/ workspace status with item/chunk counts DELETE /workspaces/{workspace_id}/ delete workspace + reachable content; concept-safe (orphan-only Concept GC; concepts referenced elsewhere are preserved) POST /ingest/ queue a file for ingest. Idempotent on (library, source_ref, hash): same triple → return existing job; new hash → supersede. GET /jobs/{job_id}/ poll job status POST /jobs/{job_id}/retry/ re-dispatch a failed job GET /jobs/?status=&library_uid= list recent jobs Workspace-Library lookup uses the unique workspace_id index added in the schema commit. Concept GC runs as a separate transaction after item/chunk delete so partial failures don't leave the global graph corrupted. Tests cover serializer validation, IngestJob ORM behavior, the (library, source_ref, hash) idempotency query pattern, and auth boundaries on every new endpoint. Cypher correctness is validated by manual end-to-end testing — no live Neo4j in unit tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 06:27:08 -04:00
Robert Helewka	c485a8560c	feat(ingest): add Daedalus cross-bucket S3 fetch + ingest_from_daedalus task Adds DAEDALUS_S3_* settings (read-only credentials for the Daedalus bucket) and a small `daedalus_s3.py` helper that fetches a file from Daedalus's bucket and writes it into Mnemosyne's bucket via default_storage. Adds the Celery task `library.tasks.ingest_from_daedalus`. Given an IngestJob row, it: 1. Resolves the target Library (by library_uid). 2. Supersedes a prior Item with the same source_ref but different content_hash by deleting the old Item + chunks first. 3. Fetches from Daedalus S3, copies into items/{item_uid}/original.{ext}. 4. Creates the Item node, links it to a default Collection. 5. Runs the existing EmbeddingPipeline.process_item. 6. Marks the job completed with chunks/concepts counts. Failures retry up to 3× with exponential backoff; final failure marks the job failed with the exception text. Routed to the embedding queue so single-worker setups must consume it. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 06:26:48 -04:00
Robert Helewka	33658fbc8d	feat(library): add business + finance types, workspace_id, IngestJob Adds two new content-type-aware library types — `business` for proposals/marketing/strategy (used by the work-team agents) and `finance` for statements/tax/market commentary (used by Garth). Each ships with chunking config, embedding/reranker instructions, an LLM-context prompt that forbids fabricating financial figures, and a vision prompt. Adds a unique-indexed `workspace_id` property to `Library` so a node can be scoped to a Daedalus workspace. Null means a global library; non-null means workspace-scoped. Search Cypher (added in a later commit) enforces the boundary. Adds an `IngestJob` Django ORM model — separate from neomodel — that tracks asynchronous ingestion lifecycle (Daedalus → S3 → Celery → embedding pipeline) with idempotency on (library, source_ref, hash). Migration 0001_initial creates the table. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-29 06:26:26 -04:00
Robert Helewka	81426327bf	feat(mcp): store MCP tokens as SHA-256 hashes instead of plaintext Replace plaintext token storage with SHA-256 hashes so leaked database contents cannot be used to authenticate. Plaintext is generated, shown once at creation time, and never persisted. - Add `hash_token()` helper and `MCPTokenManager.create_token()` that returns `(instance, plaintext)`. - Replace `token` field with indexed `token_hash`; look up bearers by hashing the incoming value. - Update dashboard, management command, and admin to surface plaintext only at creation. Disable admin "add" since it cannot reveal plaintext. - Migration drops the old `token` column and adds `token_hash`; pre-existing tokens are invalidated and must be reissued.	2026-04-27 09:01:36 -04:00
Robert Helewka	2df22941d2	feat: replace server-side RAG with MCP retrieval primitives - Remove Phase 4 RAG pipeline in favor of retrieval-only architecture - Add FastMCP server exposing search, get_chunk, list_libraries tools - Mount MCP endpoints (streamable HTTP + SSE) via Starlette in ASGI config - Update README to clarify Mnemosyne is a retrieval engine, not RAG - Let calling LLMs drive synthesis and iterative retrieval themselves	2026-04-26 15:34:26 -04:00
Robert Helewka	388b37e471	fix(search): require library match and preserve raw scores for RRF Replace OPTIONAL MATCH with MATCH for Library-Collection-Item paths to ensure results are properly scoped to libraries, and remove per-query score normalization since RRF fuses results by rank rather than score magnitude.	2026-04-26 06:35:11 -04:00
Robert Helewka	4a35aa126f	refactor(settings): replace DATABASE_URL with explicit DB env vars Replace the single `DATABASE_URL` connection string with individual environment variables (`APP_DB_NAME`, `APP_DB_USER`, `APP_DB_PASSWORD`, `DB_HOST`, `DB_PORT`) for more granular database configuration control.	2026-04-13 10:23:03 +00:00
Robert Helewka	634845fee0	feat: add Phase 3 hybrid search with Synesis reranking Implement hybrid search pipeline combining vector, fulltext, and graph search across Neo4j, with cross-attention reranking via Synesis (Qwen3-VL-Reranker-2B) `/v1/rerank` endpoint. - Add SearchService with vector, fulltext, and graph search strategies - Add SynesisRerankerClient for multimodal reranking via HTTP API - Add search API endpoint (POST /search/) with filtering by library, collection, and library_type - Add SearchRequest/Response serializers and image search results - Add "nonfiction" to library_type choices - Consolidate reranker stack from two models to single Synesis service - Handle image analysis_status as "skipped" when analysis is unavailable - Add comprehensive tests for search pipeline and reranker client	2026-03-29 18:09:50 +00:00
Robert Helewka	fb38a881d9	Add vision model support to LLM Manager admin and rename index for clarity	2026-03-29 17:03:59 +00:00
Robert Helewka	90db904959	Add vision analysis capabilities to the embedding pipeline - Introduced a new vision analysis service to classify, describe, and extract text from images. - Enhanced the Image model with fields for OCR text, vision model name, and analysis status. - Added a new "nonfiction" library type with specific chunking and embedding configurations. - Updated content types to include vision prompts for various library types. - Integrated vision analysis into the embedding pipeline, allowing for image analysis during document processing. - Implemented metrics to track vision analysis performance and usage. - Updated UI components to display vision analysis results and statuses in item details and the embedding dashboard. - Added migration for new vision model fields and usage tracking.	2026-03-22 15:14:34 +00:00
Robert Helewka	6585beed20	Add download functionality for items and images with presigned URLs	2026-03-22 12:08:44 +00:00
Robert Helewka	1379e0d425	Add logging configuration to prevent Celery from overriding Django's logging setup	2026-03-21 13:23:56 +00:00
Robert Helewka	99bdb4ac92	Add Themis application with custom widgets, views, and utilities - Implemented custom form widgets for date, time, and datetime fields with DaisyUI styling. - Created utility functions for formatting dates, times, and numbers according to user preferences. - Developed views for profile settings, API key management, and notifications, including health check endpoints. - Added URL configurations for Themis tests and main application routes. - Established test cases for custom widgets to ensure proper functionality and integration. - Defined project metadata and dependencies in pyproject.toml for package management.	2026-03-21 02:00:18 +00:00
Robert	e99346d014	Initial commit	2026-03-18 23:01:09 +00:00

48 Commits