r/mnemosyne

Fork 0

Go to file

Robert Helewka e0fa825189

CVE Scan & Docker Build / security-scan (push) Successful in 50s

Details

CVE Scan & Docker Build / build-and-push (push) Successful in 2m26s

Details

auth: read tool name off context.message directly; trace call_next failures

In FastMCP's on_call_tool hook the middleware context is already
MiddlewareContext[CallToolRequestParams] (per fastmcp's own
middleware.py:158), so tool name lives at context.message.name, not
at context.message.params.name — the latter always returned None,
silently breaking the PUBLIC_TOOLS bypass for get_health and making
the per-tool ACL short-circuit.

Also wrap call_next in a traced helper that logs any exception with
a full traceback and logs the success-path result type.  During the
Pallas↔Mnemosyne shakedown the tool results were coming back to
fast-agent as the literal string "object NoneType can't be used in
'await' expression" with no trace in either process — that's Python's
TypeError for 'await X' where X is None.  If that TypeError is raised
inside FastMCP dispatch we want the frame in Mnemosyne's own log
rather than having Pallas's aggregator turn it into a terse
CallToolResult(isError=True) with no stack.

2026-05-06 19:47:52 -04:00

.gitea/workflows

feat(deploy): production docker compose stack + Gitea CI image build

2026-04-29 12:05:23 -04:00

docker

Port number adjustments

2026-05-03 19:56:01 -04:00

docs

docs(mnemosyne): update Phase 3 status to implemented

2026-05-04 15:06:34 -04:00

mnemosyne

auth: read tool name off context.message directly; trace call_next failures

2026-05-06 19:47:52 -04:00

nginx

refactor(nginx): overhaul config with dynamic resolution and media serving

2026-05-04 07:41:15 -04:00

validator

fix(asgi): redirect /mcp → /mcp/ for clients that omit the trailing slash

2026-04-29 12:04:42 -04:00

.env.example

docs(env): expand .env.example into full compose interpolation template

2026-05-04 07:04:28 -04:00

.gitignore

Add Themis application with custom widgets, views, and utilities

2026-03-21 02:00:18 +00:00

docker-compose.yaml

fix(deploy): use /ready/ healthcheck and /srv/mnemosyne path

2026-05-04 09:23:36 -04:00

Dockerfile

Docker Compose: Set pull policy to always

2026-05-03 20:06:38 -04:00

LICENSE

Add Themis application with custom widgets, views, and utilities

2026-03-21 02:00:18 +00:00

pyproject.toml

feat(library): add workspace-scoped search and JWT auth for Daedalus

2026-05-03 17:36:06 -04:00

README.md

docs(env): expand .env.example into full compose interpolation template

2026-05-04 07:04:28 -04:00

README.md

Mnemosyne

"The electric light did not come from the continuous improvement of candles." — Oren Harari

The memory of everything you know.

Mnemosyne is a content-type-aware, multimodal personal knowledge management system built on Neo4j knowledge graphs and Qwen3-VL multimodal AI models. Named after the Titan goddess of memory and mother of the nine Muses, Mnemosyne doesn't just store your knowledge — it understands what kind of knowledge it is, connects it through relationships, and makes it all searchable through text, images, and natural language.

What Makes This Different

Every existing knowledge base tool treats all documents identically: text in, chunks out, vectors stored. A novel and a PostgreSQL manual get the same treatment.

Mnemosyne knows the difference:

A textbook has chapters, an index, technical terminology, and pedagogical structure. It's chunked accordingly, and when an LLM retrieves results, it knows this is instructional content.
A novel has narrative flow, characters, plot arcs, dialogue. The LLM knows to interpret results as creative fiction.
Album artwork is a visual asset tied to an artist, genre, and era. It's embedded multimodally — searchable by both image similarity and text description.
A journal entry is personal, temporal, reflective. The LLM treats it differently than a reference manual.

This content-type awareness flows through every layer: chunking strategy, embedding instructions, re-ranking, and the final LLM prompt.

Core Architecture

Component	Technology	Purpose
Knowledge Graph	Neo4j 5.x	Relationships + vector storage (no dimension limits)
Multimodal Embeddings	Qwen3-VL-Embedding-8B	Text + image + video in unified vector space (4096d)
Multimodal Re-ranking	Synesis (Qwen3-VL-Reranker-2B)	Cross-attention precision scoring via `/v1/rerank`
Web Framework	Django 5.x + DRF	Auth, admin, API, content management
Object Storage	S3/MinIO	Original content + chunk text storage
Async Processing	Celery + RabbitMQ	Document embedding, graph construction
LLM Interface	MCP Server	Primary interface for Claude, Copilot, etc.
GPU Serving	vLLM + llama.cpp	Local model inference

Library Types

Library	Example Content	Multimodal?	Graph Relationships
Fiction	Novels, short stories	Cover art	Author → Book → Character → Theme
Nonfiction	History, biography, science writing	Photos, charts	Author → Work → Topic → Person/Place
Technical	Textbooks, manuals, docs	Diagrams, screenshots	Product → Manual → Section → Procedure
Music	Lyrics, liner notes	Album artwork	Artist → Album → Track → Genre
Film	Scripts, synopses	Stills, posters	Director → Film → Scene → Actor
Art	Descriptions, catalogs	The artwork itself	Artist → Piece → Style → Movement
Journal	Personal entries, plans, observations	Photos	Date → Entry → Topic → Person/Place
Business	Proposals, marketing, strategy	Logos, charts	Client → Engagement → Deliverable
Finance	Statements, tax, market commentary	Charts, statement scans	Account → Instrument → Period

Search Pipeline

Query → Vector Search (Neo4j) + Graph Traversal (Cypher) + Full-Text Search
  → Candidate Fusion → Qwen3-VL Re-ranking → Ranked Chunks + Metadata
    → MCP tool result (the calling LLM does its own synthesis)

Heritage

Mnemosyne's RAG pipeline architecture is inspired by Spelunker, an enterprise RFP response platform. The proven patterns — hybrid search, two-stage RAG (responder + reviewer), citation-based retrieval, and async document processing — are carried forward and enhanced with multimodal capabilities and knowledge graph relationships.

Running Mnemosyne

Mnemosyne runs as three cooperating processes: the Django web app (REST API + admin), the MCP server (LLM-facing tools), and one or more Celery workers (async embedding + ingest). All three read configuration from mnemosyne/.env (copy from mnemosyne/.env example and fill in secrets).

Hosts in the Ouranos lab:

Postgres — portia.incus:5432 (Django ORM: users, IngestJob)
Neo4j — umbriel.incus:7687 (Bolt; dedicated instance — see note below — knowledge graph + vectors; HTTP Browser on umbriel.incus:25555)
RabbitMQ — oberon.incus:5672 (Celery broker)
MinIO — nyx.helu.ca:8555 (S3-compatible; mnemosyne-content and daedalus buckets)
Memcached — 127.0.0.1:11211 (task progress)

Neo4j must be dedicated to Mnemosyne. Don't share the instance with Spelunker or any other graph workload. Mnemosyne owns the Library, Collection, Item, Chunk, and Concept labels and runs its own indexes (chunk_embedding_index, full-text indexes per library_type) and schema migrations (setup_neo4j_indexes, load_library_types). The Phase-1 workspace-delete path runs label-scoped DETACH DELETE over those labels, and a workspace_id-scoped subgraph is the unit of isolation — both assume single-tenancy. A shared instance risks (1) label/property collisions corrupting the other tenant's graph, (2) vector-index memory contention degrading search latency for both apps, (3) management commands mutating schema another tenant depends on, and (4) backup/restore that can't be reasoned about per-app. Neo4j Community Edition is sufficient — the multi-database feature is Enterprise-only, so isolation has to come from running a separate server process. Run a dedicated instance per environment (one for staging, one for production); point each via NEOMODEL_NEO4J_BOLT_URL in that environment's mnemosyne/.env.

One-time setup

cd mnemosyne/
python manage.py migrate                       # Apply Django ORM migrations
python manage.py setup_neo4j_indexes           # Create Neo4j vector + full-text indexes
python manage.py load_library_types            # Load LIBRARY_TYPE_DEFAULTS into Neo4j

Start the web app

The Django REST API serves /library/api/* (libraries, collections, items, search, workspaces, ingest) and Django admin. Use Gunicorn in production; runserver for dev.

cd mnemosyne/

# Development
python manage.py runserver 0.0.0.0:8000

# Production
gunicorn --bind 0.0.0.0:8000 --workers 3 mnemosyne.wsgi:application

Start the MCP server

The MCP server exposes the LLM-facing tools (search, get_chunk, list_libraries, list_collections, list_items, get_health) over Streamable HTTP at /mcp and SSE at /mcp/sse. Run as a separate Uvicorn process, on its own port, so it can be reverse-proxied or scaled independently of the Django app.

cd mnemosyne/

# Single command: ASGI server hosting the FastMCP app
uvicorn mnemosyne.asgi:app --host 0.0.0.0 --port 22091 --workers 1

The mcp_server/asgi.py mounts FastMCP at /mcp (Streamable HTTP) and /mcp/sse (SSE), with a /mcp/health JSON probe for HAProxy/Pallas.

Start a Celery worker

A single worker that handles all queues (development) plus the focused command Daedalus depends on (the embedding queue, where the Daedalus ingest task lives).

cd mnemosyne/

# Development — one worker, all queues
celery -A mnemosyne worker -l info -Q celery,embedding,batch

# Production — embedding queue (handles Daedalus ingest + embed_item)
celery -A mnemosyne worker -l info -Q embedding -c 1 -n embedding@%h

# Production — batch queue (collection/library bulk operations)
celery -A mnemosyne worker -l info -Q batch -c 2 -n batch@%h

# Production — default queue (LLM validation, misc)
celery -A mnemosyne worker -l info -Q celery -c 2 -n default@%h

Daedalus's POST /library/api/ingest/ dispatches library.tasks.ingest_from_daedalus to the embedding queue. If you only run one worker, make sure it consumes embedding or that task will sit in the broker.

To bypass workers in dev/test, set CELERY_TASK_ALWAYS_EAGER=True in .env.

Scheduler & monitoring (optional):

celery -A mnemosyne beat -l info            # Periodic task scheduler
celery -A mnemosyne flower --port=5555      # Web monitoring UI

See Phase 2: Celery Workers & Scheduler for queue tuning, reliability settings, and task progress tracking.

Daedalus integration endpoints

These endpoints are used by the Daedalus FastAPI backend (HTTP Basic auth). All under /library/api/:

Method	Route	Purpose
POST	`/workspaces/`	Create a workspace (idempotent on `workspace_id`); body: `{workspace_id, name, library_type, description?}`
GET	`/workspaces/{workspace_id}/`	Workspace status (item/chunk counts)
DELETE	`/workspaces/{workspace_id}/`	Delete workspace + reachable content; preserves shared concepts
POST	`/ingest/`	Queue a file for ingestion + embedding
GET	`/jobs/{job_id}/`	Poll ingest job status
POST	`/jobs/{job_id}/retry/`	Re-dispatch a failed job
GET	`/jobs/?status=&library_uid=`	List recent jobs

See docs/mnemosyne_integration.md for the full Daedalus contract.

Production Deployment

Production runs as four containers from a single image (built and pushed by .gitea/workflows/cve-scan-docker-build.yml on every push to main):

Service	Role	Port
`app`	Django REST API + admin (gunicorn)	internal :8000
`mcp`	FastMCP server (uvicorn)	internal :22091
`worker`	Celery worker — embedding/ingest/batch	—
`web`	Reverse proxy + static files (nginx)	host :23090

Plus a one-shot static-init service that copies /app/staticfiles (baked into the image at build time via collectstatic) into the shared volume nginx reads from. It runs to completion on every up, so static-file changes propagate on each deploy without manual intervention.

External services (NOT spun up by compose): Postgres on Portia, Neo4j on Umbriel (dedicated Mnemosyne instance), RabbitMQ on Oberon, S3/MinIO on Nyx, Memcached, embedder + reranker. All reached over the internal 10.10.0.0/24 network.

Environment scoping

Each compose service declares only the environment variables it actually needs — there is no shared env_file:. The rationale:

The MCP server (the most exposed surface, because it talks to outside LLMs) should never see the Celery broker URL or the LLM API encryption key. It only needs Postgres, Neo4j, Memcached, S3, and the MCP-specific auth toggle.
The Celery worker has no business knowing ALLOWED_HOSTS, CSRF_TRUSTED_ORIGINS, MCP_REQUIRE_AUTH, or the email backend — it doesn't serve HTTP.
The Django app doesn't need the Daedalus S3 credentials — only the ingest Celery task reads that bucket.
When a shared secret (like the broker password) is mis-configured, the blast radius is limited to the services that actually need that secret, so you can still observe the rest of the stack while debugging.

Values are interpolated from a .env file at the repo root (not mnemosyne/.env, which is the dev config for bare-Python runs). Copy .env.example to .env and fill in the blanks, or — in production — have your Ansible role render .env from a Jinja2 template with secrets from the vault.

cp .env.example .env
$EDITOR .env       # fill in SECRET_KEY, DB/RabbitMQ/S3 creds, LLM_API_SECRETS_ENCRYPTION_KEY

The per-service surface is defined by the environment: blocks in docker-compose.yaml; .env.example documents every variable with which service(s) consume it.

Broker URL gotcha. If the RabbitMQ password contains any of @ : / # % + ? & = or a space, it must be percent-encoded in CELERY_BROKER_URL. Kombu's URL parser is strict, and this is the most common cause of a PLAIN 403 ACCESS_REFUSED at worker startup when the same credentials work fine under bare-Python celery invocations (because you were probably passing them as kwargs, not a URL).

First-time bring-up

# Generate the root .env from the template (or let Ansible do it)
cp .env.example .env && $EDITOR .env

# Pull the image (or build locally with `docker compose build`)
docker compose pull

# DB migrations (one-shot)
docker compose run --rm app migrate

# Neo4j indexes + library_type defaults (one-shot)
docker compose run --rm app setup

# Bring the stack up
docker compose up -d

Day-to-day

docker compose ps                  # service status + health
docker compose logs -f app         # tail Django app logs
docker compose logs -f web         # tail nginx logs
docker compose logs -f worker      # tail Celery worker logs
docker compose restart mcp         # restart just the MCP server

# After a new image is published:
docker compose pull && docker compose up -d

Things to verify in `.env` before bringing up

The root .env (the one compose interpolates from — not mnemosyne/.env) needs the following set for a working production deploy:

DEBUG=False
USE_LOCAL_STORAGE=False
KVDB_LOCATION=<external-memcached-host>:11211 — 127.0.0.1 does not resolve from inside containers
AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY filled in (Mnemosyne's own MinIO bucket)
DAEDALUS_S3_ACCESS_KEY_ID / DAEDALUS_S3_SECRET_ACCESS_KEY filled in for cross-bucket ingest reads
CELERY_BROKER_URL with the RabbitMQ password percent-encoded if it contains URL-special characters
ALLOWED_HOSTS includes the public hostname HAProxy routes to (e.g. mnemosyne.ouranos.helu.ca)
CSRF_TRUSTED_ORIGINS includes https://<same-hostname>
LLM_API_SECRETS_ENCRYPTION_KEY set to a real Fernet key (generated once per environment)

Verifying the environment reached a container

If a service misbehaves on startup — typically the worker with an AccessRefused from RabbitMQ, or the app with a DB auth error — the fastest diagnostic is to print what Django actually parsed, since that removes every layer of env-file / interpolation / URL-encoding ambiguity:

# What broker URL did the worker actually receive?
docker compose run --rm --no-deps worker \
    python -c "from django.conf import settings; print(repr(settings.CELERY_BROKER_URL))"

# What DB host/user?
docker compose run --rm --no-deps app \
    python -c "from django.conf import settings; print(settings.DATABASES['default'])"

The repr(...) form surfaces CRLF, trailing whitespace, stray quotes, or characters that should have been percent-encoded.

Health probes

Endpoint	Probes	Auth
`GET /live/`	Django process alive (always 200 if gunicorn is up)	None
`GET /ready/`	PostgreSQL + Memcached reachable (503 if either is down)	None
`GET /healthz`	MCP server `/mcp/health` — used as the HAProxy `health_path`	None
`GET /metrics`	Prometheus scrape	Internal networks only

Trailing slashes matter. Always use /live/ and /ready/ (with the trailing slash). The un-slashed forms (/live, /ready) trigger Django's APPEND_SLASH 301 redirect — health check clients that don't follow redirects will report a failure even when the service is healthy.

Architecture Note: Retrieval, Not Synthesis

Mnemosyne is a retrieval engine, not a RAG pipeline. It stores, embeds, and ranks — it does not synthesize answers.

The earlier roadmap had a server-side RAG layer that took a query and returned a written answer with citations. That layer has been removed. Calling LLMs (Claude via MCP, principally) are perfectly capable of driving iterative retrieval themselves when given the right primitives, and a server-side synthesis hop adds latency, cost, and a place where errors are harder to debug. Letting the calling LLM see chunks directly — and follow citations, pivot mid-search, or call get_chunk for full text — beats pre-digesting them.

If a "knowledge subagent" is ever wanted (a wrapper that takes a question and returns a written answer), it lives outside Mnemosyne as a thin client over the MCP tools, with its own system prompt. No coupling, no extra inference hop inside the server, and the subagent's behavior can iterate independently.

Documentation

Architecture Documentation — Full system architecture with diagrams
Phase 1: Foundation — Project skeleton, Neo4j data model, content-type system
Phase 2: Embedding Pipeline — Qwen3-VL multimodal embedding
Phase 3: Search & Re-ranking — Hybrid search + re-ranker
Phase 5: MCP Server — Retrieval primitives for LLMs (search, get_chunk, list_libraries, …)
Phase 6: Backport to Spelunker — Proven patterns flowing back

Languages

Python 61.8%

JavaScript 21.3%

HTML 9.4%

CSS 6.9%

Shell 0.4%

Other 0.2%

README.md

Mnemosyne

What Makes This Different

Core Architecture

Library Types

Search Pipeline

Heritage

Running Mnemosyne

One-time setup

Start the web app

Start the MCP server

Start a Celery worker

Daedalus integration endpoints

Production Deployment

Environment scoping

First-time bring-up

Day-to-day

Things to verify in .env before bringing up

Verifying the environment reached a container

Health probes

Architecture Note: Retrieval, Not Synthesis

Documentation

Things to verify in `.env` before bringing up