# Mnemosyne — Ansible Deployment Reference This document gives the Ansible author everything needed to write and maintain the Mnemosyne deployment role. All implementation decisions are already locked in `docker-compose.yaml` and `nginx/mnemosyne.conf`; this document explains the *why* behind each decision and provides the authoritative list of variables, one-time steps, and verification checks. --- ## 1. Host & Stack Overview | Item | Value | |------|-------| | Deploy target | `puck.incus` (Incus container, 10.10.0.0/24) | | Compose project directory | `/srv/mnemosyne` | | Image registry | `git.helu.ca/r/mnemosyne:latest` | | Public host port | **23181** (nginx → HAProxy on Titania → `https://mnemosyne.ouranos.helu.ca`) | | Internal app port | `app:8000` (Django/gunicorn) | | Internal MCP port | `mcp:8001` (FastMCP/uvicorn) | The four compose services (`app`, `mcp`, `worker`, `web`) all run from the same image. A one-shot `static-init` service seeds the nginx static-file volume on every `up` so static-file changes propagate automatically on deploy without manual intervention. --- ## 2. External Dependencies (NOT managed by this role) These services must exist before Mnemosyne can start. The role only consumes credentials; it does not provision these hosts. | Service | Host | Notes | |---------|------|-------| | PostgreSQL | `portia.incus:5432` | Database `mnemosyne`, user `mnemosyne` | | Neo4j | `umbriel.incus:7687` | Bolt protocol. **Must be dedicated to Mnemosyne** — do not share with Spelunker or any other graph workload (see README §Note on Neo4j). HTTP browser on `umbriel.incus:25555`. | | RabbitMQ | `oberon.incus:5672` | vhost `mnemosyne`, user `mnemosyne` | | MinIO (Mnemosyne bucket) | `nyx.helu.ca:8555` | Bucket `mnemosyne-content`. Credentials scoped read+write. | | MinIO (Daedalus bucket) | `nyx.helu.ca:8555` | Bucket `daedalus`. **Read-only** cross-bucket credentials for the ingest worker. | | Memcached | `oberon.incus:11211` | Shared; prefix `mnemosyne` avoids collisions. | | Embedder (Qwen3-VL-Embedding) | Configured via `EMBEDDING_*` vars in settings | GPU host on Nyx; not managed here. | | Reranker (Synesis) | Configured via `RERANKER_*` vars in settings | GPU host on Nyx; not managed here. | --- ## 3. Role Tasks ### 3.1 Directory & file layout ``` /srv/mnemosyne/ ├── docker-compose.yaml ← copied from repo (or symlinked via git pull) ├── nginx/ │ └── mnemosyne.conf ← copied from repo nginx/mnemosyne.conf └── .env ← rendered from Jinja2 template + vault secrets ``` The role should: 1. Create `/srv/mnemosyne/` and `nginx/` (owner: `root`, mode `0750`). 2. Render `.env` from the vault-sourced Jinja2 template (mode `0600`, owner `root`). 3. Copy (or `git pull`) `docker-compose.yaml` and `nginx/mnemosyne.conf` from the repo. ### 3.2 Pull & start ```yaml - name: Pull latest image community.docker.docker_compose_v2: project_src: /srv/mnemosyne pull: always - name: Bring stack up community.docker.docker_compose_v2: project_src: /srv/mnemosyne state: present ``` This triggers `static-init` automatically on every `up` — no separate handler needed. ### 3.3 One-time setup (run once on first deploy, idempotent thereafter) These management commands are safe to re-run; they do nothing if the target state already exists. Run them as a post-start task gated on a `creates:` sentinel or an explicit `when: mnemosyne_first_deploy` flag. ```bash # Apply Django ORM migrations (PostgreSQL schema) docker compose -f /srv/mnemosyne/docker-compose.yaml \ run --rm app migrate # Create Neo4j vector + full-text indexes and load library-type defaults docker compose -f /srv/mnemosyne/docker-compose.yaml \ run --rm app setup # Create the daedalus-service user (HTTP Basic auth for ingest API) # Pass --password from vault; idempotent if user already exists. docker compose -f /srv/mnemosyne/docker-compose.yaml \ run --rm app \ python manage.py ensure_service_user \ --username daedalus-service \ --password "{{ vault_mnemosyne_daedalus_service_password }}" # Seed the MCPSigningKey used to sign long-lived Pallas team JWTs. # --retire-other deactivates any previously-active key. The hex # emitted to stdout is persisted in Mnemosyne's database and is # not re-injected from the vault — no operator action required # beyond running this command once per fresh deployment. docker compose -f /srv/mnemosyne/docker-compose.yaml \ run --rm app \ python manage.py seed_signing_key --kid daedalus-1 --retire-other ``` The `seed_signing_key` command prints the generated secret once to stdout — it is safe to discard that output after the command succeeds. Mnemosyne persists the active key inside ``MCPSigningKey`` and reads it directly when minting each team JWT; Daedalus never sees this value. To rotate, re-run the command with ``--retire-other`` and then rotate every Pallas team JWT via the Daedalus admin UI so consumers pick up bearers signed with the new key. --- ## 4. Environment Variables (`.env` template) All variables are consumed by `docker-compose.yaml` for interpolation into the relevant service `environment:` blocks. The per-service scoping is defined in `docker-compose.yaml`; the `.env` file just provides values. ### Django core — `app`, `mcp`, `worker` | Variable | Example / default | Notes | |----------|-------------------|-------| | `SECRET_KEY` | `{{ vault_mnemosyne_secret_key }}` | Fernet-safe; never rotate without re-encrypting stored API keys first | | `DEBUG` | `False` | | | `TIME_ZONE` | `UTC` | | | `LANGUAGE_CODE` | `en-us` | | ### HTTP surface — `app` (CSRF), `app` + `mcp` (ALLOWED_HOSTS) | Variable | Example | |----------|---------| | `ALLOWED_HOSTS` | `localhost,127.0.0.1,mnemosyne.ouranos.helu.ca` | | `CSRF_TRUSTED_ORIGINS` | `https://mnemosyne.ouranos.helu.ca` | ### PostgreSQL — `app`, `mcp`, `worker` | Variable | Example | |----------|---------| | `APP_DB_NAME` | `mnemosyne` | | `APP_DB_USER` | `mnemosyne` | | `APP_DB_PASSWORD` | `{{ vault_mnemosyne_db_password }}` | | `DB_HOST` | `portia.incus` | | `DB_PORT` | `5432` | ### Neo4j — `app`, `mcp`, `worker` | Variable | Example | |----------|---------| | `NEOMODEL_NEO4J_BOLT_URL` | `bolt://neo4j:{{ vault_neo4j_password }}@umbriel.incus:7687` | > **URL-encode the password** if it contains `@ : / # % + ? & =` or a space. > The Bolt URL parser is strict. ### Memcached — `app`, `mcp`, `worker` | Variable | Example | |----------|---------| | `KVDB_LOCATION` | `oberon.incus:11211` | | `KVDB_PREFIX` | `mnemosyne` | ### S3 / MinIO (Mnemosyne bucket) — `app`, `mcp`, `worker` | Variable | Example | |----------|---------| | `USE_LOCAL_STORAGE` | `False` | | `AWS_ACCESS_KEY_ID` | `{{ vault_mnemosyne_s3_key }}` | | `AWS_SECRET_ACCESS_KEY` | `{{ vault_mnemosyne_s3_secret }}` | | `AWS_STORAGE_BUCKET_NAME` | `mnemosyne-content` | | `AWS_S3_ENDPOINT_URL` | `https://nyx.helu.ca:8555` | | `AWS_S3_USE_SSL` | `True` | | `AWS_S3_VERIFY` | `False` (self-signed cert on Nyx) | | `AWS_S3_REGION_NAME` | `us-east-1` | ### Daedalus S3 (cross-bucket reads) — `worker` only | Variable | Example | |----------|---------| | `DAEDALUS_S3_ENDPOINT_URL` | `https://nyx.helu.ca:8555` | | `DAEDALUS_S3_ACCESS_KEY_ID` | `{{ vault_daedalus_s3_read_key }}` | | `DAEDALUS_S3_SECRET_ACCESS_KEY` | `{{ vault_daedalus_s3_read_secret }}` | | `DAEDALUS_S3_BUCKET_NAME` | `daedalus` | | `DAEDALUS_S3_REGION_NAME` | `us-east-1` | | `DAEDALUS_S3_USE_SSL` | `True` | | `DAEDALUS_S3_VERIFY` | `True` | ### Celery / RabbitMQ — `app` (producer), `worker` (consumer) | Variable | Example | |----------|---------| | `CELERY_BROKER_URL` | `amqp://mnemosyne:{{ vault_rabbitmq_password \| urlencode }}@oberon.incus:5672/mnemosyne` | | `CELERY_RESULT_BACKEND` | `rpc://` | | `CELERY_TASK_ALWAYS_EAGER` | `False` | > **Percent-encode** the RabbitMQ password in the broker URL if it contains any > URL-special characters. Use Ansible's `urlencode` filter or pre-encode in the > vault variable. An unencoded password is the most common cause of > `PLAIN 403 ACCESS_REFUSED` at worker startup. ### Worker tuning — `worker` only | Variable | Default | Notes | |----------|---------|-------| | `CELERY_QUEUES` | `celery,embedding,batch` | Override per host for dedicated queue workers | | `CELERY_CONCURRENCY` | `2` | Number of worker processes | ### MCP server — `mcp` only | Variable | Production value | |----------|-----------------| | `MCP_REQUIRE_AUTH` | `True` | ### LLM API encryption — `app`, `worker` | Variable | Notes | |----------|-------| | `LLM_API_SECRETS_ENCRYPTION_KEY` | Fernet key. Generate once: `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"`. Never rotate without re-encrypting all stored provider keys first. | ### Email — `app` only | Variable | Example | |----------|---------| | `EMAIL_HOST` | `oberon.incus` | | `EMAIL_PORT` | `22025` | | `EMAIL_USE_TLS` | `False` | ### Embedding pipeline — `worker` only | Variable | Default | |----------|---------| | `EMBEDDING_BATCH_SIZE` | `8` | | `EMBEDDING_TIMEOUT` | `120` | ### Search & re-ranker — `app`, `mcp` | Variable | Default | |----------|---------| | `SEARCH_VECTOR_TOP_K` | `50` | | `SEARCH_FULLTEXT_TOP_K` | `30` | | `SEARCH_GRAPH_MAX_DEPTH` | `2` | | `SEARCH_RRF_K` | `60` | | `SEARCH_DEFAULT_LIMIT` | `20` | | `RERANKER_MAX_CANDIDATES` | `32` | | `RERANKER_TIMEOUT` | `30` | ### Logging — `app`, `mcp`, `worker` | Variable | Default | |----------|---------| | `LOGGING_LEVEL` | `INFO` | | `DJANGO_LOGGING_LEVEL` | `WARNING` | | `CELERY_LOGGING_LEVEL` | `INFO` | --- ## 5. Health Probes & Verification After `docker compose up -d`, wait for all services to report healthy: ```bash docker compose -f /srv/mnemosyne/docker-compose.yaml ps ``` Expected: `app`, `mcp`, `worker`, `web` all `healthy`; `static-init` `exited (0)`. ### Per-service probes | Service | Healthcheck command | Expected | |---------|---------------------|----------| | `app` | `curl -f http://localhost:8000/live/` | 200 | | `mcp` | `curl -f http://localhost:8001/mcp/health` | 200 JSON | | `web` | `curl -f http://localhost/live/` | 200 (proxied to app) | | `worker` | `celery -A mnemosyne inspect ping -d celery@$HOSTNAME` | `pong` | ### External checks (from inside the 10.10.0.0/24 network) ```bash # Django liveness (via nginx) curl -f http://puck.incus:23181/live/ # Django readiness (Postgres + Memcached) curl -f http://puck.incus:23181/ready/ # MCP health (proxied from /healthz → mcp:8001/mcp/health) curl -f http://puck.incus:23181/healthz # Prometheus metrics (internal only) curl http://puck.incus:23181/metrics | head -5 ``` ### Verify the daedalus-service account ```bash curl -u daedalus-service: \ https://mnemosyne.ouranos.helu.ca/library/api/workspaces/ \ -o /dev/null -w "%{http_code}" # Expect: 200 ``` ### Verify MCP connectivity (from a client with a valid MCPToken) ```bash curl -H "Authorization: Bearer " \ https://mnemosyne.ouranos.helu.ca/mcp/health # Expect: {"status": "ok", ...} ``` --- ## 6. Upgrade Procedure A standard upgrade (new image pushed to `git.helu.ca/r/mnemosyne:latest`): ```bash cd /srv/mnemosyne docker compose pull docker compose up -d # static-init re-seeds; running containers replaced docker compose run --rm app migrate # no-op if no new migrations ``` The `static-init` service runs to completion on every `up`, propagating static file changes without manual volume reset. --- ## 7. Rollback ```bash # Pin to a specific digest docker compose pull git.helu.ca/r/mnemosyne@sha256: # Edit docker-compose.yaml image: line to use the digest, then: docker compose up -d ``` Alternatively, tag good images in the registry before each deploy and reference the tag. --- ## 8. HAProxy / Titania Configuration Notes Titania terminates TLS and forwards to `puck.incus:23181`. The nginx config preserves `X-Forwarded-Proto: https` so Django's `request.is_secure()`, secure cookies, and `build_absolute_uri()` work correctly. The HAProxy `health_path` for this backend should be `/healthz` (not `/live/` or `/ready/`) — `/healthz` short-circuits directly to the FastMCP health endpoint without touching Django, so it can confirm the MCP server is up even if Django is momentarily unhealthy. If HAProxy checks don't follow redirects, use `/live/` and `/ready/` **with** the trailing slash. The un-slashed forms (`/live`, `/ready`) trigger Django's `APPEND_SLASH` 301 redirect, which health checkers that don't follow redirects will report as a failure. --- ## 9. Vault Variables Summary | Vault variable | Used in `.env` as | |----------------|-------------------| | `vault_mnemosyne_secret_key` | `SECRET_KEY` | | `vault_mnemosyne_db_password` | `APP_DB_PASSWORD` | | `vault_neo4j_password` | embedded in `NEOMODEL_NEO4J_BOLT_URL` | | `vault_mnemosyne_s3_key` | `AWS_ACCESS_KEY_ID` | | `vault_mnemosyne_s3_secret` | `AWS_SECRET_ACCESS_KEY` | | `vault_daedalus_s3_read_key` | `DAEDALUS_S3_ACCESS_KEY_ID` | | `vault_daedalus_s3_read_secret` | `DAEDALUS_S3_SECRET_ACCESS_KEY` | | `vault_rabbitmq_password` | embedded in `CELERY_BROKER_URL` | | `vault_mnemosyne_llm_encryption_key` | `LLM_API_SECRETS_ENCRYPTION_KEY` | | `vault_mnemosyne_daedalus_service_password` | passed to `ensure_service_user --password` |