Files
mnemosyne/docs/deploy.md
Robert Helewka de0d7a4317
All checks were successful
CVE Scan & Docker Build / security-scan (push) Successful in 50s
CVE Scan & Docker Build / build-and-push (push) Successful in 4m2s
docs(mnemosyne): update integration doc for container deployment
2026-05-04 08:56:49 -04:00

376 lines
13 KiB
Markdown

# Mnemosyne — Ansible Deployment Reference
This document gives the Ansible author everything needed to write and maintain the
Mnemosyne deployment role. All implementation decisions are already locked in
`docker-compose.yaml` and `nginx/mnemosyne.conf`; this document explains the
*why* behind each decision and provides the authoritative list of variables,
one-time steps, and verification checks.
---
## 1. Host & Stack Overview
| Item | Value |
|------|-------|
| Deploy target | `puck.incus` (Incus container, 10.10.0.0/24) |
| Compose project directory | `/opt/mnemosyne` |
| Image registry | `git.helu.ca/r/mnemosyne:latest` |
| Public host port | **23181** (nginx → HAProxy on Titania → `https://mnemosyne.ouranos.helu.ca`) |
| Internal app port | `app:8000` (Django/gunicorn) |
| Internal MCP port | `mcp:8001` (FastMCP/uvicorn) |
The four compose services (`app`, `mcp`, `worker`, `web`) all run from the same
image. A one-shot `static-init` service seeds the nginx static-file volume on
every `up` so static-file changes propagate automatically on deploy without
manual intervention.
---
## 2. External Dependencies (NOT managed by this role)
These services must exist before Mnemosyne can start. The role only consumes
credentials; it does not provision these hosts.
| Service | Host | Notes |
|---------|------|-------|
| PostgreSQL | `portia.incus:5432` | Database `mnemosyne`, user `mnemosyne` |
| Neo4j | `umbriel.incus:7687` | Bolt protocol. **Must be dedicated to Mnemosyne** — do not share with Spelunker or any other graph workload (see README §Note on Neo4j). HTTP browser on `umbriel.incus:25555`. |
| RabbitMQ | `oberon.incus:5672` | vhost `mnemosyne`, user `mnemosyne` |
| MinIO (Mnemosyne bucket) | `nyx.helu.ca:8555` | Bucket `mnemosyne-content`. Credentials scoped read+write. |
| MinIO (Daedalus bucket) | `nyx.helu.ca:8555` | Bucket `daedalus`. **Read-only** cross-bucket credentials for the ingest worker. |
| Memcached | `oberon.incus:11211` | Shared; prefix `mnemosyne` avoids collisions. |
| Embedder (Qwen3-VL-Embedding) | Configured via `EMBEDDING_*` vars in settings | GPU host on Nyx; not managed here. |
| Reranker (Synesis) | Configured via `RERANKER_*` vars in settings | GPU host on Nyx; not managed here. |
---
## 3. Role Tasks
### 3.1 Directory & file layout
```
/opt/mnemosyne/
├── docker-compose.yaml ← copied from repo (or symlinked via git pull)
├── nginx/
│ └── mnemosyne.conf ← copied from repo nginx/mnemosyne.conf
└── .env ← rendered from Jinja2 template + vault secrets
```
The role should:
1. Create `/opt/mnemosyne/` and `nginx/` (owner: `root`, mode `0750`).
2. Render `.env` from the vault-sourced Jinja2 template (mode `0600`, owner `root`).
3. Copy (or `git pull`) `docker-compose.yaml` and `nginx/mnemosyne.conf` from the repo.
### 3.2 Pull & start
```yaml
- name: Pull latest image
community.docker.docker_compose_v2:
project_src: /opt/mnemosyne
pull: always
- name: Bring stack up
community.docker.docker_compose_v2:
project_src: /opt/mnemosyne
state: present
```
This triggers `static-init` automatically on every `up` — no separate handler needed.
### 3.3 One-time setup (run once on first deploy, idempotent thereafter)
These management commands are safe to re-run; they do nothing if the target state
already exists. Run them as a post-start task gated on a `creates:` sentinel or
an explicit `when: mnemosyne_first_deploy` flag.
```bash
# Apply Django ORM migrations (PostgreSQL schema)
docker compose -f /opt/mnemosyne/docker-compose.yaml \
run --rm app migrate
# Create Neo4j vector + full-text indexes and load library-type defaults
docker compose -f /opt/mnemosyne/docker-compose.yaml \
run --rm app setup
# Create the daedalus-service user (HTTP Basic auth for ingest API)
# Pass --password from vault; idempotent if user already exists.
docker compose -f /opt/mnemosyne/docker-compose.yaml \
run --rm app \
python manage.py ensure_service_user \
--username daedalus-service \
--password "{{ vault_mnemosyne_daedalus_service_password }}"
# Seed the MCP signing key (for Phase 2 per-turn JWT auth)
# --retire-other deactivates any previously-active key.
# Print the secret_hex and store in vault as vault_mnemosyne_signing_secret.
docker compose -f /opt/mnemosyne/docker-compose.yaml \
run --rm app \
python manage.py seed_signing_key --kid daedalus-1 --retire-other
```
The `seed_signing_key` command prints the generated secret once to stdout —
capture it and store in the vault. The Daedalus role reads this secret from the
same vault variable to mint per-turn tokens (Phase 2).
---
## 4. Environment Variables (`.env` template)
All variables are consumed by `docker-compose.yaml` for interpolation into the
relevant service `environment:` blocks. The per-service scoping is defined in
`docker-compose.yaml`; the `.env` file just provides values.
### Django core — `app`, `mcp`, `worker`
| Variable | Example / default | Notes |
|----------|-------------------|-------|
| `SECRET_KEY` | `{{ vault_mnemosyne_secret_key }}` | Fernet-safe; never rotate without re-encrypting stored API keys first |
| `DEBUG` | `False` | |
| `TIME_ZONE` | `UTC` | |
| `LANGUAGE_CODE` | `en-us` | |
### HTTP surface — `app` (CSRF), `app` + `mcp` (ALLOWED_HOSTS)
| Variable | Example |
|----------|---------|
| `ALLOWED_HOSTS` | `localhost,127.0.0.1,mnemosyne.ouranos.helu.ca` |
| `CSRF_TRUSTED_ORIGINS` | `https://mnemosyne.ouranos.helu.ca` |
### PostgreSQL — `app`, `mcp`, `worker`
| Variable | Example |
|----------|---------|
| `APP_DB_NAME` | `mnemosyne` |
| `APP_DB_USER` | `mnemosyne` |
| `APP_DB_PASSWORD` | `{{ vault_mnemosyne_db_password }}` |
| `DB_HOST` | `portia.incus` |
| `DB_PORT` | `5432` |
### Neo4j — `app`, `mcp`, `worker`
| Variable | Example |
|----------|---------|
| `NEOMODEL_NEO4J_BOLT_URL` | `bolt://neo4j:{{ vault_neo4j_password }}@umbriel.incus:7687` |
> **URL-encode the password** if it contains `@ : / # % + ? & =` or a space.
> The Bolt URL parser is strict.
### Memcached — `app`, `mcp`, `worker`
| Variable | Example |
|----------|---------|
| `KVDB_LOCATION` | `oberon.incus:11211` |
| `KVDB_PREFIX` | `mnemosyne` |
### S3 / MinIO (Mnemosyne bucket) — `app`, `mcp`, `worker`
| Variable | Example |
|----------|---------|
| `USE_LOCAL_STORAGE` | `False` |
| `AWS_ACCESS_KEY_ID` | `{{ vault_mnemosyne_s3_key }}` |
| `AWS_SECRET_ACCESS_KEY` | `{{ vault_mnemosyne_s3_secret }}` |
| `AWS_STORAGE_BUCKET_NAME` | `mnemosyne-content` |
| `AWS_S3_ENDPOINT_URL` | `https://nyx.helu.ca:8555` |
| `AWS_S3_USE_SSL` | `True` |
| `AWS_S3_VERIFY` | `False` (self-signed cert on Nyx) |
| `AWS_S3_REGION_NAME` | `us-east-1` |
### Daedalus S3 (cross-bucket reads) — `worker` only
| Variable | Example |
|----------|---------|
| `DAEDALUS_S3_ENDPOINT_URL` | `https://nyx.helu.ca:8555` |
| `DAEDALUS_S3_ACCESS_KEY_ID` | `{{ vault_daedalus_s3_read_key }}` |
| `DAEDALUS_S3_SECRET_ACCESS_KEY` | `{{ vault_daedalus_s3_read_secret }}` |
| `DAEDALUS_S3_BUCKET_NAME` | `daedalus` |
| `DAEDALUS_S3_REGION_NAME` | `us-east-1` |
| `DAEDALUS_S3_USE_SSL` | `True` |
| `DAEDALUS_S3_VERIFY` | `True` |
### Celery / RabbitMQ — `app` (producer), `worker` (consumer)
| Variable | Example |
|----------|---------|
| `CELERY_BROKER_URL` | `amqp://mnemosyne:{{ vault_rabbitmq_password \| urlencode }}@oberon.incus:5672/mnemosyne` |
| `CELERY_RESULT_BACKEND` | `rpc://` |
| `CELERY_TASK_ALWAYS_EAGER` | `False` |
> **Percent-encode** the RabbitMQ password in the broker URL if it contains any
> URL-special characters. Use Ansible's `urlencode` filter or pre-encode in the
> vault variable. An unencoded password is the most common cause of
> `PLAIN 403 ACCESS_REFUSED` at worker startup.
### Worker tuning — `worker` only
| Variable | Default | Notes |
|----------|---------|-------|
| `CELERY_QUEUES` | `celery,embedding,batch` | Override per host for dedicated queue workers |
| `CELERY_CONCURRENCY` | `2` | Number of worker processes |
### MCP server — `mcp` only
| Variable | Production value |
|----------|-----------------|
| `MCP_REQUIRE_AUTH` | `True` |
### LLM API encryption — `app`, `worker`
| Variable | Notes |
|----------|-------|
| `LLM_API_SECRETS_ENCRYPTION_KEY` | Fernet key. Generate once: `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"`. Never rotate without re-encrypting all stored provider keys first. |
### Email — `app` only
| Variable | Example |
|----------|---------|
| `EMAIL_HOST` | `oberon.incus` |
| `EMAIL_PORT` | `22025` |
| `EMAIL_USE_TLS` | `False` |
### Embedding pipeline — `worker` only
| Variable | Default |
|----------|---------|
| `EMBEDDING_BATCH_SIZE` | `8` |
| `EMBEDDING_TIMEOUT` | `120` |
### Search & re-ranker — `app`, `mcp`
| Variable | Default |
|----------|---------|
| `SEARCH_VECTOR_TOP_K` | `50` |
| `SEARCH_FULLTEXT_TOP_K` | `30` |
| `SEARCH_GRAPH_MAX_DEPTH` | `2` |
| `SEARCH_RRF_K` | `60` |
| `SEARCH_DEFAULT_LIMIT` | `20` |
| `RERANKER_MAX_CANDIDATES` | `32` |
| `RERANKER_TIMEOUT` | `30` |
### Logging — `app`, `mcp`, `worker`
| Variable | Default |
|----------|---------|
| `LOGGING_LEVEL` | `INFO` |
| `DJANGO_LOGGING_LEVEL` | `WARNING` |
| `CELERY_LOGGING_LEVEL` | `INFO` |
---
## 5. Health Probes & Verification
After `docker compose up -d`, wait for all services to report healthy:
```bash
docker compose -f /opt/mnemosyne/docker-compose.yaml ps
```
Expected: `app`, `mcp`, `worker`, `web` all `healthy`; `static-init` `exited (0)`.
### Per-service probes
| Service | Healthcheck command | Expected |
|---------|---------------------|----------|
| `app` | `curl -f http://localhost:8000/live/` | 200 |
| `mcp` | `curl -f http://localhost:8001/mcp/health` | 200 JSON |
| `web` | `curl -f http://localhost/live/` | 200 (proxied to app) |
| `worker` | `celery -A mnemosyne inspect ping -d celery@$HOSTNAME` | `pong` |
### External checks (from inside the 10.10.0.0/24 network)
```bash
# Django liveness (via nginx)
curl -f http://puck.incus:23181/live/
# Django readiness (Postgres + Memcached)
curl -f http://puck.incus:23181/ready/
# MCP health (proxied from /healthz → mcp:8001/mcp/health)
curl -f http://puck.incus:23181/healthz
# Prometheus metrics (internal only)
curl http://puck.incus:23181/metrics | head -5
```
### Verify the daedalus-service account
```bash
curl -u daedalus-service:<password> \
https://mnemosyne.ouranos.helu.ca/library/api/workspaces/ \
-o /dev/null -w "%{http_code}"
# Expect: 200
```
### Verify MCP connectivity (from a client with a valid MCPToken)
```bash
curl -H "Authorization: Bearer <token>" \
https://mnemosyne.ouranos.helu.ca/mcp/health
# Expect: {"status": "ok", ...}
```
---
## 6. Upgrade Procedure
A standard upgrade (new image pushed to `git.helu.ca/r/mnemosyne:latest`):
```bash
cd /opt/mnemosyne
docker compose pull
docker compose up -d # static-init re-seeds; running containers replaced
docker compose run --rm app migrate # no-op if no new migrations
```
The `static-init` service runs to completion on every `up`, propagating static
file changes without manual volume reset.
---
## 7. Rollback
```bash
# Pin to a specific digest
docker compose pull git.helu.ca/r/mnemosyne@sha256:<digest>
# Edit docker-compose.yaml image: line to use the digest, then:
docker compose up -d
```
Alternatively, tag good images in the registry before each deploy and reference
the tag.
---
## 8. HAProxy / Titania Configuration Notes
Titania terminates TLS and forwards to `puck.incus:23181`. The nginx config
preserves `X-Forwarded-Proto: https` so Django's `request.is_secure()`, secure
cookies, and `build_absolute_uri()` work correctly.
The HAProxy `health_path` for this backend should be `/healthz` (not `/live/` or
`/ready/`) — `/healthz` short-circuits directly to the FastMCP health endpoint
without touching Django, so it can confirm the MCP server is up even if Django
is momentarily unhealthy.
If HAProxy checks don't follow redirects, use `/live/` and `/ready/` **with** the
trailing slash. The un-slashed forms (`/live`, `/ready`) trigger Django's
`APPEND_SLASH` 301 redirect, which health checkers that don't follow redirects
will report as a failure.
---
## 9. Vault Variables Summary
| Vault variable | Used in `.env` as |
|----------------|-------------------|
| `vault_mnemosyne_secret_key` | `SECRET_KEY` |
| `vault_mnemosyne_db_password` | `APP_DB_PASSWORD` |
| `vault_neo4j_password` | embedded in `NEOMODEL_NEO4J_BOLT_URL` |
| `vault_mnemosyne_s3_key` | `AWS_ACCESS_KEY_ID` |
| `vault_mnemosyne_s3_secret` | `AWS_SECRET_ACCESS_KEY` |
| `vault_daedalus_s3_read_key` | `DAEDALUS_S3_ACCESS_KEY_ID` |
| `vault_daedalus_s3_read_secret` | `DAEDALUS_S3_SECRET_ACCESS_KEY` |
| `vault_rabbitmq_password` | embedded in `CELERY_BROKER_URL` |
| `vault_mnemosyne_llm_encryption_key` | `LLM_API_SECRETS_ENCRYPTION_KEY` |
| `vault_mnemosyne_daedalus_service_password` | passed to `ensure_service_user --password` |
| `vault_mnemosyne_signing_secret` | (Phase 2) printed by `seed_signing_key`, stored here, consumed by Daedalus role |