docs(mnemosyne): update integration doc for container deployment
This commit is contained in:
375
docs/deploy.md
Normal file
375
docs/deploy.md
Normal file
@@ -0,0 +1,375 @@
|
||||
# Mnemosyne — Ansible Deployment Reference
|
||||
|
||||
This document gives the Ansible author everything needed to write and maintain the
|
||||
Mnemosyne deployment role. All implementation decisions are already locked in
|
||||
`docker-compose.yaml` and `nginx/mnemosyne.conf`; this document explains the
|
||||
*why* behind each decision and provides the authoritative list of variables,
|
||||
one-time steps, and verification checks.
|
||||
|
||||
---
|
||||
|
||||
## 1. Host & Stack Overview
|
||||
|
||||
| Item | Value |
|
||||
|------|-------|
|
||||
| Deploy target | `puck.incus` (Incus container, 10.10.0.0/24) |
|
||||
| Compose project directory | `/opt/mnemosyne` |
|
||||
| Image registry | `git.helu.ca/r/mnemosyne:latest` |
|
||||
| Public host port | **23181** (nginx → HAProxy on Titania → `https://mnemosyne.ouranos.helu.ca`) |
|
||||
| Internal app port | `app:8000` (Django/gunicorn) |
|
||||
| Internal MCP port | `mcp:8001` (FastMCP/uvicorn) |
|
||||
|
||||
The four compose services (`app`, `mcp`, `worker`, `web`) all run from the same
|
||||
image. A one-shot `static-init` service seeds the nginx static-file volume on
|
||||
every `up` so static-file changes propagate automatically on deploy without
|
||||
manual intervention.
|
||||
|
||||
---
|
||||
|
||||
## 2. External Dependencies (NOT managed by this role)
|
||||
|
||||
These services must exist before Mnemosyne can start. The role only consumes
|
||||
credentials; it does not provision these hosts.
|
||||
|
||||
| Service | Host | Notes |
|
||||
|---------|------|-------|
|
||||
| PostgreSQL | `portia.incus:5432` | Database `mnemosyne`, user `mnemosyne` |
|
||||
| Neo4j | `umbriel.incus:7687` | Bolt protocol. **Must be dedicated to Mnemosyne** — do not share with Spelunker or any other graph workload (see README §Note on Neo4j). HTTP browser on `umbriel.incus:25555`. |
|
||||
| RabbitMQ | `oberon.incus:5672` | vhost `mnemosyne`, user `mnemosyne` |
|
||||
| MinIO (Mnemosyne bucket) | `nyx.helu.ca:8555` | Bucket `mnemosyne-content`. Credentials scoped read+write. |
|
||||
| MinIO (Daedalus bucket) | `nyx.helu.ca:8555` | Bucket `daedalus`. **Read-only** cross-bucket credentials for the ingest worker. |
|
||||
| Memcached | `oberon.incus:11211` | Shared; prefix `mnemosyne` avoids collisions. |
|
||||
| Embedder (Qwen3-VL-Embedding) | Configured via `EMBEDDING_*` vars in settings | GPU host on Nyx; not managed here. |
|
||||
| Reranker (Synesis) | Configured via `RERANKER_*` vars in settings | GPU host on Nyx; not managed here. |
|
||||
|
||||
---
|
||||
|
||||
## 3. Role Tasks
|
||||
|
||||
### 3.1 Directory & file layout
|
||||
|
||||
```
|
||||
/opt/mnemosyne/
|
||||
├── docker-compose.yaml ← copied from repo (or symlinked via git pull)
|
||||
├── nginx/
|
||||
│ └── mnemosyne.conf ← copied from repo nginx/mnemosyne.conf
|
||||
└── .env ← rendered from Jinja2 template + vault secrets
|
||||
```
|
||||
|
||||
The role should:
|
||||
1. Create `/opt/mnemosyne/` and `nginx/` (owner: `root`, mode `0750`).
|
||||
2. Render `.env` from the vault-sourced Jinja2 template (mode `0600`, owner `root`).
|
||||
3. Copy (or `git pull`) `docker-compose.yaml` and `nginx/mnemosyne.conf` from the repo.
|
||||
|
||||
### 3.2 Pull & start
|
||||
|
||||
```yaml
|
||||
- name: Pull latest image
|
||||
community.docker.docker_compose_v2:
|
||||
project_src: /opt/mnemosyne
|
||||
pull: always
|
||||
|
||||
- name: Bring stack up
|
||||
community.docker.docker_compose_v2:
|
||||
project_src: /opt/mnemosyne
|
||||
state: present
|
||||
```
|
||||
|
||||
This triggers `static-init` automatically on every `up` — no separate handler needed.
|
||||
|
||||
### 3.3 One-time setup (run once on first deploy, idempotent thereafter)
|
||||
|
||||
These management commands are safe to re-run; they do nothing if the target state
|
||||
already exists. Run them as a post-start task gated on a `creates:` sentinel or
|
||||
an explicit `when: mnemosyne_first_deploy` flag.
|
||||
|
||||
```bash
|
||||
# Apply Django ORM migrations (PostgreSQL schema)
|
||||
docker compose -f /opt/mnemosyne/docker-compose.yaml \
|
||||
run --rm app migrate
|
||||
|
||||
# Create Neo4j vector + full-text indexes and load library-type defaults
|
||||
docker compose -f /opt/mnemosyne/docker-compose.yaml \
|
||||
run --rm app setup
|
||||
|
||||
# Create the daedalus-service user (HTTP Basic auth for ingest API)
|
||||
# Pass --password from vault; idempotent if user already exists.
|
||||
docker compose -f /opt/mnemosyne/docker-compose.yaml \
|
||||
run --rm app \
|
||||
python manage.py ensure_service_user \
|
||||
--username daedalus-service \
|
||||
--password "{{ vault_mnemosyne_daedalus_service_password }}"
|
||||
|
||||
# Seed the MCP signing key (for Phase 2 per-turn JWT auth)
|
||||
# --retire-other deactivates any previously-active key.
|
||||
# Print the secret_hex and store in vault as vault_mnemosyne_signing_secret.
|
||||
docker compose -f /opt/mnemosyne/docker-compose.yaml \
|
||||
run --rm app \
|
||||
python manage.py seed_signing_key --kid daedalus-1 --retire-other
|
||||
```
|
||||
|
||||
The `seed_signing_key` command prints the generated secret once to stdout —
|
||||
capture it and store in the vault. The Daedalus role reads this secret from the
|
||||
same vault variable to mint per-turn tokens (Phase 2).
|
||||
|
||||
---
|
||||
|
||||
## 4. Environment Variables (`.env` template)
|
||||
|
||||
All variables are consumed by `docker-compose.yaml` for interpolation into the
|
||||
relevant service `environment:` blocks. The per-service scoping is defined in
|
||||
`docker-compose.yaml`; the `.env` file just provides values.
|
||||
|
||||
### Django core — `app`, `mcp`, `worker`
|
||||
|
||||
| Variable | Example / default | Notes |
|
||||
|----------|-------------------|-------|
|
||||
| `SECRET_KEY` | `{{ vault_mnemosyne_secret_key }}` | Fernet-safe; never rotate without re-encrypting stored API keys first |
|
||||
| `DEBUG` | `False` | |
|
||||
| `TIME_ZONE` | `UTC` | |
|
||||
| `LANGUAGE_CODE` | `en-us` | |
|
||||
|
||||
### HTTP surface — `app` (CSRF), `app` + `mcp` (ALLOWED_HOSTS)
|
||||
|
||||
| Variable | Example |
|
||||
|----------|---------|
|
||||
| `ALLOWED_HOSTS` | `localhost,127.0.0.1,mnemosyne.ouranos.helu.ca` |
|
||||
| `CSRF_TRUSTED_ORIGINS` | `https://mnemosyne.ouranos.helu.ca` |
|
||||
|
||||
### PostgreSQL — `app`, `mcp`, `worker`
|
||||
|
||||
| Variable | Example |
|
||||
|----------|---------|
|
||||
| `APP_DB_NAME` | `mnemosyne` |
|
||||
| `APP_DB_USER` | `mnemosyne` |
|
||||
| `APP_DB_PASSWORD` | `{{ vault_mnemosyne_db_password }}` |
|
||||
| `DB_HOST` | `portia.incus` |
|
||||
| `DB_PORT` | `5432` |
|
||||
|
||||
### Neo4j — `app`, `mcp`, `worker`
|
||||
|
||||
| Variable | Example |
|
||||
|----------|---------|
|
||||
| `NEOMODEL_NEO4J_BOLT_URL` | `bolt://neo4j:{{ vault_neo4j_password }}@umbriel.incus:7687` |
|
||||
|
||||
> **URL-encode the password** if it contains `@ : / # % + ? & =` or a space.
|
||||
> The Bolt URL parser is strict.
|
||||
|
||||
### Memcached — `app`, `mcp`, `worker`
|
||||
|
||||
| Variable | Example |
|
||||
|----------|---------|
|
||||
| `KVDB_LOCATION` | `oberon.incus:11211` |
|
||||
| `KVDB_PREFIX` | `mnemosyne` |
|
||||
|
||||
### S3 / MinIO (Mnemosyne bucket) — `app`, `mcp`, `worker`
|
||||
|
||||
| Variable | Example |
|
||||
|----------|---------|
|
||||
| `USE_LOCAL_STORAGE` | `False` |
|
||||
| `AWS_ACCESS_KEY_ID` | `{{ vault_mnemosyne_s3_key }}` |
|
||||
| `AWS_SECRET_ACCESS_KEY` | `{{ vault_mnemosyne_s3_secret }}` |
|
||||
| `AWS_STORAGE_BUCKET_NAME` | `mnemosyne-content` |
|
||||
| `AWS_S3_ENDPOINT_URL` | `https://nyx.helu.ca:8555` |
|
||||
| `AWS_S3_USE_SSL` | `True` |
|
||||
| `AWS_S3_VERIFY` | `False` (self-signed cert on Nyx) |
|
||||
| `AWS_S3_REGION_NAME` | `us-east-1` |
|
||||
|
||||
### Daedalus S3 (cross-bucket reads) — `worker` only
|
||||
|
||||
| Variable | Example |
|
||||
|----------|---------|
|
||||
| `DAEDALUS_S3_ENDPOINT_URL` | `https://nyx.helu.ca:8555` |
|
||||
| `DAEDALUS_S3_ACCESS_KEY_ID` | `{{ vault_daedalus_s3_read_key }}` |
|
||||
| `DAEDALUS_S3_SECRET_ACCESS_KEY` | `{{ vault_daedalus_s3_read_secret }}` |
|
||||
| `DAEDALUS_S3_BUCKET_NAME` | `daedalus` |
|
||||
| `DAEDALUS_S3_REGION_NAME` | `us-east-1` |
|
||||
| `DAEDALUS_S3_USE_SSL` | `True` |
|
||||
| `DAEDALUS_S3_VERIFY` | `True` |
|
||||
|
||||
### Celery / RabbitMQ — `app` (producer), `worker` (consumer)
|
||||
|
||||
| Variable | Example |
|
||||
|----------|---------|
|
||||
| `CELERY_BROKER_URL` | `amqp://mnemosyne:{{ vault_rabbitmq_password \| urlencode }}@oberon.incus:5672/mnemosyne` |
|
||||
| `CELERY_RESULT_BACKEND` | `rpc://` |
|
||||
| `CELERY_TASK_ALWAYS_EAGER` | `False` |
|
||||
|
||||
> **Percent-encode** the RabbitMQ password in the broker URL if it contains any
|
||||
> URL-special characters. Use Ansible's `urlencode` filter or pre-encode in the
|
||||
> vault variable. An unencoded password is the most common cause of
|
||||
> `PLAIN 403 ACCESS_REFUSED` at worker startup.
|
||||
|
||||
### Worker tuning — `worker` only
|
||||
|
||||
| Variable | Default | Notes |
|
||||
|----------|---------|-------|
|
||||
| `CELERY_QUEUES` | `celery,embedding,batch` | Override per host for dedicated queue workers |
|
||||
| `CELERY_CONCURRENCY` | `2` | Number of worker processes |
|
||||
|
||||
### MCP server — `mcp` only
|
||||
|
||||
| Variable | Production value |
|
||||
|----------|-----------------|
|
||||
| `MCP_REQUIRE_AUTH` | `True` |
|
||||
|
||||
### LLM API encryption — `app`, `worker`
|
||||
|
||||
| Variable | Notes |
|
||||
|----------|-------|
|
||||
| `LLM_API_SECRETS_ENCRYPTION_KEY` | Fernet key. Generate once: `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"`. Never rotate without re-encrypting all stored provider keys first. |
|
||||
|
||||
### Email — `app` only
|
||||
|
||||
| Variable | Example |
|
||||
|----------|---------|
|
||||
| `EMAIL_HOST` | `oberon.incus` |
|
||||
| `EMAIL_PORT` | `22025` |
|
||||
| `EMAIL_USE_TLS` | `False` |
|
||||
|
||||
### Embedding pipeline — `worker` only
|
||||
|
||||
| Variable | Default |
|
||||
|----------|---------|
|
||||
| `EMBEDDING_BATCH_SIZE` | `8` |
|
||||
| `EMBEDDING_TIMEOUT` | `120` |
|
||||
|
||||
### Search & re-ranker — `app`, `mcp`
|
||||
|
||||
| Variable | Default |
|
||||
|----------|---------|
|
||||
| `SEARCH_VECTOR_TOP_K` | `50` |
|
||||
| `SEARCH_FULLTEXT_TOP_K` | `30` |
|
||||
| `SEARCH_GRAPH_MAX_DEPTH` | `2` |
|
||||
| `SEARCH_RRF_K` | `60` |
|
||||
| `SEARCH_DEFAULT_LIMIT` | `20` |
|
||||
| `RERANKER_MAX_CANDIDATES` | `32` |
|
||||
| `RERANKER_TIMEOUT` | `30` |
|
||||
|
||||
### Logging — `app`, `mcp`, `worker`
|
||||
|
||||
| Variable | Default |
|
||||
|----------|---------|
|
||||
| `LOGGING_LEVEL` | `INFO` |
|
||||
| `DJANGO_LOGGING_LEVEL` | `WARNING` |
|
||||
| `CELERY_LOGGING_LEVEL` | `INFO` |
|
||||
|
||||
---
|
||||
|
||||
## 5. Health Probes & Verification
|
||||
|
||||
After `docker compose up -d`, wait for all services to report healthy:
|
||||
|
||||
```bash
|
||||
docker compose -f /opt/mnemosyne/docker-compose.yaml ps
|
||||
```
|
||||
|
||||
Expected: `app`, `mcp`, `worker`, `web` all `healthy`; `static-init` `exited (0)`.
|
||||
|
||||
### Per-service probes
|
||||
|
||||
| Service | Healthcheck command | Expected |
|
||||
|---------|---------------------|----------|
|
||||
| `app` | `curl -f http://localhost:8000/live/` | 200 |
|
||||
| `mcp` | `curl -f http://localhost:8001/mcp/health` | 200 JSON |
|
||||
| `web` | `curl -f http://localhost/live/` | 200 (proxied to app) |
|
||||
| `worker` | `celery -A mnemosyne inspect ping -d celery@$HOSTNAME` | `pong` |
|
||||
|
||||
### External checks (from inside the 10.10.0.0/24 network)
|
||||
|
||||
```bash
|
||||
# Django liveness (via nginx)
|
||||
curl -f http://puck.incus:23181/live/
|
||||
|
||||
# Django readiness (Postgres + Memcached)
|
||||
curl -f http://puck.incus:23181/ready/
|
||||
|
||||
# MCP health (proxied from /healthz → mcp:8001/mcp/health)
|
||||
curl -f http://puck.incus:23181/healthz
|
||||
|
||||
# Prometheus metrics (internal only)
|
||||
curl http://puck.incus:23181/metrics | head -5
|
||||
```
|
||||
|
||||
### Verify the daedalus-service account
|
||||
|
||||
```bash
|
||||
curl -u daedalus-service:<password> \
|
||||
https://mnemosyne.ouranos.helu.ca/library/api/workspaces/ \
|
||||
-o /dev/null -w "%{http_code}"
|
||||
# Expect: 200
|
||||
```
|
||||
|
||||
### Verify MCP connectivity (from a client with a valid MCPToken)
|
||||
|
||||
```bash
|
||||
curl -H "Authorization: Bearer <token>" \
|
||||
https://mnemosyne.ouranos.helu.ca/mcp/health
|
||||
# Expect: {"status": "ok", ...}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 6. Upgrade Procedure
|
||||
|
||||
A standard upgrade (new image pushed to `git.helu.ca/r/mnemosyne:latest`):
|
||||
|
||||
```bash
|
||||
cd /opt/mnemosyne
|
||||
docker compose pull
|
||||
docker compose up -d # static-init re-seeds; running containers replaced
|
||||
docker compose run --rm app migrate # no-op if no new migrations
|
||||
```
|
||||
|
||||
The `static-init` service runs to completion on every `up`, propagating static
|
||||
file changes without manual volume reset.
|
||||
|
||||
---
|
||||
|
||||
## 7. Rollback
|
||||
|
||||
```bash
|
||||
# Pin to a specific digest
|
||||
docker compose pull git.helu.ca/r/mnemosyne@sha256:<digest>
|
||||
# Edit docker-compose.yaml image: line to use the digest, then:
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
Alternatively, tag good images in the registry before each deploy and reference
|
||||
the tag.
|
||||
|
||||
---
|
||||
|
||||
## 8. HAProxy / Titania Configuration Notes
|
||||
|
||||
Titania terminates TLS and forwards to `puck.incus:23181`. The nginx config
|
||||
preserves `X-Forwarded-Proto: https` so Django's `request.is_secure()`, secure
|
||||
cookies, and `build_absolute_uri()` work correctly.
|
||||
|
||||
The HAProxy `health_path` for this backend should be `/healthz` (not `/live/` or
|
||||
`/ready/`) — `/healthz` short-circuits directly to the FastMCP health endpoint
|
||||
without touching Django, so it can confirm the MCP server is up even if Django
|
||||
is momentarily unhealthy.
|
||||
|
||||
If HAProxy checks don't follow redirects, use `/live/` and `/ready/` **with** the
|
||||
trailing slash. The un-slashed forms (`/live`, `/ready`) trigger Django's
|
||||
`APPEND_SLASH` 301 redirect, which health checkers that don't follow redirects
|
||||
will report as a failure.
|
||||
|
||||
---
|
||||
|
||||
## 9. Vault Variables Summary
|
||||
|
||||
| Vault variable | Used in `.env` as |
|
||||
|----------------|-------------------|
|
||||
| `vault_mnemosyne_secret_key` | `SECRET_KEY` |
|
||||
| `vault_mnemosyne_db_password` | `APP_DB_PASSWORD` |
|
||||
| `vault_neo4j_password` | embedded in `NEOMODEL_NEO4J_BOLT_URL` |
|
||||
| `vault_mnemosyne_s3_key` | `AWS_ACCESS_KEY_ID` |
|
||||
| `vault_mnemosyne_s3_secret` | `AWS_SECRET_ACCESS_KEY` |
|
||||
| `vault_daedalus_s3_read_key` | `DAEDALUS_S3_ACCESS_KEY_ID` |
|
||||
| `vault_daedalus_s3_read_secret` | `DAEDALUS_S3_SECRET_ACCESS_KEY` |
|
||||
| `vault_rabbitmq_password` | embedded in `CELERY_BROKER_URL` |
|
||||
| `vault_mnemosyne_llm_encryption_key` | `LLM_API_SECRETS_ENCRYPTION_KEY` |
|
||||
| `vault_mnemosyne_daedalus_service_password` | passed to `ensure_service_user --password` |
|
||||
| `vault_mnemosyne_signing_secret` | (Phase 2) printed by `seed_signing_key`, stored here, consumed by Daedalus role |
|
||||
Reference in New Issue
Block a user