feat: add init sidecar for migrations and setup on compose up
All checks were successful
CVE Scan & Docker Build / security-scan (push) Successful in 50s
CVE Scan & Docker Build / build-and-push (push) Successful in 2m30s

Introduces a one-shot `init` service in docker-compose that runs Postgres
migrations, Neo4j index setup, and library-type seeding on every `up`.
Long-running services (`app`, `mcp`, `worker`) now depend on its
successful completion via `service_completed_successfully`, blocking the
stack on configuration errors (missing embedding model, dimension
mismatch, unreachable DB) rather than serving silent zero-result
searches.

Also standardizes reranker test fixtures to use the `/v1` OpenAI-style
base URL convention used across other service clients.
This commit is contained in:
2026-05-10 08:01:58 -04:00
parent 9ceb01f829
commit a945b382e6
15 changed files with 821 additions and 65 deletions

View File

@@ -24,10 +24,17 @@
#
# Run:
# docker compose up -d
# docker compose run --rm app migrate # one-shot DB migrate
# docker compose run --rm app setup # Neo4j indexes + library types
#
# The `init` sidecar (below) runs Postgres migrations, Neo4j index setup,
# and library-type seeding on every `up`. Long-running services wait for
# it via `depends_on: init: service_completed_successfully` — so a failure
# there (missing embedding model, dimension mismatch, unreachable DB)
# blocks the stack rather than letting it serve silent zero-result
# searches. The standalone `migrate` / `setup` entrypoint commands remain
# available for ad-hoc ops work.
# =============================================================================
services:
# ── Static-file seeder: copies /app/staticfiles into the shared volume on
# every `up`. Runs once and exits. Without this, the named volume is only
@@ -41,6 +48,41 @@ services:
- mnemosyne-static:/shared-static
restart: "no"
# ── Init sidecar: one-shot Postgres migrate + Neo4j index setup + library
# type seed. Runs on every `up` and exits. Long-running services below
# depend on `service_completed_successfully`, so a failure here (no system
# embedding model configured, dimension mismatch, unreachable DB) blocks
# `app`/`mcp`/`worker` from starting — which is the whole point. All three
# commands are idempotent: re-running is a no-op unless state actually
# needs to change.
#
# This sidecar only needs Postgres, Neo4j, and logging env — no S3, no
# Celery, no LLM encryption key. Keep it that way.
init:
image: git.helu.ca/r/mnemosyne:latest
pull_policy: always
command: ["init"]
environment:
# Django core (settings import)
- DJANGO_SETTINGS_MODULE=mnemosyne.settings
- SECRET_KEY=${SECRET_KEY}
- DEBUG=${DEBUG}
- TIME_ZONE=${TIME_ZONE}
- LANGUAGE_CODE=${LANGUAGE_CODE}
# Postgres (migrate)
- APP_DB_NAME=${APP_DB_NAME}
- APP_DB_USER=${APP_DB_USER}
- APP_DB_PASSWORD=${APP_DB_PASSWORD}
- DB_HOST=${DB_HOST}
- DB_PORT=${DB_PORT}
# Neo4j (setup_neo4j_indexes + load_library_types)
- NEOMODEL_NEO4J_BOLT_URL=${NEOMODEL_NEO4J_BOLT_URL}
# Logging
- LOGGING_LEVEL=${LOGGING_LEVEL}
- DJANGO_LOGGING_LEVEL=${DJANGO_LOGGING_LEVEL}
restart: "no"
# ── App: Django REST API + admin ──────────────────────────────────────────
# Serves /library/api/*, /admin/, /live/, /ready/, /metrics. Enqueues
# Celery tasks (hence CELERY_BROKER_URL is required here too — Django is
@@ -103,6 +145,8 @@ services:
depends_on:
static-init:
condition: service_completed_successfully
init:
condition: service_completed_successfully
volumes:
- mnemosyne-media:/app/media
healthcheck:
@@ -112,6 +156,7 @@ services:
retries: 3
start_period: 30s
# ── MCP server: FastMCP Streamable HTTP at /mcp/ ───────────────────────────
# Read-only LLM-facing surface. Intentionally excluded:
# CELERY_BROKER_URL — MCP must not enqueue tasks
@@ -171,6 +216,9 @@ services:
- LOGGING_LEVEL=${LOGGING_LEVEL}
- DJANGO_LOGGING_LEVEL=${DJANGO_LOGGING_LEVEL}
restart: unless-stopped
depends_on:
init:
condition: service_completed_successfully
volumes:
- mnemosyne-media:/app/media
healthcheck:
@@ -180,6 +228,7 @@ services:
retries: 3
start_period: 30s
# ── Celery worker: embedding + ingest + batch queues ───────────────────────
# Consumer side of the queue. Needs the full S3 block (reads Daedalus's
# bucket, writes to Mnemosyne's), the LLM API encryption key (ingest calls