docs(bootstrap): clarify three-step Docker first-boot flow
All checks were successful
CVE Scan & Docker Build / security-scan (push) Successful in 51s
CVE Scan & Docker Build / build-and-push (push) Successful in 2m31s

Rework README and docker-compose comments to document the deliberate
chicken-and-egg escape: the `init` sidecar now only runs `migrate` and
`load_library_types`, leaving `setup_neo4j_indexes` as a manual step
after the system embedding model is configured in `/admin/`. This
avoids making `app` unreachable on first boot when no embedding model
row exists yet, while preserving loud failure on dimension mismatch.
This commit is contained in:
2026-05-10 16:15:28 -04:00
parent 19e2aee91c
commit afcbee8819
6 changed files with 102 additions and 155 deletions

View File

@@ -25,13 +25,26 @@
# Run:
# docker compose up -d
#
# The `init` sidecar (below) runs Postgres migrations, Neo4j index setup,
# and library-type seeding on every `up`. Long-running services wait for
# it via `depends_on: init: service_completed_successfully` — so a failure
# there (missing embedding model, dimension mismatch, unreachable DB)
# blocks the stack rather than letting it serve silent zero-result
# searches. The standalone `migrate` / `setup` entrypoint commands remain
# available for ad-hoc ops work.
# The `init` sidecar (below) runs Postgres migrations and library-type
# seeding on every `up`. Long-running services wait for it via
# `depends_on: init: service_completed_successfully` — so a failure there
# (unreachable DB, broken migration) blocks the stack.
#
# Neo4j vector-index creation is deliberately NOT bundled into `init`.
# `setup_neo4j_indexes` requires a system embedding model configured in
# the admin, which only exists after first boot — an operator has to land
# in /admin/, pick an embedding API + model, and set its vector_dimensions
# value. Bootstrap order is therefore:
#
# 1. docker compose up # init sidecar: migrate + load_library_types
# 2. browse to /admin/ → llm_manager → configure system embedding model
# 3. docker compose exec app python manage.py setup_neo4j_indexes
#
# Until step 3, vector search returns empty results. library/apps.py logs
# a readiness warning when indexes are missing, so this is visible.
# The standalone `migrate` / `setup` entrypoint commands remain available
# for ad-hoc ops work (`setup` runs setup_neo4j_indexes + load_library_types
# and is the typical re-run target after embedding-model changes).
# =============================================================================
@@ -48,13 +61,15 @@ services:
- mnemosyne-static:/shared-static
restart: "no"
# ── Init sidecar: one-shot Postgres migrate + Neo4j index setup + library
# type seed. Runs on every `up` and exits. Long-running services below
# depend on `service_completed_successfully`, so a failure here (no system
# embedding model configured, dimension mismatch, unreachable DB) blocks
# `app`/`mcp`/`worker` from starting — which is the whole point. All three
# commands are idempotent: re-running is a no-op unless state actually
# needs to change.
# ── Init sidecar: one-shot Postgres migrate + library-type seed. Runs on
# every `up` and exits. Long-running services below depend on
# `service_completed_successfully`, so a failure here (unreachable DB,
# broken migration) blocks `app`/`mcp`/`worker` from starting. Both
# commands are idempotent.
#
# Neo4j vector-index setup is NOT run here — see the header comment for
# the operator bootstrap flow. Only library_type seeding touches Neo4j
# from this sidecar, and it does not depend on any embedding model.
#
# This sidecar only needs Postgres, Neo4j, and logging env — no S3, no
# Celery, no LLM encryption key. Keep it that way.
@@ -75,7 +90,7 @@ services:
- APP_DB_PASSWORD=${APP_DB_PASSWORD}
- DB_HOST=${DB_HOST}
- DB_PORT=${DB_PORT}
# Neo4j (setup_neo4j_indexes + load_library_types)
# Neo4j (load_library_types writes Library defaults into the graph)
- NEOMODEL_NEO4J_BOLT_URL=${NEOMODEL_NEO4J_BOLT_URL}
# Logging
- LOGGING_LEVEL=${LOGGING_LEVEL}