docs(bootstrap): clarify three-step Docker first-boot flow

Rework README and docker-compose comments to document the deliberate chicken-and-egg escape: the `init` sidecar now only runs `migrate` and `load_library_types`, leaving `setup_neo4j_indexes` as a manual step after the system embedding model is configured in `/admin/`. This avoids making `app` unreachable on first boot when no embedding model row exists yet, while preserving loud failure on dimension mismatch.
2026-05-10 16:15:28 -04:00
parent 19e2aee91c
commit afcbee8819
6 changed files with 102 additions and 155 deletions
--- a/mnemosyne/docker/entrypoint.sh
+++ b/mnemosyne/docker/entrypoint.sh
@@ -1,111 +0,0 @@
-#!/bin/sh
-# Mnemosyne container entrypoint.
-#
-# The same image runs all three processes — the compose service supplies
-# `web`, `mcp`, `worker`, or `migrate` as CMD.
-
-set -e
-
-case "$1" in
-  web)
-    # Django REST API + admin (gunicorn → wsgi).
-    exec gunicorn \
-      --config /app/docker/gunicorn.conf.py \
-      --bind 0.0.0.0:8000 \
-      --workers "${GUNICORN_WORKERS:-3}" \
-      --access-logfile - \
-      --error-logfile - \
-      mnemosyne.wsgi:application
-    ;;
-
-  mcp)
-    # FastMCP over Streamable HTTP at /mcp/, mounted by mnemosyne.asgi.
-    exec uvicorn \
-      --host 0.0.0.0 \
-      --port 8001 \
-      --workers "${UVICORN_WORKERS:-1}" \
-      mnemosyne.asgi:app
-    ;;
-
-  worker)
-    # Celery worker covering embedding + ingest + batch + default queues.
-    # In production you may want to split these onto separate worker
-    # services for queue-level isolation; one process is fine to start.
-    exec celery -A mnemosyne worker \
-      --loglevel="${CELERY_LOG_LEVEL:-info}" \
-      --queues="${CELERY_QUEUES:-celery,embedding,batch}" \
-      --concurrency="${CELERY_CONCURRENCY:-2}"
-    ;;
-
-  beat)
-    # Celery scheduled tasks (only needed if/when periodic jobs are wired).
-    exec celery -A mnemosyne beat \
-      --loglevel="${CELERY_LOG_LEVEL:-info}"
-    ;;
-
-  migrate)
-    # One-shot DB migration runner — invoke before bringing services up
-    # for the first time or after a deploy.
-    exec python manage.py migrate --noinput
-    ;;
-
-  setup)
-    # One-shot init — Neo4j indexes + library_type seed data.
-    python manage.py setup_neo4j_indexes
-    python manage.py load_library_types
-    ;;
-
-  init)
-    # Bundled one-shot init run by the `init` sidecar on every
-    # `docker compose up`. Idempotent: re-runs are no-ops unless
-    # migrations or library-type seed data need to change.
-    #
-    # Vector-index creation intentionally runs in *best-effort* mode:
-    # ``setup_neo4j_indexes`` requires a system embedding model with a
-    # configured ``vector_dimensions`` value, and that model is data an
-    # operator seeds via the admin UI after the stack comes up for the
-    # first time. Blocking the whole stack on first boot would force
-    # every new deployer through a manual dance with the init sidecar's
-    # entrypoint; instead we log loudly and carry on, and the operator
-    # runs the command once post-boot:
-    #
-    #     docker compose exec app python manage.py setup_neo4j_indexes
-    #
-    # Full-text and neomodel constraint indexes are created by the same
-    # command and are *not* dimension-sensitive, but they also only land
-    # after the operator re-runs it — acceptable because search against
-    # an empty graph is itself a no-op.
-    set -e
-    python manage.py migrate --noinput
-    python manage.py load_library_types
-    if ! python manage.py setup_neo4j_indexes; then
-      echo ""
-      echo "============================================================"
-      echo "NOTICE: Neo4j index creation was skipped."
-      echo ""
-      echo "This is expected on a fresh deployment — vector indexes"
-      echo "require a system embedding model with vector_dimensions set."
-      echo ""
-      echo "Seed the embedding model in the Django admin"
-      echo "  (/admin/llm_manager/llmmodel/, mark one row as"
-      echo "   is_system_embedding_model=True with vector_dimensions set),"
-      echo "then run:"
-      echo ""
-      echo "  docker compose exec app python manage.py setup_neo4j_indexes"
-      echo ""
-      echo "Search endpoints will return empty results until this is done."
-      echo "============================================================"
-      echo ""
-    fi
-    ;;
-
-  shell)
-    # Drop into the management shell for ad-hoc work.
-    exec python manage.py shell
-    ;;
-
-  *)
-    # Fall through: run whatever was passed (e.g. `manage.py <cmd>`).
-    exec "$@"
-    ;;
-esac
--- a/mnemosyne/library/apps.py
+++ b/mnemosyne/library/apps.py
@@ -11,8 +11,13 @@ the stderr of a different container.

 The probe is deliberately best-effort: it cannot crash the process even if
 Neo4j is unreachable, because a transient DB blip on startup should not
-take down the whole app. The `init` sidecar is the hard gate; this is the
-second line of defence for long-running containers.
+take down the whole app. Nothing hard-gates on the vector indexes — the
+``init`` sidecar only runs ``migrate`` + ``load_library_types`` (vector
+indexes cannot be created before the system embedding model is configured
+in the admin, which is a manual step after first boot). This probe is the
+only way an operator learns that the manual
+``setup_neo4j_indexes`` step was skipped or fell out of sync with the
+current system model.
 """

 import logging
@@ -149,9 +154,10 @@ def _run_startup_probe():
    for name in _EXPECTED_VECTOR_INDEXES:
        if name not in present:
            logger.error(
-                "Neo4j vector index '%s' is missing. Run "
-                "'docker compose run --rm init' (or 'python manage.py "
-                "setup_neo4j_indexes') to rebuild.",
+                "Neo4j vector index '%s' is missing. Configure the system "
+                "embedding model in /admin/llm_manager/llmmodel/, then run "
+                "'docker compose exec app python manage.py "
+                "setup_neo4j_indexes' to create it.",
                name,
            )
            continue