docs(bootstrap): clarify three-step Docker first-boot flow
All checks were successful
CVE Scan & Docker Build / security-scan (push) Successful in 51s
CVE Scan & Docker Build / build-and-push (push) Successful in 2m31s

Rework README and docker-compose comments to document the deliberate
chicken-and-egg escape: the `init` sidecar now only runs `migrate` and
`load_library_types`, leaving `setup_neo4j_indexes` as a manual step
after the system embedding model is configured in `/admin/`. This
avoids making `app` unreachable on first boot when no embedding model
row exists yet, while preserving loud failure on dimension mismatch.
This commit is contained in:
2026-05-10 16:15:28 -04:00
parent 19e2aee91c
commit afcbee8819
6 changed files with 102 additions and 155 deletions

View File

@@ -1,111 +0,0 @@
#!/bin/sh
# Mnemosyne container entrypoint.
#
# The same image runs all three processes — the compose service supplies
# `web`, `mcp`, `worker`, or `migrate` as CMD.
set -e
case "$1" in
web)
# Django REST API + admin (gunicorn → wsgi).
exec gunicorn \
--config /app/docker/gunicorn.conf.py \
--bind 0.0.0.0:8000 \
--workers "${GUNICORN_WORKERS:-3}" \
--access-logfile - \
--error-logfile - \
mnemosyne.wsgi:application
;;
mcp)
# FastMCP over Streamable HTTP at /mcp/, mounted by mnemosyne.asgi.
exec uvicorn \
--host 0.0.0.0 \
--port 8001 \
--workers "${UVICORN_WORKERS:-1}" \
mnemosyne.asgi:app
;;
worker)
# Celery worker covering embedding + ingest + batch + default queues.
# In production you may want to split these onto separate worker
# services for queue-level isolation; one process is fine to start.
exec celery -A mnemosyne worker \
--loglevel="${CELERY_LOG_LEVEL:-info}" \
--queues="${CELERY_QUEUES:-celery,embedding,batch}" \
--concurrency="${CELERY_CONCURRENCY:-2}"
;;
beat)
# Celery scheduled tasks (only needed if/when periodic jobs are wired).
exec celery -A mnemosyne beat \
--loglevel="${CELERY_LOG_LEVEL:-info}"
;;
migrate)
# One-shot DB migration runner — invoke before bringing services up
# for the first time or after a deploy.
exec python manage.py migrate --noinput
;;
setup)
# One-shot init — Neo4j indexes + library_type seed data.
python manage.py setup_neo4j_indexes
python manage.py load_library_types
;;
init)
# Bundled one-shot init run by the `init` sidecar on every
# `docker compose up`. Idempotent: re-runs are no-ops unless
# migrations or library-type seed data need to change.
#
# Vector-index creation intentionally runs in *best-effort* mode:
# ``setup_neo4j_indexes`` requires a system embedding model with a
# configured ``vector_dimensions`` value, and that model is data an
# operator seeds via the admin UI after the stack comes up for the
# first time. Blocking the whole stack on first boot would force
# every new deployer through a manual dance with the init sidecar's
# entrypoint; instead we log loudly and carry on, and the operator
# runs the command once post-boot:
#
# docker compose exec app python manage.py setup_neo4j_indexes
#
# Full-text and neomodel constraint indexes are created by the same
# command and are *not* dimension-sensitive, but they also only land
# after the operator re-runs it — acceptable because search against
# an empty graph is itself a no-op.
set -e
python manage.py migrate --noinput
python manage.py load_library_types
if ! python manage.py setup_neo4j_indexes; then
echo ""
echo "============================================================"
echo "NOTICE: Neo4j index creation was skipped."
echo ""
echo "This is expected on a fresh deployment — vector indexes"
echo "require a system embedding model with vector_dimensions set."
echo ""
echo "Seed the embedding model in the Django admin"
echo " (/admin/llm_manager/llmmodel/, mark one row as"
echo " is_system_embedding_model=True with vector_dimensions set),"
echo "then run:"
echo ""
echo " docker compose exec app python manage.py setup_neo4j_indexes"
echo ""
echo "Search endpoints will return empty results until this is done."
echo "============================================================"
echo ""
fi
;;
shell)
# Drop into the management shell for ad-hoc work.
exec python manage.py shell
;;
*)
# Fall through: run whatever was passed (e.g. `manage.py <cmd>`).
exec "$@"
;;
esac

View File

@@ -11,8 +11,13 @@ the stderr of a different container.
The probe is deliberately best-effort: it cannot crash the process even if
Neo4j is unreachable, because a transient DB blip on startup should not
take down the whole app. The `init` sidecar is the hard gate; this is the
second line of defence for long-running containers.
take down the whole app. Nothing hard-gates on the vector indexes — the
``init`` sidecar only runs ``migrate`` + ``load_library_types`` (vector
indexes cannot be created before the system embedding model is configured
in the admin, which is a manual step after first boot). This probe is the
only way an operator learns that the manual
``setup_neo4j_indexes`` step was skipped or fell out of sync with the
current system model.
"""
import logging
@@ -149,9 +154,10 @@ def _run_startup_probe():
for name in _EXPECTED_VECTOR_INDEXES:
if name not in present:
logger.error(
"Neo4j vector index '%s' is missing. Run "
"'docker compose run --rm init' (or 'python manage.py "
"setup_neo4j_indexes') to rebuild.",
"Neo4j vector index '%s' is missing. Configure the system "
"embedding model in /admin/llm_manager/llmmodel/, then run "
"'docker compose exec app python manage.py "
"setup_neo4j_indexes' to create it.",
name,
)
continue