docs(bootstrap): clarify three-step Docker first-boot flow

Rework README and docker-compose comments to document the deliberate chicken-and-egg escape: the `init` sidecar now only runs `migrate` and `load_library_types`, leaving `setup_neo4j_indexes` as a manual step after the system embedding model is configured in `/admin/`. This avoids making `app` unreachable on first boot when no embedding model row exists yet, while preserving loud failure on dimension mismatch.
2026-05-10 16:15:28 -04:00
parent 19e2aee91c
commit afcbee8819
6 changed files with 102 additions and 155 deletions
--- a/README.md
+++ b/README.md
@@ -86,12 +86,15 @@ python manage.py setup_neo4j_indexes           # Create Neo4j vector + full-text
 > has ``is_system_embedding_model=True`` and a non-null ``vector_dimensions``.
 > There is deliberately no hardcoded fallback: an index built at the wrong
 > dimension silently breaks every search. The command will exit non-zero
-> with a clear error if no such row exists, which is also why the
-> ``docker compose`` ``init`` sidecar treats vector-index creation as
-> best-effort on first boot — the stack starts healthy, migrations and
-> library-type seed data land, and you run
-> ``docker compose exec app python manage.py setup_neo4j_indexes`` once
-> the embedding-model row is in place.
+> with a clear error if no such row exists, which is why the
+> ``docker compose`` ``init`` sidecar does **not** run
+> ``setup_neo4j_indexes`` — the stack brings up `migrate` +
+> `load_library_types` only, you land in `/admin/` to configure the system
+> embedding model, and then you run
+> ``docker compose exec app python manage.py setup_neo4j_indexes`` manually
+> once. Until that last step runs, vector search returns empty results and
+> `library/apps.py` logs a readiness warning. See
+> [Docker bootstrap order](#docker-bootstrap-order) below for the full flow.

 ### Start the web app

@@ -203,27 +206,45 @@ The per-service surface is defined by the `environment:` blocks in `docker-compo

 > **Broker URL gotcha.** If the RabbitMQ password contains any of `@ : / # % + ? & =` or a space, it must be percent-encoded in `CELERY_BROKER_URL`. Kombu's URL parser is strict, and this is the most common cause of a `PLAIN 403 ACCESS_REFUSED` at worker startup when the same credentials work fine under bare-Python `celery` invocations (because you were probably passing them as kwargs, not a URL).

-### First-time bring-up
+### Docker bootstrap order
+
+Three steps — the first and third are one-liners, the middle step is a
+manual sit-down in `/admin/` to configure the system embedding model.
+`setup_neo4j_indexes` is **not** run automatically: it reads vector
+dimensions from that admin row and hard-fails if the row is missing, so
+bundling it into the `init` sidecar would make `app` unreachable on
+first boot. Running it manually after admin configuration is the
+chicken-and-egg escape.

 ```bash
-# Generate the root .env from the template (or let Ansible do it)
+# 1. Generate the root .env from the template (or let Ansible do it),
+#    pull the image, and bring the stack up. The `init` sidecar runs
+#    `migrate` + `load_library_types` and exits; `app`, `mcp`, and
+#    `worker` come up healthy.
 cp .env.example .env && $EDITOR .env
-
-# Pull the image (or build locally with `docker compose build`)
 docker compose pull
-
-# Bring the stack up — the `init` sidecar runs migrations + library_type
-# defaults automatically. Vector indexes are deferred until you seed the
-# system embedding model (see below) — the sidecar logs a clear notice
-# and exits 0 either way, so the stack comes up healthy on first boot.
 docker compose up -d

-# Seed the system embedding model at /admin/llm_manager/llmmodel/
-# (mark one row `is_system_embedding_model=True` with `vector_dimensions`
-# set to whatever your embedding provider returns), then:
+# 2. Browse to /admin/llm_manager/llmapi/ and add the embedding provider
+#    (e.g. Pan Synesis, with the right base URL and API key). Then
+#    /admin/llm_manager/llmmodel/ and add one row for the embedding model:
+#       - api             = the api you just created
+#       - name            = the provider's model name
+#       - vector_dimensions = whatever your embedding provider returns
+#       - is_system_embedding_model = True
+#    Save, then come back to the shell.
+
+# 3. Create Neo4j vector + full-text indexes at the right dimensions.
+#    Idempotent — re-run after an embedding-model swap with `--drop` to
+#    rebuild, which requires re-embedding all content.
 docker compose exec app python manage.py setup_neo4j_indexes
 ```

+Until step 3 runs, vector search returns empty results and
+`library/apps.py` logs a readiness warning each time the app boots. This
+is deliberate: an index built at the wrong dimension silently breaks
+every search, so loud failure beats quiet misconfiguration.
+
 ### Day-to-day

 ```bash