docs(bootstrap): clarify three-step Docker first-boot flow
Rework README and docker-compose comments to document the deliberate chicken-and-egg escape: the `init` sidecar now only runs `migrate` and `load_library_types`, leaving `setup_neo4j_indexes` as a manual step after the system embedding model is configured in `/admin/`. This avoids making `app` unreachable on first boot when no embedding model row exists yet, while preserving loud failure on dimension mismatch.
This commit is contained in:
57
README.md
57
README.md
@@ -86,12 +86,15 @@ python manage.py setup_neo4j_indexes # Create Neo4j vector + full-text
|
||||
> has ``is_system_embedding_model=True`` and a non-null ``vector_dimensions``.
|
||||
> There is deliberately no hardcoded fallback: an index built at the wrong
|
||||
> dimension silently breaks every search. The command will exit non-zero
|
||||
> with a clear error if no such row exists, which is also why the
|
||||
> ``docker compose`` ``init`` sidecar treats vector-index creation as
|
||||
> best-effort on first boot — the stack starts healthy, migrations and
|
||||
> library-type seed data land, and you run
|
||||
> ``docker compose exec app python manage.py setup_neo4j_indexes`` once
|
||||
> the embedding-model row is in place.
|
||||
> with a clear error if no such row exists, which is why the
|
||||
> ``docker compose`` ``init`` sidecar does **not** run
|
||||
> ``setup_neo4j_indexes`` — the stack brings up `migrate` +
|
||||
> `load_library_types` only, you land in `/admin/` to configure the system
|
||||
> embedding model, and then you run
|
||||
> ``docker compose exec app python manage.py setup_neo4j_indexes`` manually
|
||||
> once. Until that last step runs, vector search returns empty results and
|
||||
> `library/apps.py` logs a readiness warning. See
|
||||
> [Docker bootstrap order](#docker-bootstrap-order) below for the full flow.
|
||||
|
||||
### Start the web app
|
||||
|
||||
@@ -203,27 +206,45 @@ The per-service surface is defined by the `environment:` blocks in `docker-compo
|
||||
|
||||
> **Broker URL gotcha.** If the RabbitMQ password contains any of `@ : / # % + ? & =` or a space, it must be percent-encoded in `CELERY_BROKER_URL`. Kombu's URL parser is strict, and this is the most common cause of a `PLAIN 403 ACCESS_REFUSED` at worker startup when the same credentials work fine under bare-Python `celery` invocations (because you were probably passing them as kwargs, not a URL).
|
||||
|
||||
### First-time bring-up
|
||||
### Docker bootstrap order
|
||||
|
||||
Three steps — the first and third are one-liners, the middle step is a
|
||||
manual sit-down in `/admin/` to configure the system embedding model.
|
||||
`setup_neo4j_indexes` is **not** run automatically: it reads vector
|
||||
dimensions from that admin row and hard-fails if the row is missing, so
|
||||
bundling it into the `init` sidecar would make `app` unreachable on
|
||||
first boot. Running it manually after admin configuration is the
|
||||
chicken-and-egg escape.
|
||||
|
||||
```bash
|
||||
# Generate the root .env from the template (or let Ansible do it)
|
||||
# 1. Generate the root .env from the template (or let Ansible do it),
|
||||
# pull the image, and bring the stack up. The `init` sidecar runs
|
||||
# `migrate` + `load_library_types` and exits; `app`, `mcp`, and
|
||||
# `worker` come up healthy.
|
||||
cp .env.example .env && $EDITOR .env
|
||||
|
||||
# Pull the image (or build locally with `docker compose build`)
|
||||
docker compose pull
|
||||
|
||||
# Bring the stack up — the `init` sidecar runs migrations + library_type
|
||||
# defaults automatically. Vector indexes are deferred until you seed the
|
||||
# system embedding model (see below) — the sidecar logs a clear notice
|
||||
# and exits 0 either way, so the stack comes up healthy on first boot.
|
||||
docker compose up -d
|
||||
|
||||
# Seed the system embedding model at /admin/llm_manager/llmmodel/
|
||||
# (mark one row `is_system_embedding_model=True` with `vector_dimensions`
|
||||
# set to whatever your embedding provider returns), then:
|
||||
# 2. Browse to /admin/llm_manager/llmapi/ and add the embedding provider
|
||||
# (e.g. Pan Synesis, with the right base URL and API key). Then
|
||||
# /admin/llm_manager/llmmodel/ and add one row for the embedding model:
|
||||
# - api = the api you just created
|
||||
# - name = the provider's model name
|
||||
# - vector_dimensions = whatever your embedding provider returns
|
||||
# - is_system_embedding_model = True
|
||||
# Save, then come back to the shell.
|
||||
|
||||
# 3. Create Neo4j vector + full-text indexes at the right dimensions.
|
||||
# Idempotent — re-run after an embedding-model swap with `--drop` to
|
||||
# rebuild, which requires re-embedding all content.
|
||||
docker compose exec app python manage.py setup_neo4j_indexes
|
||||
```
|
||||
|
||||
Until step 3 runs, vector search returns empty results and
|
||||
`library/apps.py` logs a readiness warning each time the app boots. This
|
||||
is deliberate: an index built at the wrong dimension silently breaks
|
||||
every search, so loud failure beats quiet misconfiguration.
|
||||
|
||||
### Day-to-day
|
||||
|
||||
```bash
|
||||
|
||||
Reference in New Issue
Block a user