docs(readme): clarify embedding model seed order for Neo4j indexes
All checks were successful
CVE Scan & Docker Build / security-scan (push) Successful in 52s
CVE Scan & Docker Build / build-and-push (push) Successful in 2m31s

Document that the system embedding model must be seeded before running
`setup_neo4j_indexes`, since vector index dimensions are read from the
`llm_manager_llmmodel` row. Update Docker instructions to reflect the
`init` sidecar behavior, which now runs migrations and library_type
defaults automatically while deferring vector index creation.
This commit is contained in:
2026-05-10 14:02:41 -04:00
parent bbd65b1300
commit 19e2aee91c
2 changed files with 134 additions and 8 deletions

View File

@@ -76,10 +76,23 @@ Hosts in the Ouranos lab:
```bash
cd mnemosyne/
python manage.py migrate # Apply Django ORM migrations
python manage.py setup_neo4j_indexes # Create Neo4j vector + full-text indexes
python manage.py load_library_types # Load LIBRARY_TYPE_DEFAULTS into Neo4j
# --- seed the system embedding model in /admin/llm_manager/llmmodel/ here ---
python manage.py setup_neo4j_indexes # Create Neo4j vector + full-text indexes
```
> **Seed the embedding model before running `setup_neo4j_indexes`.** Vector
> index dimensions are read from the row in ``llm_manager_llmmodel`` that
> has ``is_system_embedding_model=True`` and a non-null ``vector_dimensions``.
> There is deliberately no hardcoded fallback: an index built at the wrong
> dimension silently breaks every search. The command will exit non-zero
> with a clear error if no such row exists, which is also why the
> ``docker compose`` ``init`` sidecar treats vector-index creation as
> best-effort on first boot — the stack starts healthy, migrations and
> library-type seed data land, and you run
> ``docker compose exec app python manage.py setup_neo4j_indexes`` once
> the embedding-model row is in place.
### Start the web app
The Django REST API serves `/library/api/*` (libraries, collections, items, search, workspaces, ingest) and Django admin. Use Gunicorn in production; `runserver` for dev.
@@ -199,14 +212,16 @@ cp .env.example .env && $EDITOR .env
# Pull the image (or build locally with `docker compose build`)
docker compose pull
# DB migrations (one-shot)
docker compose run --rm app migrate
# Neo4j indexes + library_type defaults (one-shot)
docker compose run --rm app setup
# Bring the stack up
# Bring the stack up — the `init` sidecar runs migrations + library_type
# defaults automatically. Vector indexes are deferred until you seed the
# system embedding model (see below) — the sidecar logs a clear notice
# and exits 0 either way, so the stack comes up healthy on first boot.
docker compose up -d
# Seed the system embedding model at /admin/llm_manager/llmmodel/
# (mark one row `is_system_embedding_model=True` with `vector_dimensions`
# set to whatever your embedding provider returns), then:
docker compose exec app python manage.py setup_neo4j_indexes
```
### Day-to-day