feat: add init sidecar for migrations and setup on compose up
Introduces a one-shot `init` service in docker-compose that runs Postgres migrations, Neo4j index setup, and library-type seeding on every `up`. Long-running services (`app`, `mcp`, `worker`) now depend on its successful completion via `service_completed_successfully`, blocking the stack on configuration errors (missing embedding model, dimension mismatch, unreachable DB) rather than serving silent zero-result searches. Also standardizes reranker test fixtures to use the `/v1` OpenAI-style base URL convention used across other service clients.
This commit is contained in:
@@ -294,31 +294,37 @@ graph LR
|
||||
</div>
|
||||
|
||||
<div class="alert alert-warning border-start border-4 border-warning">
|
||||
<h4><i class="bi bi-lightning"></i> Neo4j Vector Indexes</h4>
|
||||
<pre class="bg-light p-3 rounded mb-0"><code>// Chunk text+image embeddings (4096 dimensions, no pgvector limits!)
|
||||
CREATE VECTOR INDEX chunk_embedding FOR (c:Chunk)
|
||||
<h4><i class="bi bi-lightning"></i> Neo4j Indexes (managed by <code>setup_neo4j_indexes</code>)</h4>
|
||||
<p>Created by the <code>init</code> sidecar on every <code>docker compose up</code>. Vector dimensions come from the system embedding model's <code>vector_dimensions</code> field — the command fails if no model is configured. Current production model: <strong>Pan Synesis · qwen3-vl-embedding-2b · 2048d</strong>.</p>
|
||||
<pre class="bg-light p-3 rounded mb-0"><code>// Chunk text+image embeddings (dimensions read from system embedding model)
|
||||
CREATE VECTOR INDEX chunk_embedding_index FOR (c:Chunk)
|
||||
ON (c.embedding) OPTIONS {indexConfig: {
|
||||
`vector.dimensions`: 4096,
|
||||
`vector.dimensions`: 2048,
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}
|
||||
|
||||
// Concept embeddings for semantic concept search
|
||||
CREATE VECTOR INDEX concept_embedding FOR (con:Concept)
|
||||
CREATE VECTOR INDEX concept_embedding_index FOR (con:Concept)
|
||||
ON (con.embedding) OPTIONS {indexConfig: {
|
||||
`vector.dimensions`: 4096,
|
||||
`vector.dimensions`: 2048,
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}
|
||||
|
||||
// Image multimodal embeddings
|
||||
CREATE VECTOR INDEX image_embedding FOR (ie:ImageEmbedding)
|
||||
CREATE VECTOR INDEX image_embedding_index FOR (ie:ImageEmbedding)
|
||||
ON (ie.embedding) OPTIONS {indexConfig: {
|
||||
`vector.dimensions`: 4096,
|
||||
`vector.dimensions`: 2048,
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}
|
||||
|
||||
// Full-text index for keyword/BM25-style search
|
||||
CREATE FULLTEXT INDEX chunk_fulltext FOR (c:Chunk) ON EACH [c.text_preview]</code></pre>
|
||||
// Full-text indexes (BM25-style keyword search)
|
||||
CREATE FULLTEXT INDEX chunk_text_fulltext FOR (c:Chunk) ON EACH [c.text_preview]
|
||||
CREATE FULLTEXT INDEX concept_name_fulltext FOR (c:Concept) ON EACH [c.name]
|
||||
CREATE FULLTEXT INDEX item_title_fulltext FOR (i:Item) ON EACH [i.title]
|
||||
CREATE FULLTEXT INDEX library_name_fulltext FOR (l:Library) ON EACH [l.name]</code></pre>
|
||||
<p class="mb-0 mt-3"><strong>Changing the embedding model or dimensions is a re-embedding event.</strong> Drop + recreate the vector indexes (<code>setup_neo4j_indexes --drop</code>) and re-queue all content for embedding. Old vectors at the previous dimension remain on the nodes until overwritten but are no longer indexed.</p>
|
||||
</div>
|
||||
|
||||
</section>
|
||||
|
||||
<!-- SECTION: CONTENT TYPES -->
|
||||
@@ -521,10 +527,11 @@ flowchart TD
|
||||
<div class="card-body">
|
||||
<p>Cosine similarity via Neo4j vector index on Chunk and ImageEmbedding nodes.</p>
|
||||
<pre class="bg-light p-2 rounded"><code>CALL db.index.vector.queryNodes(
|
||||
'chunk_embedding', 30,
|
||||
'chunk_embedding_index', 30,
|
||||
$query_vector
|
||||
) YIELD node, score
|
||||
WHERE score > $threshold</code></pre>
|
||||
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
@@ -548,9 +555,10 @@ RETURN c2, i2</code></pre>
|
||||
<div class="card-body">
|
||||
<p>Neo4j native full-text index for keyword matching (BM25-equivalent).</p>
|
||||
<pre class="bg-light p-2 rounded"><code>CALL db.index.fulltext.queryNodes(
|
||||
'chunk_fulltext',
|
||||
'chunk_text_fulltext',
|
||||
$query_text
|
||||
) YIELD node, score</code></pre>
|
||||
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
Reference in New Issue
Block a user