feat: add Phase 3 hybrid search with Synesis reranking

Implement hybrid search pipeline combining vector, fulltext, and graph search across Neo4j, with cross-attention reranking via Synesis (Qwen3-VL-Reranker-2B) `/v1/rerank` endpoint. - Add SearchService with vector, fulltext, and graph search strategies - Add SynesisRerankerClient for multimodal reranking via HTTP API - Add search API endpoint (POST /search/) with filtering by library, collection, and library_type - Add SearchRequest/Response serializers and image search results - Add "nonfiction" to library_type choices - Consolidate reranker stack from two models to single Synesis service - Handle image analysis_status as "skipped" when analysis is unavailable - Add comprehensive tests for search pipeline and reranker client
2026-03-29 18:09:50 +00:00
parent fb38a881d9
commit 634845fee0
27 changed files with 5680 additions and 4 deletions
--- a/docs/PHASE_3_SEARCH_AND_RERANKING.md
+++ b/docs/PHASE_3_SEARCH_AND_RERANKING.md
@@ -0,0 +1,384 @@
+# Phase 3: Search & Re-ranking
+
+## Objective
+
+Build the complete hybrid search pipeline: accept a query → embed it → search Neo4j (vector + full-text + graph traversal) → fuse candidates → re-rank via Synesis → return ranked results with content-type context. At the end of this phase, content is discoverable through multiple search modalities, ranked by cross-attention relevance, and ready for Phase 4's RAG generation.
+
+## Heritage
+
+The hybrid search architecture adapts patterns from [Spelunker](https://git.helu.ca/r/spelunker)'s two-stage retrieval pipeline — vector recall + cross-attention re-ranking — enhanced with knowledge graph traversal, multimodal search, and content-type-aware re-ranking instructions.
+
+## Architecture Overview
+
+```
+User Query (text, optional image, optional filters)
+  │
+  ├─→ Vector Search (Neo4j vector index — Chunk.embedding)
+  │     → Top-K nearest neighbors by cosine similarity
+  │
+  ├─→ Full-Text Search (Neo4j fulltext index — Chunk.text_preview, Concept.name)
+  │     → BM25-scored matches
+  │
+  ├─→ Graph Search (Cypher traversal)
+  │     → Concept-linked chunks via MENTIONS/REFERENCES/DEPICTS edges
+  │
+  └─→ Image Search (Neo4j vector index — ImageEmbedding.embedding)
+        → Multimodal similarity (text-to-image in unified vector space)
+          │
+          └─→ Candidate Fusion (Reciprocal Rank Fusion)
+                → Deduplicated, scored candidate list
+                  │
+                  └─→ Re-ranking (Synesis /v1/rerank)
+                        → Content-type-aware instruction injection
+                        → Cross-attention precision scoring
+                          │
+                          └─→ Final ranked results with metadata
+```
+
+## Synesis Integration
+
+[Synesis](docs/synesis_api_usage_guide.html) is a custom FastAPI service built around Qwen3-VL-2B, providing both embedding and re-ranking over a clean REST API. It runs on `pan.helu.ca:8400`.
+
+**Embedding** (Phase 2, already working): Synesis's `/v1/embeddings` endpoint is OpenAI-compatible — the existing `EmbeddingClient` handles it with `api_type="openai"`.
+
+**Re-ranking** (Phase 3, new): Synesis's `/v1/rerank` endpoint provides:
+- Native `instruction` parameter — maps directly to `reranker_instruction` from content types
+- `top_n` for server-side truncation
+- Multimodal support — both query and documents can include images
+- Relevance scores for each candidate
+
+```python
+# Synesis rerank request
+POST http://pan.helu.ca:8400/v1/rerank
+{
+    "query": {"text": "How do I configure a 3-phase motor?"},
+    "documents": [
+        {"text": "The motor controller requires..."},
+        {"text": "3-phase power is distributed..."}
+    ],
+    "instruction": "Re-rank passages from technical documentation based on procedural relevance.",
+    "top_n": 10
+}
+```
+
+## Deliverables
+
+### 1. Search Service (`library/services/search.py`)
+
+The core search orchestrator. Accepts a `SearchRequest`, dispatches to individual search backends, fuses results, and optionally re-ranks.
+
+#### SearchRequest
+
+```python
+@dataclass
+class SearchRequest:
+    query: str                           # Natural language query text
+    query_image: bytes | None = None     # Optional image for multimodal search
+    library_uid: str | None = None       # Scope to specific library
+    library_type: str | None = None      # Scope to library type
+    collection_uid: str | None = None    # Scope to specific collection
+    search_types: list[str]              # ["vector", "fulltext", "graph"]
+    limit: int = 20                      # Max results after fusion
+    vector_top_k: int = 50              # Candidates from vector search
+    fulltext_top_k: int = 30            # Candidates from fulltext search
+    graph_max_depth: int = 2             # Graph traversal depth
+    rerank: bool = True                  # Apply re-ranking
+    include_images: bool = True          # Include image results
+```
+
+#### SearchResponse
+
+```python
+@dataclass
+class SearchCandidate:
+    chunk_uid: str
+    item_uid: str
+    item_title: str
+    library_type: str
+    text_preview: str
+    chunk_s3_key: str
+    chunk_index: int
+    score: float                         # Final score (post-fusion or post-rerank)
+    source: str                          # "vector", "fulltext", "graph"
+    metadata: dict                       # Page, section, nearby images, etc.
+
+@dataclass
+class ImageSearchResult:
+    image_uid: str
+    item_uid: str
+    item_title: str
+    image_type: str
+    description: str
+    s3_key: str
+    score: float
+    source: str                          # "vector", "graph"
+
+@dataclass
+class SearchResponse:
+    query: str
+    candidates: list[SearchCandidate]    # Ranked text results
+    images: list[ImageSearchResult]      # Ranked image results
+    total_candidates: int                # Pre-fusion candidate count
+    search_time_ms: float
+    reranker_used: bool
+    reranker_model: str | None
+    search_types_used: list[str]
+```
+
+### 2. Vector Search
+
+Uses Neo4j's `db.index.vector.queryNodes()` against `chunk_embedding_index`.
+
+- Embed query text using system embedding model (via existing `EmbeddingClient`)
+- Prepend library's `embedding_instruction` when scoped to a specific library
+- Query Neo4j vector index for top-K Chunk nodes by cosine similarity
+- Filter by library/collection via graph pattern matching
+
+```cypher
+CALL db.index.vector.queryNodes('chunk_embedding_index', $top_k, $query_vector)
+YIELD node AS chunk, score
+MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
+OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
+WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
+  AND ($library_type IS NULL OR lib.library_type = $library_type)
+  AND ($collection_uid IS NULL OR col.uid = $collection_uid)
+RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
+       chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index,
+       item.uid AS item_uid, item.title AS item_title,
+       lib.library_type AS library_type, score
+ORDER BY score DESC
+LIMIT $top_k
+```
+
+### 3. Full-Text Search
+
+Uses Neo4j fulltext indexes created by `setup_neo4j_indexes`.
+
+- Query `chunk_text_fulltext` for Chunk matches (BM25)
+- Query `concept_name_fulltext` for Concept matches → traverse to connected Chunks
+- Query `item_title_fulltext` for Item title matches → get their Chunks
+- Normalize BM25 scores to 0-1 range for fusion compatibility
+
+```cypher
+-- Chunk full-text search
+CALL db.index.fulltext.queryNodes('chunk_text_fulltext', $query)
+YIELD node AS chunk, score
+MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
+OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
+WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
+RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
+       item.uid AS item_uid, item.title AS item_title,
+       lib.library_type AS library_type, score
+ORDER BY score DESC
+LIMIT $top_k
+
+-- Concept-to-Chunk traversal
+CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query)
+YIELD node AS concept, score AS concept_score
+MATCH (chunk:Chunk)-[:MENTIONS]->(concept)
+MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
+RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
+       item.uid AS item_uid, item.title AS item_title,
+       concept_score * 0.8 AS score
+```
+
+### 4. Graph Search
+
+Knowledge-graph-powered discovery — the differentiator from standard RAG.
+
+- Match query terms against Concept names via fulltext index
+- Traverse `Concept ←[MENTIONS]- Chunk ←[HAS_CHUNK]- Item`
+- Expand via `Concept -[RELATED_TO]- Concept` for secondary connections
+- Score based on relationship weight and traversal depth
+
+```cypher
+-- Concept graph traversal
+CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query)
+YIELD node AS concept, score
+MATCH path = (concept)<-[:MENTIONS|REFERENCES*1..2]-(connected)
+WHERE connected:Chunk OR connected:Item
+WITH concept, connected, score, length(path) AS depth
+MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
+WHERE chunk = connected OR item = connected
+RETURN DISTINCT chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
+       item.uid AS item_uid, item.title AS item_title,
+       score / (depth * 0.5 + 1) AS score
+```
+
+### 5. Image Search
+
+Multimodal vector search against `image_embedding_index`.
+
+- Embed query text (or image) using system embedding model
+- Search `ImageEmbedding` vectors in unified multimodal space
+- Return with Image descriptions, OCR text, and Item associations from Phase 2B
+- Also include images found via concept graph DEPICTS relationships
+
+### 6. Candidate Fusion (`library/services/fusion.py`)
+
+Reciprocal Rank Fusion (RRF) — parameter-light, proven in Spelunker.
+
+```python
+def reciprocal_rank_fusion(
+    result_lists: list[list[SearchCandidate]],
+    k: int = 60,
+) -> list[SearchCandidate]:
+    """
+    RRF score = Σ 1 / (k + rank_i) for each list containing the candidate.
+    Candidates in multiple lists get boosted.
+    """
+```
+
+- Deduplicates candidates by `chunk_uid`
+- Candidates appearing in multiple search types get naturally boosted
+- Sort by fused score descending, trim to `limit`
+
+### 7. Re-ranking Client (`library/services/reranker.py`)
+
+Targets Synesis's `POST /v1/rerank` endpoint. Wraps the system reranker model's API configuration.
+
+#### Synesis Backend
+
+```python
+class RerankerClient:
+    def rerank(
+        self,
+        query: str,
+        candidates: list[SearchCandidate],
+        instruction: str = "",
+        top_n: int | None = None,
+        query_image: bytes | None = None,
+    ) -> list[SearchCandidate]:
+        """
+        Re-rank candidates via Synesis /v1/rerank.
+        
+        Injects content-type reranker_instruction as the instruction parameter.
+        """
+```
+
+Features:
+- Uses `text_preview` (500 chars) for document text — avoids S3 round-trips
+- Prepends library's `reranker_instruction` as the `instruction` parameter
+- Supports multimodal queries (text + image)
+- Falls back gracefully when no reranker model configured
+- Tracks usage via `LLMUsage` with `purpose="reranking"`
+
+### 8. Search API Endpoints
+
+New endpoints in `library/api/`:
+
+| Method | Route | Purpose |
+|--------|-------|---------|
+| `POST` | `/api/v1/library/search/` | Full hybrid search + re-rank |
+| `POST` | `/api/v1/library/search/vector/` | Vector-only search (debugging) |
+| `POST` | `/api/v1/library/search/fulltext/` | Full-text-only search (debugging) |
+| `GET` | `/api/v1/library/concepts/` | List/search concepts |
+| `GET` | `/api/v1/library/concepts/<uid>/graph/` | Concept neighborhood graph |
+
+### 9. Search UI Views
+
+| URL | View | Purpose |
+|-----|------|---------|
+| `/library/search/` | `search` | Search page with query input + filters |
+| `/library/concepts/` | `concept_list` | Browse concepts with search |
+| `/library/concepts/<uid>/` | `concept_detail` | Single concept with connections |
+
+### 10. Prometheus Metrics
+
+| Metric | Type | Labels | Purpose |
+|--------|------|--------|---------|
+| `mnemosyne_search_requests_total` | Counter | search_type, library_type | Search throughput |
+| `mnemosyne_search_duration_seconds` | Histogram | search_type | Per-search-type latency |
+| `mnemosyne_search_candidates_total` | Histogram | search_type | Candidates per search type |
+| `mnemosyne_fusion_duration_seconds` | Histogram | — | Fusion latency |
+| `mnemosyne_rerank_requests_total` | Counter | model_name, status | Re-rank throughput |
+| `mnemosyne_rerank_duration_seconds` | Histogram | model_name | Re-rank latency |
+| `mnemosyne_rerank_candidates` | Histogram | — | Candidates sent to reranker |
+| `mnemosyne_search_total_duration_seconds` | Histogram | — | End-to-end search latency |
+
+### 11. Management Commands
+
+| Command | Purpose |
+|---------|---------|
+| `search <query> [--library-uid] [--limit] [--no-rerank]` | CLI search for testing |
+| `search_stats` | Search index statistics |
+
+### 12. Settings
+
+```python
+# Search configuration
+SEARCH_VECTOR_TOP_K = env.int("SEARCH_VECTOR_TOP_K", default=50)
+SEARCH_FULLTEXT_TOP_K = env.int("SEARCH_FULLTEXT_TOP_K", default=30)
+SEARCH_GRAPH_MAX_DEPTH = env.int("SEARCH_GRAPH_MAX_DEPTH", default=2)
+SEARCH_RRF_K = env.int("SEARCH_RRF_K", default=60)
+SEARCH_DEFAULT_LIMIT = env.int("SEARCH_DEFAULT_LIMIT", default=20)
+RERANKER_MAX_CANDIDATES = env.int("RERANKER_MAX_CANDIDATES", default=32)
+RERANKER_TIMEOUT = env.int("RERANKER_TIMEOUT", default=30)
+```
+
+## File Structure
+
+```
+mnemosyne/library/
+├── services/
+│   ├── search.py              # NEW — SearchService orchestrator
+│   ├── fusion.py              # NEW — Reciprocal Rank Fusion
+│   ├── reranker.py            # NEW — Synesis re-ranking client
+│   └── ...                    # Existing services unchanged
+├── metrics.py                 # Modified — add search/rerank metrics
+├── views.py                   # Modified — add search UI views
+├── urls.py                    # Modified — add search routes
+├── api/
+│   ├── views.py               # Modified — add search API endpoints
+│   ├── serializers.py         # Modified — add search serializers
+│   └── urls.py                # Modified — add search API routes
+├── management/commands/
+│   ├── search.py              # NEW — CLI search command
+│   └── search_stats.py        # NEW — Index statistics
+├── templates/library/
+│   ├── search.html            # NEW — Search page
+│   ├── concept_list.html      # NEW — Concept browser
+│   └── concept_detail.html    # NEW — Concept detail
+└── tests/
+    ├── test_search.py         # NEW — Search service tests
+    ├── test_fusion.py         # NEW — RRF fusion tests
+    ├── test_reranker.py       # NEW — Re-ranking client tests
+    └── test_search_api.py     # NEW — Search API endpoint tests
+```
+
+## Dependencies
+
+No new Python dependencies required. Phase 3 uses:
+- `neomodel` + raw Cypher (Neo4j search)
+- `requests` (Synesis reranker HTTP)
+- `EmbeddingClient` from Phase 2 (query embedding)
+- `prometheus_client` (metrics)
+
+## Testing Strategy
+
+All tests use Django `TestCase`. External services mocked.
+
+| Test File | Scope |
+|-----------|-------|
+| `test_search.py` | SearchService orchestration, individual search methods, library/collection scoping |
+| `test_fusion.py` | RRF correctness, deduplication, score calculation, edge cases |
+| `test_reranker.py` | Synesis backend (mocked HTTP), instruction injection, graceful fallback |
+| `test_search_api.py` | API endpoints, request validation, response format |
+
+## Success Criteria
+
+- [ ] Vector search returns Chunk nodes ranked by cosine similarity from Neo4j
+- [ ] Full-text search returns matches from Neo4j fulltext indexes
+- [ ] Graph search traverses Concept relationships to discover related content
+- [ ] Image search returns images via multimodal vector similarity
+- [ ] Reciprocal Rank Fusion correctly merges and deduplicates across search types
+- [ ] Re-ranking via Synesis `/v1/rerank` re-scores candidates with cross-attention
+- [ ] Content-type `reranker_instruction` injected per library type
+- [ ] Search scoping works (by library, library type, collection)
+- [ ] Search gracefully degrades: no reranker → skip; no embedding model → clear error
+- [ ] Search API endpoints return structured results with scores and metadata
+- [ ] Search UI allows querying with filters and displays ranked results
+- [ ] Concept explorer allows browsing the knowledge graph
+- [ ] Prometheus metrics track search throughput, latency, and candidate counts
+- [ ] CLI search command works for testing
+- [ ] All tests pass with mocked external services