# Phase 3: Search & Re-ranking ## Objective Build the complete hybrid search pipeline: accept a query → embed it → search Neo4j (vector + full-text + graph traversal) → fuse candidates → re-rank via Synesis → return ranked results with content-type context. At the end of this phase, content is discoverable through multiple search modalities, ranked by cross-attention relevance, and ready for Phase 4's RAG generation. ## Heritage The hybrid search architecture adapts patterns from [Spelunker](https://git.helu.ca/r/spelunker)'s two-stage retrieval pipeline — vector recall + cross-attention re-ranking — enhanced with knowledge graph traversal, multimodal search, and content-type-aware re-ranking instructions. ## Architecture Overview ``` User Query (text, optional image, optional filters) │ ├─→ Vector Search (Neo4j vector index — Chunk.embedding) │ → Top-K nearest neighbors by cosine similarity │ ├─→ Full-Text Search (Neo4j fulltext index — Chunk.text_preview, Concept.name) │ → BM25-scored matches │ ├─→ Graph Search (Cypher traversal) │ → Concept-linked chunks via MENTIONS/REFERENCES/DEPICTS edges │ └─→ Image Search (Neo4j vector index — ImageEmbedding.embedding) → Multimodal similarity (text-to-image in unified vector space) │ └─→ Candidate Fusion (Reciprocal Rank Fusion) → Deduplicated, scored candidate list │ └─→ Re-ranking (Synesis /v1/rerank) → Content-type-aware instruction injection → Cross-attention precision scoring │ └─→ Final ranked results with metadata ``` ## Synesis Integration [Synesis](docs/synesis_api_usage_guide.html) is a custom FastAPI service built around Qwen3-VL-2B, providing both embedding and re-ranking over a clean REST API. It runs on `pan.helu.ca:8400`. **Embedding** (Phase 2, already working): Synesis's `/v1/embeddings` endpoint is OpenAI-compatible — the existing `EmbeddingClient` handles it with `api_type="openai"`. **Re-ranking** (Phase 3, new): Synesis's `/v1/rerank` endpoint provides: - Native `instruction` parameter — maps directly to `reranker_instruction` from content types - `top_n` for server-side truncation - Multimodal support — both query and documents can include images - Relevance scores for each candidate ```python # Synesis rerank request POST http://pan.helu.ca:8400/v1/rerank { "query": {"text": "How do I configure a 3-phase motor?"}, "documents": [ {"text": "The motor controller requires..."}, {"text": "3-phase power is distributed..."} ], "instruction": "Re-rank passages from technical documentation based on procedural relevance.", "top_n": 10 } ``` ## Deliverables ### 1. Search Service (`library/services/search.py`) The core search orchestrator. Accepts a `SearchRequest`, dispatches to individual search backends, fuses results, and optionally re-ranks. #### SearchRequest ```python @dataclass class SearchRequest: query: str # Natural language query text query_image: bytes | None = None # Optional image for multimodal search library_uid: str | None = None # Scope to specific library library_type: str | None = None # Scope to library type collection_uid: str | None = None # Scope to specific collection search_types: list[str] # ["vector", "fulltext", "graph"] limit: int = 20 # Max results after fusion vector_top_k: int = 50 # Candidates from vector search fulltext_top_k: int = 30 # Candidates from fulltext search graph_max_depth: int = 2 # Graph traversal depth rerank: bool = True # Apply re-ranking include_images: bool = True # Include image results ``` #### SearchResponse ```python @dataclass class SearchCandidate: chunk_uid: str item_uid: str item_title: str library_type: str text_preview: str chunk_s3_key: str chunk_index: int score: float # Final score (post-fusion or post-rerank) source: str # "vector", "fulltext", "graph" metadata: dict # Page, section, nearby images, etc. @dataclass class ImageSearchResult: image_uid: str item_uid: str item_title: str image_type: str description: str s3_key: str score: float source: str # "vector", "graph" @dataclass class SearchResponse: query: str candidates: list[SearchCandidate] # Ranked text results images: list[ImageSearchResult] # Ranked image results total_candidates: int # Pre-fusion candidate count search_time_ms: float reranker_used: bool reranker_model: str | None search_types_used: list[str] ``` ### 2. Vector Search Uses Neo4j's `db.index.vector.queryNodes()` against `chunk_embedding_index`. - Embed query text using system embedding model (via existing `EmbeddingClient`) - Prepend library's `embedding_instruction` when scoped to a specific library - Query Neo4j vector index for top-K Chunk nodes by cosine similarity - Filter by library/collection via graph pattern matching ```cypher CALL db.index.vector.queryNodes('chunk_embedding_index', $top_k, $query_vector) YIELD node AS chunk, score MATCH (item:Item)-[:HAS_CHUNK]->(chunk) OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item) WHERE ($library_uid IS NULL OR lib.uid = $library_uid) AND ($library_type IS NULL OR lib.library_type = $library_type) AND ($collection_uid IS NULL OR col.uid = $collection_uid) RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview, chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index, item.uid AS item_uid, item.title AS item_title, lib.library_type AS library_type, score ORDER BY score DESC LIMIT $top_k ``` ### 3. Full-Text Search Uses Neo4j fulltext indexes created by `setup_neo4j_indexes`. - Query `chunk_text_fulltext` for Chunk matches (BM25) - Query `concept_name_fulltext` for Concept matches → traverse to connected Chunks - Query `item_title_fulltext` for Item title matches → get their Chunks - Normalize BM25 scores to 0-1 range for fusion compatibility ```cypher -- Chunk full-text search CALL db.index.fulltext.queryNodes('chunk_text_fulltext', $query) YIELD node AS chunk, score MATCH (item:Item)-[:HAS_CHUNK]->(chunk) OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item) WHERE ($library_uid IS NULL OR lib.uid = $library_uid) RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview, item.uid AS item_uid, item.title AS item_title, lib.library_type AS library_type, score ORDER BY score DESC LIMIT $top_k -- Concept-to-Chunk traversal CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query) YIELD node AS concept, score AS concept_score MATCH (chunk:Chunk)-[:MENTIONS]->(concept) MATCH (item:Item)-[:HAS_CHUNK]->(chunk) RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview, item.uid AS item_uid, item.title AS item_title, concept_score * 0.8 AS score ``` ### 4. Graph Search Knowledge-graph-powered discovery — the differentiator from standard RAG. - Match query terms against Concept names via fulltext index - Traverse `Concept ←[MENTIONS]- Chunk ←[HAS_CHUNK]- Item` - Expand via `Concept -[RELATED_TO]- Concept` for secondary connections - Score based on relationship weight and traversal depth ```cypher -- Concept graph traversal CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query) YIELD node AS concept, score MATCH path = (concept)<-[:MENTIONS|REFERENCES*1..2]-(connected) WHERE connected:Chunk OR connected:Item WITH concept, connected, score, length(path) AS depth MATCH (item:Item)-[:HAS_CHUNK]->(chunk) WHERE chunk = connected OR item = connected RETURN DISTINCT chunk.uid AS chunk_uid, chunk.text_preview AS text_preview, item.uid AS item_uid, item.title AS item_title, score / (depth * 0.5 + 1) AS score ``` ### 5. Image Search Multimodal vector search against `image_embedding_index`. - Embed query text (or image) using system embedding model - Search `ImageEmbedding` vectors in unified multimodal space - Return with Image descriptions, OCR text, and Item associations from Phase 2B - Also include images found via concept graph DEPICTS relationships ### 6. Candidate Fusion (`library/services/fusion.py`) Reciprocal Rank Fusion (RRF) — parameter-light, proven in Spelunker. ```python def reciprocal_rank_fusion( result_lists: list[list[SearchCandidate]], k: int = 60, ) -> list[SearchCandidate]: """ RRF score = Σ 1 / (k + rank_i) for each list containing the candidate. Candidates in multiple lists get boosted. """ ``` - Deduplicates candidates by `chunk_uid` - Candidates appearing in multiple search types get naturally boosted - Sort by fused score descending, trim to `limit` ### 7. Re-ranking Client (`library/services/reranker.py`) Targets Synesis's `POST /v1/rerank` endpoint. Wraps the system reranker model's API configuration. #### Synesis Backend ```python class RerankerClient: def rerank( self, query: str, candidates: list[SearchCandidate], instruction: str = "", top_n: int | None = None, query_image: bytes | None = None, ) -> list[SearchCandidate]: """ Re-rank candidates via Synesis /v1/rerank. Injects content-type reranker_instruction as the instruction parameter. """ ``` Features: - Uses `text_preview` (500 chars) for document text — avoids S3 round-trips - Prepends library's `reranker_instruction` as the `instruction` parameter - Supports multimodal queries (text + image) - Falls back gracefully when no reranker model configured - Tracks usage via `LLMUsage` with `purpose="reranking"` ### 8. Search API Endpoints New endpoints in `library/api/`: | Method | Route | Purpose | |--------|-------|---------| | `POST` | `/api/v1/library/search/` | Full hybrid search + re-rank | | `POST` | `/api/v1/library/search/vector/` | Vector-only search (debugging) | | `POST` | `/api/v1/library/search/fulltext/` | Full-text-only search (debugging) | | `GET` | `/api/v1/library/concepts/` | List/search concepts | | `GET` | `/api/v1/library/concepts//graph/` | Concept neighborhood graph | ### 9. Search UI Views | URL | View | Purpose | |-----|------|---------| | `/library/search/` | `search` | Search page with query input + filters | | `/library/concepts/` | `concept_list` | Browse concepts with search | | `/library/concepts//` | `concept_detail` | Single concept with connections | ### 10. Prometheus Metrics | Metric | Type | Labels | Purpose | |--------|------|--------|---------| | `mnemosyne_search_requests_total` | Counter | search_type, library_type | Search throughput | | `mnemosyne_search_duration_seconds` | Histogram | search_type | Per-search-type latency | | `mnemosyne_search_candidates_total` | Histogram | search_type | Candidates per search type | | `mnemosyne_fusion_duration_seconds` | Histogram | — | Fusion latency | | `mnemosyne_rerank_requests_total` | Counter | model_name, status | Re-rank throughput | | `mnemosyne_rerank_duration_seconds` | Histogram | model_name | Re-rank latency | | `mnemosyne_rerank_candidates` | Histogram | — | Candidates sent to reranker | | `mnemosyne_search_total_duration_seconds` | Histogram | — | End-to-end search latency | ### 11. Management Commands | Command | Purpose | |---------|---------| | `search [--library-uid] [--limit] [--no-rerank]` | CLI search for testing | | `search_stats` | Search index statistics | ### 12. Settings ```python # Search configuration SEARCH_VECTOR_TOP_K = env.int("SEARCH_VECTOR_TOP_K", default=50) SEARCH_FULLTEXT_TOP_K = env.int("SEARCH_FULLTEXT_TOP_K", default=30) SEARCH_GRAPH_MAX_DEPTH = env.int("SEARCH_GRAPH_MAX_DEPTH", default=2) SEARCH_RRF_K = env.int("SEARCH_RRF_K", default=60) SEARCH_DEFAULT_LIMIT = env.int("SEARCH_DEFAULT_LIMIT", default=20) RERANKER_MAX_CANDIDATES = env.int("RERANKER_MAX_CANDIDATES", default=32) RERANKER_TIMEOUT = env.int("RERANKER_TIMEOUT", default=30) ``` ## File Structure ``` mnemosyne/library/ ├── services/ │ ├── search.py # NEW — SearchService orchestrator │ ├── fusion.py # NEW — Reciprocal Rank Fusion │ ├── reranker.py # NEW — Synesis re-ranking client │ └── ... # Existing services unchanged ├── metrics.py # Modified — add search/rerank metrics ├── views.py # Modified — add search UI views ├── urls.py # Modified — add search routes ├── api/ │ ├── views.py # Modified — add search API endpoints │ ├── serializers.py # Modified — add search serializers │ └── urls.py # Modified — add search API routes ├── management/commands/ │ ├── search.py # NEW — CLI search command │ └── search_stats.py # NEW — Index statistics ├── templates/library/ │ ├── search.html # NEW — Search page │ ├── concept_list.html # NEW — Concept browser │ └── concept_detail.html # NEW — Concept detail └── tests/ ├── test_search.py # NEW — Search service tests ├── test_fusion.py # NEW — RRF fusion tests ├── test_reranker.py # NEW — Re-ranking client tests └── test_search_api.py # NEW — Search API endpoint tests ``` ## Dependencies No new Python dependencies required. Phase 3 uses: - `neomodel` + raw Cypher (Neo4j search) - `requests` (Synesis reranker HTTP) - `EmbeddingClient` from Phase 2 (query embedding) - `prometheus_client` (metrics) ## Testing Strategy All tests use Django `TestCase`. External services mocked. | Test File | Scope | |-----------|-------| | `test_search.py` | SearchService orchestration, individual search methods, library/collection scoping | | `test_fusion.py` | RRF correctness, deduplication, score calculation, edge cases | | `test_reranker.py` | Synesis backend (mocked HTTP), instruction injection, graceful fallback | | `test_search_api.py` | API endpoints, request validation, response format | ## Success Criteria - [ ] Vector search returns Chunk nodes ranked by cosine similarity from Neo4j - [ ] Full-text search returns matches from Neo4j fulltext indexes - [ ] Graph search traverses Concept relationships to discover related content - [ ] Image search returns images via multimodal vector similarity - [ ] Reciprocal Rank Fusion correctly merges and deduplicates across search types - [ ] Re-ranking via Synesis `/v1/rerank` re-scores candidates with cross-attention - [ ] Content-type `reranker_instruction` injected per library type - [ ] Search scoping works (by library, library type, collection) - [ ] Search gracefully degrades: no reranker → skip; no embedding model → clear error - [ ] Search API endpoints return structured results with scores and metadata - [ ] Search UI allows querying with filters and displays ranked results - [ ] Concept explorer allows browsing the knowledge graph - [ ] Prometheus metrics track search throughput, latency, and candidate counts - [ ] CLI search command works for testing - [ ] All tests pass with mocked external services