Compare commits

..

2 Commits

Author SHA1 Message Date
634845fee0 feat: add Phase 3 hybrid search with Synesis reranking
Implement hybrid search pipeline combining vector, fulltext, and graph
search across Neo4j, with cross-attention reranking via Synesis
(Qwen3-VL-Reranker-2B) `/v1/rerank` endpoint.

- Add SearchService with vector, fulltext, and graph search strategies
- Add SynesisRerankerClient for multimodal reranking via HTTP API
- Add search API endpoint (POST /search/) with filtering by library,
  collection, and library_type
- Add SearchRequest/Response serializers and image search results
- Add "nonfiction" to library_type choices
- Consolidate reranker stack from two models to single Synesis service
- Handle image analysis_status as "skipped" when analysis is unavailable
- Add comprehensive tests for search pipeline and reranker client
2026-03-29 18:09:50 +00:00
fb38a881d9 Add vision model support to LLM Manager admin and rename index for clarity 2026-03-29 17:03:59 +00:00
30 changed files with 5723 additions and 4 deletions

View File

@@ -25,8 +25,7 @@ This **content-type awareness** flows through every layer: chunking strategy, em
|-----------|-----------|---------| |-----------|-----------|---------|
| **Knowledge Graph** | Neo4j 5.x | Relationships + vector storage (no dimension limits) | | **Knowledge Graph** | Neo4j 5.x | Relationships + vector storage (no dimension limits) |
| **Multimodal Embeddings** | Qwen3-VL-Embedding-8B | Text + image + video in unified vector space (4096d) | | **Multimodal Embeddings** | Qwen3-VL-Embedding-8B | Text + image + video in unified vector space (4096d) |
| **Multimodal Re-ranking** | Qwen3-VL-Reranker-8B | Cross-attention precision scoring | | **Multimodal Re-ranking** | Synesis (Qwen3-VL-Reranker-2B) | Cross-attention precision scoring via `/v1/rerank` |
| **Text Fallback** | Qwen3-Reranker (llama.cpp) | Text-only re-ranking via GGUF |
| **Web Framework** | Django 5.x + DRF | Auth, admin, API, content management | | **Web Framework** | Django 5.x + DRF | Auth, admin, API, content management |
| **Object Storage** | S3/MinIO | Original content + chunk text storage | | **Object Storage** | S3/MinIO | Original content + chunk text storage |
| **Async Processing** | Celery + RabbitMQ | Document embedding, graph construction | | **Async Processing** | Celery + RabbitMQ | Document embedding, graph construction |

View File

@@ -0,0 +1,384 @@
# Phase 3: Search & Re-ranking
## Objective
Build the complete hybrid search pipeline: accept a query → embed it → search Neo4j (vector + full-text + graph traversal) → fuse candidates → re-rank via Synesis → return ranked results with content-type context. At the end of this phase, content is discoverable through multiple search modalities, ranked by cross-attention relevance, and ready for Phase 4's RAG generation.
## Heritage
The hybrid search architecture adapts patterns from [Spelunker](https://git.helu.ca/r/spelunker)'s two-stage retrieval pipeline — vector recall + cross-attention re-ranking — enhanced with knowledge graph traversal, multimodal search, and content-type-aware re-ranking instructions.
## Architecture Overview
```
User Query (text, optional image, optional filters)
├─→ Vector Search (Neo4j vector index — Chunk.embedding)
│ → Top-K nearest neighbors by cosine similarity
├─→ Full-Text Search (Neo4j fulltext index — Chunk.text_preview, Concept.name)
│ → BM25-scored matches
├─→ Graph Search (Cypher traversal)
│ → Concept-linked chunks via MENTIONS/REFERENCES/DEPICTS edges
└─→ Image Search (Neo4j vector index — ImageEmbedding.embedding)
→ Multimodal similarity (text-to-image in unified vector space)
└─→ Candidate Fusion (Reciprocal Rank Fusion)
→ Deduplicated, scored candidate list
└─→ Re-ranking (Synesis /v1/rerank)
→ Content-type-aware instruction injection
→ Cross-attention precision scoring
└─→ Final ranked results with metadata
```
## Synesis Integration
[Synesis](docs/synesis_api_usage_guide.html) is a custom FastAPI service built around Qwen3-VL-2B, providing both embedding and re-ranking over a clean REST API. It runs on `pan.helu.ca:8400`.
**Embedding** (Phase 2, already working): Synesis's `/v1/embeddings` endpoint is OpenAI-compatible — the existing `EmbeddingClient` handles it with `api_type="openai"`.
**Re-ranking** (Phase 3, new): Synesis's `/v1/rerank` endpoint provides:
- Native `instruction` parameter — maps directly to `reranker_instruction` from content types
- `top_n` for server-side truncation
- Multimodal support — both query and documents can include images
- Relevance scores for each candidate
```python
# Synesis rerank request
POST http://pan.helu.ca:8400/v1/rerank
{
"query": {"text": "How do I configure a 3-phase motor?"},
"documents": [
{"text": "The motor controller requires..."},
{"text": "3-phase power is distributed..."}
],
"instruction": "Re-rank passages from technical documentation based on procedural relevance.",
"top_n": 10
}
```
## Deliverables
### 1. Search Service (`library/services/search.py`)
The core search orchestrator. Accepts a `SearchRequest`, dispatches to individual search backends, fuses results, and optionally re-ranks.
#### SearchRequest
```python
@dataclass
class SearchRequest:
query: str # Natural language query text
query_image: bytes | None = None # Optional image for multimodal search
library_uid: str | None = None # Scope to specific library
library_type: str | None = None # Scope to library type
collection_uid: str | None = None # Scope to specific collection
search_types: list[str] # ["vector", "fulltext", "graph"]
limit: int = 20 # Max results after fusion
vector_top_k: int = 50 # Candidates from vector search
fulltext_top_k: int = 30 # Candidates from fulltext search
graph_max_depth: int = 2 # Graph traversal depth
rerank: bool = True # Apply re-ranking
include_images: bool = True # Include image results
```
#### SearchResponse
```python
@dataclass
class SearchCandidate:
chunk_uid: str
item_uid: str
item_title: str
library_type: str
text_preview: str
chunk_s3_key: str
chunk_index: int
score: float # Final score (post-fusion or post-rerank)
source: str # "vector", "fulltext", "graph"
metadata: dict # Page, section, nearby images, etc.
@dataclass
class ImageSearchResult:
image_uid: str
item_uid: str
item_title: str
image_type: str
description: str
s3_key: str
score: float
source: str # "vector", "graph"
@dataclass
class SearchResponse:
query: str
candidates: list[SearchCandidate] # Ranked text results
images: list[ImageSearchResult] # Ranked image results
total_candidates: int # Pre-fusion candidate count
search_time_ms: float
reranker_used: bool
reranker_model: str | None
search_types_used: list[str]
```
### 2. Vector Search
Uses Neo4j's `db.index.vector.queryNodes()` against `chunk_embedding_index`.
- Embed query text using system embedding model (via existing `EmbeddingClient`)
- Prepend library's `embedding_instruction` when scoped to a specific library
- Query Neo4j vector index for top-K Chunk nodes by cosine similarity
- Filter by library/collection via graph pattern matching
```cypher
CALL db.index.vector.queryNodes('chunk_embedding_index', $top_k, $query_vector)
YIELD node AS chunk, score
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
AND ($collection_uid IS NULL OR col.uid = $collection_uid)
RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index,
item.uid AS item_uid, item.title AS item_title,
lib.library_type AS library_type, score
ORDER BY score DESC
LIMIT $top_k
```
### 3. Full-Text Search
Uses Neo4j fulltext indexes created by `setup_neo4j_indexes`.
- Query `chunk_text_fulltext` for Chunk matches (BM25)
- Query `concept_name_fulltext` for Concept matches → traverse to connected Chunks
- Query `item_title_fulltext` for Item title matches → get their Chunks
- Normalize BM25 scores to 0-1 range for fusion compatibility
```cypher
-- Chunk full-text search
CALL db.index.fulltext.queryNodes('chunk_text_fulltext', $query)
YIELD node AS chunk, score
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
item.uid AS item_uid, item.title AS item_title,
lib.library_type AS library_type, score
ORDER BY score DESC
LIMIT $top_k
-- Concept-to-Chunk traversal
CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query)
YIELD node AS concept, score AS concept_score
MATCH (chunk:Chunk)-[:MENTIONS]->(concept)
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
item.uid AS item_uid, item.title AS item_title,
concept_score * 0.8 AS score
```
### 4. Graph Search
Knowledge-graph-powered discovery — the differentiator from standard RAG.
- Match query terms against Concept names via fulltext index
- Traverse `Concept ←[MENTIONS]- Chunk ←[HAS_CHUNK]- Item`
- Expand via `Concept -[RELATED_TO]- Concept` for secondary connections
- Score based on relationship weight and traversal depth
```cypher
-- Concept graph traversal
CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query)
YIELD node AS concept, score
MATCH path = (concept)<-[:MENTIONS|REFERENCES*1..2]-(connected)
WHERE connected:Chunk OR connected:Item
WITH concept, connected, score, length(path) AS depth
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
WHERE chunk = connected OR item = connected
RETURN DISTINCT chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
item.uid AS item_uid, item.title AS item_title,
score / (depth * 0.5 + 1) AS score
```
### 5. Image Search
Multimodal vector search against `image_embedding_index`.
- Embed query text (or image) using system embedding model
- Search `ImageEmbedding` vectors in unified multimodal space
- Return with Image descriptions, OCR text, and Item associations from Phase 2B
- Also include images found via concept graph DEPICTS relationships
### 6. Candidate Fusion (`library/services/fusion.py`)
Reciprocal Rank Fusion (RRF) — parameter-light, proven in Spelunker.
```python
def reciprocal_rank_fusion(
result_lists: list[list[SearchCandidate]],
k: int = 60,
) -> list[SearchCandidate]:
"""
RRF score = Σ 1 / (k + rank_i) for each list containing the candidate.
Candidates in multiple lists get boosted.
"""
```
- Deduplicates candidates by `chunk_uid`
- Candidates appearing in multiple search types get naturally boosted
- Sort by fused score descending, trim to `limit`
### 7. Re-ranking Client (`library/services/reranker.py`)
Targets Synesis's `POST /v1/rerank` endpoint. Wraps the system reranker model's API configuration.
#### Synesis Backend
```python
class RerankerClient:
def rerank(
self,
query: str,
candidates: list[SearchCandidate],
instruction: str = "",
top_n: int | None = None,
query_image: bytes | None = None,
) -> list[SearchCandidate]:
"""
Re-rank candidates via Synesis /v1/rerank.
Injects content-type reranker_instruction as the instruction parameter.
"""
```
Features:
- Uses `text_preview` (500 chars) for document text — avoids S3 round-trips
- Prepends library's `reranker_instruction` as the `instruction` parameter
- Supports multimodal queries (text + image)
- Falls back gracefully when no reranker model configured
- Tracks usage via `LLMUsage` with `purpose="reranking"`
### 8. Search API Endpoints
New endpoints in `library/api/`:
| Method | Route | Purpose |
|--------|-------|---------|
| `POST` | `/api/v1/library/search/` | Full hybrid search + re-rank |
| `POST` | `/api/v1/library/search/vector/` | Vector-only search (debugging) |
| `POST` | `/api/v1/library/search/fulltext/` | Full-text-only search (debugging) |
| `GET` | `/api/v1/library/concepts/` | List/search concepts |
| `GET` | `/api/v1/library/concepts/<uid>/graph/` | Concept neighborhood graph |
### 9. Search UI Views
| URL | View | Purpose |
|-----|------|---------|
| `/library/search/` | `search` | Search page with query input + filters |
| `/library/concepts/` | `concept_list` | Browse concepts with search |
| `/library/concepts/<uid>/` | `concept_detail` | Single concept with connections |
### 10. Prometheus Metrics
| Metric | Type | Labels | Purpose |
|--------|------|--------|---------|
| `mnemosyne_search_requests_total` | Counter | search_type, library_type | Search throughput |
| `mnemosyne_search_duration_seconds` | Histogram | search_type | Per-search-type latency |
| `mnemosyne_search_candidates_total` | Histogram | search_type | Candidates per search type |
| `mnemosyne_fusion_duration_seconds` | Histogram | — | Fusion latency |
| `mnemosyne_rerank_requests_total` | Counter | model_name, status | Re-rank throughput |
| `mnemosyne_rerank_duration_seconds` | Histogram | model_name | Re-rank latency |
| `mnemosyne_rerank_candidates` | Histogram | — | Candidates sent to reranker |
| `mnemosyne_search_total_duration_seconds` | Histogram | — | End-to-end search latency |
### 11. Management Commands
| Command | Purpose |
|---------|---------|
| `search <query> [--library-uid] [--limit] [--no-rerank]` | CLI search for testing |
| `search_stats` | Search index statistics |
### 12. Settings
```python
# Search configuration
SEARCH_VECTOR_TOP_K = env.int("SEARCH_VECTOR_TOP_K", default=50)
SEARCH_FULLTEXT_TOP_K = env.int("SEARCH_FULLTEXT_TOP_K", default=30)
SEARCH_GRAPH_MAX_DEPTH = env.int("SEARCH_GRAPH_MAX_DEPTH", default=2)
SEARCH_RRF_K = env.int("SEARCH_RRF_K", default=60)
SEARCH_DEFAULT_LIMIT = env.int("SEARCH_DEFAULT_LIMIT", default=20)
RERANKER_MAX_CANDIDATES = env.int("RERANKER_MAX_CANDIDATES", default=32)
RERANKER_TIMEOUT = env.int("RERANKER_TIMEOUT", default=30)
```
## File Structure
```
mnemosyne/library/
├── services/
│ ├── search.py # NEW — SearchService orchestrator
│ ├── fusion.py # NEW — Reciprocal Rank Fusion
│ ├── reranker.py # NEW — Synesis re-ranking client
│ └── ... # Existing services unchanged
├── metrics.py # Modified — add search/rerank metrics
├── views.py # Modified — add search UI views
├── urls.py # Modified — add search routes
├── api/
│ ├── views.py # Modified — add search API endpoints
│ ├── serializers.py # Modified — add search serializers
│ └── urls.py # Modified — add search API routes
├── management/commands/
│ ├── search.py # NEW — CLI search command
│ └── search_stats.py # NEW — Index statistics
├── templates/library/
│ ├── search.html # NEW — Search page
│ ├── concept_list.html # NEW — Concept browser
│ └── concept_detail.html # NEW — Concept detail
└── tests/
├── test_search.py # NEW — Search service tests
├── test_fusion.py # NEW — RRF fusion tests
├── test_reranker.py # NEW — Re-ranking client tests
└── test_search_api.py # NEW — Search API endpoint tests
```
## Dependencies
No new Python dependencies required. Phase 3 uses:
- `neomodel` + raw Cypher (Neo4j search)
- `requests` (Synesis reranker HTTP)
- `EmbeddingClient` from Phase 2 (query embedding)
- `prometheus_client` (metrics)
## Testing Strategy
All tests use Django `TestCase`. External services mocked.
| Test File | Scope |
|-----------|-------|
| `test_search.py` | SearchService orchestration, individual search methods, library/collection scoping |
| `test_fusion.py` | RRF correctness, deduplication, score calculation, edge cases |
| `test_reranker.py` | Synesis backend (mocked HTTP), instruction injection, graceful fallback |
| `test_search_api.py` | API endpoints, request validation, response format |
## Success Criteria
- [ ] Vector search returns Chunk nodes ranked by cosine similarity from Neo4j
- [ ] Full-text search returns matches from Neo4j fulltext indexes
- [ ] Graph search traverses Concept relationships to discover related content
- [ ] Image search returns images via multimodal vector similarity
- [ ] Reciprocal Rank Fusion correctly merges and deduplicates across search types
- [ ] Re-ranking via Synesis `/v1/rerank` re-scores candidates with cross-attention
- [ ] Content-type `reranker_instruction` injected per library type
- [ ] Search scoping works (by library, library type, collection)
- [ ] Search gracefully degrades: no reranker → skip; no embedding model → clear error
- [ ] Search API endpoints return structured results with scores and metadata
- [ ] Search UI allows querying with filters and displays ranked results
- [ ] Concept explorer allows browsing the knowledge graph
- [ ] Prometheus metrics track search throughput, latency, and candidate counts
- [ ] CLI search command works for testing
- [ ] All tests pass with mocked external services

View File

@@ -166,6 +166,8 @@ Titania provides TLS termination and reverse proxy for all services.
| `peitho.ouranos.helu.ca` | puck.incus:22981 | Peitho (Django) | | `peitho.ouranos.helu.ca` | puck.incus:22981 | Peitho (Django) |
| `pgadmin.ouranos.helu.ca` | prospero.incus:443 (SSL) | PgAdmin 4 | | `pgadmin.ouranos.helu.ca` | prospero.incus:443 (SSL) | PgAdmin 4 |
| `prometheus.ouranos.helu.ca` | prospero.incus:443 (SSL) | Prometheus | | `prometheus.ouranos.helu.ca` | prospero.incus:443 (SSL) | Prometheus |
| `freecad-mcp.ouranos.helu.ca` | caliban.incus:22032 | FreeCAD Robust MCP Server |
| `rommie.ouranos.helu.ca` | caliban.incus:22031 | Rommie MCP Server (Agent S GUI automation) |
| `searxng.ouranos.helu.ca` | oberon.incus:22073 | SearXNG (OAuth2-Proxy) | | `searxng.ouranos.helu.ca` | oberon.incus:22073 | SearXNG (OAuth2-Proxy) |
| `smtp4dev.ouranos.helu.ca` | oberon.incus:22085 | smtp4dev | | `smtp4dev.ouranos.helu.ca` | oberon.incus:22085 | smtp4dev |
| `spelunker.ouranos.helu.ca` | puck.incus:22881 | Spelunker (Django) | | `spelunker.ouranos.helu.ca` | puck.incus:22881 | Spelunker (Django) |

View File

@@ -0,0 +1,908 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Synesis — API Usage Guide</title>
<!-- Bootstrap CSS -->
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
<!-- Mermaid -->
<script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
</head>
<body>
<div class="container-fluid">
<!-- Navigation -->
<nav class="navbar navbar-dark bg-dark rounded mb-4">
<div class="container-fluid">
<a class="navbar-brand" href="api_usage_guide.html">Synesis API Guide</a>
<div class="navbar-nav d-flex flex-row">
<a class="nav-link me-3" href="#overview">Overview</a>
<a class="nav-link me-3" href="#architecture">Architecture</a>
<a class="nav-link me-3" href="#embeddings">Embeddings</a>
<a class="nav-link me-3" href="#reranking">Reranking</a>
<a class="nav-link me-3" href="#integration">Integration</a>
<a class="nav-link" href="#operations">Operations</a>
</div>
</div>
</nav>
<nav aria-label="breadcrumb">
<ol class="breadcrumb">
<li class="breadcrumb-item"><a href="api_usage_guide.html">Synesis</a></li>
<li class="breadcrumb-item active">API Usage Guide</li>
</ol>
</nav>
<!-- Title -->
<div class="row mb-4">
<div class="col-12">
<h1 class="display-4 mb-2">Synesis — API Usage Guide</h1>
<p class="lead">Multimodal embedding and reranking service powered by Qwen3-VL-2B. Supports text, image, and mixed-modal inputs over a simple REST API.</p>
</div>
</div>
<!-- ============================================================ -->
<!-- OVERVIEW -->
<!-- ============================================================ -->
<section id="overview" class="mb-5">
<h2 class="h2 mb-4">Overview</h2>
<div class="row g-4 mb-4">
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary">Embeddings</h3>
<p>Generate dense vector representations for text, images, or both. Vectors are suitable for semantic search, retrieval, clustering, and classification.</p>
<code>POST /v1/embeddings</code>
</div>
</div>
</div>
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary">Reranking</h3>
<p>Given a query and a list of candidate documents, score and sort them by relevance. Use after an initial retrieval step to improve precision.</p>
<code>POST /v1/rerank</code>
</div>
</div>
</div>
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary">Similarity</h3>
<p>Convenience endpoint to compute cosine similarity between two inputs without managing vectors yourself.</p>
<code>POST /v1/similarity</code>
</div>
</div>
</div>
</div>
<div class="alert alert-info border-start border-4 border-info">
<h3>Interactive API Explorer</h3>
<p class="mb-0">Full request/response schemas, try-it-out functionality, and auto-generated curl examples are available at <strong><code>http://&lt;host&gt;:8400/docs</code></strong> (Swagger UI). Use it to experiment with every endpoint interactively.</p>
</div>
<div class="alert alert-secondary border-start border-4 border-secondary">
<h3>Base URL</h3>
<p>All endpoints are served from a single base URL. Configure this in your consuming application:</p>
<pre class="mb-0">http://&lt;synesis-host&gt;:8400</pre>
<p class="mt-2 mb-0">Default port is <code>8400</code>. No authentication is required (secure via network policy / firewall).</p>
</div>
</section>
<!-- ============================================================ -->
<!-- ARCHITECTURE -->
<!-- ============================================================ -->
<section id="architecture" class="mb-5">
<h2 class="h2 mb-4">Architecture</h2>
<div class="alert alert-info border-start border-4 border-info">
<h3>Service Architecture</h3>
<p>Synesis loads two Qwen3-VL-2B models into GPU memory at startup: one for embeddings and one for reranking. Both share the same NVIDIA 3090 (24 GB VRAM).</p>
</div>
<div class="card my-4">
<div class="card-body">
<h3 class="card-title text-primary">Request Flow</h3>
<div class="mermaid">
graph LR
Client["Client Application"] -->|HTTP POST| FastAPI["FastAPI<br/>:8400"]
FastAPI -->|/v1/embeddings| Embedder["Qwen3-VL<br/>Embedder 2B"]
FastAPI -->|/v1/rerank| Reranker["Qwen3-VL<br/>Reranker 2B"]
FastAPI -->|/v1/similarity| Embedder
Embedder --> GPU["NVIDIA 3090<br/>24 GB VRAM"]
Reranker --> GPU
FastAPI -->|/metrics| Prometheus["Prometheus"]
</div>
</div>
</div>
<div class="card my-4">
<div class="card-body">
<h3 class="card-title text-primary">Typical RAG Integration</h3>
<div class="mermaid">
sequenceDiagram
participant App as Your Application
participant Synesis as Synesis API
participant VDB as Vector Database
Note over App: Indexing Phase
App->>Synesis: POST /v1/embeddings (documents)
Synesis-->>App: embedding vectors
App->>VDB: Store vectors + metadata
Note over App: Query Phase
App->>Synesis: POST /v1/embeddings (query)
Synesis-->>App: query vector
App->>VDB: ANN search (top 50)
VDB-->>App: candidate documents
App->>Synesis: POST /v1/rerank (query + candidates)
Synesis-->>App: ranked results with scores
App->>App: Use top 5-10 results
</div>
</div>
</div>
</section>
<!-- ============================================================ -->
<!-- EMBEDDINGS -->
<!-- ============================================================ -->
<section id="embeddings" class="mb-5">
<h2 class="h2 mb-4">Embeddings API</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>POST /v1/embeddings</h3>
<p class="mb-0">Generate dense vector embeddings for one or more inputs. Each input can be text, an image, or both (multimodal).</p>
</div>
<!-- Request Schema -->
<h3 class="mt-4">Request Body</h3>
<table class="table table-bordered">
<thead class="table-dark">
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>inputs</code></td>
<td>array</td>
<td>Yes</td>
<td>List of items to embed (1 to <code>max_batch_size</code>).</td>
</tr>
<tr>
<td><code>inputs[].text</code></td>
<td>string</td>
<td>*</td>
<td>Text content. At least one of <code>text</code> or <code>image</code> is required.</td>
</tr>
<tr>
<td><code>inputs[].image</code></td>
<td>string</td>
<td>*</td>
<td>Image file path or URL. At least one of <code>text</code> or <code>image</code> is required.</td>
</tr>
<tr>
<td><code>inputs[].instruction</code></td>
<td>string</td>
<td>No</td>
<td>Optional task instruction to guide embedding (e.g. "Represent this document for retrieval").</td>
</tr>
<tr>
<td><code>dimension</code></td>
<td>int</td>
<td>No</td>
<td>Output vector dimension (642048). Default: 2048. See <a href="#dimensions">Dimensions</a>.</td>
</tr>
<tr>
<td><code>normalize</code></td>
<td>bool</td>
<td>No</td>
<td>L2-normalize output vectors. Default: <code>true</code>.</td>
</tr>
</tbody>
</table>
<!-- Response Schema -->
<h3 class="mt-4">Response Body</h3>
<table class="table table-bordered">
<thead class="table-dark">
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>embeddings[]</code></td>
<td>array</td>
<td>One embedding per input, in order.</td>
</tr>
<tr>
<td><code>embeddings[].index</code></td>
<td>int</td>
<td>Position in the input array.</td>
</tr>
<tr>
<td><code>embeddings[].embedding</code></td>
<td>float[]</td>
<td>The dense vector (length = <code>dimension</code>).</td>
</tr>
<tr>
<td><code>usage.input_count</code></td>
<td>int</td>
<td>Number of inputs processed.</td>
</tr>
<tr>
<td><code>usage.dimension</code></td>
<td>int</td>
<td>Dimension of returned vectors.</td>
</tr>
<tr>
<td><code>usage.elapsed_ms</code></td>
<td>float</td>
<td>Server-side processing time in milliseconds.</td>
</tr>
</tbody>
</table>
<!-- Input Types -->
<h3 class="mt-4">Input Modalities</h3>
<div class="row g-4">
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h4 class="card-title text-primary">Text Only</h4>
<pre class="mb-0">{
"inputs": [
{"text": "quantum computing basics"},
{"text": "machine learning tutorial"}
]
}</pre>
</div>
</div>
</div>
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h4 class="card-title text-primary">Image Only</h4>
<pre class="mb-0">{
"inputs": [
{"image": "/data/photos/cat.jpg"},
{"image": "https://example.com/dog.png"}
]
}</pre>
</div>
</div>
</div>
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h4 class="card-title text-primary">Multimodal</h4>
<pre class="mb-0">{
"inputs": [
{
"text": "product photo",
"image": "/data/products/shoe.jpg"
}
]
}</pre>
</div>
</div>
</div>
</div>
</section>
<!-- ============================================================ -->
<!-- RERANKING -->
<!-- ============================================================ -->
<section id="reranking" class="mb-5">
<h2 class="h2 mb-4">Reranking API</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>POST /v1/rerank</h3>
<p class="mb-0">Score and rank a list of candidate documents against a query. Returns documents sorted by relevance (highest score first).</p>
</div>
<!-- Request Schema -->
<h3 class="mt-4">Request Body</h3>
<table class="table table-bordered">
<thead class="table-dark">
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>query</code></td>
<td>object</td>
<td>Yes</td>
<td>The query to rank against. Must contain <code>text</code>, <code>image</code>, or both.</td>
</tr>
<tr>
<td><code>query.text</code></td>
<td>string</td>
<td>*</td>
<td>Query text. At least one of <code>text</code> or <code>image</code> required.</td>
</tr>
<tr>
<td><code>query.image</code></td>
<td>string</td>
<td>*</td>
<td>Query image path or URL.</td>
</tr>
<tr>
<td><code>documents</code></td>
<td>array</td>
<td>Yes</td>
<td>Candidate documents to rerank (1 to <code>max_batch_size</code>).</td>
</tr>
<tr>
<td><code>documents[].text</code></td>
<td>string</td>
<td>*</td>
<td>Document text. At least one of <code>text</code> or <code>image</code> required per document.</td>
</tr>
<tr>
<td><code>documents[].image</code></td>
<td>string</td>
<td>*</td>
<td>Document image path or URL.</td>
</tr>
<tr>
<td><code>instruction</code></td>
<td>string</td>
<td>No</td>
<td>Task instruction (e.g. "Retrieve images relevant to the query.").</td>
</tr>
<tr>
<td><code>top_n</code></td>
<td>int</td>
<td>No</td>
<td>Return only the top N results. Default: return all.</td>
</tr>
</tbody>
</table>
<!-- Response Schema -->
<h3 class="mt-4">Response Body</h3>
<table class="table table-bordered">
<thead class="table-dark">
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>results[]</code></td>
<td>array</td>
<td>Documents sorted by relevance score (descending).</td>
</tr>
<tr>
<td><code>results[].index</code></td>
<td>int</td>
<td>Original position of this document in the input array.</td>
</tr>
<tr>
<td><code>results[].score</code></td>
<td>float</td>
<td>Relevance score (higher = more relevant).</td>
</tr>
<tr>
<td><code>results[].document</code></td>
<td>object</td>
<td>The document that was ranked (echoed back).</td>
</tr>
<tr>
<td><code>usage.query_count</code></td>
<td>int</td>
<td>Always 1.</td>
</tr>
<tr>
<td><code>usage.document_count</code></td>
<td>int</td>
<td>Total documents scored.</td>
</tr>
<tr>
<td><code>usage.returned_count</code></td>
<td>int</td>
<td>Number of results returned (respects <code>top_n</code>).</td>
</tr>
<tr>
<td><code>usage.elapsed_ms</code></td>
<td>float</td>
<td>Server-side processing time in milliseconds.</td>
</tr>
</tbody>
</table>
<!-- Rerank Examples -->
<h3 class="mt-4">Example: Text Query → Text Documents</h3>
<div class="card my-3">
<div class="card-body">
<pre class="mb-0">{
"query": {"text": "How do neural networks learn?"},
"documents": [
{"text": "Neural networks adjust weights through backpropagation..."},
{"text": "The stock market experienced a downturn in Q3..."},
{"text": "Deep learning uses gradient descent to minimize loss..."},
{"text": "Photosynthesis converts sunlight into chemical energy..."}
],
"top_n": 2
}</pre>
</div>
</div>
<h3 class="mt-4">Example: Text Query → Image Documents</h3>
<div class="card my-3">
<div class="card-body">
<pre class="mb-0">{
"query": {"text": "melancholy album artwork"},
"documents": [
{"image": "/data/covers/cover1.jpg"},
{"image": "/data/covers/cover2.jpg"},
{"text": "dark moody painting", "image": "/data/covers/cover3.jpg"}
],
"instruction": "Retrieve images relevant to the query.",
"top_n": 2
}</pre>
</div>
</div>
</section>
<!-- ============================================================ -->
<!-- DIMENSIONS, BATCHES, PERFORMANCE -->
<!-- ============================================================ -->
<section id="dimensions" class="mb-5">
<h2 class="h2 mb-4">Dimensions, Batches &amp; Performance</h2>
<div class="alert alert-danger border-start border-4 border-danger">
<h3>Matryoshka Dimension Truncation</h3>
<p>Synesis uses <strong>Matryoshka Representation Learning (MRL)</strong>. The model always computes full 2048-dimensional vectors internally, then truncates to your requested dimension. This means you can choose a dimension that balances <strong>quality vs. storage/speed</strong>.</p>
<table class="table table-bordered mt-3 mb-0">
<thead class="table-dark">
<tr>
<th>Dimension</th>
<th>Vector Size</th>
<th>Quality</th>
<th>Use Case</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>2048</code> (default)</td>
<td>8 KB / vector (float32)</td>
<td>Maximum</td>
<td>Highest accuracy retrieval, small collections</td>
</tr>
<tr>
<td><code>1024</code></td>
<td>4 KB / vector</td>
<td>Very high</td>
<td>Good balance for most production systems</td>
</tr>
<tr>
<td><code>512</code></td>
<td>2 KB / vector</td>
<td>High</td>
<td>Large-scale search with reasonable quality</td>
</tr>
<tr>
<td><code>256</code></td>
<td>1 KB / vector</td>
<td>Good</td>
<td>Very large collections, cost-sensitive</td>
</tr>
<tr>
<td><code>128</code></td>
<td>512 B / vector</td>
<td>Moderate</td>
<td>Rough filtering, pre-screening</td>
</tr>
<tr>
<td><code>64</code></td>
<td>256 B / vector</td>
<td>Basic</td>
<td>Coarse clustering, topic grouping</td>
</tr>
</tbody>
</table>
</div>
<div class="alert alert-warning border-start border-4 border-warning">
<h3>Important: Consistency</h3>
<p class="mb-0">All vectors in the same index/collection <strong>must use the same dimension</strong>. Choose a dimension at index creation time and use it consistently for both indexing and querying. You cannot mix 512-d and 1024-d vectors in the same vector database index.</p>
</div>
<div class="alert alert-info border-start border-4 border-info">
<h3>Batch Size &amp; Microbatching</h3>
<p>The <code>max_batch_size</code> setting (default: <strong>32</strong>) controls the maximum number of inputs per API call. This is tuned for the 3090's 24 GB VRAM.</p>
<ul>
<li><strong>Text-only inputs:</strong> Batch sizes up to 32 are safe.</li>
<li><strong>Image inputs:</strong> Images consume significantly more VRAM. Reduce batch sizes to 816 when embedding images, depending on resolution.</li>
<li><strong>Mixed-modal inputs:</strong> Treat as image batches for sizing purposes.</li>
</ul>
<h4>Microbatching Strategy</h4>
<p>When processing large datasets (thousands of documents), <strong>do not send all items in a single request</strong>. Instead, implement client-side microbatching:</p>
<ol class="mb-0">
<li>Split your dataset into chunks of 1632 items.</li>
<li>Send each chunk as a separate <code>/v1/embeddings</code> request.</li>
<li>Collect and concatenate the resulting vectors.</li>
<li>For images, use smaller chunk sizes (816) to avoid OOM errors.</li>
<li>Add a small delay between requests if processing thousands of items to avoid GPU thermal throttling.</li>
</ol>
</div>
<div class="alert alert-secondary border-start border-4 border-secondary">
<h3>Reranking Batch Limits</h3>
<p class="mb-0">The reranker also respects <code>max_batch_size</code> for the number of candidate documents. If you have more than 32 candidates, either pre-filter with embeddings first (recommended) or split into multiple rerank calls and merge results.</p>
</div>
</section>
<!-- ============================================================ -->
<!-- INTEGRATION GUIDE -->
<!-- ============================================================ -->
<section id="integration" class="mb-5">
<h2 class="h2 mb-4">Integration Guide</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>Configuring a Consuming Application</h3>
<p>To integrate Synesis into another system, configure these settings:</p>
<table class="table table-bordered mt-3 mb-0">
<thead class="table-dark">
<tr>
<th>Setting</th>
<th>Value</th>
<th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>Embedding API URL</td>
<td><code>http://&lt;host&gt;:8400/v1/embeddings</code></td>
<td>POST, JSON body</td>
</tr>
<tr>
<td>Rerank API URL</td>
<td><code>http://&lt;host&gt;:8400/v1/rerank</code></td>
<td>POST, JSON body</td>
</tr>
<tr>
<td>Health check URL</td>
<td><code>http://&lt;host&gt;:8400/ready/</code></td>
<td>GET, 200 = ready</td>
</tr>
<tr>
<td>Embedding dimension</td>
<td><code>2048</code> (or your chosen value)</td>
<td>Must match vector DB index config</td>
</tr>
<tr>
<td>Authentication</td>
<td>None</td>
<td>Secure via network policy</td>
</tr>
<tr>
<td>Content-Type</td>
<td><code>application/json</code></td>
<td>All endpoints</td>
</tr>
<tr>
<td>Timeout</td>
<td>3060 seconds</td>
<td>Image inputs take longer; adjust for batch size</td>
</tr>
</tbody>
</table>
</div>
<h3 class="mt-4">Python Integration Example</h3>
<div class="card my-3">
<div class="card-body">
<pre class="mb-0">import requests
SYNESIS_URL = "http://synesis-host:8400"
# --- Generate embeddings ---
resp = requests.post(f"{SYNESIS_URL}/v1/embeddings", json={
"inputs": [
{"text": "How to train a neural network"},
{"text": "Best practices for deep learning"},
],
"dimension": 1024,
})
data = resp.json()
vectors = [e["embedding"] for e in data["embeddings"]]
# vectors[0] is a list of 1024 floats
# --- Rerank candidates ---
resp = requests.post(f"{SYNESIS_URL}/v1/rerank", json={
"query": {"text": "neural network training"},
"documents": [
{"text": "Backpropagation adjusts weights using gradients..."},
{"text": "The weather forecast for tomorrow is sunny..."},
{"text": "Stochastic gradient descent is an optimization method..."},
],
"top_n": 2,
})
ranked = resp.json()
for result in ranked["results"]:
print(f" #{result['index']} score={result['score']:.4f}")
print(f" {result['document']['text'][:80]}")</pre>
</div>
</div>
<h3 class="mt-4">Typical Two-Stage Retrieval Pipeline</h3>
<div class="alert alert-info border-start border-4 border-info">
<ol class="mb-0">
<li><strong>Index time:</strong> Embed all documents via <code>/v1/embeddings</code> and store vectors in your vector database (e.g. pgvector, Qdrant, Milvus, Weaviate).</li>
<li><strong>Query time — Stage 1 (Recall):</strong> Embed the query via <code>/v1/embeddings</code>, perform approximate nearest neighbour (ANN) search in the vector DB to retrieve top 2050 candidates.</li>
<li><strong>Query time — Stage 2 (Precision):</strong> Pass the query and candidates to <code>/v1/rerank</code> to get precise relevance scores. Return the top 510 to the user or LLM context.</li>
</ol>
</div>
</section>
<!-- ============================================================ -->
<!-- SIMILARITY -->
<!-- ============================================================ -->
<section id="similarity" class="mb-5">
<h2 class="h2 mb-4">Similarity API</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>POST /v1/similarity</h3>
<p class="mb-0">Compute cosine similarity between exactly two inputs. A convenience wrapper — embeds both, normalizes, and returns the dot product.</p>
</div>
<h3 class="mt-4">Request Body</h3>
<table class="table table-bordered">
<thead class="table-dark">
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>a</code></td>
<td>object</td>
<td>Yes</td>
<td>First input (<code>text</code>, <code>image</code>, or both).</td>
</tr>
<tr>
<td><code>b</code></td>
<td>object</td>
<td>Yes</td>
<td>Second input (<code>text</code>, <code>image</code>, or both).</td>
</tr>
<tr>
<td><code>dimension</code></td>
<td>int</td>
<td>No</td>
<td>Embedding dimension for comparison (642048). Default: 2048.</td>
</tr>
</tbody>
</table>
<h3 class="mt-4">Response Body</h3>
<table class="table table-bordered">
<thead class="table-dark">
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>score</code></td>
<td>float</td>
<td>Cosine similarity (1.0 to 1.0). Higher = more similar.</td>
</tr>
<tr>
<td><code>dimension</code></td>
<td>int</td>
<td>Dimension used for the comparison.</td>
</tr>
</tbody>
</table>
</section>
<!-- ============================================================ -->
<!-- OPERATIONS -->
<!-- ============================================================ -->
<section id="operations" class="mb-5">
<h2 class="h2 mb-4">Operations &amp; Monitoring</h2>
<div class="alert alert-info border-start border-4 border-info">
<h3>Health &amp; Readiness Endpoints</h3>
<table class="table table-bordered mt-3 mb-0">
<thead class="table-dark">
<tr>
<th>Endpoint</th>
<th>Method</th>
<th>Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>/ready/</code></td>
<td>GET</td>
<td>Readiness probe. Returns 200 when both models are loaded and GPU is available. 503 otherwise. Use for load balancer health checks.</td>
</tr>
<tr>
<td><code>/live/</code></td>
<td>GET</td>
<td>Liveness probe. Returns 200 if the process is alive. Use for container restart decisions.</td>
</tr>
<tr>
<td><code>/health</code></td>
<td>GET</td>
<td>Detailed status: model paths, loaded state, GPU device name, VRAM usage.</td>
</tr>
<tr>
<td><code>/models</code><br><code>/v1/models</code></td>
<td>GET</td>
<td>List available models (OpenAI-compatible). Returns model IDs, capabilities, and metadata. Used by OpenAI SDK clients for model discovery.</td>
</tr>
<tr>
<td><code>/metrics</code></td>
<td>GET</td>
<td>Prometheus metrics (request counts, latency histograms, GPU memory, model status).</td>
</tr>
</tbody>
</table>
</div>
<div class="alert alert-warning border-start border-4 border-warning">
<h3>Prometheus Metrics</h3>
<p>Key custom metrics exposed:</p>
<ul class="mb-0">
<li><code>embedding_model_loaded</code> — Gauge (1 = loaded)</li>
<li><code>reranker_model_loaded</code> — Gauge (1 = loaded)</li>
<li><code>embedding_gpu_memory_bytes</code> — Gauge (current GPU allocation)</li>
<li><code>embedding_inference_requests_total{endpoint}</code> — Counter per endpoint (embeddings, similarity, rerank)</li>
<li><code>embedding_inference_duration_seconds{endpoint}</code> — Histogram of inference latency</li>
<li>Plus standard HTTP metrics from <code>prometheus-fastapi-instrumentator</code></li>
</ul>
</div>
<div class="alert alert-secondary border-start border-4 border-secondary">
<h3>Environment Configuration</h3>
<p>All settings use the <code>EMBEDDING_</code> prefix and can be overridden via environment variables or <code>/etc/default/synesis</code>:</p>
<table class="table table-bordered mt-3 mb-0">
<thead class="table-dark">
<tr>
<th>Variable</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>EMBEDDING_MODEL_PATH</code></td>
<td><code>./models/Qwen3-VL-Embedding-2B</code></td>
<td>Path to embedding model weights</td>
</tr>
<tr>
<td><code>EMBEDDING_RERANKER_MODEL_PATH</code></td>
<td><code>./models/Qwen3-VL-Reranker-2B</code></td>
<td>Path to reranker model weights</td>
</tr>
<tr>
<td><code>EMBEDDING_TORCH_DTYPE</code></td>
<td><code>float16</code></td>
<td>Model precision (<code>float16</code> or <code>bfloat16</code>)</td>
</tr>
<tr>
<td><code>EMBEDDING_USE_FLASH_ATTENTION</code></td>
<td><code>true</code></td>
<td>Enable Flash Attention 2</td>
</tr>
<tr>
<td><code>EMBEDDING_DEFAULT_DIMENSION</code></td>
<td><code>2048</code></td>
<td>Default embedding dimension when not specified per request</td>
</tr>
<tr>
<td><code>EMBEDDING_MAX_BATCH_SIZE</code></td>
<td><code>32</code></td>
<td>Maximum inputs per request (both embeddings and rerank)</td>
</tr>
<tr>
<td><code>EMBEDDING_HOST</code></td>
<td><code>0.0.0.0</code></td>
<td>Bind address</td>
</tr>
<tr>
<td><code>EMBEDDING_PORT</code></td>
<td><code>8400</code></td>
<td>Listen port</td>
</tr>
</tbody>
</table>
</div>
</section>
<!-- ============================================================ -->
<!-- ERROR HANDLING -->
<!-- ============================================================ -->
<section id="errors" class="mb-5">
<h2 class="h2 mb-4">Error Handling</h2>
<div class="alert alert-danger border-start border-4 border-danger">
<h3>HTTP Status Codes</h3>
<table class="table table-bordered mt-3 mb-0">
<thead class="table-dark">
<tr>
<th>Code</th>
<th>Meaning</th>
<th>Action</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>200</code></td>
<td>Success</td>
<td>Process the response.</td>
</tr>
<tr>
<td><code>422</code></td>
<td>Validation error</td>
<td>Check your request body. Batch size may exceed <code>max_batch_size</code>, or required fields are missing.</td>
</tr>
<tr>
<td><code>500</code></td>
<td>Inference error</td>
<td>Model failed during processing. Check server logs. May indicate OOM with large image batches.</td>
</tr>
<tr>
<td><code>503</code></td>
<td>Model not loaded</td>
<td>Service is starting up or a model failed to load. Retry after checking <code>/ready/</code>.</td>
</tr>
</tbody>
</table>
</div>
</section>
<!-- Footer -->
<div class="alert alert-secondary border-start border-4 border-secondary">
<p class="mb-0"><strong>Synesis v0.2.0</strong> — Qwen3-VL Embedding &amp; Reranking Service. For interactive API exploration, visit <code>/docs</code> on the running service.</p>
</div>
</div>
<!-- Bootstrap JS -->
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
<!-- Mermaid init -->
<script>
mermaid.initialize({
startOnLoad: true,
theme: window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'default'
});
</script>
<!-- Dark mode support -->
<script>
if (window.matchMedia('(prefers-color-scheme: dark)').matches) {
document.documentElement.setAttribute('data-bs-theme', 'dark');
}
window.matchMedia('(prefers-color-scheme: dark)').addEventListener('change', function(e) {
document.documentElement.setAttribute('data-bs-theme', e.matches ? 'dark' : 'light');
});
</script>
</body>
</html>

View File

@@ -11,7 +11,7 @@ class LibrarySerializer(serializers.Serializer):
uid = serializers.CharField(read_only=True) uid = serializers.CharField(read_only=True)
name = serializers.CharField(max_length=200) name = serializers.CharField(max_length=200)
library_type = serializers.ChoiceField( library_type = serializers.ChoiceField(
choices=["fiction", "technical", "music", "film", "art", "journal"] choices=["fiction", "nonfiction", "technical", "music", "film", "art", "journal"]
) )
description = serializers.CharField(required=False, allow_blank=True, default="") description = serializers.CharField(required=False, allow_blank=True, default="")
chunking_config = serializers.JSONField(required=False, default=dict) chunking_config = serializers.JSONField(required=False, default=dict)
@@ -81,3 +81,61 @@ class ImageSerializer(serializers.Serializer):
description = serializers.CharField(required=False, allow_blank=True, default="") description = serializers.CharField(required=False, allow_blank=True, default="")
metadata = serializers.JSONField(required=False, default=dict) metadata = serializers.JSONField(required=False, default=dict)
created_at = serializers.DateTimeField(read_only=True) created_at = serializers.DateTimeField(read_only=True)
# --- Phase 3: Search ---
class SearchRequestSerializer(serializers.Serializer):
query = serializers.CharField(max_length=2000)
library_uid = serializers.CharField(required=False, allow_blank=True)
library_type = serializers.ChoiceField(
choices=[
"fiction", "nonfiction", "technical", "music", "film", "art", "journal",
],
required=False,
)
collection_uid = serializers.CharField(required=False, allow_blank=True)
search_types = serializers.ListField(
child=serializers.ChoiceField(choices=["vector", "fulltext", "graph"]),
required=False,
default=["vector", "fulltext", "graph"],
)
limit = serializers.IntegerField(default=20, min_value=1, max_value=100)
rerank = serializers.BooleanField(default=True)
include_images = serializers.BooleanField(default=True)
class SearchCandidateSerializer(serializers.Serializer):
chunk_uid = serializers.CharField()
item_uid = serializers.CharField()
item_title = serializers.CharField()
library_type = serializers.CharField()
text_preview = serializers.CharField()
chunk_s3_key = serializers.CharField()
chunk_index = serializers.IntegerField()
score = serializers.FloatField()
source = serializers.CharField()
metadata = serializers.DictField(required=False, default=dict)
class ImageSearchResultSerializer(serializers.Serializer):
image_uid = serializers.CharField()
item_uid = serializers.CharField()
item_title = serializers.CharField()
image_type = serializers.CharField()
description = serializers.CharField()
s3_key = serializers.CharField()
score = serializers.FloatField()
source = serializers.CharField()
class SearchResponseSerializer(serializers.Serializer):
query = serializers.CharField()
candidates = SearchCandidateSerializer(many=True)
images = ImageSearchResultSerializer(many=True)
total_candidates = serializers.IntegerField()
search_time_ms = serializers.FloatField()
reranker_used = serializers.BooleanField()
reranker_model = serializers.CharField(allow_null=True)
search_types_used = serializers.ListField(child=serializers.CharField())

View File

@@ -21,4 +21,11 @@ urlpatterns = [
path("items/<str:uid>/", views.item_detail, name="item-detail"), path("items/<str:uid>/", views.item_detail, name="item-detail"),
path("items/<str:uid>/reembed/", views.item_reembed, name="item-reembed"), path("items/<str:uid>/reembed/", views.item_reembed, name="item-reembed"),
path("items/<str:uid>/status/", views.item_status, name="item-status"), path("items/<str:uid>/status/", views.item_status, name="item-status"),
# Search (Phase 3)
path("search/", views.search, name="search"),
path("search/vector/", views.search_vector, name="search-vector"),
path("search/fulltext/", views.search_fulltext, name="search-fulltext"),
# Concepts (Phase 3)
path("concepts/", views.concept_list, name="concept-list"),
path("concepts/<str:uid>/graph/", views.concept_graph, name="concept-graph"),
] ]

View File

@@ -20,8 +20,11 @@ from library.content_types import get_library_type_config
from .serializers import ( from .serializers import (
CollectionSerializer, CollectionSerializer,
ConceptSerializer,
ItemSerializer, ItemSerializer,
LibrarySerializer, LibrarySerializer,
SearchRequestSerializer,
SearchResponseSerializer,
) )
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -424,3 +427,211 @@ def item_status(request, uid):
"error_message": item.error_message, "error_message": item.error_message,
} }
) )
# ---------------------------------------------------------------------------
# Search API (Phase 3)
# ---------------------------------------------------------------------------
@api_view(["POST"])
@permission_classes([IsAuthenticated])
def search(request):
"""
Full hybrid search: vector + fulltext + graph → fusion → re-ranking.
Accepts JSON body with query, optional filters, and search parameters.
Returns ranked candidates with scores and metadata.
"""
from django.conf import settings as django_settings
from library.services.search import SearchRequest, SearchService
serializer = SearchRequestSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
search_request = SearchRequest(
query=data["query"],
library_uid=data.get("library_uid") or None,
library_type=data.get("library_type") or None,
collection_uid=data.get("collection_uid") or None,
search_types=data.get("search_types", ["vector", "fulltext", "graph"]),
limit=data.get("limit", getattr(django_settings, "SEARCH_DEFAULT_LIMIT", 20)),
vector_top_k=getattr(django_settings, "SEARCH_VECTOR_TOP_K", 50),
fulltext_top_k=getattr(django_settings, "SEARCH_FULLTEXT_TOP_K", 30),
rerank=data.get("rerank", True),
include_images=data.get("include_images", True),
)
service = SearchService(user=request.user)
response = service.search(search_request)
return Response(SearchResponseSerializer(response).data)
@api_view(["POST"])
@permission_classes([IsAuthenticated])
def search_vector(request):
"""Vector-only search (debugging endpoint)."""
from django.conf import settings as django_settings
from library.services.search import SearchRequest, SearchService
serializer = SearchRequestSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
search_request = SearchRequest(
query=data["query"],
library_uid=data.get("library_uid") or None,
library_type=data.get("library_type") or None,
collection_uid=data.get("collection_uid") or None,
search_types=["vector"],
limit=data.get("limit", 20),
vector_top_k=getattr(django_settings, "SEARCH_VECTOR_TOP_K", 50),
rerank=False,
include_images=False,
)
service = SearchService(user=request.user)
response = service.search(search_request)
return Response(SearchResponseSerializer(response).data)
@api_view(["POST"])
@permission_classes([IsAuthenticated])
def search_fulltext(request):
"""Full-text-only search (debugging endpoint)."""
from django.conf import settings as django_settings
from library.services.search import SearchRequest, SearchService
serializer = SearchRequestSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
search_request = SearchRequest(
query=data["query"],
library_uid=data.get("library_uid") or None,
library_type=data.get("library_type") or None,
collection_uid=data.get("collection_uid") or None,
search_types=["fulltext"],
limit=data.get("limit", 20),
fulltext_top_k=getattr(django_settings, "SEARCH_FULLTEXT_TOP_K", 30),
rerank=False,
include_images=False,
)
service = SearchService(user=request.user)
response = service.search(search_request)
return Response(SearchResponseSerializer(response).data)
# ---------------------------------------------------------------------------
# Concept API (Phase 3)
# ---------------------------------------------------------------------------
@api_view(["GET"])
@permission_classes([IsAuthenticated])
def concept_list(request):
"""List or search concepts."""
from library.models import Concept
query = request.query_params.get("q", "")
limit = min(int(request.query_params.get("limit", 50)), 100)
if query:
# Search via fulltext index
try:
from neomodel import db
results, _ = db.cypher_query(
"CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query) "
"YIELD node, score "
"RETURN node.uid AS uid, node.name AS name, "
" node.concept_type AS concept_type, score "
"ORDER BY score DESC LIMIT $limit",
{"query": query, "limit": limit},
)
concepts = [
{"uid": r[0], "name": r[1], "concept_type": r[2] or "", "score": r[3]}
for r in results
]
return Response({"concepts": concepts, "count": len(concepts)})
except Exception as exc:
logger.error("Concept search failed: %s", exc)
return Response(
{"detail": f"Search failed: {exc}"},
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
else:
try:
concepts = Concept.nodes.order_by("name")[:limit]
return Response(
{
"concepts": ConceptSerializer(concepts, many=True).data,
"count": len(concepts),
}
)
except Exception as exc:
logger.error("Concept list failed: %s", exc)
return Response(
{"detail": f"Failed: {exc}"},
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
@api_view(["GET"])
@permission_classes([IsAuthenticated])
def concept_graph(request, uid):
"""Get a concept's neighborhood graph (connected concepts, chunks, items)."""
try:
from neomodel import db
# Get the concept and its connections
results, _ = db.cypher_query(
"MATCH (c:Concept {uid: $uid}) "
"OPTIONAL MATCH (c)<-[:MENTIONS]-(chunk:Chunk)<-[:HAS_CHUNK]-(item:Item) "
"OPTIONAL MATCH (c)<-[:DEPICTS]-(img:Image)<-[:HAS_IMAGE]-(img_item:Item) "
"OPTIONAL MATCH (c)-[:RELATED_TO]-(related:Concept) "
"RETURN c.uid AS uid, c.name AS name, c.concept_type AS concept_type, "
" collect(DISTINCT {uid: item.uid, title: item.title})[..20] AS items, "
" collect(DISTINCT {uid: related.uid, name: related.name, "
" concept_type: related.concept_type}) AS related_concepts, "
" count(DISTINCT chunk) AS chunk_count, "
" count(DISTINCT img) AS image_count",
{"uid": uid},
)
if not results or not results[0][0]:
return Response(
{"detail": "Concept not found."}, status=status.HTTP_404_NOT_FOUND
)
row = results[0]
# Filter out null entries from collected lists
items = [i for i in (row[3] or []) if i.get("uid")]
related = [r for r in (row[4] or []) if r.get("uid")]
return Response(
{
"uid": row[0],
"name": row[1],
"concept_type": row[2] or "",
"items": items,
"related_concepts": related,
"chunk_count": row[5] or 0,
"image_count": row[6] or 0,
}
)
except Exception as exc:
logger.error("Concept graph query failed: %s", exc)
return Response(
{"detail": f"Failed: {exc}"},
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)

View File

@@ -0,0 +1,161 @@
"""
Management command to run a search query from the command line.
Usage:
python manage.py search "how do neural networks work"
python manage.py search "motor wiring" --library-uid abc123
python manage.py search "quantum physics" --limit 5 --no-rerank
"""
import json
import logging
from django.conf import settings
from django.core.management.base import BaseCommand
logger = logging.getLogger(__name__)
class Command(BaseCommand):
help = "Run a hybrid search query against the Mnemosyne knowledge graph."
def add_arguments(self, parser):
parser.add_argument("query", type=str, help="Search query text")
parser.add_argument(
"--library-uid",
type=str,
default="",
help="Scope search to a specific library UID",
)
parser.add_argument(
"--library-type",
type=str,
default="",
help="Scope search to a library type (fiction, technical, etc.)",
)
parser.add_argument(
"--limit",
type=int,
default=0,
help="Maximum results (default: from settings)",
)
parser.add_argument(
"--no-rerank",
action="store_true",
help="Skip re-ranking (return fusion results only)",
)
parser.add_argument(
"--types",
type=str,
default="vector,fulltext,graph",
help="Comma-separated search types (default: vector,fulltext,graph)",
)
parser.add_argument(
"--json",
action="store_true",
help="Output results as JSON",
)
def handle(self, *args, **options):
from library.services.search import SearchRequest, SearchService
query = options["query"]
limit = options["limit"] or getattr(settings, "SEARCH_DEFAULT_LIMIT", 20)
search_types = [t.strip() for t in options["types"].split(",") if t.strip()]
rerank = not options["no_rerank"]
self.stdout.write(
self.style.HTTP_INFO(
f'Searching: "{query}" (types={search_types}, limit={limit}, rerank={rerank})'
)
)
request = SearchRequest(
query=query,
library_uid=options["library_uid"] or None,
library_type=options["library_type"] or None,
search_types=search_types,
limit=limit,
vector_top_k=getattr(settings, "SEARCH_VECTOR_TOP_K", 50),
fulltext_top_k=getattr(settings, "SEARCH_FULLTEXT_TOP_K", 30),
rerank=rerank,
include_images=True,
)
service = SearchService()
response = service.search(request)
if options["json"]:
self._output_json(response)
else:
self._output_text(response)
def _output_text(self, response):
"""Format results as human-readable text."""
self.stdout.write("")
self.stdout.write(
self.style.SUCCESS(
f"Found {len(response.candidates)} results "
f"({response.total_candidates} candidates, "
f"{response.search_time_ms:.0f}ms, "
f"types={response.search_types_used}, "
f"reranked={response.reranker_used})"
)
)
self.stdout.write("")
for i, c in enumerate(response.candidates, 1):
self.stdout.write(
self.style.WARNING(f" [{i}] {c.item_title} (chunk #{c.chunk_index})")
)
self.stdout.write(f" Score: {c.score:.6f} Source: {c.source}")
if c.library_type:
self.stdout.write(f" Type: {c.library_type}")
preview = c.text_preview[:200].replace("\n", " ")
self.stdout.write(f" {preview}")
self.stdout.write("")
if response.images:
self.stdout.write(
self.style.SUCCESS(f"\nImage results: {len(response.images)}")
)
for img in response.images:
self.stdout.write(
f" [{img.image_type}] {img.item_title}{img.description[:80]} "
f"(score: {img.score:.4f})"
)
def _output_json(self, response):
"""Format results as JSON."""
data = {
"query": response.query,
"total_candidates": response.total_candidates,
"search_time_ms": response.search_time_ms,
"reranker_used": response.reranker_used,
"reranker_model": response.reranker_model,
"search_types_used": response.search_types_used,
"candidates": [
{
"chunk_uid": c.chunk_uid,
"item_uid": c.item_uid,
"item_title": c.item_title,
"library_type": c.library_type,
"text_preview": c.text_preview[:200],
"chunk_index": c.chunk_index,
"score": c.score,
"source": c.source,
}
for c in response.candidates
],
"images": [
{
"image_uid": img.image_uid,
"item_title": img.item_title,
"image_type": img.image_type,
"description": img.description[:200],
"score": img.score,
}
for img in response.images
],
}
self.stdout.write(json.dumps(data, indent=2))

View File

@@ -0,0 +1,127 @@
"""
Management command to show search index statistics.
Usage:
python manage.py search_stats
"""
import logging
from django.core.management.base import BaseCommand
logger = logging.getLogger(__name__)
class Command(BaseCommand):
help = "Show search index statistics from Neo4j."
def handle(self, *args, **options):
try:
from neomodel import db
except Exception as exc:
self.stderr.write(self.style.ERROR(f"Cannot import neomodel: {exc}"))
return
self.stdout.write(self.style.HTTP_INFO("Mnemosyne Search Index Statistics"))
self.stdout.write("=" * 50)
# Node counts
self.stdout.write(self.style.MIGRATE_HEADING("\nNode Counts:"))
for label in [
"Library", "Collection", "Item", "Chunk",
"Concept", "Image", "ImageEmbedding",
]:
try:
results, _ = db.cypher_query(f"MATCH (n:{label}) RETURN count(n)")
count = results[0][0] if results else 0
self.stdout.write(f" {label:20s} {count:>8,d}")
except Exception as exc:
self.stdout.write(f" {label:20s} ERROR: {exc}")
# Embedded chunks
self.stdout.write(self.style.MIGRATE_HEADING("\nEmbedding Coverage:"))
try:
results, _ = db.cypher_query(
"MATCH (c:Chunk) "
"RETURN count(c) AS total, "
" count(CASE WHEN c.embedding IS NOT NULL THEN 1 END) AS embedded"
)
total = results[0][0] if results else 0
embedded = results[0][1] if results else 0
pct = (embedded / total * 100) if total > 0 else 0
self.stdout.write(f" Chunks total: {total:>8,d}")
self.stdout.write(f" Chunks embedded: {embedded:>8,d} ({pct:.1f}%)")
except Exception as exc:
self.stdout.write(f" Error: {exc}")
try:
results, _ = db.cypher_query(
"MATCH (ie:ImageEmbedding) RETURN count(ie)"
)
count = results[0][0] if results else 0
self.stdout.write(f" Image embeddings: {count:>8,d}")
except Exception:
pass
# Indexes
self.stdout.write(self.style.MIGRATE_HEADING("\nIndexes:"))
try:
results, _ = db.cypher_query(
"SHOW INDEXES YIELD name, type, labelsOrTypes, properties, state "
"RETURN name, type, labelsOrTypes, properties, state "
"ORDER BY type, name"
)
for row in results:
name, idx_type, labels, props, state = row
labels_str = ", ".join(labels) if labels else "?"
props_str = ", ".join(props) if props else "?"
state_style = self.style.SUCCESS if state == "ONLINE" else self.style.WARNING
self.stdout.write(
f" {name:35s} {idx_type:12s} {labels_str}({props_str}) "
f"[{state_style(state)}]"
)
except Exception as exc:
self.stdout.write(f" Error listing indexes: {exc}")
# Relationship counts
self.stdout.write(self.style.MIGRATE_HEADING("\nRelationship Counts:"))
for rel_type in [
"CONTAINS", "BELONGS_TO", "HAS_CHUNK", "HAS_IMAGE",
"HAS_EMBEDDING", "MENTIONS", "REFERENCES", "DEPICTS",
"RELATED_TO", "HAS_NEARBY_IMAGE",
]:
try:
results, _ = db.cypher_query(
f"MATCH ()-[r:{rel_type}]->() RETURN count(r)"
)
count = results[0][0] if results else 0
if count > 0:
self.stdout.write(f" {rel_type:20s} {count:>8,d}")
except Exception:
pass
# System models
self.stdout.write(self.style.MIGRATE_HEADING("\nSystem Models:"))
try:
from llm_manager.models import LLMModel
for name, getter in [
("Embedding", LLMModel.get_system_embedding_model),
("Chat", LLMModel.get_system_chat_model),
("Reranker", LLMModel.get_system_reranker_model),
("Vision", LLMModel.get_system_vision_model),
]:
model = getter()
if model:
self.stdout.write(
f" {name:12s} {self.style.SUCCESS(model.name)} "
f"({model.api.name})"
)
else:
self.stdout.write(
f" {name:12s} {self.style.WARNING('Not configured')}"
)
except Exception as exc:
self.stdout.write(f" Error loading models: {exc}")
self.stdout.write("")

View File

@@ -106,6 +106,58 @@ VISION_CONCEPTS_EXTRACTED_TOTAL = Counter(
["concept_type"], ["concept_type"],
) )
# --- Search ---
SEARCH_REQUESTS_TOTAL = Counter(
"mnemosyne_search_requests_total",
"Total search requests",
["search_type", "library_type"],
)
SEARCH_DURATION = Histogram(
"mnemosyne_search_duration_seconds",
"Time per search type execution",
["search_type"],
buckets=[0.01, 0.05, 0.1, 0.25, 0.5, 1, 2, 5, 10],
)
SEARCH_CANDIDATES_TOTAL = Histogram(
"mnemosyne_search_candidates_total",
"Number of candidates returned per search type",
["search_type"],
buckets=[0, 1, 5, 10, 20, 50, 100],
)
SEARCH_TOTAL_DURATION = Histogram(
"mnemosyne_search_total_duration_seconds",
"End-to-end search latency including fusion and re-ranking",
buckets=[0.1, 0.25, 0.5, 1, 2, 5, 10, 30],
)
# --- Fusion ---
FUSION_DURATION = Histogram(
"mnemosyne_fusion_duration_seconds",
"Time to perform Reciprocal Rank Fusion",
buckets=[0.001, 0.005, 0.01, 0.05, 0.1, 0.5],
)
# --- Re-ranking ---
RERANK_REQUESTS_TOTAL = Counter(
"mnemosyne_rerank_requests_total",
"Total re-ranking requests",
["model_name", "status"],
)
RERANK_DURATION = Histogram(
"mnemosyne_rerank_duration_seconds",
"Time per re-ranking request",
["model_name"],
buckets=[0.1, 0.25, 0.5, 1, 2, 5, 10, 30],
)
RERANK_CANDIDATES = Histogram(
"mnemosyne_rerank_candidates",
"Number of candidates sent to re-ranker",
buckets=[1, 5, 10, 20, 32, 50, 100],
)
# --- System State --- # --- System State ---
EMBEDDING_QUEUE_SIZE = Gauge( EMBEDDING_QUEUE_SIZE = Gauge(

View File

@@ -0,0 +1,129 @@
"""
Reciprocal Rank Fusion (RRF) for merging multiple ranked result lists.
Combines results from vector search, full-text search, and graph traversal
into a single ranked list. Candidates appearing in multiple lists receive
a natural score boost.
"""
import logging
import time
from dataclasses import dataclass, field
from library.metrics import FUSION_DURATION
logger = logging.getLogger(__name__)
@dataclass
class SearchCandidate:
"""
A single search result candidate (text chunk).
Carries enough context for re-ranking, display, and citation.
"""
chunk_uid: str
item_uid: str
item_title: str
library_type: str
text_preview: str
chunk_s3_key: str
chunk_index: int
score: float
source: str # "vector", "fulltext", "graph"
metadata: dict = field(default_factory=dict)
@dataclass
class ImageSearchResult:
"""A search result for an image."""
image_uid: str
item_uid: str
item_title: str
image_type: str
description: str
s3_key: str
score: float
source: str # "vector", "graph"
def reciprocal_rank_fusion(
result_lists: list[list[SearchCandidate]],
k: int = 60,
limit: int = 50,
) -> list[SearchCandidate]:
"""
Merge multiple ranked result lists using Reciprocal Rank Fusion.
RRF score for candidate c = Σ 1 / (k + rank_i)
where rank_i is the 1-based position in list i.
Candidates appearing in multiple lists receive a natural boost
because they accumulate scores from each list.
:param result_lists: List of ranked candidate lists (one per search type).
:param k: RRF constant (higher = less emphasis on top ranks). Default 60.
:param limit: Maximum number of results to return.
:returns: Merged, deduplicated, and re-scored candidates sorted by RRF score.
"""
start = time.time()
# Accumulate RRF scores by chunk_uid
scores: dict[str, float] = {}
candidates: dict[str, SearchCandidate] = {}
sources: dict[str, list[str]] = {}
for result_list in result_lists:
for rank, candidate in enumerate(result_list, start=1):
uid = candidate.chunk_uid
rrf_score = 1.0 / (k + rank)
scores[uid] = scores.get(uid, 0.0) + rrf_score
# Keep the candidate with the highest original score
if uid not in candidates or candidate.score > candidates[uid].score:
candidates[uid] = candidate
# Track which search types found this candidate
sources.setdefault(uid, [])
if candidate.source not in sources[uid]:
sources[uid].append(candidate.source)
# Build fused results
fused = []
for uid, rrf_score in scores.items():
candidate = candidates[uid]
# Update score to RRF score
fused_candidate = SearchCandidate(
chunk_uid=candidate.chunk_uid,
item_uid=candidate.item_uid,
item_title=candidate.item_title,
library_type=candidate.library_type,
text_preview=candidate.text_preview,
chunk_s3_key=candidate.chunk_s3_key,
chunk_index=candidate.chunk_index,
score=rrf_score,
source="+".join(sources[uid]),
metadata={**candidate.metadata, "sources": sources[uid]},
)
fused.append(fused_candidate)
# Sort by RRF score descending
fused.sort(key=lambda c: c.score, reverse=True)
# Trim to limit
fused = fused[:limit]
elapsed = time.time() - start
FUSION_DURATION.observe(elapsed)
logger.debug(
"RRF fusion input_lists=%d total_candidates=%d fused=%d elapsed=%.3fs",
len(result_lists),
sum(len(rl) for rl in result_lists),
len(fused),
elapsed,
)
return fused

View File

@@ -0,0 +1,246 @@
"""
Re-ranking client for Synesis.
Sends query-document pairs to Synesis's /v1/rerank endpoint for
cross-attention relevance scoring. Supports content-type-aware
instruction injection and multimodal queries.
"""
import base64
import logging
import time
from typing import Optional
import requests
from library.metrics import (
RERANK_CANDIDATES,
RERANK_DURATION,
RERANK_REQUESTS_TOTAL,
)
logger = logging.getLogger(__name__)
class RerankerClient:
"""
Client for re-ranking search candidates via Synesis.
Uses the system reranker model's API configuration to send
query-document pairs to Synesis's ``POST /v1/rerank`` endpoint.
"""
def __init__(self, reranker_model, user=None):
"""
:param reranker_model: ``LLMModel`` instance for re-ranking.
:param user: Optional Django user for usage tracking.
"""
self.model = reranker_model
self.api = reranker_model.api
self.user = user
self.base_url = self.api.base_url.rstrip("/")
self.model_name = self.model.name
self.timeout = self.api.timeout_seconds or 30
logger.info(
"RerankerClient initialized model=%s base_url=%s",
self.model_name,
self.base_url,
)
def rerank(
self,
query: str,
candidates: list,
instruction: str = "",
top_n: Optional[int] = None,
query_image: Optional[bytes] = None,
) -> list:
"""
Re-rank search candidates via Synesis /v1/rerank.
:param query: Query text.
:param candidates: List of SearchCandidate instances.
:param instruction: Content-type-aware re-ranking instruction.
:param top_n: Return only top N results (server-side truncation).
:param query_image: Optional image bytes for multimodal query.
:returns: Re-ranked list of SearchCandidate instances with updated scores.
"""
if not candidates:
return []
RERANK_CANDIDATES.observe(len(candidates))
start = time.time()
try:
# Build Synesis rerank request
payload = self._build_payload(
query, candidates, instruction, top_n, query_image
)
url = f"{self.base_url}/v1/rerank"
headers = {"Content-Type": "application/json"}
if self.api.api_key:
headers["Authorization"] = f"Bearer {self.api.api_key}"
logger.debug(
"Rerank request candidates=%d instruction_len=%d top_n=%s",
len(candidates),
len(instruction),
top_n,
)
resp = requests.post(
url, json=payload, headers=headers, timeout=self.timeout
)
if resp.status_code != 200:
logger.error(
"Rerank failed status=%d body=%s",
resp.status_code,
resp.text[:500],
)
resp.raise_for_status()
data = resp.json()
# Parse results and update candidate scores
reranked = self._apply_scores(candidates, data)
elapsed = time.time() - start
RERANK_DURATION.labels(model_name=self.model_name).observe(elapsed)
RERANK_REQUESTS_TOTAL.labels(
model_name=self.model_name, status="success"
).inc()
logger.info(
"Rerank completed candidates=%d returned=%d elapsed=%.3fs",
len(candidates),
len(reranked),
elapsed,
)
self._log_usage(len(candidates), elapsed)
return reranked
except Exception as exc:
elapsed = time.time() - start
RERANK_REQUESTS_TOTAL.labels(
model_name=self.model_name, status="error"
).inc()
logger.error("Rerank failed: %s", exc, exc_info=True)
raise
def _build_payload(
self,
query: str,
candidates: list,
instruction: str,
top_n: Optional[int],
query_image: Optional[bytes],
) -> dict:
"""
Build the Synesis /v1/rerank request payload.
:param query: Query text.
:param candidates: List of SearchCandidate instances.
:param instruction: Re-ranking instruction.
:param top_n: Optional top-N limit.
:param query_image: Optional query image bytes.
:returns: Request payload dict.
"""
# Build query object
query_obj: dict = {"text": query}
if query_image:
b64 = base64.b64encode(query_image).decode("utf-8")
query_obj["image"] = f"data:image/png;base64,{b64}"
# Build document objects from candidate text previews
documents = []
for candidate in candidates:
doc: dict = {"text": candidate.text_preview}
documents.append(doc)
payload: dict = {
"query": query_obj,
"documents": documents,
}
if instruction:
payload["instruction"] = instruction
if top_n is not None and top_n > 0:
payload["top_n"] = top_n
return payload
def _apply_scores(self, candidates: list, data: dict) -> list:
"""
Apply Synesis rerank scores back to SearchCandidate instances.
Synesis returns results sorted by score descending with original indices.
:param candidates: Original candidate list.
:param data: Synesis response.
:returns: Re-ranked candidates with updated scores.
"""
from .fusion import SearchCandidate
results = data.get("results", [])
if not results:
logger.warning("Rerank response contained no results")
return candidates
# Map original index → score
reranked = []
for result in results:
orig_idx = result.get("index", 0)
score = result.get("score", 0.0)
if orig_idx < len(candidates):
original = candidates[orig_idx]
reranked.append(
SearchCandidate(
chunk_uid=original.chunk_uid,
item_uid=original.item_uid,
item_title=original.item_title,
library_type=original.library_type,
text_preview=original.text_preview,
chunk_s3_key=original.chunk_s3_key,
chunk_index=original.chunk_index,
score=score,
source=original.source,
metadata={**original.metadata, "reranked": True},
)
)
# Already sorted by Synesis, but ensure descending order
reranked.sort(key=lambda c: c.score, reverse=True)
return reranked
def _log_usage(self, document_count: int, elapsed: float):
"""
Log re-ranking usage to LLMUsage model.
:param document_count: Number of documents re-ranked.
:param elapsed: Time taken in seconds.
"""
try:
from llm_manager.models import LLMUsage
LLMUsage.objects.create(
model=self.model,
user=self.user,
input_tokens=document_count, # Approximate — one "token" per doc
output_tokens=0,
cached_tokens=0,
total_cost=0,
purpose="reranking",
request_metadata={
"document_count": document_count,
"elapsed_ms": round(elapsed * 1000, 2),
},
)
except Exception as exc:
logger.warning("Failed to log rerank usage: %s", exc)

View File

@@ -0,0 +1,734 @@
"""
Hybrid search service for the Mnemosyne knowledge graph.
Orchestrates vector search, full-text search, graph traversal, and
image search against Neo4j — then fuses results via Reciprocal Rank
Fusion and optionally re-ranks via Synesis.
"""
import logging
import time
from dataclasses import dataclass, field
from typing import Optional
from django.conf import settings
from neomodel import db
from library.metrics import (
SEARCH_CANDIDATES_TOTAL,
SEARCH_DURATION,
SEARCH_REQUESTS_TOTAL,
SEARCH_TOTAL_DURATION,
)
from .fusion import ImageSearchResult, SearchCandidate, reciprocal_rank_fusion
logger = logging.getLogger(__name__)
@dataclass
class SearchRequest:
"""Parameters for a search query."""
query: str
query_image: Optional[bytes] = None
library_uid: Optional[str] = None
library_type: Optional[str] = None
collection_uid: Optional[str] = None
search_types: list[str] = field(
default_factory=lambda: ["vector", "fulltext", "graph"]
)
limit: int = 20
vector_top_k: int = 50
fulltext_top_k: int = 30
graph_max_depth: int = 2
rerank: bool = True
include_images: bool = True
@dataclass
class SearchResponse:
"""Results from a search query."""
query: str
candidates: list[SearchCandidate]
images: list[ImageSearchResult]
total_candidates: int
search_time_ms: float
reranker_used: bool
reranker_model: Optional[str]
search_types_used: list[str]
class SearchService:
"""
Orchestrates hybrid search across the Mnemosyne knowledge graph.
Search pipeline:
1. Embed query text via system embedding model
2. Run enabled search types in parallel (vector, fulltext, graph)
3. Fuse results via Reciprocal Rank Fusion
4. Optionally re-rank via Synesis
5. Optionally search images
"""
def __init__(self, user=None):
"""
:param user: Optional Django user for usage tracking.
"""
self.user = user
def search(self, request: SearchRequest) -> SearchResponse:
"""
Execute a hybrid search query.
:param request: SearchRequest with query and parameters.
:returns: SearchResponse with ranked results.
:raises ValueError: If no embedding model configured.
"""
start_time = time.time()
library_type_label = request.library_type or "all"
# --- Embed the query ---
query_vector = self._embed_query(request)
# --- Run search types ---
result_lists = []
search_types_used = []
if "vector" in request.search_types and query_vector:
vector_results = self._vector_search(request, query_vector)
if vector_results:
result_lists.append(vector_results)
search_types_used.append("vector")
SEARCH_REQUESTS_TOTAL.labels(
search_type="vector", library_type=library_type_label
).inc()
if "fulltext" in request.search_types:
fulltext_results = self._fulltext_search(request)
if fulltext_results:
result_lists.append(fulltext_results)
search_types_used.append("fulltext")
SEARCH_REQUESTS_TOTAL.labels(
search_type="fulltext", library_type=library_type_label
).inc()
if "graph" in request.search_types:
graph_results = self._graph_search(request)
if graph_results:
result_lists.append(graph_results)
search_types_used.append("graph")
SEARCH_REQUESTS_TOTAL.labels(
search_type="graph", library_type=library_type_label
).inc()
total_candidates = sum(len(rl) for rl in result_lists)
# --- Fuse results ---
rrf_k = getattr(settings, "SEARCH_RRF_K", 60)
fused = reciprocal_rank_fusion(
result_lists, k=rrf_k, limit=request.limit * 2
)
# --- Re-rank ---
reranker_used = False
reranker_model_name = None
if request.rerank and fused:
reranked, model_name = self._rerank(request, fused)
if reranked is not None:
fused = reranked
reranker_used = True
reranker_model_name = model_name
# Trim to limit
fused = fused[: request.limit]
# --- Image search ---
images = []
if request.include_images and query_vector:
images = self._image_search(request, query_vector)
elapsed_ms = (time.time() - start_time) * 1000
SEARCH_TOTAL_DURATION.observe(elapsed_ms / 1000)
logger.info(
"Search completed query='%s' types=%s total_candidates=%d "
"fused=%d reranked=%s elapsed=%.1fms",
request.query[:80],
search_types_used,
total_candidates,
len(fused),
reranker_used,
elapsed_ms,
)
return SearchResponse(
query=request.query,
candidates=fused,
images=images,
total_candidates=total_candidates,
search_time_ms=round(elapsed_ms, 2),
reranker_used=reranker_used,
reranker_model=reranker_model_name,
search_types_used=search_types_used,
)
# ------------------------------------------------------------------
# Query embedding
# ------------------------------------------------------------------
def _embed_query(self, request: SearchRequest) -> Optional[list[float]]:
"""
Embed the query text using the system embedding model.
Prepends the library's embedding_instruction when scoped to
a specific library for vector space alignment.
:param request: SearchRequest.
:returns: Query embedding vector, or None if not available.
"""
from llm_manager.models import LLMModel
from .embedding_client import EmbeddingClient
embedding_model = LLMModel.get_system_embedding_model()
if not embedding_model:
logger.warning("No system embedding model configured — skipping vector search")
return None
# Get embedding instruction for library-scoped search
instruction = ""
if request.library_uid:
instruction = self._get_embedding_instruction(request.library_uid)
elif request.library_type:
instruction = self._get_type_embedding_instruction(request.library_type)
# Build query text with instruction prefix
query_text = request.query
if instruction:
query_text = f"{instruction}\n\n{query_text}"
try:
client = EmbeddingClient(embedding_model, user=self.user)
vector = client.embed_text(query_text)
logger.debug(
"Query embedded dimensions=%d instruction_len=%d",
len(vector),
len(instruction),
)
return vector
except Exception as exc:
logger.error("Query embedding failed: %s", exc)
return None
# ------------------------------------------------------------------
# Vector search
# ------------------------------------------------------------------
def _vector_search(
self, request: SearchRequest, query_vector: list[float]
) -> list[SearchCandidate]:
"""
Search Chunk embeddings via Neo4j vector index.
:param request: SearchRequest with scope parameters.
:param query_vector: Embedded query vector.
:returns: List of SearchCandidate sorted by cosine similarity.
"""
start = time.time()
top_k = request.vector_top_k
# Build Cypher with optional filtering
cypher = """
CALL db.index.vector.queryNodes('chunk_embedding_index', $top_k, $query_vector)
YIELD node AS chunk, score
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
AND ($collection_uid IS NULL OR col.uid = $collection_uid)
RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index,
item.uid AS item_uid, item.title AS item_title,
lib.library_type AS library_type, score
ORDER BY score DESC
LIMIT $top_k
"""
params = {
"top_k": top_k,
"query_vector": query_vector,
"library_uid": request.library_uid,
"library_type": request.library_type,
"collection_uid": request.collection_uid,
}
try:
results, _ = db.cypher_query(cypher, params)
except Exception as exc:
logger.error("Vector search failed: %s", exc)
return []
candidates = [
SearchCandidate(
chunk_uid=row[0] or "",
text_preview=row[1] or "",
chunk_s3_key=row[2] or "",
chunk_index=row[3] or 0,
item_uid=row[4] or "",
item_title=row[5] or "",
library_type=row[6] or "",
score=float(row[7]) if row[7] else 0.0,
source="vector",
)
for row in results
if row[0] # Skip if chunk_uid is None
]
elapsed = time.time() - start
SEARCH_DURATION.labels(search_type="vector").observe(elapsed)
SEARCH_CANDIDATES_TOTAL.labels(search_type="vector").observe(len(candidates))
logger.debug(
"Vector search results=%d top_k=%d elapsed=%.3fs",
len(candidates),
top_k,
elapsed,
)
return candidates
# ------------------------------------------------------------------
# Full-text search
# ------------------------------------------------------------------
def _fulltext_search(self, request: SearchRequest) -> list[SearchCandidate]:
"""
Search via Neo4j full-text indexes (BM25).
Queries both chunk_text_fulltext and concept_name_fulltext indexes,
then merges results.
:param request: SearchRequest with query and scope parameters.
:returns: List of SearchCandidate sorted by BM25 score.
"""
start = time.time()
top_k = request.fulltext_top_k
candidates: dict[str, SearchCandidate] = {}
# --- Chunk text search ---
self._fulltext_chunk_search(request, top_k, candidates)
# --- Concept-to-chunk traversal ---
self._fulltext_concept_search(request, top_k, candidates)
result = sorted(candidates.values(), key=lambda c: c.score, reverse=True)[
:top_k
]
elapsed = time.time() - start
SEARCH_DURATION.labels(search_type="fulltext").observe(elapsed)
SEARCH_CANDIDATES_TOTAL.labels(search_type="fulltext").observe(len(result))
logger.debug(
"Fulltext search results=%d elapsed=%.3fs", len(result), elapsed
)
return result
def _fulltext_chunk_search(
self,
request: SearchRequest,
top_k: int,
candidates: dict[str, SearchCandidate],
):
"""Search chunk_text_fulltext index and add to candidates dict."""
cypher = """
CALL db.index.fulltext.queryNodes('chunk_text_fulltext', $query)
YIELD node AS chunk, score
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
AND ($collection_uid IS NULL OR col.uid = $collection_uid)
RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index,
item.uid AS item_uid, item.title AS item_title,
lib.library_type AS library_type, score
ORDER BY score DESC
LIMIT $top_k
"""
params = {
"query": request.query,
"top_k": top_k,
"library_uid": request.library_uid,
"library_type": request.library_type,
"collection_uid": request.collection_uid,
}
try:
results, _ = db.cypher_query(cypher, params)
# Normalize BM25 scores to 0-1 range
max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
for row in results:
uid = row[0]
if not uid:
continue
raw_score = float(row[7]) if row[7] else 0.0
normalized = raw_score / max_score if max_score > 0 else 0.0
if uid not in candidates or normalized > candidates[uid].score:
candidates[uid] = SearchCandidate(
chunk_uid=uid,
text_preview=row[1] or "",
chunk_s3_key=row[2] or "",
chunk_index=row[3] or 0,
item_uid=row[4] or "",
item_title=row[5] or "",
library_type=row[6] or "",
score=normalized,
source="fulltext",
)
except Exception as exc:
logger.error("Fulltext chunk search failed: %s", exc)
def _fulltext_concept_search(
self,
request: SearchRequest,
top_k: int,
candidates: dict[str, SearchCandidate],
):
"""Search concept_name_fulltext and traverse to chunks."""
cypher = """
CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query)
YIELD node AS concept, score AS concept_score
MATCH (chunk:Chunk)-[:MENTIONS]->(concept)
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index,
item.uid AS item_uid, item.title AS item_title,
lib.library_type AS library_type,
concept_score * 0.8 AS score
ORDER BY score DESC
LIMIT $top_k
"""
params = {
"query": request.query,
"top_k": top_k,
"library_uid": request.library_uid,
"library_type": request.library_type,
}
try:
results, _ = db.cypher_query(cypher, params)
max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
for row in results:
uid = row[0]
if not uid:
continue
raw_score = float(row[7]) if row[7] else 0.0
normalized = raw_score / max_score if max_score > 0 else 0.0
if uid not in candidates or normalized > candidates[uid].score:
candidates[uid] = SearchCandidate(
chunk_uid=uid,
text_preview=row[1] or "",
chunk_s3_key=row[2] or "",
chunk_index=row[3] or 0,
item_uid=row[4] or "",
item_title=row[5] or "",
library_type=row[6] or "",
score=normalized,
source="fulltext",
)
except Exception as exc:
logger.error("Fulltext concept search failed: %s", exc)
# ------------------------------------------------------------------
# Graph search
# ------------------------------------------------------------------
def _graph_search(self, request: SearchRequest) -> list[SearchCandidate]:
"""
Knowledge graph traversal search.
Matches query terms against Concept names, then traverses
MENTIONS/REFERENCES relationships to discover related chunks.
:param request: SearchRequest with query and scope parameters.
:returns: List of SearchCandidate from graph traversal.
"""
start = time.time()
cypher = """
CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query)
YIELD node AS concept, score AS concept_score
WITH concept, concept_score
ORDER BY concept_score DESC
LIMIT 10
MATCH (chunk:Chunk)-[:MENTIONS]->(concept)
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
WITH chunk, item, lib, concept, concept_score,
count(DISTINCT concept) AS concept_count
RETURN DISTINCT chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index,
item.uid AS item_uid, item.title AS item_title,
lib.library_type AS library_type,
concept_score AS score,
collect(concept.name)[..5] AS concept_names
ORDER BY score DESC
LIMIT $limit
"""
params = {
"query": request.query,
"limit": request.fulltext_top_k,
"library_uid": request.library_uid,
"library_type": request.library_type,
}
try:
results, _ = db.cypher_query(cypher, params)
except Exception as exc:
logger.error("Graph search failed: %s", exc)
return []
# Normalize scores
max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
candidates = []
for row in results:
uid = row[0]
if not uid:
continue
raw_score = float(row[7]) if row[7] else 0.0
normalized = raw_score / max_score if max_score > 0 else 0.0
concept_names = row[8] if len(row) > 8 else []
candidates.append(
SearchCandidate(
chunk_uid=uid,
text_preview=row[1] or "",
chunk_s3_key=row[2] or "",
chunk_index=row[3] or 0,
item_uid=row[4] or "",
item_title=row[5] or "",
library_type=row[6] or "",
score=normalized,
source="graph",
metadata={"concepts": concept_names},
)
)
elapsed = time.time() - start
SEARCH_DURATION.labels(search_type="graph").observe(elapsed)
SEARCH_CANDIDATES_TOTAL.labels(search_type="graph").observe(len(candidates))
logger.debug(
"Graph search results=%d elapsed=%.3fs", len(candidates), elapsed
)
return candidates
# ------------------------------------------------------------------
# Image search
# ------------------------------------------------------------------
def _image_search(
self, request: SearchRequest, query_vector: list[float]
) -> list[ImageSearchResult]:
"""
Search images via multimodal vector index.
:param request: SearchRequest.
:param query_vector: Embedded query vector.
:returns: List of ImageSearchResult.
"""
start = time.time()
cypher = """
CALL db.index.vector.queryNodes('image_embedding_index', $top_k, $query_vector)
YIELD node AS emb_node, score
MATCH (img:Image)-[:HAS_EMBEDDING]->(emb_node)
MATCH (item:Item)-[:HAS_IMAGE]->(img)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
RETURN img.uid AS image_uid, img.image_type AS image_type,
img.description AS description, img.s3_key AS s3_key,
item.uid AS item_uid, item.title AS item_title,
score
ORDER BY score DESC
LIMIT 10
"""
params = {
"top_k": 10,
"query_vector": query_vector,
"library_uid": request.library_uid,
"library_type": request.library_type,
}
try:
results, _ = db.cypher_query(cypher, params)
except Exception as exc:
logger.error("Image search failed: %s", exc)
return []
images = [
ImageSearchResult(
image_uid=row[0] or "",
image_type=row[1] or "",
description=row[2] or "",
s3_key=row[3] or "",
item_uid=row[4] or "",
item_title=row[5] or "",
score=float(row[6]) if row[6] else 0.0,
source="vector",
)
for row in results
if row[0]
]
elapsed = time.time() - start
SEARCH_DURATION.labels(search_type="image").observe(elapsed)
logger.debug(
"Image search results=%d elapsed=%.3fs", len(images), elapsed
)
return images
# ------------------------------------------------------------------
# Re-ranking
# ------------------------------------------------------------------
def _rerank(
self, request: SearchRequest, candidates: list[SearchCandidate]
) -> tuple[Optional[list[SearchCandidate]], Optional[str]]:
"""
Re-rank candidates via Synesis.
:param request: SearchRequest.
:param candidates: Fused candidates to re-rank.
:returns: Tuple of (reranked_candidates, model_name) or (None, None).
"""
from llm_manager.models import LLMModel
from .reranker import RerankerClient
reranker_model = LLMModel.get_system_reranker_model()
if not reranker_model:
logger.debug("No system reranker model — skipping re-ranking")
return None, None
# Get content-type reranker instruction
instruction = self._get_reranker_instruction(request, candidates)
# Cap candidates at configured maximum
max_candidates = getattr(settings, "RERANKER_MAX_CANDIDATES", 32)
candidates_to_rerank = candidates[:max_candidates]
try:
client = RerankerClient(reranker_model, user=self.user)
reranked = client.rerank(
query=request.query,
candidates=candidates_to_rerank,
instruction=instruction,
top_n=request.limit,
query_image=request.query_image,
)
return reranked, reranker_model.name
except Exception as exc:
logger.warning(
"Re-ranking failed, returning fusion results: %s", exc
)
return None, None
# ------------------------------------------------------------------
# Helpers
# ------------------------------------------------------------------
def _get_reranker_instruction(
self, request: SearchRequest, candidates: list[SearchCandidate]
) -> str:
"""
Get the content-type-aware reranker instruction.
If scoped to a library or library type, use that type's instruction.
If mixed types, use a generic instruction.
:param request: SearchRequest.
:param candidates: Candidates (used to detect dominant library type).
:returns: Reranker instruction string.
"""
from library.content_types import get_library_type_config
# Use explicit library type from request
if request.library_type:
try:
config = get_library_type_config(request.library_type)
return config.get("reranker_instruction", "")
except ValueError:
pass
# Use library UID to look up type
if request.library_uid:
return self._get_library_reranker_instruction(request.library_uid)
# Detect dominant type from candidates
type_counts: dict[str, int] = {}
for c in candidates:
if c.library_type:
type_counts[c.library_type] = type_counts.get(c.library_type, 0) + 1
if type_counts:
dominant_type = max(type_counts, key=type_counts.get)
try:
config = get_library_type_config(dominant_type)
return config.get("reranker_instruction", "")
except ValueError:
pass
return ""
def _get_library_reranker_instruction(self, library_uid: str) -> str:
"""Get reranker_instruction from a Library node."""
try:
from library.models import Library
lib = Library.nodes.get(uid=library_uid)
return lib.reranker_instruction or ""
except Exception:
return ""
def _get_embedding_instruction(self, library_uid: str) -> str:
"""Get embedding_instruction from a Library node."""
try:
from library.models import Library
lib = Library.nodes.get(uid=library_uid)
return lib.embedding_instruction or ""
except Exception:
return ""
def _get_type_embedding_instruction(self, library_type: str) -> str:
"""Get embedding_instruction for a library type."""
try:
from library.content_types import get_library_type_config
config = get_library_type_config(library_type)
return config.get("embedding_instruction", "")
except ValueError:
return ""

View File

@@ -0,0 +1,99 @@
{% extends "themis/base.html" %}
{% block title %}{{ concept.name }} — Concepts — Mnemosyne{% endblock %}
{% block content %}
<div class="container mx-auto px-4 py-6 max-w-4xl">
{% if error %}
<div class="alert alert-error mb-6">
<span>{{ error }}</span>
</div>
{% endif %}
{% if concept %}
<div class="flex items-center gap-4 mb-6">
<h1 class="text-3xl font-bold">{{ concept.name }}</h1>
{% if concept.concept_type %}
<span class="badge badge-primary badge-lg">{{ concept.concept_type }}</span>
{% endif %}
</div>
<!-- Stats -->
<div class="stats shadow mb-6 w-full">
<div class="stat">
<div class="stat-title">Chunks Mentioning</div>
<div class="stat-value text-lg">{{ chunk_count }}</div>
</div>
<div class="stat">
<div class="stat-title">Images Depicting</div>
<div class="stat-value text-lg">{{ image_count }}</div>
</div>
<div class="stat">
<div class="stat-title">Connected Items</div>
<div class="stat-value text-lg">{{ items|length }}</div>
</div>
<div class="stat">
<div class="stat-title">Related Concepts</div>
<div class="stat-value text-lg">{{ related_concepts|length }}</div>
</div>
</div>
<!-- Related Concepts -->
{% if related_concepts %}
<h2 class="text-xl font-semibold mb-3">Related Concepts</h2>
<div class="flex flex-wrap gap-2 mb-6">
{% for rc in related_concepts %}
<a href="{% url 'library:concept-detail' uid=rc.uid %}"
class="badge badge-outline badge-lg gap-1 hover:badge-primary transition-colors">
{{ rc.name }}
{% if rc.concept_type %}
<span class="opacity-60 text-xs">({{ rc.concept_type }})</span>
{% endif %}
</a>
{% endfor %}
</div>
{% endif %}
<!-- Connected Items -->
{% if items %}
<h2 class="text-xl font-semibold mb-3">Connected Items</h2>
<div class="overflow-x-auto mb-6">
<table class="table table-zebra w-full">
<thead>
<tr>
<th>Title</th>
</tr>
</thead>
<tbody>
{% for item in items %}
<tr class="hover">
<td>
<a href="{% url 'library:item-detail' uid=item.uid %}"
class="link link-primary">
{{ item.title }}
</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% endif %}
<div class="mt-6">
<a href="{% url 'library:concept-list' %}" class="btn btn-ghost btn-sm">
← Back to Concepts
</a>
<a href="{% url 'library:search' %}?q={{ concept.name }}" class="btn btn-outline btn-sm ml-2">
Search for "{{ concept.name }}"
</a>
</div>
{% else %}
<div class="alert alert-warning">
<span>Concept not found.</span>
</div>
{% endif %}
</div>
{% endblock %}

View File

@@ -0,0 +1,69 @@
{% extends "themis/base.html" %}
{% block title %}Concepts — Mnemosyne{% endblock %}
{% block content %}
<div class="container mx-auto px-4 py-6 max-w-4xl">
<h1 class="text-3xl font-bold mb-6">Knowledge Graph — Concepts</h1>
<!-- Search -->
<form method="get" action="{% url 'library:concept-list' %}" class="mb-6">
<div class="join w-full">
<input type="text" name="q" value="{{ query }}"
placeholder="Search concepts..."
class="input input-bordered join-item w-full"
autofocus>
<button type="submit" class="btn btn-primary join-item">Search</button>
</div>
</form>
{% if error %}
<div class="alert alert-error mb-6">
<span>{{ error }}</span>
</div>
{% endif %}
{% if concepts %}
<div class="overflow-x-auto">
<table class="table table-zebra w-full">
<thead>
<tr>
<th>Name</th>
<th>Type</th>
{% if query %}<th>Score</th>{% endif %}
</tr>
</thead>
<tbody>
{% for concept in concepts %}
<tr class="hover">
<td>
<a href="{% url 'library:concept-detail' uid=concept.uid %}"
class="link link-primary font-medium">
{{ concept.name }}
</a>
</td>
<td>
{% if concept.concept_type %}
<span class="badge badge-outline badge-sm">{{ concept.concept_type }}</span>
{% else %}
<span class="opacity-40"></span>
{% endif %}
</td>
{% if query %}
<td>
<span class="badge badge-ghost badge-sm">{{ concept.score|floatformat:3 }}</span>
</td>
{% endif %}
</tr>
{% endfor %}
</tbody>
</table>
</div>
<p class="text-sm opacity-60 mt-4">Showing {{ concepts|length }} concepts.</p>
{% else %}
<div class="alert alert-info">
<span>{% if query %}No concepts found for "{{ query }}".{% else %}No concepts in the knowledge graph yet.{% endif %}</span>
</div>
{% endif %}
</div>
{% endblock %}

View File

@@ -0,0 +1,161 @@
{% extends "themis/base.html" %}
{% load humanize %}
{% block title %}Search — Mnemosyne{% endblock %}
{% block content %}
<div class="container mx-auto px-4 py-6 max-w-6xl">
<h1 class="text-3xl font-bold mb-6">Search Knowledge</h1>
<!-- Search Form -->
<form method="post" action="{% url 'library:search' %}" class="mb-8">
{% csrf_token %}
<div class="flex flex-col gap-4">
<div class="join w-full">
<input type="text" name="query" value="{{ query }}"
placeholder="Search your knowledge base..."
class="input input-bordered join-item w-full text-lg"
autofocus>
<button type="submit" class="btn btn-primary join-item text-lg">
Search
</button>
</div>
<!-- Filters -->
<div class="flex flex-wrap gap-4 items-end">
<div class="form-control">
<label class="label"><span class="label-text">Library</span></label>
<select name="library_uid" class="select select-bordered select-sm">
<option value="">All libraries</option>
{% for lib in libraries %}
<option value="{{ lib.uid }}">{{ lib.name }} ({{ lib.library_type }})</option>
{% endfor %}
</select>
</div>
<div class="form-control">
<label class="label"><span class="label-text">Type</span></label>
<select name="library_type" class="select select-bordered select-sm">
<option value="">All types</option>
<option value="fiction">Fiction</option>
<option value="nonfiction">Non-Fiction</option>
<option value="technical">Technical</option>
<option value="music">Music</option>
<option value="film">Film</option>
<option value="art">Art</option>
<option value="journal">Journal</option>
</select>
</div>
<div class="form-control">
<label class="label cursor-pointer gap-2">
<span class="label-text">Re-rank</span>
<input type="checkbox" name="rerank" checked class="checkbox checkbox-sm checkbox-primary">
</label>
</div>
</div>
</div>
</form>
{% if error %}
<div class="alert alert-error mb-6">
<span>{{ error }}</span>
</div>
{% endif %}
{% if results %}
<!-- Search Metadata -->
<div class="stats shadow mb-6 w-full">
<div class="stat">
<div class="stat-title">Results</div>
<div class="stat-value text-lg">{{ results.candidates|length }}</div>
</div>
<div class="stat">
<div class="stat-title">Total Candidates</div>
<div class="stat-value text-lg">{{ results.total_candidates }}</div>
</div>
<div class="stat">
<div class="stat-title">Search Time</div>
<div class="stat-value text-lg">{{ results.search_time_ms|floatformat:0 }}ms</div>
</div>
<div class="stat">
<div class="stat-title">Re-ranked</div>
<div class="stat-value text-lg">
{% if results.reranker_used %}
<span class="badge badge-success">Yes</span>
{% else %}
<span class="badge badge-ghost">No</span>
{% endif %}
</div>
</div>
<div class="stat">
<div class="stat-title">Search Types</div>
<div class="stat-value text-lg">
{% for st in results.search_types_used %}
<span class="badge badge-outline badge-sm">{{ st }}</span>
{% endfor %}
</div>
</div>
</div>
<!-- Text Results -->
{% if results.candidates %}
<h2 class="text-xl font-semibold mb-4">Text Results</h2>
<div class="flex flex-col gap-4 mb-8">
{% for candidate in results.candidates %}
<div class="card bg-base-200 shadow-sm">
<div class="card-body py-4">
<div class="flex justify-between items-start">
<div class="flex-1">
<h3 class="card-title text-base">
<a href="{% url 'library:item-detail' uid=candidate.item_uid %}"
class="link link-primary">
{{ candidate.item_title }}
</a>
<span class="badge badge-sm badge-outline ml-2">chunk #{{ candidate.chunk_index }}</span>
</h3>
<p class="text-sm opacity-80 mt-1 line-clamp-3">{{ candidate.text_preview }}</p>
</div>
<div class="flex flex-col items-end gap-1 ml-4 min-w-fit">
<span class="badge badge-primary badge-sm">{{ candidate.score|floatformat:4 }}</span>
{% for src in candidate.source|cut:" " %}
<span class="badge badge-ghost badge-xs">{{ src }}</span>
{% endfor %}
{% if candidate.library_type %}
<span class="badge badge-outline badge-xs">{{ candidate.library_type }}</span>
{% endif %}
</div>
</div>
</div>
</div>
{% endfor %}
</div>
{% endif %}
<!-- Image Results -->
{% if results.images %}
<h2 class="text-xl font-semibold mb-4">Image Results</h2>
<div class="grid grid-cols-2 md:grid-cols-3 lg:grid-cols-4 gap-4 mb-8">
{% for image in results.images %}
<div class="card bg-base-200 shadow-sm">
<div class="card-body p-3">
<div class="badge badge-sm badge-outline mb-1">{{ image.image_type }}</div>
<p class="text-xs opacity-80 line-clamp-2">{{ image.description }}</p>
<div class="flex justify-between items-center mt-1">
<a href="{% url 'library:item-detail' uid=image.item_uid %}"
class="text-xs link link-primary">{{ image.item_title|truncatechars:30 }}</a>
<span class="badge badge-primary badge-xs">{{ image.score|floatformat:3 }}</span>
</div>
</div>
</div>
{% endfor %}
</div>
{% endif %}
{% elif query %}
<div class="alert alert-info">
<span>No results found for "{{ query }}".</span>
</div>
{% endif %}
</div>
{% endblock %}

View File

@@ -0,0 +1,151 @@
"""
Tests for the content-type system configuration.
Validates library type defaults, vision prompts, and the
get_library_type_config helper.
"""
from django.test import TestCase
from library.content_types import LIBRARY_TYPE_DEFAULTS, get_library_type_config
class LibraryTypeDefaultsTests(TestCase):
"""Tests for the LIBRARY_TYPE_DEFAULTS registry."""
EXPECTED_TYPES = {"fiction", "nonfiction", "technical", "music", "film", "art", "journal"}
def test_all_expected_types_present(self):
for lib_type in self.EXPECTED_TYPES:
self.assertIn(lib_type, LIBRARY_TYPE_DEFAULTS, f"Missing library type: {lib_type}")
def test_no_unexpected_types(self):
for lib_type in LIBRARY_TYPE_DEFAULTS:
self.assertIn(lib_type, self.EXPECTED_TYPES, f"Unexpected library type: {lib_type}")
def test_each_type_has_required_keys(self):
required_keys = {
"chunking_config",
"embedding_instruction",
"reranker_instruction",
"llm_context_prompt",
"vision_prompt",
}
for lib_type, config in LIBRARY_TYPE_DEFAULTS.items():
for key in required_keys:
self.assertIn(
key, config,
f"Library type '{lib_type}' missing key '{key}'",
)
def test_chunking_configs_have_strategy(self):
for lib_type, config in LIBRARY_TYPE_DEFAULTS.items():
cc = config["chunking_config"]
self.assertIn("strategy", cc, f"'{lib_type}' chunking_config missing 'strategy'")
self.assertIn("chunk_size", cc, f"'{lib_type}' chunking_config missing 'chunk_size'")
self.assertIn("chunk_overlap", cc, f"'{lib_type}' chunking_config missing 'chunk_overlap'")
def test_chunk_sizes_are_positive(self):
for lib_type, config in LIBRARY_TYPE_DEFAULTS.items():
cc = config["chunking_config"]
self.assertGreater(cc["chunk_size"], 0, f"'{lib_type}' chunk_size must be positive")
self.assertGreaterEqual(cc["chunk_overlap"], 0, f"'{lib_type}' chunk_overlap must be >= 0")
def test_chunk_overlap_less_than_size(self):
for lib_type, config in LIBRARY_TYPE_DEFAULTS.items():
cc = config["chunking_config"]
self.assertLess(
cc["chunk_overlap"], cc["chunk_size"],
f"'{lib_type}' chunk_overlap must be less than chunk_size",
)
def test_instructions_are_nonempty_strings(self):
for lib_type, config in LIBRARY_TYPE_DEFAULTS.items():
for key in ["embedding_instruction", "reranker_instruction", "llm_context_prompt", "vision_prompt"]:
val = config[key]
self.assertIsInstance(val, str, f"'{lib_type}'.{key} should be str")
self.assertTrue(len(val) > 10, f"'{lib_type}'.{key} seems too short")
class VisionPromptTests(TestCase):
"""Tests for vision prompts across library types."""
def test_fiction_vision_prompt_mentions_characters(self):
config = get_library_type_config("fiction")
prompt = config["vision_prompt"].lower()
self.assertIn("character", prompt)
def test_technical_vision_prompt_mentions_diagram(self):
config = get_library_type_config("technical")
prompt = config["vision_prompt"].lower()
self.assertIn("diagram", prompt)
def test_music_vision_prompt_mentions_album(self):
config = get_library_type_config("music")
prompt = config["vision_prompt"].lower()
self.assertIn("album", prompt)
def test_film_vision_prompt_mentions_still(self):
config = get_library_type_config("film")
prompt = config["vision_prompt"].lower()
self.assertIn("still", prompt)
def test_art_vision_prompt_mentions_style(self):
config = get_library_type_config("art")
prompt = config["vision_prompt"].lower()
self.assertIn("style", prompt)
def test_journal_vision_prompt_mentions_date(self):
config = get_library_type_config("journal")
prompt = config["vision_prompt"].lower()
self.assertIn("date", prompt)
def test_nonfiction_vision_prompt_mentions_historical(self):
config = get_library_type_config("nonfiction")
prompt = config["vision_prompt"].lower()
self.assertIn("historical", prompt)
class GetLibraryTypeConfigTests(TestCase):
"""Tests for the get_library_type_config helper."""
def test_returns_dict_for_valid_type(self):
config = get_library_type_config("fiction")
self.assertIsInstance(config, dict)
def test_raises_for_unknown_type(self):
with self.assertRaises(ValueError) as ctx:
get_library_type_config("nonexistent")
self.assertIn("Unknown library type", str(ctx.exception))
self.assertIn("nonexistent", str(ctx.exception))
def test_all_types_retrievable(self):
for lib_type in LIBRARY_TYPE_DEFAULTS:
config = get_library_type_config(lib_type)
self.assertIsNotNone(config)
class NonfictionTypeTests(TestCase):
"""Tests specific to the nonfiction library type (Phase 2B addition)."""
def test_nonfiction_strategy(self):
config = get_library_type_config("nonfiction")
self.assertEqual(config["chunking_config"]["strategy"], "section_aware")
def test_nonfiction_chunk_size(self):
config = get_library_type_config("nonfiction")
self.assertEqual(config["chunking_config"]["chunk_size"], 768)
def test_nonfiction_chunk_overlap(self):
config = get_library_type_config("nonfiction")
self.assertEqual(config["chunking_config"]["chunk_overlap"], 96)
def test_nonfiction_has_vision_prompt(self):
config = get_library_type_config("nonfiction")
self.assertTrue(len(config["vision_prompt"]) > 0)
def test_nonfiction_boundaries(self):
config = get_library_type_config("nonfiction")
boundaries = config["chunking_config"]["respect_boundaries"]
self.assertIn("chapter", boundaries)
self.assertIn("section", boundaries)

View File

@@ -0,0 +1,152 @@
"""
Tests for Reciprocal Rank Fusion.
Tests RRF algorithm correctness, deduplication, score calculation,
and edge cases — no external services required.
"""
from django.test import TestCase
from library.services.fusion import SearchCandidate, reciprocal_rank_fusion
def _make_candidate(chunk_uid: str, score: float = 0.5, source: str = "vector", **kwargs):
"""Helper to create a SearchCandidate with defaults."""
return SearchCandidate(
chunk_uid=chunk_uid,
item_uid=kwargs.get("item_uid", f"item_{chunk_uid}"),
item_title=kwargs.get("item_title", f"Title {chunk_uid}"),
library_type=kwargs.get("library_type", "technical"),
text_preview=kwargs.get("text_preview", f"Preview for {chunk_uid}"),
chunk_s3_key=kwargs.get("chunk_s3_key", f"chunks/{chunk_uid}/chunk_0.txt"),
chunk_index=kwargs.get("chunk_index", 0),
score=score,
source=source,
)
class ReciprocallRankFusionTest(TestCase):
"""Tests for reciprocal_rank_fusion()."""
def test_single_list_preserves_order(self):
"""Single result list — RRF should preserve rank ordering."""
candidates = [
_make_candidate("a", score=0.9),
_make_candidate("b", score=0.8),
_make_candidate("c", score=0.7),
]
fused = reciprocal_rank_fusion([candidates], k=60)
self.assertEqual(len(fused), 3)
self.assertEqual(fused[0].chunk_uid, "a")
self.assertEqual(fused[1].chunk_uid, "b")
self.assertEqual(fused[2].chunk_uid, "c")
def test_duplicate_candidate_gets_boosted(self):
"""Candidate appearing in multiple lists gets higher score."""
list1 = [
_make_candidate("a", score=0.9, source="vector"),
_make_candidate("b", score=0.7, source="vector"),
]
list2 = [
_make_candidate("a", score=0.8, source="fulltext"),
_make_candidate("c", score=0.6, source="fulltext"),
]
fused = reciprocal_rank_fusion([list1, list2], k=60)
# 'a' appears in both lists → should have highest RRF score
self.assertEqual(fused[0].chunk_uid, "a")
# RRF score for 'a': 1/(60+1) + 1/(60+1) ≈ 0.0328
# RRF score for 'b': 1/(60+2) ≈ 0.0161
# RRF score for 'c': 1/(60+2) ≈ 0.0161
self.assertGreater(fused[0].score, fused[1].score)
def test_deduplication_by_chunk_uid(self):
"""Same chunk_uid in multiple lists results in one entry."""
list1 = [_make_candidate("a", source="vector")]
list2 = [_make_candidate("a", source="fulltext")]
list3 = [_make_candidate("a", source="graph")]
fused = reciprocal_rank_fusion([list1, list2, list3], k=60)
self.assertEqual(len(fused), 1)
self.assertEqual(fused[0].chunk_uid, "a")
def test_source_tracking(self):
"""Fused candidate tracks which sources found it."""
list1 = [_make_candidate("a", source="vector")]
list2 = [_make_candidate("a", source="fulltext")]
fused = reciprocal_rank_fusion([list1, list2], k=60)
self.assertIn("vector", fused[0].source)
self.assertIn("fulltext", fused[0].source)
self.assertIn("sources", fused[0].metadata)
self.assertIn("vector", fused[0].metadata["sources"])
self.assertIn("fulltext", fused[0].metadata["sources"])
def test_empty_lists(self):
"""Empty input lists produce empty output."""
fused = reciprocal_rank_fusion([], k=60)
self.assertEqual(fused, [])
fused = reciprocal_rank_fusion([[], []], k=60)
self.assertEqual(fused, [])
def test_limit_trims_results(self):
"""Limit parameter trims output size."""
candidates = [_make_candidate(f"c{i}") for i in range(20)]
fused = reciprocal_rank_fusion([candidates], k=60, limit=5)
self.assertEqual(len(fused), 5)
def test_rrf_score_calculation(self):
"""Verify exact RRF score for a known case."""
k = 60
list1 = [_make_candidate("a", source="vector")] # rank 1
list2 = [
_make_candidate("b", source="fulltext"), # rank 1
_make_candidate("a", source="fulltext"), # rank 2
]
fused = reciprocal_rank_fusion([list1, list2], k=k)
# 'a': 1/(60+1) + 1/(60+2) = 0.01639... + 0.01613... = 0.03253...
a = next(c for c in fused if c.chunk_uid == "a")
expected = 1 / (k + 1) + 1 / (k + 2)
self.assertAlmostEqual(a.score, expected, places=6)
# 'b': 1/(60+1) = 0.01639...
b = next(c for c in fused if c.chunk_uid == "b")
expected_b = 1 / (k + 1)
self.assertAlmostEqual(b.score, expected_b, places=6)
def test_higher_k_reduces_rank_emphasis(self):
"""Higher k makes rank differences less significant."""
list1 = [
_make_candidate("top", source="vector"),
_make_candidate("bottom", source="vector"),
]
fused_low_k = reciprocal_rank_fusion([list1], k=1)
fused_high_k = reciprocal_rank_fusion([list1], k=100)
# Ratio of rank-1 to rank-2 scores
ratio_low = fused_low_k[0].score / fused_low_k[1].score
ratio_high = fused_high_k[0].score / fused_high_k[1].score
# Higher k → scores closer together → lower ratio
self.assertGreater(ratio_low, ratio_high)
def test_keeps_highest_original_score_for_metadata(self):
"""When deduplicating, keeps candidate with highest original score."""
list1 = [_make_candidate("a", score=0.5, source="vector", item_title="Low")]
list2 = [_make_candidate("a", score=0.9, source="fulltext", item_title="High")]
fused = reciprocal_rank_fusion([list1, list2], k=60)
# Should keep the candidate from list2 (higher original score)
self.assertEqual(fused[0].item_title, "High")

View File

@@ -101,3 +101,228 @@ class PipelineNoEmbeddingModelTests(TestCase):
pipeline.process_item("test-uid") pipeline.process_item("test-uid")
self.assertIn("No system embedding model", str(ctx.exception)) self.assertIn("No system embedding model", str(ctx.exception))
class PipelineVisionPromptTests(TestCase):
"""Tests for the _get_vision_prompt helper."""
def test_returns_empty_for_no_library(self):
pipeline = EmbeddingPipeline()
result = pipeline._get_vision_prompt(None)
self.assertEqual(result, "")
@patch("library.content_types.get_library_type_config")
def test_returns_vision_prompt_from_config(self, mock_config):
mock_config.return_value = {
"vision_prompt": "Analyze this technical diagram.",
}
mock_library = MagicMock()
mock_library.library_type = "technical"
pipeline = EmbeddingPipeline()
result = pipeline._get_vision_prompt(mock_library)
self.assertEqual(result, "Analyze this technical diagram.")
mock_config.assert_called_once_with("technical")
@patch("library.content_types.get_library_type_config")
def test_returns_empty_when_no_vision_prompt_key(self, mock_config):
mock_config.return_value = {"embedding_instruction": "something"}
mock_library = MagicMock()
mock_library.library_type = "fiction"
pipeline = EmbeddingPipeline()
result = pipeline._get_vision_prompt(mock_library)
self.assertEqual(result, "")
@patch("library.content_types.get_library_type_config")
def test_returns_empty_on_exception(self, mock_config):
mock_config.side_effect = ValueError("Unknown type")
mock_library = MagicMock()
mock_library.library_type = "bogus"
pipeline = EmbeddingPipeline()
result = pipeline._get_vision_prompt(mock_library)
self.assertEqual(result, "")
class PipelineVisionStageTests(TestCase):
"""Tests for Stage 5.5 — vision analysis integration in _run_pipeline."""
def _make_mock_item(self):
"""Create a common mock Item for pipeline tests."""
item = MagicMock()
item.uid = "test-uid"
item.title = "Test Doc"
item.file_type = "pdf"
item.s3_key = "items/test-uid/original.pdf"
item.embedding_status = "pending"
item.content_hash = ""
item.chunks = MagicMock()
item.chunks.all.return_value = []
item.images = MagicMock()
item.images.all.return_value = []
return item
@patch("library.services.pipeline.ConceptExtractor")
@patch("library.services.pipeline.EmbeddingClient")
@patch("library.services.pipeline.ContentTypeChunker")
@patch("library.services.pipeline.DocumentParser")
@patch("library.services.pipeline.LLMModel")
@patch("library.services.pipeline.default_storage")
def test_no_vision_model_marks_images_skipped(
self, mock_storage, mock_llm, mock_parser_cls,
mock_chunker_cls, mock_embed_cls, mock_concept_cls,
):
"""When no vision model is configured, images get analysis_status='skipped'."""
# Setup embedding model
mock_embed_model = MagicMock()
mock_embed_model.name = "test-embed"
mock_embed_model.vector_dimensions = None
mock_embed_model.supports_multimodal = False
mock_llm.get_system_embedding_model.return_value = mock_embed_model
mock_llm.get_system_vision_model.return_value = None
mock_llm.get_system_chat_model.return_value = None
# Setup parser — returns text + images
mock_parse_result = MagicMock()
mock_parse_result.images = [MagicMock(source_index=0, ext="png", data=b"img", width=100, height=100, source_page=0)]
mock_parse_result.text_blocks = []
mock_parser = MagicMock()
mock_parser.parse_bytes.return_value = mock_parse_result
mock_parser_cls.return_value = mock_parser
# Setup chunker — empty chunks
mock_chunk_result = MagicMock()
mock_chunk_result.chunks = []
mock_chunk_result.chunk_page_map = {}
mock_chunker = MagicMock()
mock_chunker.chunk.return_value = mock_chunk_result
mock_chunker_cls.return_value = mock_chunker
# Setup S3
mock_file = MagicMock()
mock_file.read.return_value = b"file data"
mock_storage.open.return_value.__enter__ = MagicMock(return_value=mock_file)
mock_storage.open.return_value.__exit__ = MagicMock(return_value=False)
item = self._make_mock_item()
pipeline = EmbeddingPipeline()
# Mock _store_images to return a mock image node
img_node = MagicMock()
img_node.s3_key = "images/test-uid/0.png"
with patch.object(pipeline, "_get_item_library", return_value=None), \
patch.object(pipeline, "_read_item_from_s3", return_value=b"data"), \
patch.object(pipeline, "_store_chunks", return_value=[]), \
patch.object(pipeline, "_store_images", return_value=[img_node]), \
patch.object(pipeline, "_associate_images_with_chunks"):
result = pipeline._run_pipeline(item, None)
# Image should be marked as skipped
self.assertEqual(img_node.analysis_status, "skipped")
img_node.save.assert_called()
self.assertEqual(result["images_analyzed"], 0)
@patch("library.services.pipeline.VisionAnalyzer")
@patch("library.services.pipeline.ConceptExtractor")
@patch("library.services.pipeline.EmbeddingClient")
@patch("library.services.pipeline.ContentTypeChunker")
@patch("library.services.pipeline.DocumentParser")
@patch("library.services.pipeline.LLMModel")
@patch("library.services.pipeline.default_storage")
def test_vision_model_triggers_analysis(
self, mock_storage, mock_llm, mock_parser_cls,
mock_chunker_cls, mock_embed_cls, mock_concept_cls, mock_vision_cls,
):
"""When vision model is configured and images exist, analysis runs."""
# Setup models
mock_embed_model = MagicMock()
mock_embed_model.name = "test-embed"
mock_embed_model.vector_dimensions = None
mock_embed_model.supports_multimodal = False
mock_vision_model = MagicMock()
mock_llm.get_system_embedding_model.return_value = mock_embed_model
mock_llm.get_system_vision_model.return_value = mock_vision_model
mock_llm.get_system_chat_model.return_value = None
# Setup parser
mock_parse_result = MagicMock()
mock_parse_result.images = []
mock_parse_result.text_blocks = []
mock_parser = MagicMock()
mock_parser.parse_bytes.return_value = mock_parse_result
mock_parser_cls.return_value = mock_parser
# Setup chunker
mock_chunk_result = MagicMock()
mock_chunk_result.chunks = []
mock_chunk_result.chunk_page_map = {}
mock_chunker = MagicMock()
mock_chunker.chunk.return_value = mock_chunk_result
mock_chunker_cls.return_value = mock_chunker
# Setup vision analyzer
mock_analyzer = MagicMock()
mock_analyzer.analyze_images.return_value = 3
mock_vision_cls.return_value = mock_analyzer
item = self._make_mock_item()
img_nodes = [MagicMock(), MagicMock(), MagicMock()]
pipeline = EmbeddingPipeline()
with patch.object(pipeline, "_get_item_library", return_value=None), \
patch.object(pipeline, "_read_item_from_s3", return_value=b"data"), \
patch.object(pipeline, "_store_chunks", return_value=[]), \
patch.object(pipeline, "_store_images", return_value=img_nodes), \
patch.object(pipeline, "_associate_images_with_chunks"), \
patch.object(pipeline, "_get_vision_prompt", return_value="Analyze"):
result = pipeline._run_pipeline(item, None)
self.assertEqual(result["images_analyzed"], 3)
mock_vision_cls.assert_called_once_with(mock_vision_model, user=None)
mock_analyzer.analyze_images.assert_called_once()
@patch("library.services.pipeline.LLMModel")
def test_no_images_skips_vision_entirely(self, mock_llm):
"""When there are no images, vision stage is a no-op regardless of model."""
mock_vision_model = MagicMock()
mock_llm.get_system_vision_model.return_value = mock_vision_model
mock_llm.get_system_embedding_model.return_value = MagicMock(
name="embed", vector_dimensions=None, supports_multimodal=False
)
mock_llm.get_system_chat_model.return_value = None
item = self._make_mock_item()
pipeline = EmbeddingPipeline()
mock_chunk_result = MagicMock()
mock_chunk_result.chunks = []
mock_chunk_result.chunk_page_map = {}
with patch.object(pipeline, "_get_item_library", return_value=None), \
patch.object(pipeline, "_read_item_from_s3", return_value=b"data"), \
patch.object(pipeline, "_store_chunks", return_value=[]), \
patch.object(pipeline, "_store_images", return_value=[]), \
patch.object(pipeline, "_associate_images_with_chunks"), \
patch("library.services.pipeline.DocumentParser") as mock_parser_cls, \
patch("library.services.pipeline.ContentTypeChunker") as mock_chunker_cls, \
patch("library.services.pipeline.EmbeddingClient"), \
patch("library.services.pipeline.VisionAnalyzer") as mock_vision_cls:
mock_parser = MagicMock()
mock_parser.parse_bytes.return_value = MagicMock(images=[], text_blocks=[])
mock_parser_cls.return_value = mock_parser
mock_chunker = MagicMock()
mock_chunker.chunk.return_value = mock_chunk_result
mock_chunker_cls.return_value = mock_chunker
result = pipeline._run_pipeline(item, None)
# VisionAnalyzer should never be instantiated
mock_vision_cls.assert_not_called()
self.assertEqual(result["images_analyzed"], 0)

View File

@@ -0,0 +1,218 @@
"""
Tests for the RerankerClient (Synesis backend).
All Synesis HTTP calls are mocked — no external service needed.
"""
from decimal import Decimal
from unittest.mock import MagicMock, patch
from django.test import TestCase
from library.services.fusion import SearchCandidate
from library.services.reranker import RerankerClient
def _make_candidate(chunk_uid: str, text_preview: str = "Some text", **kwargs):
"""Helper to create a SearchCandidate."""
return SearchCandidate(
chunk_uid=chunk_uid,
item_uid=kwargs.get("item_uid", f"item_{chunk_uid}"),
item_title=kwargs.get("item_title", f"Title {chunk_uid}"),
library_type=kwargs.get("library_type", "technical"),
text_preview=text_preview,
chunk_s3_key=f"chunks/{chunk_uid}/chunk_0.txt",
chunk_index=0,
score=kwargs.get("score", 0.5),
source=kwargs.get("source", "vector"),
)
def _mock_reranker_model():
"""Create a mock LLMModel for reranking."""
model = MagicMock()
model.name = "qwen3-vl-reranker-2b"
model.api.base_url = "http://pan.helu.ca:8400"
model.api.api_key = ""
model.api.timeout_seconds = 30
model.input_cost_per_1k = Decimal("0")
model.output_cost_per_1k = Decimal("0")
return model
class RerankerClientInitTest(TestCase):
"""Tests for RerankerClient initialization."""
def test_initializes_with_model(self):
"""Client initializes correctly from LLMModel."""
model = _mock_reranker_model()
client = RerankerClient(model)
self.assertEqual(client.model_name, "qwen3-vl-reranker-2b")
self.assertEqual(client.base_url, "http://pan.helu.ca:8400")
class RerankerClientRerankTest(TestCase):
"""Tests for RerankerClient.rerank()."""
@patch("library.services.reranker.requests.post")
def test_basic_rerank(self, mock_post):
"""Rerank returns candidates with updated scores."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"results": [
{"index": 1, "score": 0.95, "document": {"text": "relevant"}},
{"index": 0, "score": 0.30, "document": {"text": "less relevant"}},
],
"usage": {"document_count": 2, "elapsed_ms": 42.0},
}
mock_post.return_value = mock_response
model = _mock_reranker_model()
client = RerankerClient(model)
candidates = [
_make_candidate("a", text_preview="less relevant text"),
_make_candidate("b", text_preview="very relevant text"),
]
reranked = client.rerank(query="test query", candidates=candidates)
self.assertEqual(len(reranked), 2)
# 'b' (index 1, score 0.95) should be first
self.assertEqual(reranked[0].chunk_uid, "b")
self.assertAlmostEqual(reranked[0].score, 0.95)
# 'a' (index 0, score 0.30) should be second
self.assertEqual(reranked[1].chunk_uid, "a")
self.assertAlmostEqual(reranked[1].score, 0.30)
@patch("library.services.reranker.requests.post")
def test_instruction_included_in_payload(self, mock_post):
"""Reranker instruction is sent in the API payload."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"results": [{"index": 0, "score": 0.8}],
"usage": {"document_count": 1, "elapsed_ms": 10},
}
mock_post.return_value = mock_response
model = _mock_reranker_model()
client = RerankerClient(model)
candidates = [_make_candidate("a")]
client.rerank(
query="how to wire a motor",
candidates=candidates,
instruction="Re-rank based on procedural relevance.",
)
# Check the payload sent to Synesis
call_args = mock_post.call_args
payload = call_args.kwargs.get("json") or call_args[1].get("json")
self.assertEqual(payload["instruction"], "Re-rank based on procedural relevance.")
self.assertEqual(payload["query"]["text"], "how to wire a motor")
@patch("library.services.reranker.requests.post")
def test_top_n_included(self, mock_post):
"""top_n parameter is sent to Synesis."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"results": [{"index": 0, "score": 0.9}],
"usage": {"document_count": 1, "elapsed_ms": 10},
}
mock_post.return_value = mock_response
model = _mock_reranker_model()
client = RerankerClient(model)
candidates = [_make_candidate("a"), _make_candidate("b")]
client.rerank(query="test", candidates=candidates, top_n=1)
payload = mock_post.call_args.kwargs.get("json") or mock_post.call_args[1].get("json")
self.assertEqual(payload["top_n"], 1)
@patch("library.services.reranker.requests.post")
def test_documents_use_text_preview(self, mock_post):
"""Document text comes from candidate.text_preview."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"results": [{"index": 0, "score": 0.5}],
"usage": {"document_count": 1, "elapsed_ms": 10},
}
mock_post.return_value = mock_response
model = _mock_reranker_model()
client = RerankerClient(model)
candidates = [_make_candidate("a", text_preview="Motor wiring procedures")]
client.rerank(query="test", candidates=candidates)
payload = mock_post.call_args.kwargs.get("json") or mock_post.call_args[1].get("json")
self.assertEqual(payload["documents"][0]["text"], "Motor wiring procedures")
def test_empty_candidates_returns_empty(self):
"""Empty candidate list returns empty without API call."""
model = _mock_reranker_model()
client = RerankerClient(model)
result = client.rerank(query="test", candidates=[])
self.assertEqual(result, [])
@patch("library.services.reranker.requests.post")
def test_reranked_flag_in_metadata(self, mock_post):
"""Reranked candidates have 'reranked: True' in metadata."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"results": [{"index": 0, "score": 0.9}],
"usage": {"document_count": 1, "elapsed_ms": 10},
}
mock_post.return_value = mock_response
model = _mock_reranker_model()
client = RerankerClient(model)
candidates = [_make_candidate("a")]
reranked = client.rerank(query="test", candidates=candidates)
self.assertTrue(reranked[0].metadata.get("reranked"))
@patch("library.services.reranker.requests.post")
def test_api_error_raises(self, mock_post):
"""HTTP errors propagate as exceptions."""
mock_response = MagicMock()
mock_response.status_code = 500
mock_response.text = "Internal Server Error"
mock_response.raise_for_status.side_effect = Exception("Server error")
mock_post.return_value = mock_response
model = _mock_reranker_model()
client = RerankerClient(model)
candidates = [_make_candidate("a")]
with self.assertRaises(Exception):
client.rerank(query="test", candidates=candidates)
@patch("library.services.reranker.requests.post")
def test_no_instruction_omits_field(self, mock_post):
"""Empty instruction is not sent in payload."""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"results": [{"index": 0, "score": 0.5}],
}
mock_post.return_value = mock_response
model = _mock_reranker_model()
client = RerankerClient(model)
candidates = [_make_candidate("a")]
client.rerank(query="test", candidates=candidates, instruction="")
payload = mock_post.call_args.kwargs.get("json") or mock_post.call_args[1].get("json")
self.assertNotIn("instruction", payload)

View File

@@ -0,0 +1,259 @@
"""
Tests for the SearchService.
Neo4j queries and embedding calls are mocked — no external services needed.
"""
from unittest.mock import MagicMock, patch
from django.test import TestCase, override_settings
from library.services.fusion import SearchCandidate
from library.services.search import SearchRequest, SearchResponse, SearchService
class SearchServiceInitTest(TestCase):
"""Tests for SearchService initialization."""
def test_creates_without_user(self):
"""Service can be created without a user."""
service = SearchService()
self.assertIsNone(service.user)
def test_creates_with_user(self):
"""Service stores user for usage tracking."""
user = MagicMock()
service = SearchService(user=user)
self.assertEqual(service.user, user)
class SearchServiceSearchTest(TestCase):
"""Tests for SearchService.search() orchestration."""
@patch("library.services.search.SearchService._image_search")
@patch("library.services.search.SearchService._rerank")
@patch("library.services.search.SearchService._graph_search")
@patch("library.services.search.SearchService._fulltext_search")
@patch("library.services.search.SearchService._vector_search")
@patch("library.services.search.SearchService._embed_query")
def test_search_calls_all_types(
self, mock_embed, mock_vector, mock_fulltext, mock_graph,
mock_rerank, mock_image
):
"""Search dispatches to all enabled search types."""
mock_embed.return_value = [0.1] * 2048
mock_vector.return_value = [
SearchCandidate(
chunk_uid="c1", item_uid="i1", item_title="Test",
library_type="technical", text_preview="preview",
chunk_s3_key="s3/key", chunk_index=0, score=0.9,
source="vector",
)
]
mock_fulltext.return_value = []
mock_graph.return_value = []
mock_rerank.return_value = (None, None)
mock_image.return_value = []
request = SearchRequest(
query="test query",
search_types=["vector", "fulltext", "graph"],
)
service = SearchService()
response = service.search(request)
mock_embed.assert_called_once()
mock_vector.assert_called_once()
mock_fulltext.assert_called_once()
mock_graph.assert_called_once()
self.assertIsInstance(response, SearchResponse)
self.assertEqual(response.query, "test query")
self.assertGreater(len(response.candidates), 0)
@patch("library.services.search.SearchService._embed_query")
def test_search_without_embedding_model(self, mock_embed):
"""Search continues without vector search if no embedding model."""
mock_embed.return_value = None
request = SearchRequest(
query="test",
search_types=["vector"],
rerank=False,
include_images=False,
)
service = SearchService()
response = service.search(request)
# No candidates since only vector was requested and no embedding
self.assertEqual(len(response.candidates), 0)
@patch("library.services.search.SearchService._rerank")
@patch("library.services.search.SearchService._fulltext_search")
@patch("library.services.search.SearchService._embed_query")
def test_search_with_reranking(self, mock_embed, mock_fulltext, mock_rerank):
"""Search applies reranking when enabled."""
mock_embed.return_value = None
mock_fulltext.return_value = [
SearchCandidate(
chunk_uid="c1", item_uid="i1", item_title="Test",
library_type="technical", text_preview="preview",
chunk_s3_key="s3/key", chunk_index=0, score=0.5,
source="fulltext",
)
]
reranked_candidate = SearchCandidate(
chunk_uid="c1", item_uid="i1", item_title="Test",
library_type="technical", text_preview="preview",
chunk_s3_key="s3/key", chunk_index=0, score=0.95,
source="fulltext",
)
mock_rerank.return_value = ([reranked_candidate], "qwen3-vl-reranker-2b")
request = SearchRequest(
query="test",
search_types=["fulltext"],
rerank=True,
include_images=False,
)
service = SearchService()
response = service.search(request)
self.assertTrue(response.reranker_used)
self.assertEqual(response.reranker_model, "qwen3-vl-reranker-2b")
self.assertAlmostEqual(response.candidates[0].score, 0.95)
@patch("library.services.search.SearchService._fulltext_search")
@patch("library.services.search.SearchService._embed_query")
def test_search_without_reranking(self, mock_embed, mock_fulltext):
"""Search skips reranking when disabled."""
mock_embed.return_value = None
mock_fulltext.return_value = [
SearchCandidate(
chunk_uid="c1", item_uid="i1", item_title="Test",
library_type="technical", text_preview="preview",
chunk_s3_key="s3/key", chunk_index=0, score=0.5,
source="fulltext",
)
]
request = SearchRequest(
query="test",
search_types=["fulltext"],
rerank=False,
include_images=False,
)
service = SearchService()
response = service.search(request)
self.assertFalse(response.reranker_used)
self.assertIsNone(response.reranker_model)
@patch("library.services.search.SearchService._fulltext_search")
@patch("library.services.search.SearchService._embed_query")
def test_search_respects_limit(self, mock_embed, mock_fulltext):
"""Search trims results to requested limit."""
mock_embed.return_value = None
mock_fulltext.return_value = [
SearchCandidate(
chunk_uid=f"c{i}", item_uid=f"i{i}", item_title=f"Title {i}",
library_type="technical", text_preview=f"preview {i}",
chunk_s3_key=f"s3/{i}", chunk_index=i, score=0.5 - i * 0.01,
source="fulltext",
)
for i in range(20)
]
request = SearchRequest(
query="test",
search_types=["fulltext"],
limit=5,
rerank=False,
include_images=False,
)
service = SearchService()
response = service.search(request)
self.assertLessEqual(len(response.candidates), 5)
@patch("library.services.search.SearchService._embed_query")
def test_search_tracks_types_used(self, mock_embed):
"""Response lists which search types actually ran."""
mock_embed.return_value = None
request = SearchRequest(
query="test",
search_types=["fulltext"],
rerank=False,
include_images=False,
)
service = SearchService()
# Mock fulltext to return empty — type not added to used list
with patch.object(service, "_fulltext_search", return_value=[]):
response = service.search(request)
# Fulltext was called but returned empty, so not in used types
self.assertEqual(response.search_types_used, [])
class SearchServiceHelperTest(TestCase):
"""Tests for SearchService helper methods."""
def test_get_type_embedding_instruction(self):
"""Returns embedding instruction for known library type."""
service = SearchService()
instruction = service._get_type_embedding_instruction("technical")
self.assertIn("technical", instruction.lower())
def test_get_type_embedding_instruction_unknown(self):
"""Returns empty string for unknown library type."""
service = SearchService()
instruction = service._get_type_embedding_instruction("nonexistent")
self.assertEqual(instruction, "")
def test_get_reranker_instruction_from_type(self):
"""Resolves reranker instruction from library_type in request."""
service = SearchService()
request = SearchRequest(query="test", library_type="fiction")
instruction = service._get_reranker_instruction(request, [])
self.assertIn("fiction", instruction.lower())
def test_get_reranker_instruction_from_candidates(self):
"""Detects dominant library type from candidate list."""
service = SearchService()
request = SearchRequest(query="test")
candidates = [
SearchCandidate(
chunk_uid=f"c{i}", item_uid="i1", item_title="T",
library_type="technical", text_preview="p",
chunk_s3_key="s3", chunk_index=0, score=0.5,
source="vector",
)
for i in range(5)
]
instruction = service._get_reranker_instruction(request, candidates)
self.assertIn("technical", instruction.lower())
def test_get_reranker_instruction_empty_when_no_context(self):
"""Returns empty when no library type context available."""
service = SearchService()
request = SearchRequest(query="test")
candidates = [
SearchCandidate(
chunk_uid="c1", item_uid="i1", item_title="T",
library_type="", text_preview="p",
chunk_s3_key="s3", chunk_index=0, score=0.5,
source="vector",
)
]
instruction = service._get_reranker_instruction(request, candidates)
self.assertEqual(instruction, "")

View File

@@ -0,0 +1,226 @@
"""
Tests for the search API endpoints.
All search service calls are mocked — tests focus on request validation,
response format, and authentication.
"""
from unittest.mock import MagicMock, patch
from django.contrib.auth import get_user_model
from django.test import TestCase
from rest_framework.test import APIClient
from library.services.fusion import ImageSearchResult, SearchCandidate
from library.services.search import SearchResponse
User = get_user_model()
class SearchAPIAuthTest(TestCase):
"""Tests for search API authentication."""
def setUp(self):
self.client = APIClient()
def test_search_requires_auth(self):
"""POST /api/v1/library/search/ requires authentication."""
response = self.client.post(
"/library/api/search/",
{"query": "test"},
format="json",
)
# Should be 403 or 401 (not authenticated)
self.assertIn(response.status_code, [401, 403])
class SearchAPIValidationTest(TestCase):
"""Tests for search API request validation."""
def setUp(self):
self.user = User.objects.create_user(
username="testuser", password="testpass123"
)
self.client = APIClient()
self.client.force_authenticate(user=self.user)
def test_missing_query_returns_400(self):
"""Request without query field returns 400."""
response = self.client.post(
"/library/api/search/",
{},
format="json",
)
self.assertEqual(response.status_code, 400)
def test_empty_query_returns_400(self):
"""Empty query string returns 400."""
response = self.client.post(
"/library/api/search/",
{"query": ""},
format="json",
)
self.assertEqual(response.status_code, 400)
def test_invalid_library_type_returns_400(self):
"""Invalid library_type returns 400."""
response = self.client.post(
"/library/api/search/",
{"query": "test", "library_type": "invalid_type"},
format="json",
)
self.assertEqual(response.status_code, 400)
def test_invalid_search_type_returns_400(self):
"""Invalid search type returns 400."""
response = self.client.post(
"/library/api/search/",
{"query": "test", "search_types": ["invalid"]},
format="json",
)
self.assertEqual(response.status_code, 400)
def test_limit_above_max_returns_400(self):
"""Limit > 100 returns 400."""
response = self.client.post(
"/library/api/search/",
{"query": "test", "limit": 200},
format="json",
)
self.assertEqual(response.status_code, 400)
class SearchAPIResponseTest(TestCase):
"""Tests for search API response format."""
def setUp(self):
self.user = User.objects.create_user(
username="testuser", password="testpass123"
)
self.client = APIClient()
self.client.force_authenticate(user=self.user)
@patch("library.api.views.SearchService")
def test_successful_search_response_format(self, MockService):
"""Successful search returns expected JSON structure."""
mock_response = SearchResponse(
query="neural networks",
candidates=[
SearchCandidate(
chunk_uid="c1", item_uid="i1", item_title="Deep Learning",
library_type="technical", text_preview="Neural networks are...",
chunk_s3_key="s3/chunk.txt", chunk_index=0,
score=0.95, source="vector+fulltext",
)
],
images=[
ImageSearchResult(
image_uid="img1", item_uid="i1", item_title="Deep Learning",
image_type="diagram", description="Neural network architecture",
s3_key="s3/img.png", score=0.8, source="vector",
)
],
total_candidates=42,
search_time_ms=156.7,
reranker_used=True,
reranker_model="qwen3-vl-reranker-2b",
search_types_used=["vector", "fulltext"],
)
mock_instance = MockService.return_value
mock_instance.search.return_value = mock_response
response = self.client.post(
"/library/api/search/",
{"query": "neural networks"},
format="json",
)
self.assertEqual(response.status_code, 200)
data = response.json()
# Verify top-level fields
self.assertEqual(data["query"], "neural networks")
self.assertEqual(data["total_candidates"], 42)
self.assertTrue(data["reranker_used"])
self.assertEqual(data["reranker_model"], "qwen3-vl-reranker-2b")
self.assertIn("vector", data["search_types_used"])
# Verify candidate structure
self.assertEqual(len(data["candidates"]), 1)
candidate = data["candidates"][0]
self.assertEqual(candidate["chunk_uid"], "c1")
self.assertEqual(candidate["item_title"], "Deep Learning")
self.assertAlmostEqual(candidate["score"], 0.95)
# Verify image structure
self.assertEqual(len(data["images"]), 1)
image = data["images"][0]
self.assertEqual(image["image_uid"], "img1")
self.assertEqual(image["image_type"], "diagram")
@patch("library.api.views.SearchService")
def test_vector_only_endpoint(self, MockService):
"""Vector-only endpoint sets correct search types."""
mock_response = SearchResponse(
query="test", candidates=[], images=[],
total_candidates=0, search_time_ms=10,
reranker_used=False, reranker_model=None,
search_types_used=[],
)
mock_instance = MockService.return_value
mock_instance.search.return_value = mock_response
response = self.client.post(
"/library/api/search/vector/",
{"query": "test"},
format="json",
)
self.assertEqual(response.status_code, 200)
# Verify search was called with vector only
call_args = mock_instance.search.call_args[0][0]
self.assertEqual(call_args.search_types, ["vector"])
self.assertFalse(call_args.rerank)
@patch("library.api.views.SearchService")
def test_fulltext_only_endpoint(self, MockService):
"""Fulltext-only endpoint sets correct search types."""
mock_response = SearchResponse(
query="test", candidates=[], images=[],
total_candidates=0, search_time_ms=10,
reranker_used=False, reranker_model=None,
search_types_used=[],
)
mock_instance = MockService.return_value
mock_instance.search.return_value = mock_response
response = self.client.post(
"/library/api/search/fulltext/",
{"query": "test"},
format="json",
)
self.assertEqual(response.status_code, 200)
call_args = mock_instance.search.call_args[0][0]
self.assertEqual(call_args.search_types, ["fulltext"])
self.assertFalse(call_args.rerank)
class ConceptAPITest(TestCase):
"""Tests for concept API endpoints."""
def setUp(self):
self.user = User.objects.create_user(
username="testuser", password="testpass123"
)
self.client = APIClient()
self.client.force_authenticate(user=self.user)
def test_concept_list_requires_auth(self):
"""GET /api/v1/library/concepts/ requires authentication."""
client = APIClient() # Unauthenticated
response = client.get("/library/api/concepts/")
self.assertIn(response.status_code, [401, 403])

View File

@@ -0,0 +1,491 @@
"""
Tests for the vision analysis service (Phase 2B).
VisionAnalyzer tests mock external dependencies (S3, vision model API, Neo4j).
"""
import json
from unittest.mock import MagicMock, patch
from django.test import TestCase
from library.services.vision import VALID_IMAGE_TYPES, VisionAnalyzer
class MockVisionModel:
"""Mock LLMModel for vision testing."""
def __init__(self):
self.name = "qwen3-vl-72b"
self.api = MockVisionApi()
class MockVisionApi:
"""Mock LLMApi for vision testing."""
def __init__(self):
self.name = "Local vLLM"
self.base_url = "http://localhost:8000/v1"
self.api_key = ""
self.timeout_seconds = 120
class VisionAnalyzerInitTests(TestCase):
"""Tests for VisionAnalyzer initialization."""
def test_init_basic(self):
model = MockVisionModel()
analyzer = VisionAnalyzer(model)
self.assertEqual(analyzer.model_name, "qwen3-vl-72b")
self.assertEqual(analyzer.base_url, "http://localhost:8000/v1")
self.assertIsNone(analyzer.user)
def test_init_with_user(self):
model = MockVisionModel()
user = MagicMock()
analyzer = VisionAnalyzer(model, user=user)
self.assertEqual(analyzer.user, user)
def test_strips_trailing_slash_from_base_url(self):
model = MockVisionModel()
model.api.base_url = "http://localhost:8000/v1/"
analyzer = VisionAnalyzer(model)
self.assertEqual(analyzer.base_url, "http://localhost:8000/v1")
class VisionResponseParsingTests(TestCase):
"""Tests for _parse_vision_response."""
def setUp(self):
self.analyzer = VisionAnalyzer(MockVisionModel())
def test_parse_valid_json(self):
response = json.dumps({
"image_type": "diagram",
"description": "A wiring diagram.",
"ocr_text": "L1 L2 L3",
"concepts": [{"name": "wiring", "type": "topic"}],
})
result = self.analyzer._parse_vision_response(response)
self.assertIsNotNone(result)
self.assertEqual(result["image_type"], "diagram")
self.assertEqual(result["description"], "A wiring diagram.")
self.assertEqual(result["ocr_text"], "L1 L2 L3")
self.assertEqual(len(result["concepts"]), 1)
def test_parse_json_in_markdown_code_block(self):
response = '```json\n{"image_type": "chart", "description": "A pie chart.", "ocr_text": "", "concepts": []}\n```'
result = self.analyzer._parse_vision_response(response)
self.assertIsNotNone(result)
self.assertEqual(result["image_type"], "chart")
def test_parse_json_embedded_in_text(self):
response = 'Here is my analysis: {"image_type": "photo", "description": "A landscape.", "ocr_text": "", "concepts": []} I hope that helps.'
result = self.analyzer._parse_vision_response(response)
self.assertIsNotNone(result)
self.assertEqual(result["image_type"], "photo")
def test_parse_invalid_json_returns_none(self):
response = "This is not JSON at all, just a description."
result = self.analyzer._parse_vision_response(response)
self.assertIsNone(result)
def test_parse_empty_string_returns_none(self):
result = self.analyzer._parse_vision_response("")
self.assertIsNone(result)
def test_parse_json_array_extracts_object_via_regex(self):
"""JSON arrays containing an object get extracted via regex fallback."""
response = '[{"image_type": "photo", "description": "Test", "ocr_text": "", "concepts": []}]'
result = self.analyzer._parse_vision_response(response)
# The regex fallback finds the embedded JSON object
self.assertIsNotNone(result)
self.assertEqual(result["image_type"], "photo")
class VisionResultValidationTests(TestCase):
"""Tests for _validate_result."""
def setUp(self):
self.analyzer = VisionAnalyzer(MockVisionModel())
def test_valid_image_type_preserved(self):
result = self.analyzer._validate_result({
"image_type": "diagram",
"description": "A diagram.",
"ocr_text": "",
"concepts": [],
})
self.assertEqual(result["image_type"], "diagram")
def test_invalid_image_type_defaults_to_photo(self):
result = self.analyzer._validate_result({
"image_type": "unknown_type",
"description": "Something.",
"ocr_text": "",
"concepts": [],
})
self.assertEqual(result["image_type"], "photo")
def test_empty_image_type_defaults_to_photo(self):
result = self.analyzer._validate_result({
"image_type": "",
"description": "Something.",
"ocr_text": "",
"concepts": [],
})
self.assertEqual(result["image_type"], "photo")
def test_image_type_case_insensitive(self):
result = self.analyzer._validate_result({
"image_type": "DIAGRAM",
"description": "Something.",
"ocr_text": "",
"concepts": [],
})
self.assertEqual(result["image_type"], "diagram")
def test_all_valid_image_types(self):
for image_type in VALID_IMAGE_TYPES:
result = self.analyzer._validate_result({
"image_type": image_type,
"description": "Test.",
"ocr_text": "",
"concepts": [],
})
self.assertEqual(result["image_type"], image_type)
def test_description_truncated_at_2000(self):
long_desc = "x" * 3000
result = self.analyzer._validate_result({
"image_type": "photo",
"description": long_desc,
"ocr_text": "",
"concepts": [],
})
self.assertEqual(len(result["description"]), 2000)
def test_ocr_text_truncated_at_5000(self):
long_ocr = "y" * 6000
result = self.analyzer._validate_result({
"image_type": "photo",
"description": "Test.",
"ocr_text": long_ocr,
"concepts": [],
})
self.assertEqual(len(result["ocr_text"]), 5000)
def test_concepts_capped_at_20(self):
concepts = [{"name": f"concept-{i}", "type": "topic"} for i in range(30)]
result = self.analyzer._validate_result({
"image_type": "photo",
"description": "Test.",
"ocr_text": "",
"concepts": concepts,
})
self.assertEqual(len(result["concepts"]), 20)
def test_concept_names_lowercased(self):
result = self.analyzer._validate_result({
"image_type": "photo",
"description": "Test.",
"ocr_text": "",
"concepts": [{"name": "Machine Learning", "type": "topic"}],
})
self.assertEqual(result["concepts"][0]["name"], "machine learning")
def test_concept_names_stripped(self):
result = self.analyzer._validate_result({
"image_type": "photo",
"description": "Test.",
"ocr_text": "",
"concepts": [{"name": " padded ", "type": "topic"}],
})
self.assertEqual(result["concepts"][0]["name"], "padded")
def test_short_concept_names_filtered(self):
"""Concept names shorter than 2 characters are filtered out."""
result = self.analyzer._validate_result({
"image_type": "photo",
"description": "Test.",
"ocr_text": "",
"concepts": [
{"name": "a", "type": "topic"},
{"name": "ab", "type": "topic"},
],
})
self.assertEqual(len(result["concepts"]), 1)
self.assertEqual(result["concepts"][0]["name"], "ab")
def test_invalid_concept_entries_filtered(self):
"""Non-dict entries and entries without name are filtered."""
result = self.analyzer._validate_result({
"image_type": "photo",
"description": "Test.",
"ocr_text": "",
"concepts": [
"just a string",
{"type": "topic"}, # missing name
{"name": "valid", "type": "topic"},
],
})
self.assertEqual(len(result["concepts"]), 1)
self.assertEqual(result["concepts"][0]["name"], "valid")
def test_missing_concept_type_defaults_to_topic(self):
result = self.analyzer._validate_result({
"image_type": "photo",
"description": "Test.",
"ocr_text": "",
"concepts": [{"name": "untyped"}],
})
self.assertEqual(result["concepts"][0]["type"], "topic")
def test_missing_fields_default_to_empty(self):
result = self.analyzer._validate_result({})
self.assertEqual(result["image_type"], "photo")
self.assertEqual(result["description"], "")
self.assertEqual(result["ocr_text"], "")
self.assertEqual(result["concepts"], [])
class VisionModelCallTests(TestCase):
"""Tests for _call_vision_model HTTP request construction."""
def setUp(self):
self.analyzer = VisionAnalyzer(MockVisionModel())
@patch("library.services.vision.requests.post")
def test_call_sends_correct_payload(self, mock_post):
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"choices": [
{"message": {"content": '{"image_type": "photo", "description": "Test", "ocr_text": "", "concepts": []}'}}
]
}
mock_post.return_value = mock_response
result = self.analyzer._call_vision_model(
b64_image="dGVzdA==",
mime_type="image/png",
user_prompt="Analyze this image.",
)
mock_post.assert_called_once()
call_args = mock_post.call_args
url = call_args[0][0] if call_args[0] else call_args[1].get("url")
self.assertIn("/chat/completions", url)
body = call_args[1]["json"]
self.assertEqual(body["model"], "qwen3-vl-72b")
self.assertEqual(body["temperature"], 0.1)
self.assertEqual(body["max_tokens"], 800)
self.assertEqual(len(body["messages"]), 2)
self.assertEqual(body["messages"][0]["role"], "system")
self.assertEqual(body["messages"][1]["role"], "user")
# User message should contain text and image_url parts
user_content = body["messages"][1]["content"]
self.assertEqual(len(user_content), 2)
self.assertEqual(user_content[0]["type"], "text")
self.assertEqual(user_content[1]["type"], "image_url")
self.assertIn("data:image/png;base64,", user_content[1]["image_url"]["url"])
@patch("library.services.vision.requests.post")
def test_call_includes_auth_header_when_api_key_set(self, mock_post):
self.analyzer.api.api_key = "test-key-123"
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"choices": [{"message": {"content": "{}"}}]
}
mock_post.return_value = mock_response
self.analyzer._call_vision_model("dGVzdA==", "image/png", "Test")
call_headers = mock_post.call_args[1]["headers"]
self.assertIn("Authorization", call_headers)
self.assertEqual(call_headers["Authorization"], "Bearer test-key-123")
@patch("library.services.vision.requests.post")
def test_call_no_auth_header_when_no_api_key(self, mock_post):
self.analyzer.api.api_key = ""
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"choices": [{"message": {"content": "{}"}}]
}
mock_post.return_value = mock_response
self.analyzer._call_vision_model("dGVzdA==", "image/png", "Test")
call_headers = mock_post.call_args[1]["headers"]
self.assertNotIn("Authorization", call_headers)
@patch("library.services.vision.requests.post")
def test_call_parses_openai_format(self, mock_post):
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"choices": [{"message": {"content": "response text"}}]
}
mock_post.return_value = mock_response
result = self.analyzer._call_vision_model("dGVzdA==", "image/png", "Test")
self.assertEqual(result, "response text")
@patch("library.services.vision.requests.post")
def test_call_parses_bedrock_format(self, mock_post):
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"output": {"message": {"content": [{"text": "bedrock response"}]}}
}
mock_post.return_value = mock_response
result = self.analyzer._call_vision_model("dGVzdA==", "image/png", "Test")
self.assertEqual(result, "bedrock response")
@patch("library.services.vision.requests.post")
def test_call_raises_on_unexpected_format(self, mock_post):
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {"unexpected": "format"}
mock_post.return_value = mock_response
with self.assertRaises(ValueError):
self.analyzer._call_vision_model("dGVzdA==", "image/png", "Test")
@patch("library.services.vision.requests.post")
def test_call_raises_on_http_error(self, mock_post):
mock_response = MagicMock()
mock_response.status_code = 500
mock_response.text = "Internal Server Error"
mock_response.raise_for_status.side_effect = Exception("500 Server Error")
mock_post.return_value = mock_response
with self.assertRaises(Exception):
self.analyzer._call_vision_model("dGVzdA==", "image/png", "Test")
class AnalyzeImagesTests(TestCase):
"""Tests for the top-level analyze_images method."""
def setUp(self):
self.analyzer = VisionAnalyzer(MockVisionModel())
@patch.object(VisionAnalyzer, "_log_usage")
@patch.object(VisionAnalyzer, "_apply_result")
@patch.object(VisionAnalyzer, "_analyze_single_image")
@patch("django.core.files.storage.default_storage")
def test_successful_analysis_returns_count(
self, mock_storage, mock_analyze, mock_apply, mock_log
):
mock_file = MagicMock()
mock_file.read.return_value = b"fake image data"
mock_storage.open.return_value = mock_file
mock_analyze.return_value = {
"image_type": "diagram",
"description": "Test",
"ocr_text": "",
"concepts": [],
}
img_node = MagicMock()
img_node.s3_key = "images/test/0.png"
count = self.analyzer.analyze_images([img_node], vision_prompt="Analyze")
self.assertEqual(count, 1)
mock_analyze.assert_called_once()
mock_apply.assert_called_once()
@patch.object(VisionAnalyzer, "_analyze_single_image")
@patch("django.core.files.storage.default_storage")
def test_failed_analysis_marks_failed(self, mock_storage, mock_analyze):
mock_file = MagicMock()
mock_file.read.return_value = b"fake image data"
mock_storage.open.return_value = mock_file
mock_analyze.return_value = None # Analysis failed
img_node = MagicMock()
img_node.s3_key = "images/test/0.png"
count = self.analyzer.analyze_images([img_node])
self.assertEqual(count, 0)
self.assertEqual(img_node.analysis_status, "failed")
img_node.save.assert_called()
@patch("django.core.files.storage.default_storage")
def test_s3_read_failure_marks_failed(self, mock_storage):
mock_storage.open.side_effect = Exception("S3 error")
img_node = MagicMock()
img_node.s3_key = "images/test/0.png"
count = self.analyzer.analyze_images([img_node])
self.assertEqual(count, 0)
self.assertEqual(img_node.analysis_status, "failed")
@patch.object(VisionAnalyzer, "_log_usage")
@patch.object(VisionAnalyzer, "_apply_result")
@patch.object(VisionAnalyzer, "_analyze_single_image")
@patch("django.core.files.storage.default_storage")
def test_partial_failure_counts_successes(
self, mock_storage, mock_analyze, mock_apply, mock_log
):
"""One success + one failure = count of 1."""
mock_file = MagicMock()
mock_file.read.return_value = b"fake data"
mock_storage.open.return_value = mock_file
# First succeeds, second fails
mock_analyze.side_effect = [
{"image_type": "photo", "description": "Test", "ocr_text": "", "concepts": []},
None,
]
img1 = MagicMock()
img1.s3_key = "images/test/0.png"
img2 = MagicMock()
img2.s3_key = "images/test/1.png"
count = self.analyzer.analyze_images([img1, img2])
self.assertEqual(count, 1)
def test_empty_image_list(self):
count = self.analyzer.analyze_images([])
self.assertEqual(count, 0)
@patch.object(VisionAnalyzer, "_log_usage")
@patch.object(VisionAnalyzer, "_apply_result")
@patch.object(VisionAnalyzer, "_analyze_single_image")
@patch("django.core.files.storage.default_storage")
def test_extracts_extension_from_s3_key(
self, mock_storage, mock_analyze, mock_apply, mock_log
):
mock_file = MagicMock()
mock_file.read.return_value = b"data"
mock_storage.open.return_value = mock_file
mock_analyze.return_value = {
"image_type": "photo",
"description": "Test",
"ocr_text": "",
"concepts": [],
}
img_node = MagicMock()
img_node.s3_key = "images/test/0.jpg"
self.analyzer.analyze_images([img_node])
# Check the extension passed to _analyze_single_image
call_args = mock_analyze.call_args
self.assertEqual(call_args[0][1], "jpg") # ext argument

View File

@@ -54,6 +54,11 @@ urlpatterns = [
path("items/<str:uid>/delete/", views.item_delete, name="item-delete"), path("items/<str:uid>/delete/", views.item_delete, name="item-delete"),
# Image views # Image views
path("images/<str:uid>/view/", views.image_serve, name="image-serve"), path("images/<str:uid>/view/", views.image_serve, name="image-serve"),
# Search (Phase 3)
path("search/", views.search_page, name="search"),
# Concepts (Phase 3)
path("concepts/", views.concept_list_page, name="concept-list"),
path("concepts/<str:uid>/", views.concept_detail_page, name="concept-detail"),
# DRF API # DRF API
path("api/", include("library.api.urls")), path("api/", include("library.api.urls")),
] ]

View File

@@ -603,6 +603,165 @@ def embedding_dashboard(request):
return render(request, "library/embedding_dashboard.html", context) return render(request, "library/embedding_dashboard.html", context)
# ---------------------------------------------------------------------------
# Search views (Phase 3)
# ---------------------------------------------------------------------------
@login_required
def search_page(request):
"""Search page — query input with filters and results display."""
from .utils import neo4j_available
context = {
"query": "",
"results": None,
"libraries": [],
"error": None,
}
# Load libraries for filter dropdown
if neo4j_available():
try:
from .models import Library
context["libraries"] = Library.nodes.order_by("name")
except Exception:
pass
if request.method == "POST" or request.GET.get("q"):
query = request.POST.get("query", "") or request.GET.get("q", "")
library_uid = request.POST.get("library_uid", "") or request.GET.get("library_uid", "")
library_type = request.POST.get("library_type", "") or request.GET.get("library_type", "")
rerank = request.POST.get("rerank", "on") == "on"
context["query"] = query
if query.strip():
try:
from django.conf import settings as django_settings
from .services.search import SearchRequest, SearchService
search_request = SearchRequest(
query=query,
library_uid=library_uid or None,
library_type=library_type or None,
limit=getattr(django_settings, "SEARCH_DEFAULT_LIMIT", 20),
vector_top_k=getattr(django_settings, "SEARCH_VECTOR_TOP_K", 50),
fulltext_top_k=getattr(django_settings, "SEARCH_FULLTEXT_TOP_K", 30),
rerank=rerank,
include_images=True,
)
service = SearchService(user=request.user)
context["results"] = service.search(search_request)
except Exception as exc:
logger.error("Search failed: %s", exc, exc_info=True)
context["error"] = str(exc)
return render(request, "library/search.html", context)
@login_required
def concept_list_page(request):
"""Browse concepts with optional search."""
context = {
"concepts": [],
"query": "",
"error": None,
}
query = request.GET.get("q", "")
context["query"] = query
try:
if query:
from neomodel import db
results, _ = db.cypher_query(
"CALL db.index.fulltext.queryNodes('concept_name_fulltext', $query) "
"YIELD node, score "
"RETURN node.uid AS uid, node.name AS name, "
" node.concept_type AS concept_type, score "
"ORDER BY score DESC LIMIT 50",
{"query": query},
)
context["concepts"] = [
{"uid": r[0], "name": r[1], "concept_type": r[2] or "", "score": r[3]}
for r in results
]
else:
from .models import Concept
concepts = Concept.nodes.order_by("name")[:100]
context["concepts"] = [
{"uid": c.uid, "name": c.name, "concept_type": c.concept_type or ""}
for c in concepts
]
except Exception as exc:
logger.error("Concept list failed: %s", exc)
context["error"] = str(exc)
return render(request, "library/concept_list.html", context)
@login_required
def concept_detail_page(request, uid):
"""View a concept and its graph connections."""
context = {
"concept": None,
"items": [],
"related_concepts": [],
"chunk_count": 0,
"image_count": 0,
"error": None,
}
try:
from neomodel import db
results, _ = db.cypher_query(
"MATCH (c:Concept {uid: $uid}) "
"OPTIONAL MATCH (c)<-[:MENTIONS]-(chunk:Chunk)<-[:HAS_CHUNK]-(item:Item) "
"OPTIONAL MATCH (c)<-[:DEPICTS]-(img:Image)<-[:HAS_IMAGE]-(img_item:Item) "
"OPTIONAL MATCH (c)-[:RELATED_TO]-(related:Concept) "
"RETURN c.uid AS uid, c.name AS name, c.concept_type AS concept_type, "
" collect(DISTINCT {uid: item.uid, title: item.title})[..20] AS items, "
" collect(DISTINCT {uid: related.uid, name: related.name, "
" concept_type: related.concept_type}) AS related_concepts, "
" count(DISTINCT chunk) AS chunk_count, "
" count(DISTINCT img) AS image_count",
{"uid": uid},
)
if not results or not results[0][0]:
messages.error(request, "Concept not found.")
return redirect("library:concept-list")
row = results[0]
context["concept"] = {
"uid": row[0],
"name": row[1],
"concept_type": row[2] or "",
}
context["items"] = [i for i in (row[3] or []) if i.get("uid")]
context["related_concepts"] = [r for r in (row[4] or []) if r.get("uid")]
context["chunk_count"] = row[5] or 0
context["image_count"] = row[6] or 0
except Exception as exc:
logger.error("Concept detail failed: %s", exc)
context["error"] = str(exc)
return render(request, "library/concept_detail.html", context)
# ---------------------------------------------------------------------------
# Batch Embedding
# ---------------------------------------------------------------------------
@login_required @login_required
def embed_all_pending(request): def embed_all_pending(request):
""" """

View File

@@ -100,6 +100,7 @@ class LLMModelAdmin(admin.ModelAdmin):
"system_embedding_badge", "system_embedding_badge",
"system_chat_badge", "system_chat_badge",
"system_reranker_badge", "system_reranker_badge",
"system_vision_badge",
"is_active", "is_active",
"created_at", "created_at",
) )
@@ -113,6 +114,7 @@ class LLMModelAdmin(admin.ModelAdmin):
"is_system_embedding_model", "is_system_embedding_model",
"is_system_chat_model", "is_system_chat_model",
"is_system_reranker_model", "is_system_reranker_model",
"is_system_vision_model",
) )
search_fields = ("name", "display_name", "api__name") search_fields = ("name", "display_name", "api__name")
readonly_fields = ( readonly_fields = (
@@ -121,11 +123,13 @@ class LLMModelAdmin(admin.ModelAdmin):
"is_system_embedding_model", "is_system_embedding_model",
"is_system_chat_model", "is_system_chat_model",
"is_system_reranker_model", "is_system_reranker_model",
"is_system_vision_model",
) )
actions = [ actions = [
"set_as_system_embedding_model", "set_as_system_embedding_model",
"set_as_system_chat_model", "set_as_system_chat_model",
"set_as_system_reranker_model", "set_as_system_reranker_model",
"set_as_system_vision_model",
] ]
fieldsets = ( fieldsets = (
("Model Info", {"fields": ("api", "name", "display_name", "model_type", "is_active")}), ("Model Info", {"fields": ("api", "name", "display_name", "model_type", "is_active")}),
@@ -136,6 +140,7 @@ class LLMModelAdmin(admin.ModelAdmin):
"is_system_embedding_model", "is_system_embedding_model",
"is_system_chat_model", "is_system_chat_model",
"is_system_reranker_model", "is_system_reranker_model",
"is_system_vision_model",
), ),
"classes": ("collapse",), "classes": ("collapse",),
"description": ( "description": (
@@ -208,6 +213,16 @@ class LLMModelAdmin(admin.ModelAdmin):
system_reranker_badge.short_description = "Reranker Default" system_reranker_badge.short_description = "Reranker Default"
def system_vision_badge(self, obj):
if obj.is_system_vision_model and obj.model_type in ("vision", "chat"):
return format_html(
'<span style="background:#6f42c1;color:white;padding:3px 8px;'
'border-radius:3px;font-weight:bold;">SYSTEM DEFAULT</span>'
)
return ""
system_vision_badge.short_description = "Vision Default"
# --- System model actions ----------------------------------------------- # --- System model actions -----------------------------------------------
def _set_system_model(self, request, queryset, model_type, field_name, label): def _set_system_model(self, request, queryset, model_type, field_name, label):
@@ -225,6 +240,8 @@ class LLMModelAdmin(admin.ModelAdmin):
valid_types = [model_type] valid_types = [model_type]
if model_type == "embedding": if model_type == "embedding":
valid_types = ["embedding", "multimodal_embed"] valid_types = ["embedding", "multimodal_embed"]
elif model_type == "vision":
valid_types = ["vision", "chat"]
if new_model.model_type not in valid_types: if new_model.model_type not in valid_types:
self.message_user( self.message_user(
@@ -269,6 +286,11 @@ class LLMModelAdmin(admin.ModelAdmin):
set_as_system_reranker_model.short_description = "Set as System Reranker Model" set_as_system_reranker_model.short_description = "Set as System Reranker Model"
def set_as_system_vision_model(self, request, queryset):
self._set_system_model(request, queryset, "vision", "is_system_vision_model", "vision model")
set_as_system_vision_model.short_description = "Set as System Vision Model"
def save_model(self, request, obj, form, change): def save_model(self, request, obj, form, change):
"""Ensure only ONE model per type is marked as system default.""" """Ensure only ONE model per type is marked as system default."""
type_field_map = { type_field_map = {
@@ -276,6 +298,7 @@ class LLMModelAdmin(admin.ModelAdmin):
"multimodal_embed": "is_system_embedding_model", "multimodal_embed": "is_system_embedding_model",
"chat": "is_system_chat_model", "chat": "is_system_chat_model",
"reranker": "is_system_reranker_model", "reranker": "is_system_reranker_model",
"vision": "is_system_vision_model",
} }
for mtype, field in type_field_map.items(): for mtype, field in type_field_map.items():
if getattr(obj, field, False) and obj.model_type == mtype: if getattr(obj, field, False) and obj.model_type == mtype:

View File

@@ -0,0 +1,18 @@
# Generated by Django 5.2.11 on 2026-03-22 15:15
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
("llm_manager", "0003_add_vision_model_and_usage"),
]
operations = [
migrations.RenameIndex(
model_name="llmmodel",
new_name="llm_manager_is_syst_d190bb_idx",
old_name="llm_manager__is_syst_b2f4e7_idx",
),
]

View File

@@ -0,0 +1,381 @@
"""
Tests for LLM Manager admin configuration.
Covers system model admin actions, badges, and save_model validation.
"""
from decimal import Decimal
from django.contrib.admin.sites import AdminSite
from django.contrib.auth import get_user_model
from django.test import RequestFactory, TestCase
from llm_manager.admin import LLMModelAdmin
from llm_manager.models import LLMApi, LLMModel
User = get_user_model()
class AdminTestBase(TestCase):
"""Base class with common admin test setup."""
def setUp(self):
self.factory = RequestFactory()
self.site = AdminSite()
self.admin = LLMModelAdmin(LLMModel, self.site)
self.user = User.objects.create_superuser(
username="admin", password="admin123", email="admin@test.com"
)
self.api = LLMApi.objects.create(
name="Test API",
api_type="vllm",
base_url="http://localhost:8000/v1",
)
def _make_request(self):
request = self.factory.post("/admin/")
request.user = self.user
# Django admin uses _messages attribute
from django.contrib.messages.storage.fallback import FallbackStorage
setattr(request, "session", "session")
setattr(request, "_messages", FallbackStorage(request))
return request
class SystemVisionModelActionTests(AdminTestBase):
"""Tests for the set_as_system_vision_model admin action."""
def test_set_vision_type_as_system_vision_model(self):
"""A 'vision' type model can be set as system vision model."""
model = LLMModel.objects.create(
api=self.api,
name="qwen3-vl-72b",
model_type="vision",
context_window=8192,
)
request = self._make_request()
queryset = LLMModel.objects.filter(pk=model.pk)
self.admin.set_as_system_vision_model(request, queryset)
model.refresh_from_db()
self.assertTrue(model.is_system_vision_model)
def test_set_chat_type_as_system_vision_model(self):
"""A 'chat' type model can be set as system vision model (vision-capable chat)."""
model = LLMModel.objects.create(
api=self.api,
name="gpt-4o",
model_type="chat",
context_window=128000,
)
request = self._make_request()
queryset = LLMModel.objects.filter(pk=model.pk)
self.admin.set_as_system_vision_model(request, queryset)
model.refresh_from_db()
self.assertTrue(model.is_system_vision_model)
def test_embedding_type_rejected_as_vision_model(self):
"""An 'embedding' type model cannot be set as system vision model."""
model = LLMModel.objects.create(
api=self.api,
name="embed-model",
model_type="embedding",
context_window=8192,
)
request = self._make_request()
queryset = LLMModel.objects.filter(pk=model.pk)
self.admin.set_as_system_vision_model(request, queryset)
model.refresh_from_db()
self.assertFalse(model.is_system_vision_model)
def test_reranker_type_rejected_as_vision_model(self):
"""A 'reranker' type model cannot be set as system vision model."""
model = LLMModel.objects.create(
api=self.api,
name="reranker-model",
model_type="reranker",
context_window=8192,
)
request = self._make_request()
queryset = LLMModel.objects.filter(pk=model.pk)
self.admin.set_as_system_vision_model(request, queryset)
model.refresh_from_db()
self.assertFalse(model.is_system_vision_model)
def test_inactive_model_rejected(self):
"""An inactive model cannot be set as system vision model."""
model = LLMModel.objects.create(
api=self.api,
name="inactive-vision",
model_type="vision",
context_window=8192,
is_active=False,
)
request = self._make_request()
queryset = LLMModel.objects.filter(pk=model.pk)
self.admin.set_as_system_vision_model(request, queryset)
model.refresh_from_db()
self.assertFalse(model.is_system_vision_model)
def test_multiple_selection_rejected(self):
"""Selecting more than one model is rejected."""
m1 = LLMModel.objects.create(
api=self.api, name="v1", model_type="vision", context_window=8192
)
m2 = LLMModel.objects.create(
api=self.api, name="v2", model_type="vision", context_window=8192
)
request = self._make_request()
queryset = LLMModel.objects.filter(pk__in=[m1.pk, m2.pk])
self.admin.set_as_system_vision_model(request, queryset)
m1.refresh_from_db()
m2.refresh_from_db()
self.assertFalse(m1.is_system_vision_model)
self.assertFalse(m2.is_system_vision_model)
def test_replaces_previous_system_vision_model(self):
"""Setting a new system vision model clears the previous one."""
old = LLMModel.objects.create(
api=self.api,
name="old-vision",
model_type="vision",
context_window=8192,
is_system_vision_model=True,
)
new = LLMModel.objects.create(
api=self.api,
name="new-vision",
model_type="vision",
context_window=8192,
)
request = self._make_request()
queryset = LLMModel.objects.filter(pk=new.pk)
self.admin.set_as_system_vision_model(request, queryset)
old.refresh_from_db()
new.refresh_from_db()
self.assertFalse(old.is_system_vision_model)
self.assertTrue(new.is_system_vision_model)
class SystemEmbeddingModelActionTests(AdminTestBase):
"""Tests for the set_as_system_embedding_model admin action."""
def test_set_embedding_model(self):
model = LLMModel.objects.create(
api=self.api,
name="embed",
model_type="embedding",
context_window=8192,
)
request = self._make_request()
self.admin.set_as_system_embedding_model(request, LLMModel.objects.filter(pk=model.pk))
model.refresh_from_db()
self.assertTrue(model.is_system_embedding_model)
def test_multimodal_embed_accepted(self):
model = LLMModel.objects.create(
api=self.api,
name="multimodal",
model_type="multimodal_embed",
context_window=8192,
)
request = self._make_request()
self.admin.set_as_system_embedding_model(request, LLMModel.objects.filter(pk=model.pk))
model.refresh_from_db()
self.assertTrue(model.is_system_embedding_model)
def test_chat_rejected_as_embedding(self):
model = LLMModel.objects.create(
api=self.api,
name="chat",
model_type="chat",
context_window=128000,
)
request = self._make_request()
self.admin.set_as_system_embedding_model(request, LLMModel.objects.filter(pk=model.pk))
model.refresh_from_db()
self.assertFalse(model.is_system_embedding_model)
class SystemChatModelActionTests(AdminTestBase):
"""Tests for the set_as_system_chat_model admin action."""
def test_set_chat_model(self):
model = LLMModel.objects.create(
api=self.api,
name="chat",
model_type="chat",
context_window=128000,
)
request = self._make_request()
self.admin.set_as_system_chat_model(request, LLMModel.objects.filter(pk=model.pk))
model.refresh_from_db()
self.assertTrue(model.is_system_chat_model)
def test_embedding_rejected_as_chat(self):
model = LLMModel.objects.create(
api=self.api,
name="embed",
model_type="embedding",
context_window=8192,
)
request = self._make_request()
self.admin.set_as_system_chat_model(request, LLMModel.objects.filter(pk=model.pk))
model.refresh_from_db()
self.assertFalse(model.is_system_chat_model)
class SystemRerankerModelActionTests(AdminTestBase):
"""Tests for the set_as_system_reranker_model admin action."""
def test_set_reranker_model(self):
model = LLMModel.objects.create(
api=self.api,
name="reranker",
model_type="reranker",
context_window=8192,
)
request = self._make_request()
self.admin.set_as_system_reranker_model(request, LLMModel.objects.filter(pk=model.pk))
model.refresh_from_db()
self.assertTrue(model.is_system_reranker_model)
class BadgeDisplayTests(AdminTestBase):
"""Tests for system model badge display methods."""
def test_vision_badge_for_vision_default(self):
model = LLMModel.objects.create(
api=self.api,
name="vision",
model_type="vision",
context_window=8192,
is_system_vision_model=True,
)
badge = self.admin.system_vision_badge(model)
self.assertIn("SYSTEM DEFAULT", badge)
self.assertIn("6f42c1", badge) # Purple color
def test_vision_badge_for_chat_vision_default(self):
model = LLMModel.objects.create(
api=self.api,
name="chat-vision",
model_type="chat",
context_window=128000,
is_system_vision_model=True,
)
badge = self.admin.system_vision_badge(model)
self.assertIn("SYSTEM DEFAULT", badge)
def test_vision_badge_empty_when_not_default(self):
model = LLMModel.objects.create(
api=self.api,
name="not-default",
model_type="vision",
context_window=8192,
is_system_vision_model=False,
)
badge = self.admin.system_vision_badge(model)
self.assertEqual(badge, "")
def test_vision_badge_empty_for_wrong_type(self):
"""Even if is_system_vision_model is True, wrong model_type shows no badge."""
model = LLMModel.objects.create(
api=self.api,
name="embed-mislabeled",
model_type="embedding",
context_window=8192,
is_system_vision_model=True,
)
badge = self.admin.system_vision_badge(model)
self.assertEqual(badge, "")
def test_embedding_badge_shows(self):
model = LLMModel.objects.create(
api=self.api,
name="embed",
model_type="embedding",
context_window=8192,
is_system_embedding_model=True,
)
badge = self.admin.system_embedding_badge(model)
self.assertIn("SYSTEM DEFAULT", badge)
def test_chat_badge_shows(self):
model = LLMModel.objects.create(
api=self.api,
name="chat",
model_type="chat",
context_window=128000,
is_system_chat_model=True,
)
badge = self.admin.system_chat_badge(model)
self.assertIn("SYSTEM DEFAULT", badge)
def test_reranker_badge_shows(self):
model = LLMModel.objects.create(
api=self.api,
name="reranker",
model_type="reranker",
context_window=8192,
is_system_reranker_model=True,
)
badge = self.admin.system_reranker_badge(model)
self.assertIn("SYSTEM DEFAULT", badge)
class AdminActionDescriptionTests(TestCase):
"""Tests that admin actions have proper short_description."""
def setUp(self):
self.admin = LLMModelAdmin(LLMModel, AdminSite())
def test_vision_action_description(self):
self.assertEqual(
self.admin.set_as_system_vision_model.short_description,
"Set as System Vision Model",
)
def test_embedding_action_description(self):
self.assertEqual(
self.admin.set_as_system_embedding_model.short_description,
"Set as System Embedding Model",
)
def test_chat_action_description(self):
self.assertEqual(
self.admin.set_as_system_chat_model.short_description,
"Set as System Chat Model",
)
def test_reranker_action_description(self):
self.assertEqual(
self.admin.set_as_system_reranker_model.short_description,
"Set as System Reranker Model",
)
def test_vision_badge_description(self):
self.assertEqual(
self.admin.system_vision_badge.short_description,
"Vision Default",
)

View File

@@ -159,11 +159,54 @@ class LLMModelModelTest(TestCase):
result = LLMModel.get_system_reranker_model() result = LLMModel.get_system_reranker_model()
self.assertEqual(result.pk, reranker.pk) self.assertEqual(result.pk, reranker.pk)
def test_get_system_vision_model_with_vision_type(self):
vision = LLMModel.objects.create(
api=self.api,
name="vision-model",
model_type="vision",
context_window=8192,
is_system_vision_model=True,
)
result = LLMModel.get_system_vision_model()
self.assertEqual(result.pk, vision.pk)
def test_get_system_vision_model_with_chat_type(self):
"""Vision-capable chat models can serve as system vision model."""
self.model.is_system_vision_model = True
self.model.save()
result = LLMModel.get_system_vision_model()
self.assertEqual(result.pk, self.model.pk)
def test_get_system_vision_model_excludes_embedding_type(self):
"""Embedding models should not be returned as vision model."""
embed = LLMModel.objects.create(
api=self.api,
name="embed-only",
model_type="embedding",
context_window=8191,
is_system_vision_model=True,
)
result = LLMModel.get_system_vision_model()
self.assertIsNone(result)
def test_get_system_vision_model_excludes_inactive(self):
LLMModel.objects.create(
api=self.api,
name="inactive-vision",
model_type="vision",
context_window=8192,
is_system_vision_model=True,
is_active=False,
)
result = LLMModel.get_system_vision_model()
self.assertIsNone(result)
def test_get_system_model_returns_none(self): def test_get_system_model_returns_none(self):
"""Returns None when no system model is configured.""" """Returns None when no system model is configured."""
self.assertIsNone(LLMModel.get_system_embedding_model()) self.assertIsNone(LLMModel.get_system_embedding_model())
self.assertIsNone(LLMModel.get_system_chat_model()) self.assertIsNone(LLMModel.get_system_chat_model())
self.assertIsNone(LLMModel.get_system_reranker_model()) self.assertIsNone(LLMModel.get_system_reranker_model())
self.assertIsNone(LLMModel.get_system_vision_model())
class LLMUsageModelTest(TestCase): class LLMUsageModelTest(TestCase):
@@ -212,7 +255,7 @@ class LLMUsageModelTest(TestCase):
self.assertAlmostEqual(float(usage.total_cost), 0.01, places=4) self.assertAlmostEqual(float(usage.total_cost), 0.01, places=4)
def test_purpose_choices(self): def test_purpose_choices(self):
for purpose in ["responder", "reviewer", "embeddings", "search", "reranking", "multimodal_embed", "other"]: for purpose in ["responder", "reviewer", "embeddings", "search", "reranking", "multimodal_embed", "vision_analysis", "other"]:
usage = LLMUsage.objects.create( usage = LLMUsage.objects.create(
user=self.user, user=self.user,
model=self.model, model=self.model,
@@ -222,6 +265,18 @@ class LLMUsageModelTest(TestCase):
) )
self.assertEqual(usage.purpose, purpose) self.assertEqual(usage.purpose, purpose)
def test_vision_analysis_purpose(self):
"""Vision analysis usage can be tracked."""
usage = LLMUsage.objects.create(
user=self.user,
model=self.model,
input_tokens=500,
output_tokens=200,
purpose="vision_analysis",
)
self.assertEqual(usage.purpose, "vision_analysis")
self.assertGreater(usage.total_cost, 0)
def test_protect_model_delete(self): def test_protect_model_delete(self):
"""Deleting a model with usage records should raise ProtectedError.""" """Deleting a model with usage records should raise ProtectedError."""
LLMUsage.objects.create( LLMUsage.objects.create(

View File

@@ -242,6 +242,15 @@ LOGOUT_REDIRECT_URL = "/"
EMBEDDING_BATCH_SIZE = env.int("EMBEDDING_BATCH_SIZE", default=8) EMBEDDING_BATCH_SIZE = env.int("EMBEDDING_BATCH_SIZE", default=8)
EMBEDDING_TIMEOUT = env.int("EMBEDDING_TIMEOUT", default=120) EMBEDDING_TIMEOUT = env.int("EMBEDDING_TIMEOUT", default=120)
# --- Search & Re-ranking (Phase 3) ---
SEARCH_VECTOR_TOP_K = env.int("SEARCH_VECTOR_TOP_K", default=50)
SEARCH_FULLTEXT_TOP_K = env.int("SEARCH_FULLTEXT_TOP_K", default=30)
SEARCH_GRAPH_MAX_DEPTH = env.int("SEARCH_GRAPH_MAX_DEPTH", default=2)
SEARCH_RRF_K = env.int("SEARCH_RRF_K", default=60)
SEARCH_DEFAULT_LIMIT = env.int("SEARCH_DEFAULT_LIMIT", default=20)
RERANKER_MAX_CANDIDATES = env.int("RERANKER_MAX_CANDIDATES", default=32)
RERANKER_TIMEOUT = env.int("RERANKER_TIMEOUT", default=30)
# --- Themis app settings --- # --- Themis app settings ---
THEMIS_APP_NAME = "Mnemosyne" THEMIS_APP_NAME = "Mnemosyne"
THEMIS_NOTIFICATION_POLL_INTERVAL = 60 THEMIS_NOTIFICATION_POLL_INTERVAL = 60