Implement hybrid search pipeline combining vector, fulltext, and graph search across Neo4j, with cross-attention reranking via Synesis (Qwen3-VL-Reranker-2B) `/v1/rerank` endpoint. - Add SearchService with vector, fulltext, and graph search strategies - Add SynesisRerankerClient for multimodal reranking via HTTP API - Add search API endpoint (POST /search/) with filtering by library, collection, and library_type - Add SearchRequest/Response serializers and image search results - Add "nonfiction" to library_type choices - Consolidate reranker stack from two models to single Synesis service - Handle image analysis_status as "skipped" when analysis is unavailable - Add comprehensive tests for search pipeline and reranker client
Mnemosyne
"The electric light did not come from the continuous improvement of candles." — Oren Harari
The memory of everything you know.
Mnemosyne is a content-type-aware, multimodal personal knowledge management system built on Neo4j knowledge graphs and Qwen3-VL multimodal AI models. Named after the Titan goddess of memory and mother of the nine Muses, Mnemosyne doesn't just store your knowledge — it understands what kind of knowledge it is, connects it through relationships, and makes it all searchable through text, images, and natural language.
What Makes This Different
Every existing knowledge base tool treats all documents identically: text in, chunks out, vectors stored. A novel and a PostgreSQL manual get the same treatment.
Mnemosyne knows the difference:
- A textbook has chapters, an index, technical terminology, and pedagogical structure. It's chunked accordingly, and when an LLM retrieves results, it knows this is instructional content.
- A novel has narrative flow, characters, plot arcs, dialogue. The LLM knows to interpret results as creative fiction.
- Album artwork is a visual asset tied to an artist, genre, and era. It's embedded multimodally — searchable by both image similarity and text description.
- A journal entry is personal, temporal, reflective. The LLM treats it differently than a reference manual.
This content-type awareness flows through every layer: chunking strategy, embedding instructions, re-ranking, and the final LLM prompt.
Core Architecture
| Component | Technology | Purpose |
|---|---|---|
| Knowledge Graph | Neo4j 5.x | Relationships + vector storage (no dimension limits) |
| Multimodal Embeddings | Qwen3-VL-Embedding-8B | Text + image + video in unified vector space (4096d) |
| Multimodal Re-ranking | Synesis (Qwen3-VL-Reranker-2B) | Cross-attention precision scoring via /v1/rerank |
| Web Framework | Django 5.x + DRF | Auth, admin, API, content management |
| Object Storage | S3/MinIO | Original content + chunk text storage |
| Async Processing | Celery + RabbitMQ | Document embedding, graph construction |
| LLM Interface | MCP Server | Primary interface for Claude, Copilot, etc. |
| GPU Serving | vLLM + llama.cpp | Local model inference |
Library Types
| Library | Example Content | Multimodal? | Graph Relationships |
|---|---|---|---|
| Fiction | Novels, short stories | Cover art | Author → Book → Character → Theme |
| Technical | Textbooks, manuals, docs | Diagrams, screenshots | Product → Manual → Section → Procedure |
| Music | Lyrics, liner notes | Album artwork | Artist → Album → Track → Genre |
| Film | Scripts, synopses | Stills, posters | Director → Film → Scene → Actor |
| Art | Descriptions, catalogs | The artwork itself | Artist → Piece → Style → Movement |
| Journals | Personal entries | Photos | Date → Entry → Topic → Person/Place |
Search Pipeline
Query → Vector Search (Neo4j) + Graph Traversal (Cypher) + Full-Text Search
→ Candidate Fusion → Qwen3-VL Re-ranking → Content-Type Context Injection
→ LLM Response with Citations
Heritage
Mnemosyne's RAG pipeline architecture is inspired by Spelunker, an enterprise RFP response platform. The proven patterns — hybrid search, two-stage RAG (responder + reviewer), citation-based retrieval, and async document processing — are carried forward and enhanced with multimodal capabilities and knowledge graph relationships.
Running Celery Workers
Mnemosyne uses Celery with RabbitMQ for async document embedding. From the mnemosyne/ directory:
# Development — single worker, all queues
celery -A mnemosyne worker -l info -Q celery,embedding,batch
# Or skip workers entirely with eager mode (.env):
CELERY_TASK_ALWAYS_EAGER=True
Production — separate workers:
celery -A mnemosyne worker -l info -Q embedding -c 1 -n embedding@%h # GPU-bound embedding
celery -A mnemosyne worker -l info -Q batch -c 2 -n batch@%h # Batch orchestration
celery -A mnemosyne worker -l info -Q celery -c 2 -n default@%h # LLM API validation
Scheduler & Monitoring:
celery -A mnemosyne beat -l info # Periodic task scheduler
celery -A mnemosyne flower --port=5555 # Web monitoring UI
See Phase 2: Celery Workers & Scheduler for full details on queues, reliability settings, and task progress tracking.
Documentation
- Architecture Documentation — Full system architecture with diagrams
- Phase 1: Foundation — Project skeleton, Neo4j data model, content-type system
- Phase 2: Embedding Pipeline — Qwen3-VL multimodal embedding
- Phase 3: Search & Re-ranking — Hybrid search + re-ranker
- Phase 4: RAG Pipeline — Content-type-aware generation
- Phase 5: MCP Server — LLM integration interface
- Phase 6: Backport to Spelunker — Proven patterns flowing back