Add Themis application with custom widgets, views, and utilities
- Implemented custom form widgets for date, time, and datetime fields with DaisyUI styling. - Created utility functions for formatting dates, times, and numbers according to user preferences. - Developed views for profile settings, API key management, and notifications, including health check endpoints. - Added URL configurations for Themis tests and main application routes. - Established test cases for custom widgets to ensure proper functionality and integration. - Defined project metadata and dependencies in pyproject.toml for package management.
This commit is contained in:
732
docs/mnemosyne.html
Normal file
732
docs/mnemosyne.html
Normal file
@@ -0,0 +1,732 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Mnemosyne — Architecture Documentation</title>
|
||||
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
|
||||
<link href="https://cdn.jsdelivr.net/npm/bootstrap-icons@1.11.0/font/bootstrap-icons.css" rel="stylesheet">
|
||||
<script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
|
||||
<script>mermaid.initialize({ startOnLoad: true, theme: 'default' });</script>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container-fluid">
|
||||
<nav class="navbar navbar-dark bg-dark rounded mb-4">
|
||||
<div class="container-fluid">
|
||||
<a class="navbar-brand" href="#"><i class="bi bi-book"></i> Mnemosyne — Architecture Documentation</a>
|
||||
<div class="navbar-nav d-flex flex-row">
|
||||
<a class="nav-link me-3" href="#overview">Overview</a>
|
||||
<a class="nav-link me-3" href="#architecture">Architecture</a>
|
||||
<a class="nav-link me-3" href="#data-model">Data Model</a>
|
||||
<a class="nav-link me-3" href="#content-types">Content Types</a>
|
||||
<a class="nav-link me-3" href="#multimodal-pipeline">Multimodal</a>
|
||||
<a class="nav-link me-3" href="#search-pipeline">Search</a>
|
||||
<a class="nav-link me-3" href="#mcp-interface">MCP</a>
|
||||
<a class="nav-link me-3" href="#gpu-services">GPU</a>
|
||||
<a class="nav-link" href="#deployment">Deployment</a>
|
||||
</div>
|
||||
</div>
|
||||
</nav>
|
||||
|
||||
<div class="row">
|
||||
<div class="col-12">
|
||||
<h1 class="display-4 mb-2"><i class="bi bi-book-fill"></i> Mnemosyne <span class="badge bg-primary">Architecture</span></h1>
|
||||
<p class="lead text-muted fst-italic">"The electric light did not come from the continuous improvement of candles." — Oren Harari</p>
|
||||
<p class="lead">Mnemosyne is a content-type-aware, multimodal personal knowledge management system built on Neo4j knowledge graphs and Qwen3-VL multimodal AI. Named after the Titan goddess of memory, it understands <em>what kind</em> of knowledge it holds and makes it searchable through text, images, and natural language.</p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- SECTION: OVERVIEW -->
|
||||
<section id="overview" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-info-circle"></i> Overview</h2>
|
||||
|
||||
<div class="alert alert-primary border-start border-4 border-primary">
|
||||
<h3>Purpose</h3>
|
||||
<p><strong>Mnemosyne</strong> is a personal knowledge management system that treats content type as a first-class concept. Unlike generic knowledge bases that treat all documents identically, Mnemosyne understands the difference between a novel, a technical manual, album artwork, and a journal entry — and adjusts its chunking, embedding, search, and LLM prompting accordingly.</p>
|
||||
</div>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-lg-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-body">
|
||||
<h3 class="card-title text-primary"><i class="bi bi-diagram-3"></i> Knowledge Graph</h3>
|
||||
<ul class="mb-0">
|
||||
<li>Neo4j stores relationships between content, not just vectors</li>
|
||||
<li>Author → Book → Character → Theme traversals</li>
|
||||
<li>Artist → Album → Track → Genre connections</li>
|
||||
<li>No vector dimension limits (full 4096d Qwen3-VL)</li>
|
||||
<li>Graph + vector + full-text search in one database</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-lg-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-body">
|
||||
<h3 class="card-title text-primary"><i class="bi bi-eye"></i> Multimodal AI</h3>
|
||||
<ul class="mb-0">
|
||||
<li>Qwen3-VL-Embedding: text + images + video in one vector space</li>
|
||||
<li>Qwen3-VL-Reranker: cross-attention scoring across modalities</li>
|
||||
<li>Album art, diagrams, screenshots become searchable</li>
|
||||
<li>Local GPU inference (5090 + 3090) — zero API costs</li>
|
||||
<li>llama.cpp text fallback via existing Ansible/systemd infra</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-lg-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-body">
|
||||
<h3 class="card-title text-primary"><i class="bi bi-tags"></i> Content-Type Awareness</h3>
|
||||
<ul class="mb-0">
|
||||
<li>Library types define chunking, embedding, and prompt behavior</li>
|
||||
<li>Fiction: narrative-aware chunking, character extraction</li>
|
||||
<li>Technical: section-aware, code block preservation</li>
|
||||
<li>Music: lyrics as primary, metadata-heavy (genre, mood)</li>
|
||||
<li>Each type injects context into the LLM prompt</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="alert alert-info border-start border-4 border-info">
|
||||
<h3>Key Differentiators</h3>
|
||||
<ul class="mb-0">
|
||||
<li><strong>Content-type-aware pipeline</strong> — chunking, embedding instructions, re-ranking instructions, and LLM context all adapt per library type</li>
|
||||
<li><strong>Neo4j knowledge graph</strong> — traversable relationships, not just flat vector similarity</li>
|
||||
<li><strong>Full multimodal</strong> — Qwen3-VL processes images, diagrams, album art alongside text in a unified vector space</li>
|
||||
<li><strong>No dimension limits</strong> — Neo4j handles 4096d vectors natively (pgvector caps at 2000)</li>
|
||||
<li><strong>MCP-first interface</strong> — designed for LLM integration from day one</li>
|
||||
<li><strong>Proven RAG architecture</strong> — two-stage responder/reviewer pattern inherited from Spelunker</li>
|
||||
<li><strong>Local GPU inference</strong> — zero ongoing API costs via vLLM + llama.cpp on RTX 5090/3090</li>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
<div class="alert alert-secondary border-start border-4 border-secondary">
|
||||
<h3>Heritage</h3>
|
||||
<p class="mb-0">Mnemosyne's RAG pipeline architecture is inspired by <strong>Spelunker</strong>, an enterprise RFP response platform built on Django, PostgreSQL/pgvector, and LangChain. The proven patterns — hybrid search, two-stage RAG, citation-based retrieval, async document processing, and SME-approved knowledge bases — are carried forward and enhanced with multimodal capabilities and knowledge graph relationships. Proven patterns from Mnemosyne will be backported to Spelunker.</p>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: ARCHITECTURE -->
|
||||
<section id="architecture" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-diagram-3"></i> System Architecture</h2>
|
||||
|
||||
<div class="card mb-4">
|
||||
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-diagram-3"></i> High-Level Architecture</h3></div>
|
||||
<div class="card-body">
|
||||
<div class="mermaid">
|
||||
graph TB
|
||||
subgraph Clients["Client Layer"]
|
||||
MCP["MCP Clients<br/>(Claude, Copilot, etc.)"]
|
||||
UI["Django Web UI"]
|
||||
API["REST API (DRF)"]
|
||||
end
|
||||
|
||||
subgraph App["Application Layer — Django"]
|
||||
Core["core/<br/>Users, Auth"]
|
||||
Library["library/<br/>Libraries, Collections, Items"]
|
||||
Engine["engine/<br/>Embedding, Search, Reranker, RAG"]
|
||||
MCPServer["mcp_server/<br/>MCP Tool Interface"]
|
||||
Importers["importers/<br/>File, Calibre, Web"]
|
||||
end
|
||||
|
||||
subgraph Data["Data Layer"]
|
||||
Neo4j["Neo4j 5.x<br/>Knowledge Graph + Vectors"]
|
||||
PG["PostgreSQL<br/>Auth, Config, Analytics"]
|
||||
S3["S3/MinIO<br/>Content + Chunks"]
|
||||
RMQ["RabbitMQ<br/>Task Queue"]
|
||||
end
|
||||
|
||||
subgraph GPU["GPU Services"]
|
||||
vLLM_E["vLLM<br/>Qwen3-VL-Embedding-8B<br/>(Multimodal Embed)"]
|
||||
vLLM_R["vLLM<br/>Qwen3-VL-Reranker-8B<br/>(Multimodal Rerank)"]
|
||||
LCPP["llama.cpp<br/>Qwen3-Reranker-0.6B<br/>(Text Fallback)"]
|
||||
LCPP_C["llama.cpp<br/>Qwen3 Chat<br/>(RAG Responder)"]
|
||||
end
|
||||
|
||||
MCP --> MCPServer
|
||||
UI --> Core
|
||||
API --> Library
|
||||
API --> Engine
|
||||
MCPServer --> Engine
|
||||
MCPServer --> Library
|
||||
|
||||
Library --> Neo4j
|
||||
Engine --> Neo4j
|
||||
Engine --> S3
|
||||
Core --> PG
|
||||
Engine --> vLLM_E
|
||||
Engine --> vLLM_R
|
||||
Engine --> LCPP
|
||||
Engine --> LCPP_C
|
||||
Library --> RMQ
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-6">
|
||||
<div class="card">
|
||||
<div class="card-header bg-primary text-white"><h4 class="mb-0"><i class="bi bi-folder"></i> Django Apps</h4></div>
|
||||
<div class="card-body">
|
||||
<ul class="list-group list-group-flush">
|
||||
<li class="list-group-item"><strong>core/</strong> — Users, authentication, profiles, permissions</li>
|
||||
<li class="list-group-item"><strong>library/</strong> — Libraries, Collections, Items, Chunks, Concepts (Neo4j models)</li>
|
||||
<li class="list-group-item"><strong>engine/</strong> — Embedding, search, reranker, RAG pipeline services</li>
|
||||
<li class="list-group-item"><strong>mcp_server/</strong> — MCP tool definitions and server interface</li>
|
||||
<li class="list-group-item"><strong>importers/</strong> — Content acquisition (file upload, Calibre, web scrape)</li>
|
||||
<li class="list-group-item"><strong>llm_manager/</strong> — LLM API/model config, usage tracking (from Spelunker)</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-6">
|
||||
<div class="card">
|
||||
<div class="card-header bg-success text-white"><h4 class="mb-0"><i class="bi bi-stack"></i> Technology Stack</h4></div>
|
||||
<div class="card-body">
|
||||
<ul>
|
||||
<li><strong>Django 5.x</strong>, Python ≥3.12, Django REST Framework</li>
|
||||
<li><strong>Neo4j 5.x</strong> + django-neomodel — knowledge graph + vector index</li>
|
||||
<li><strong>PostgreSQL</strong> — Django auth, config, analytics only</li>
|
||||
<li><strong>S3/MinIO</strong> — all content and chunk storage</li>
|
||||
<li><strong>Celery + RabbitMQ</strong> — async embedding and graph construction</li>
|
||||
<li><strong>vLLM ≥0.14</strong> — Qwen3-VL multimodal serving</li>
|
||||
<li><strong>llama.cpp</strong> — text model serving (existing Ansible infra)</li>
|
||||
<li><strong>MCP SDK</strong> — Model Context Protocol server</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<h3 class="mt-4">Project Structure</h3>
|
||||
<pre class="bg-light p-3 rounded"><code>mnemosyne/
|
||||
├── mnemosyne/ # Django settings, URLs, WSGI/ASGI
|
||||
├── core/ # Users, auth, profiles
|
||||
├── library/ # Neo4j models (Library, Collection, Item, Chunk, Concept)
|
||||
├── engine/ # RAG pipeline services
|
||||
│ ├── embeddings.py # Qwen3-VL embedding client
|
||||
│ ├── reranker.py # Qwen3-VL reranker client
|
||||
│ ├── search.py # Hybrid search (vector + graph + full-text)
|
||||
│ ├── pipeline.py # Two-stage RAG (responder + reviewer)
|
||||
│ ├── llm_client.py # OpenAI-compatible LLM client
|
||||
│ └── content_types.py # Library type definitions
|
||||
├── mcp_server/ # MCP tool definitions
|
||||
├── importers/ # Content import tools
|
||||
├── llm_manager/ # LLM API/model config (ported from Spelunker)
|
||||
├── static/
|
||||
├── templates/
|
||||
├── docker-compose.yml
|
||||
├── pyproject.toml
|
||||
└── manage.py</code></pre>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: DATA MODEL -->
|
||||
<section id="data-model" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-database"></i> Data Model — Neo4j Knowledge Graph</h2>
|
||||
|
||||
<div class="alert alert-info border-start border-4 border-info">
|
||||
<h3>Dual Database Strategy</h3>
|
||||
<p class="mb-0"><strong>Neo4j</strong> stores all content knowledge: libraries, collections, items, chunks, concepts, and their relationships + vector embeddings. <strong>PostgreSQL</strong> stores only Django operational data: users, auth, LLM configurations, analytics, and Celery results. Content never lives in PostgreSQL.</p>
|
||||
</div>
|
||||
|
||||
<div class="card mb-4">
|
||||
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-diagram-2"></i> Graph Schema</h3></div>
|
||||
<div class="card-body">
|
||||
<div class="mermaid">
|
||||
graph LR
|
||||
L["Library<br/>(fiction, technical,<br/>music, art, journal)"] -->|CONTAINS| Col["Collection<br/>(genre, author,<br/>artist, project)"]
|
||||
Col -->|CONTAINS| I["Item<br/>(book, manual,<br/>album, film, entry)"]
|
||||
I -->|HAS_CHUNK| Ch["Chunk<br/>(text + optional image<br/>+ 4096d vector)"]
|
||||
I -->|REFERENCES| Con["Concept<br/>(person, topic,<br/>technique, theme)"]
|
||||
I -->|RELATED_TO| I
|
||||
Con -->|RELATED_TO| Con
|
||||
Ch -->|MENTIONS| Con
|
||||
I -->|HAS_IMAGE| Img["Image<br/>(cover, diagram,<br/>artwork, still)"]
|
||||
Img -->|HAS_EMBEDDING| ImgE["ImageEmbedding<br/>(4096d multimodal<br/>vector)"]
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-6">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-primary text-white"><h4 class="mb-0">Core Nodes</h4></div>
|
||||
<div class="card-body">
|
||||
<table class="table table-sm">
|
||||
<thead><tr><th>Node</th><th>Key Properties</th><th>Vector?</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td><strong>Library</strong></td><td>name, library_type, chunking_config, embedding_instruction, llm_context_prompt</td><td>No</td></tr>
|
||||
<tr><td><strong>Collection</strong></td><td>name, description, metadata</td><td>No</td></tr>
|
||||
<tr><td><strong>Item</strong></td><td>title, item_type, s3_key, content_hash, metadata, created_at</td><td>No</td></tr>
|
||||
<tr><td><strong>Chunk</strong></td><td>chunk_index, chunk_s3_key, chunk_size, embedding (4096d)</td><td><strong>Yes</strong></td></tr>
|
||||
<tr><td><strong>Concept</strong></td><td>name, concept_type, embedding (4096d)</td><td><strong>Yes</strong></td></tr>
|
||||
<tr><td><strong>Image</strong></td><td>s3_key, image_type, description, metadata</td><td>No</td></tr>
|
||||
<tr><td><strong>ImageEmbedding</strong></td><td>embedding (4096d multimodal)</td><td><strong>Yes</strong></td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-6">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-success text-white"><h4 class="mb-0">Relationships</h4></div>
|
||||
<div class="card-body">
|
||||
<table class="table table-sm">
|
||||
<thead><tr><th>Relationship</th><th>From → To</th><th>Properties</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td><strong>CONTAINS</strong></td><td>Library → Collection</td><td>—</td></tr>
|
||||
<tr><td><strong>CONTAINS</strong></td><td>Collection → Item</td><td>position</td></tr>
|
||||
<tr><td><strong>HAS_CHUNK</strong></td><td>Item → Chunk</td><td>—</td></tr>
|
||||
<tr><td><strong>HAS_IMAGE</strong></td><td>Item → Image</td><td>image_role</td></tr>
|
||||
<tr><td><strong>HAS_EMBEDDING</strong></td><td>Image → ImageEmbedding</td><td>—</td></tr>
|
||||
<tr><td><strong>REFERENCES</strong></td><td>Item → Concept</td><td>relevance</td></tr>
|
||||
<tr><td><strong>MENTIONS</strong></td><td>Chunk → Concept</td><td>—</td></tr>
|
||||
<tr><td><strong>RELATED_TO</strong></td><td>Item → Item</td><td>relationship_type, weight</td></tr>
|
||||
<tr><td><strong>RELATED_TO</strong></td><td>Concept → Concept</td><td>relationship_type</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="alert alert-warning border-start border-4 border-warning">
|
||||
<h4><i class="bi bi-lightning"></i> Neo4j Vector Indexes</h4>
|
||||
<pre class="bg-light p-3 rounded mb-0"><code>// Chunk text+image embeddings (4096 dimensions, no pgvector limits!)
|
||||
CREATE VECTOR INDEX chunk_embedding FOR (c:Chunk)
|
||||
ON (c.embedding) OPTIONS {indexConfig: {
|
||||
`vector.dimensions`: 4096,
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}
|
||||
|
||||
// Concept embeddings for semantic concept search
|
||||
CREATE VECTOR INDEX concept_embedding FOR (con:Concept)
|
||||
ON (con.embedding) OPTIONS {indexConfig: {
|
||||
`vector.dimensions`: 4096,
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}
|
||||
|
||||
// Image multimodal embeddings
|
||||
CREATE VECTOR INDEX image_embedding FOR (ie:ImageEmbedding)
|
||||
ON (ie.embedding) OPTIONS {indexConfig: {
|
||||
`vector.dimensions`: 4096,
|
||||
`vector.similarity_function`: 'cosine'
|
||||
}}
|
||||
|
||||
// Full-text index for keyword/BM25-style search
|
||||
CREATE FULLTEXT INDEX chunk_fulltext FOR (c:Chunk) ON EACH [c.text_preview]</code></pre>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: CONTENT TYPES -->
|
||||
<section id="content-types" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-tags"></i> Content Type System</h2>
|
||||
|
||||
<div class="alert alert-primary border-start border-4 border-primary">
|
||||
<h3>The Core Innovation</h3>
|
||||
<p class="mb-0">Each Library has a <strong>library_type</strong> that defines how content is chunked, what embedding instructions are sent to Qwen3-VL, what re-ranking instructions are used, and what context prompt is injected when the LLM generates answers. This is configured per library in the database — not hardcoded.</p>
|
||||
</div>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100 border-primary">
|
||||
<div class="card-header bg-primary text-white"><h5 class="mb-0"><i class="bi bi-book"></i> Fiction</h5></div>
|
||||
<div class="card-body">
|
||||
<p><strong>Chunking:</strong> Chapter-aware, preserve dialogue blocks, narrative flow</p>
|
||||
<p><strong>Embedding Instruction:</strong> <em>"Represent the narrative passage for literary retrieval, capturing themes, characters, and plot elements"</em></p>
|
||||
<p><strong>Reranker Instruction:</strong> <em>"Score relevance of this fiction excerpt to the query, considering narrative themes and character arcs"</em></p>
|
||||
<p><strong>LLM Context:</strong> <em>"The following excerpts are from fiction. Interpret as narrative — consider themes, symbolism, character development."</em></p>
|
||||
<p><strong>Multimodal:</strong> Cover art, illustrations</p>
|
||||
<p><strong>Graph:</strong> Author → Book → Character → Theme</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100 border-success">
|
||||
<div class="card-header bg-success text-white"><h5 class="mb-0"><i class="bi bi-gear"></i> Technical</h5></div>
|
||||
<div class="card-body">
|
||||
<p><strong>Chunking:</strong> Section/heading-aware, preserve code blocks and tables as atomic units</p>
|
||||
<p><strong>Embedding Instruction:</strong> <em>"Represent the technical documentation for precise procedural retrieval"</em></p>
|
||||
<p><strong>Reranker Instruction:</strong> <em>"Score relevance of this technical documentation to the query, prioritizing procedural accuracy"</em></p>
|
||||
<p><strong>LLM Context:</strong> <em>"The following excerpts are from technical documentation. Provide precise, actionable instructions."</em></p>
|
||||
<p><strong>Multimodal:</strong> Diagrams, screenshots, wiring diagrams</p>
|
||||
<p><strong>Graph:</strong> Product → Manual → Section → Procedure → Tool</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100 border-info">
|
||||
<div class="card-header bg-info text-white"><h5 class="mb-0"><i class="bi bi-music-note-beamed"></i> Music</h5></div>
|
||||
<div class="card-body">
|
||||
<p><strong>Chunking:</strong> Song-level (lyrics as one chunk), verse/chorus segmentation</p>
|
||||
<p><strong>Embedding Instruction:</strong> <em>"Represent the song lyrics and album context for music discovery and thematic analysis"</em></p>
|
||||
<p><strong>Reranker Instruction:</strong> <em>"Score relevance considering lyrical themes, musical context, and artist style"</em></p>
|
||||
<p><strong>LLM Context:</strong> <em>"The following excerpts are song lyrics and music metadata. Interpret in musical and cultural context."</em></p>
|
||||
<p><strong>Multimodal:</strong> Album artwork, liner note images</p>
|
||||
<p><strong>Graph:</strong> Artist → Album → Track → Genre; Track → SAMPLES → Track</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100 border-warning">
|
||||
<div class="card-header bg-warning text-dark"><h5 class="mb-0"><i class="bi bi-film"></i> Film</h5></div>
|
||||
<div class="card-body">
|
||||
<p><strong>Chunking:</strong> Scene-level for scripts, paragraph-level for synopses</p>
|
||||
<p><strong>Embedding Instruction:</strong> <em>"Represent the film content for cinematic retrieval, capturing visual and narrative elements"</em></p>
|
||||
<p><strong>Multimodal:</strong> Movie stills, posters, screenshots</p>
|
||||
<p><strong>Graph:</strong> Director → Film → Scene → Actor; Film → BASED_ON → Book</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100 border-danger">
|
||||
<div class="card-header bg-danger text-white"><h5 class="mb-0"><i class="bi bi-palette"></i> Art</h5></div>
|
||||
<div class="card-body">
|
||||
<p><strong>Chunking:</strong> Description-level, catalog entry as unit</p>
|
||||
<p><strong>Embedding Instruction:</strong> <em>"Represent the artwork and its description for visual and stylistic retrieval"</em></p>
|
||||
<p><strong>Multimodal:</strong> <strong>The artwork itself</strong> — primary content is visual</p>
|
||||
<p><strong>Graph:</strong> Artist → Piece → Style → Movement; Piece → INSPIRED_BY → Piece</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100 border-secondary">
|
||||
<div class="card-header bg-secondary text-white"><h5 class="mb-0"><i class="bi bi-journal-text"></i> Journals</h5></div>
|
||||
<div class="card-body">
|
||||
<p><strong>Chunking:</strong> Entry-level (one entry = one chunk), paragraph split for long entries</p>
|
||||
<p><strong>Embedding Instruction:</strong> <em>"Represent the personal journal entry for temporal and reflective retrieval"</em></p>
|
||||
<p><strong>Multimodal:</strong> Photos, sketches attached to entries</p>
|
||||
<p><strong>Graph:</strong> Date → Entry → Topic; Entry → MENTIONS → Person/Place</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: MULTIMODAL PIPELINE -->
|
||||
<section id="multimodal-pipeline" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-eye-fill"></i> Multimodal Embedding & Re-ranking Pipeline</h2>
|
||||
|
||||
<div class="alert alert-primary border-start border-4 border-primary">
|
||||
<h3>Two-Stage Multimodal Pipeline</h3>
|
||||
<p><strong>Stage 1 — Embedding (Qwen3-VL-Embedding-8B):</strong> Generates 4096-dimensional vectors from text, images, screenshots, and video in a unified semantic space. Accepts content-type-specific instructions for optimized representations.</p>
|
||||
<p class="mb-0"><strong>Stage 2 — Re-ranking (Qwen3-VL-Reranker-8B):</strong> Takes (query, document) pairs — where both can be multimodal — and outputs precise relevance scores via cross-attention. Dramatically sharpens retrieval accuracy.</p>
|
||||
</div>
|
||||
|
||||
<div class="card mb-4">
|
||||
<div class="card-header bg-success text-white"><h3 class="mb-0"><i class="bi bi-flow-chart"></i> Embedding & Ingestion Flow</h3></div>
|
||||
<div class="card-body">
|
||||
<div class="mermaid">
|
||||
flowchart TD
|
||||
A["New Content<br/>(file upload, import)"] --> B{"Content Type?"}
|
||||
B -->|"Text (PDF, DOCX, MD)"| C["Parse Text<br/>+ Extract Images"]
|
||||
B -->|"Image (art, photo)"| D["Image Only"]
|
||||
B -->|"Mixed (manual + diagrams)"| E["Parse Text<br/>+ Keep Page Images"]
|
||||
C --> F["Chunk Text<br/>(content-type-aware)"]
|
||||
D --> G["Image to S3"]
|
||||
E --> F
|
||||
E --> G
|
||||
F --> H["Store Chunks in S3"]
|
||||
H --> I["Qwen3-VL-Embedding<br/>(text + instruction)"]
|
||||
G --> J["Qwen3-VL-Embedding<br/>(image + instruction)"]
|
||||
I --> K["4096d Vector"]
|
||||
J --> K
|
||||
K --> L["Store in Neo4j<br/>Chunk/ImageEmbedding Node"]
|
||||
L --> M["Extract Concepts<br/>(LLM entity extraction)"]
|
||||
M --> N["Create Concept Nodes<br/>+ REFERENCES/MENTIONS edges"]
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-6">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-info text-white"><h4 class="mb-0">Qwen3-VL-Embedding-8B</h4></div>
|
||||
<div class="card-body">
|
||||
<ul>
|
||||
<li><strong>Dimensions:</strong> 4096 (full), or MRL truncation to 3072/2048/1536/1024</li>
|
||||
<li><strong>Input:</strong> Text, images, screenshots, video, or any mix</li>
|
||||
<li><strong>Instruction-aware:</strong> Content-type instruction improves quality 1–5%</li>
|
||||
<li><strong>Quantization:</strong> Int8 (~8GB VRAM), Int4 (~4GB VRAM)</li>
|
||||
<li><strong>Serving:</strong> vLLM with <code>--runner pooling</code></li>
|
||||
<li><strong>Languages:</strong> 30+ languages supported</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-6">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-warning text-dark"><h4 class="mb-0">Qwen3-VL-Reranker-8B</h4></div>
|
||||
<div class="card-body">
|
||||
<ul>
|
||||
<li><strong>Architecture:</strong> Single-tower cross-attention (deep query↔document interaction)</li>
|
||||
<li><strong>Input:</strong> (query, document) pairs — both can be multimodal</li>
|
||||
<li><strong>Output:</strong> Relevance score (sigmoid of yes/no token probabilities)</li>
|
||||
<li><strong>Instruction-aware:</strong> Custom re-ranking instructions per content type</li>
|
||||
<li><strong>Serving:</strong> vLLM with <code>--runner pooling</code> + score endpoint</li>
|
||||
<li><strong>Fallback:</strong> Qwen3-Reranker-0.6B via llama.cpp (text-only)</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="alert alert-info border-start border-4 border-info">
|
||||
<h4><i class="bi bi-image"></i> Why Multimodal Matters</h4>
|
||||
<p>Traditional RAG systems OCR images and diagrams, producing garbled text. Multimodal embedding understands the <em>visual content</em> directly:</p>
|
||||
<ul class="mb-0">
|
||||
<li><strong>Technical diagrams:</strong> Wiring diagrams, network topologies, architecture diagrams — searchable by visual content, not OCR garbage</li>
|
||||
<li><strong>Album artwork:</strong> "psychedelic album covers from the 70s" finds matching art via visual similarity</li>
|
||||
<li><strong>Art:</strong> The actual painting/sculpture becomes the searchable content, not just its text description</li>
|
||||
<li><strong>PDF pages:</strong> Image-only PDF pages with charts and tables are embedded as images, not skipped</li>
|
||||
</ul>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: SEARCH PIPELINE -->
|
||||
<section id="search-pipeline" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-search"></i> Search Pipeline — GraphRAG + Vector + Re-rank</h2>
|
||||
|
||||
<div class="card mb-4">
|
||||
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-flow-chart"></i> Search Flow</h3></div>
|
||||
<div class="card-body">
|
||||
<div class="mermaid">
|
||||
flowchart TD
|
||||
Q["User Query"] --> E["Embed Query<br/>(Qwen3-VL-Embedding)"]
|
||||
E --> VS["1. Vector Search<br/>(Neo4j vector index)<br/>Top-K × 3 oversample"]
|
||||
E --> GT["2. Graph Traversal<br/>(Cypher queries)<br/>Concept + relationship walks"]
|
||||
Q --> FT["3. Full-Text Search<br/>(Neo4j fulltext index)<br/>Keyword matching"]
|
||||
VS --> F["Candidate Fusion<br/>+ Deduplication"]
|
||||
GT --> F
|
||||
FT --> F
|
||||
F --> RR["4. Re-Rank<br/>(Qwen3-VL-Reranker)<br/>Cross-attention scoring"]
|
||||
RR --> TK["Top-K Results"]
|
||||
TK --> CTX["Inject Content-Type<br/>Context Prompt"]
|
||||
CTX --> LLM["5. LLM Responder<br/>(Two-stage RAG)"]
|
||||
LLM --> REV["6. LLM Reviewer<br/>(Quality + citation check)"]
|
||||
REV --> ANS["Final Answer<br/>with Citations"]
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-primary text-white"><h5 class="mb-0">1. Vector Search</h5></div>
|
||||
<div class="card-body">
|
||||
<p>Cosine similarity via Neo4j vector index on Chunk and ImageEmbedding nodes.</p>
|
||||
<pre class="bg-light p-2 rounded"><code>CALL db.index.vector.queryNodes(
|
||||
'chunk_embedding', 30,
|
||||
$query_vector
|
||||
) YIELD node, score
|
||||
WHERE score > $threshold</code></pre>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-success text-white"><h5 class="mb-0">2. Graph Traversal</h5></div>
|
||||
<div class="card-body">
|
||||
<p>Walk relationships to find contextually related content that vector search alone would miss.</p>
|
||||
<pre class="bg-light p-2 rounded"><code>MATCH (c:Chunk)-[:HAS_CHUNK]-(i:Item)
|
||||
-[:REFERENCES]->(con:Concept)
|
||||
-[:RELATED_TO]-(con2:Concept)
|
||||
<-[:REFERENCES]-(i2:Item)
|
||||
-[:HAS_CHUNK]->(c2:Chunk)
|
||||
RETURN c2, i2</code></pre>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-info text-white"><h5 class="mb-0">3. Full-Text Search</h5></div>
|
||||
<div class="card-body">
|
||||
<p>Neo4j native full-text index for keyword matching (BM25-equivalent).</p>
|
||||
<pre class="bg-light p-2 rounded"><code>CALL db.index.fulltext.queryNodes(
|
||||
'chunk_fulltext',
|
||||
$query_text
|
||||
) YIELD node, score</code></pre>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: MCP INTERFACE -->
|
||||
<section id="mcp-interface" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-plug"></i> MCP Server Interface</h2>
|
||||
|
||||
<div class="alert alert-primary border-start border-4 border-primary">
|
||||
<h3>MCP-First Design</h3>
|
||||
<p class="mb-0">Mnemosyne exposes its capabilities as MCP tools, making the entire knowledge base accessible to Claude, Copilot, and any MCP-compatible LLM client. The MCP server is a primary interface, not an afterthought.</p>
|
||||
</div>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-6">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-primary text-white"><h4 class="mb-0">Search & Retrieval Tools</h4></div>
|
||||
<div class="card-body">
|
||||
<table class="table table-sm">
|
||||
<thead><tr><th>Tool</th><th>Description</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td><code>search_library</code></td><td>Semantic + graph + full-text search with re-ranking. Filters by library, collection, content type.</td></tr>
|
||||
<tr><td><code>ask_about</code></td><td>Full RAG pipeline — search, re-rank, content-type context injection, LLM response with citations.</td></tr>
|
||||
<tr><td><code>find_similar</code></td><td>Find items similar to a given item using vector similarity. Optionally search across libraries.</td></tr>
|
||||
<tr><td><code>search_by_image</code></td><td>Multimodal search — find content matching an uploaded image.</td></tr>
|
||||
<tr><td><code>explore_connections</code></td><td>Traverse knowledge graph from an item — find related concepts, authors, themes.</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-6">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-success text-white"><h4 class="mb-0">Management & Navigation Tools</h4></div>
|
||||
<div class="card-body">
|
||||
<table class="table table-sm">
|
||||
<thead><tr><th>Tool</th><th>Description</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td><code>browse_libraries</code></td><td>List all libraries with their content types and item counts.</td></tr>
|
||||
<tr><td><code>browse_collections</code></td><td>List collections within a library.</td></tr>
|
||||
<tr><td><code>get_item</code></td><td>Get detailed info about a specific item, including metadata and graph connections.</td></tr>
|
||||
<tr><td><code>add_content</code></td><td>Add new content to a library — triggers async embedding + graph construction.</td></tr>
|
||||
<tr><td><code>get_concepts</code></td><td>List extracted concepts for an item or across a library.</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: GPU SERVICES -->
|
||||
<section id="gpu-services" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-gpu-card"></i> GPU Services</h2>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-6">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-primary text-white"><h4 class="mb-0">RTX 5090 (32GB VRAM)</h4></div>
|
||||
<div class="card-body">
|
||||
<table class="table table-sm">
|
||||
<tbody>
|
||||
<tr><td><strong>Model</strong></td><td>Qwen3-VL-Reranker-8B</td></tr>
|
||||
<tr><td><strong>VRAM (bf16)</strong></td><td>~18GB</td></tr>
|
||||
<tr><td><strong>Serving</strong></td><td>vLLM <code>--runner pooling</code></td></tr>
|
||||
<tr><td><strong>Port</strong></td><td>:8001</td></tr>
|
||||
<tr><td><strong>Role</strong></td><td>Multimodal re-ranking</td></tr>
|
||||
<tr><td><strong>Headroom</strong></td><td>~14GB for chat model</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-6">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-success text-white"><h4 class="mb-0">RTX 3090 (24GB VRAM)</h4></div>
|
||||
<div class="card-body">
|
||||
<table class="table table-sm">
|
||||
<tbody>
|
||||
<tr><td><strong>Model</strong></td><td>Qwen3-VL-Embedding-8B</td></tr>
|
||||
<tr><td><strong>VRAM (bf16)</strong></td><td>~18GB</td></tr>
|
||||
<tr><td><strong>Serving</strong></td><td>vLLM <code>--runner pooling</code></td></tr>
|
||||
<tr><td><strong>Port</strong></td><td>:8002</td></tr>
|
||||
<tr><td><strong>Role</strong></td><td>Multimodal embedding</td></tr>
|
||||
<tr><td><strong>Headroom</strong></td><td>~6GB</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="alert alert-info border-start border-4 border-info">
|
||||
<h4><i class="bi bi-arrow-repeat"></i> Fallback: llama.cpp (Existing Ansible Infra)</h4>
|
||||
<p class="mb-0">Text-only Qwen3-Reranker-0.6B GGUF served via <code>llama-server</code> on existing systemd/Ansible infrastructure. Managed by the same playbooks, monitored by the same Grafana dashboards. Used when vLLM services are down or for text-only workloads.</p>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: DEPLOYMENT -->
|
||||
<section id="deployment" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-box-seam"></i> Deployment</h2>
|
||||
|
||||
<div class="row g-4 mb-4">
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-primary text-white"><h4 class="mb-0">Core Services</h4></div>
|
||||
<div class="card-body">
|
||||
<ul class="mb-0">
|
||||
<li><strong>web:</strong> Django app (Gunicorn)</li>
|
||||
<li><strong>postgres:</strong> PostgreSQL (auth/config only)</li>
|
||||
<li><strong>neo4j:</strong> Neo4j 5.x (knowledge graph + vectors)</li>
|
||||
<li><strong>rabbitmq:</strong> Celery broker</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-success text-white"><h4 class="mb-0">Async Processing</h4></div>
|
||||
<div class="card-body">
|
||||
<ul class="mb-0">
|
||||
<li><strong>celery-worker:</strong> Embedding, graph construction</li>
|
||||
<li><strong>celery-beat:</strong> Scheduled re-sync tasks</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="col-md-4">
|
||||
<div class="card h-100">
|
||||
<div class="card-header bg-info text-white"><h4 class="mb-0">Storage & Proxy</h4></div>
|
||||
<div class="card-body">
|
||||
<ul class="mb-0">
|
||||
<li><strong>minio:</strong> S3-compatible content storage</li>
|
||||
<li><strong>nginx:</strong> Static/proxy</li>
|
||||
<li><strong>mcp-server:</strong> MCP interface process</li>
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="alert alert-secondary border-start border-4 border-secondary">
|
||||
<h4>Shared Infrastructure with Spelunker</h4>
|
||||
<p class="mb-0">Mnemosyne and Spelunker share: GPU model services (llama.cpp + vLLM), MinIO/S3 (separate buckets), Neo4j (separate databases), RabbitMQ (separate vhosts), and Grafana monitoring. Each is its own Docker Compose stack but points to shared infra.</p>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- SECTION: BACKPORT -->
|
||||
<section id="backport" class="mb-5">
|
||||
<h2 class="h2 mb-4"><i class="bi bi-arrow-left-right"></i> Backport Strategy to Spelunker</h2>
|
||||
|
||||
<div class="alert alert-warning border-start border-4 border-warning">
|
||||
<h3>Build Forward, Backport Back</h3>
|
||||
<p class="mb-0">Mnemosyne proves the architecture with no legacy constraints. Once validated, proven components flow back to Spelunker to enhance its RFP workflow with multimodal understanding and re-ranking precision.</p>
|
||||
</div>
|
||||
|
||||
<table class="table table-bordered">
|
||||
<thead class="table-dark"><tr><th>Component</th><th>Mnemosyne (Prove)</th><th>Spelunker (Backport)</th></tr></thead>
|
||||
<tbody>
|
||||
<tr><td><strong>RerankerService</strong></td><td>Qwen3-VL multimodal + llama.cpp text</td><td>Drop into <code>rag/services/reranker.py</code></td></tr>
|
||||
<tr><td><strong>Multimodal Embedding</strong></td><td>Qwen3-VL-Embedding via vLLM</td><td>Add alongside OpenAI embeddings, MRL@1536d for pgvector compat</td></tr>
|
||||
<tr><td><strong>Diagram Understanding</strong></td><td>Image pages embedded multimodally</td><td>PDF diagrams in RFP docs become searchable</td></tr>
|
||||
<tr><td><strong>MCP Server</strong></td><td>Primary interface from day one</td><td>Add as secondary interface to Spelunker</td></tr>
|
||||
<tr><td><strong>Neo4j (optional)</strong></td><td>Primary vector + graph store</td><td>Could replace pgvector, or run alongside</td></tr>
|
||||
<tr><td><strong>Content-Type Config</strong></td><td>Library type definitions</td><td>Adapt as document classification in Spelunker</td></tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</section>
|
||||
|
||||
<div class="alert alert-success border-start border-4 border-success mt-5">
|
||||
<h3><i class="bi bi-check-circle"></i> Documentation Complete</h3>
|
||||
<p class="mb-0">This document describes the target architecture for Mnemosyne. Phase implementation documents provide detailed build plans.</p>
|
||||
</div>
|
||||
</div>
|
||||
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
|
||||
</body>
|
||||
</html>
|
||||
Reference in New Issue
Block a user