Add Themis application with custom widgets, views, and utilities

- Implemented custom form widgets for date, time, and datetime fields with DaisyUI styling.
- Created utility functions for formatting dates, times, and numbers according to user preferences.
- Developed views for profile settings, API key management, and notifications, including health check endpoints.
- Added URL configurations for Themis tests and main application routes.
- Established test cases for custom widgets to ensure proper functionality and integration.
- Defined project metadata and dependencies in pyproject.toml for package management.
This commit is contained in:
2026-03-21 02:00:18 +00:00
parent e99346d014
commit 99bdb4ac92
351 changed files with 65123 additions and 2 deletions

4
.gitignore vendored
View File

@@ -174,3 +174,7 @@ cython_debug/
# PyPI configuration file
.pypirc
# Mnemosyne-specific
.env.local
/staticfiles/
/media/

View File

@@ -1,6 +1,6 @@
MIT License
Copyright (c) 2026 r
Copyright (c) 2026 Helu.ca
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
associated documentation files (the "Software"), to deal in the Software without restriction, including

View File

@@ -1,2 +1,96 @@
# mnemosyne
# Mnemosyne
*"The electric light did not come from the continuous improvement of candles."* — Oren Harari
**The memory of everything you know.**
Mnemosyne is a content-type-aware, multimodal personal knowledge management system built on Neo4j knowledge graphs and Qwen3-VL multimodal AI models. Named after the Titan goddess of memory and mother of the nine Muses, Mnemosyne doesn't just store your knowledge — it understands what kind of knowledge it is, connects it through relationships, and makes it all searchable through text, images, and natural language.
## What Makes This Different
Every existing knowledge base tool treats all documents identically: text in, chunks out, vectors stored. A novel and a PostgreSQL manual get the same treatment.
Mnemosyne knows the difference:
- **A textbook** has chapters, an index, technical terminology, and pedagogical structure. It's chunked accordingly, and when an LLM retrieves results, it knows this is instructional content.
- **A novel** has narrative flow, characters, plot arcs, dialogue. The LLM knows to interpret results as creative fiction.
- **Album artwork** is a visual asset tied to an artist, genre, and era. It's embedded multimodally — searchable by both image similarity and text description.
- **A journal entry** is personal, temporal, reflective. The LLM treats it differently than a reference manual.
This **content-type awareness** flows through every layer: chunking strategy, embedding instructions, re-ranking, and the final LLM prompt.
## Core Architecture
| Component | Technology | Purpose |
|-----------|-----------|---------|
| **Knowledge Graph** | Neo4j 5.x | Relationships + vector storage (no dimension limits) |
| **Multimodal Embeddings** | Qwen3-VL-Embedding-8B | Text + image + video in unified vector space (4096d) |
| **Multimodal Re-ranking** | Qwen3-VL-Reranker-8B | Cross-attention precision scoring |
| **Text Fallback** | Qwen3-Reranker (llama.cpp) | Text-only re-ranking via GGUF |
| **Web Framework** | Django 5.x + DRF | Auth, admin, API, content management |
| **Object Storage** | S3/MinIO | Original content + chunk text storage |
| **Async Processing** | Celery + RabbitMQ | Document embedding, graph construction |
| **LLM Interface** | MCP Server | Primary interface for Claude, Copilot, etc. |
| **GPU Serving** | vLLM + llama.cpp | Local model inference |
## Library Types
| Library | Example Content | Multimodal? | Graph Relationships |
|---------|----------------|-------------|-------------------|
| **Fiction** | Novels, short stories | Cover art | Author → Book → Character → Theme |
| **Technical** | Textbooks, manuals, docs | Diagrams, screenshots | Product → Manual → Section → Procedure |
| **Music** | Lyrics, liner notes | Album artwork | Artist → Album → Track → Genre |
| **Film** | Scripts, synopses | Stills, posters | Director → Film → Scene → Actor |
| **Art** | Descriptions, catalogs | The artwork itself | Artist → Piece → Style → Movement |
| **Journals** | Personal entries | Photos | Date → Entry → Topic → Person/Place |
## Search Pipeline
```
Query → Vector Search (Neo4j) + Graph Traversal (Cypher) + Full-Text Search
→ Candidate Fusion → Qwen3-VL Re-ranking → Content-Type Context Injection
→ LLM Response with Citations
```
## Heritage
Mnemosyne's RAG pipeline architecture is inspired by [Spelunker](https://git.helu.ca/r/spelunker), an enterprise RFP response platform. The proven patterns — hybrid search, two-stage RAG (responder + reviewer), citation-based retrieval, and async document processing — are carried forward and enhanced with multimodal capabilities and knowledge graph relationships.
## Running Celery Workers
Mnemosyne uses Celery with RabbitMQ for async document embedding. From the `mnemosyne/` directory:
```bash
# Development — single worker, all queues
celery -A mnemosyne worker -l info -Q celery,embedding,batch
# Or skip workers entirely with eager mode (.env):
CELERY_TASK_ALWAYS_EAGER=True
```
**Production — separate workers:**
```bash
celery -A mnemosyne worker -l info -Q embedding -c 1 -n embedding@%h # GPU-bound embedding
celery -A mnemosyne worker -l info -Q batch -c 2 -n batch@%h # Batch orchestration
celery -A mnemosyne worker -l info -Q celery -c 2 -n default@%h # LLM API validation
```
**Scheduler & Monitoring:**
```bash
celery -A mnemosyne beat -l info # Periodic task scheduler
celery -A mnemosyne flower --port=5555 # Web monitoring UI
```
See [Phase 2: Celery Workers & Scheduler](docs/PHASE_2_EMBEDDING_PIPELINE.md#celery-workers--scheduler) for full details on queues, reliability settings, and task progress tracking.
## Documentation
- **[Architecture Documentation](docs/mnemosyne.html)** — Full system architecture with diagrams
- **[Phase 1: Foundation](docs/PHASE_1_FOUNDATION.md)** — Project skeleton, Neo4j data model, content-type system
- **[Phase 2: Embedding Pipeline](docs/PHASE_2_EMBEDDING_PIPELINE.md)** — Qwen3-VL multimodal embedding
- **[Phase 3: Search & Re-ranking](docs/PHASE_3_SEARCH_AND_RERANKING.md)** — Hybrid search + re-ranker
- **[Phase 4: RAG Pipeline](docs/PHASE_4_RAG_PIPELINE.md)** — Content-type-aware generation
- **[Phase 5: MCP Server](docs/PHASE_5_MCP_SERVER.md)** — LLM integration interface
- **[Phase 6: Backport to Spelunker](docs/PHASE_6_BACKPORT_TO_SPELUNKER.md)** — Proven patterns flowing back

254
docs/PHASE_1_FOUNDATION.md Normal file
View File

@@ -0,0 +1,254 @@
# Phase 1: Foundation
## Objective
Establish the project skeleton, Neo4j data model, Django integration, and content-type system. At the end of this phase, you can create libraries, collections, and items via Django admin and the Neo4j graph is populated with the correct node/relationship structure.
## Deliverables
### 1. Django Project Skeleton
- Rename configuration module from `mnemosyne/mnemosyne/` to `mnemosyne/config/` per Red Panda Standards
- Create `pyproject.toml` at repo root with floor-pinned dependencies
- Create `.env` / `.env.example` for environment variables (never commit `.env`)
- Use a single settings.py and use dotenv to configure with '.env'.
- Configure dual-database: PostgreSQL (Django auth/config) + Neo4j (content graph)
- Install and configure `django-neomodel` for Neo4j OGM integration
- Configure `djangorestframework` for API
- Configure Celery + RabbitMQ (Async Task pattern)
- Configure S3 storage backend via Incus buckets (MinIO-backed, Terraform-provisioned)
- Configure structured logging for Loki integration via Alloy
### 2. Django Apps
| App | Purpose | Database |
|-----|---------|----------|
| `themis` (installed) | User profiles, preferences, API key management, navigation, notifications | PostgreSQL |
| `library/` | Libraries, Collections, Items, Chunks, Concepts | Neo4j (neomodel) |
| `llm_manager/` | LLM API/model config, usage tracking | PostgreSQL (ported from Spelunker) |
> **Note:** Themis replaces `core/`. User profiles, timezone preferences, theme management, API key storage (encrypted, Fernet), and standard navigation are all provided by Themis. No separate `core/` app is needed. If SSO (Casdoor) or Organization models are required in future, they will be added as separate apps following the SSO and Organization patterns.
### 3. Neo4j Graph Model (neomodel)
```python
# library/models.py
class Library(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
library_type = StringProperty(required=True) # fiction, technical, music, film, art, journal
description = StringProperty(default='')
# Content-type configuration (stored as JSON strings)
chunking_config = JSONProperty(default={})
embedding_instruction = StringProperty(default='')
reranker_instruction = StringProperty(default='')
llm_context_prompt = StringProperty(default='')
created_at = DateTimeProperty(default_now=True)
collections = RelationshipTo('Collection', 'CONTAINS')
class Collection(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(required=True)
description = StringProperty(default='')
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
items = RelationshipTo('Item', 'CONTAINS')
library = RelationshipTo('Library', 'BELONGS_TO')
class Item(StructuredNode):
uid = UniqueIdProperty()
title = StringProperty(required=True)
item_type = StringProperty(default='')
s3_key = StringProperty(default='')
content_hash = StringProperty(index=True)
file_type = StringProperty(default='')
file_size = IntegerProperty(default=0)
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
updated_at = DateTimeProperty(default_now=True)
chunks = RelationshipTo('Chunk', 'HAS_CHUNK')
images = RelationshipTo('Image', 'HAS_IMAGE')
concepts = RelationshipTo('Concept', 'REFERENCES', model=ReferencesRel)
related_items = RelationshipTo('Item', 'RELATED_TO', model=RelatedToRel)
class Chunk(StructuredNode):
uid = UniqueIdProperty()
chunk_index = IntegerProperty(required=True)
chunk_s3_key = StringProperty(required=True)
chunk_size = IntegerProperty(default=0)
text_preview = StringProperty(default='') # First 500 chars for full-text index
embedding = ArrayProperty(FloatProperty()) # 4096d vector
created_at = DateTimeProperty(default_now=True)
mentions = RelationshipTo('Concept', 'MENTIONS')
class Concept(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
concept_type = StringProperty(default='') # person, place, topic, technique, theme
embedding = ArrayProperty(FloatProperty()) # 4096d vector
related_concepts = RelationshipTo('Concept', 'RELATED_TO')
class Image(StructuredNode):
uid = UniqueIdProperty()
s3_key = StringProperty(required=True)
image_type = StringProperty(default='') # cover, diagram, artwork, still, photo
description = StringProperty(default='')
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
embeddings = RelationshipTo('ImageEmbedding', 'HAS_EMBEDDING')
class ImageEmbedding(StructuredNode):
uid = UniqueIdProperty()
embedding = ArrayProperty(FloatProperty()) # 4096d multimodal vector
created_at = DateTimeProperty(default_now=True)
```
### 4. Neo4j Index Setup
Management command: `python manage.py setup_neo4j_indexes`
Creates vector indexes (4096d cosine), full-text indexes, and constraint indexes.
### 5. Content-Type System
Default library type configurations loaded via management command (`python manage.py load_library_types`). A management command is preferred over fixtures because these configurations will evolve across releases, and the command can be re-run idempotently to update defaults without overwriting per-library customizations.
Default configurations:
| Library Type | Chunking Strategy | Embedding Instruction | LLM Context |
|-------------|-------------------|----------------------|-------------|
| fiction | chapter_aware | narrative retrieval | "Excerpts from fiction..." |
| technical | section_aware | procedural retrieval | "Excerpts from technical docs..." |
| music | song_level | music discovery | "Song lyrics and metadata..." |
| film | scene_level | cinematic retrieval | "Film content..." |
| art | description_level | visual/stylistic retrieval | "Artwork descriptions..." |
| journal | entry_level | temporal/reflective retrieval | "Personal journal entries..." |
### 6. Admin & Management UI
`django-neomodel`'s admin support is limited — `StructuredNode` models don't participate in Django's ORM, so standard `ModelAdmin`, filters, search, and inlines don't work. Instead:
- **Custom admin views** for Library, Collection, and Item CRUD using Cypher/neomodel queries, rendered in Django admin's template structure
- **DRF management API** (`/api/v1/library/`, `/api/v1/collection/`, `/api/v1/item/`) for programmatic access and future frontend consumption
- Library CRUD includes content-type configuration editing
- Collection/Item views support filtering by library, type, and date
- All admin views extend `themis/base.html` for consistent navigation
### 7. LLM Manager (Port from Spelunker)
Copy and adapt `llm_manager/` app from Spelunker:
- `LLMApi` model (OpenAI-compatible API endpoints)
- `LLMModel` model (with new `reranker` and `multimodal_embed` model types)
- `LLMUsage` tracking
- **API key storage uses Themis `UserAPIKey`** — LLM Manager does not implement its own encrypted key storage. API credentials for LLM providers are stored via Themis's Fernet-encrypted `UserAPIKey` model with `key_type='api'` and appropriate `service_name` (e.g., "OpenAI", "Arke"). `LLMApi` references credentials by service name lookup against the requesting user's Themis keys.
Schema additions to Spelunker's `LLMModel`:
| Field | Change | Purpose |
|-------|--------|---------|
| `model_type` | Add choices: `reranker`, `multimodal_embed` | Support Qwen3-VL reranker and embedding models |
| `supports_multimodal` | New `BooleanField` | Flag models that accept image+text input |
| `vector_dimensions` | New `IntegerProperty` | Embedding output dimensions (e.g., 4096) |
### 8. Infrastructure Wiring (Ouranos)
All connections follow Ouranos DNS conventions — use `.incus` hostnames, never hardcode IPs.
| Service | Host | Connection | Settings Variable |
|---------|------|------------|-------------------|
| PostgreSQL | `portia.incus:5432` | Database `mnemosyne` (must be provisioned) | `DATABASE_URL` |
| Neo4j (Bolt) | `ariel.incus:25554` | Neo4j 5.26.0 | `NEOMODEL_NEO4J_BOLT_URL` |
| Neo4j (HTTP) | `ariel.incus:25584` | Browser/API access | — |
| RabbitMQ | `oberon.incus:5672` | Message broker | `CELERY_BROKER_URL` |
| S3 (Incus) | Terraform-provisioned Incus bucket | MinIO-backed object storage | `AWS_S3_ENDPOINT_URL`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_STORAGE_BUCKET_NAME` |
| Arke LLM Proxy | `sycorax.incus:25540` | LLM API routing | Configured per `LLMApi` record |
| SMTP (dev) | `oberon.incus:22025` | smtp4dev test server | `EMAIL_HOST` |
| Loki (logs) | `prospero.incus:3100` | Via Alloy agent (host-level, not app-level) | — |
| Casdoor SSO | `titania.incus:22081` | Future: SSO pattern | — |
**Terraform provisioning required before Phase 1 deployment:**
- PostgreSQL database `mnemosyne` on Portia
- Incus S3 bucket for Mnemosyne content storage
- HAProxy route: `mnemosyne.ouranos.helu.ca``puck.incus:<port>` (port TBD, assign next available in 22xxx range)
**Development environment (local):**
- PostgreSQL for Django ORM on 'portia.incus'
- Local Neo4j instance or `ariel.incus` via SSH tunnel
- `django.core.files.storage.FileSystemStorage` for S3 (tests/dev)
- `CELERY_TASK_ALWAYS_EAGER=True` for synchronous task execution
### 9. Testing Strategy
Follows Red Panda Standards: Django `TestCase`, separate test files per module.
| Test File | Scope |
|-----------|-------|
| `library/tests/test_models.py` | Neo4j node creation, relationships, property validation |
| `library/tests/test_content_types.py` | `load_library_types` command, configuration retrieval per library |
| `library/tests/test_indexes.py` | `setup_neo4j_indexes` command execution |
| `library/tests/test_api.py` | DRF endpoints for Library/Collection/Item CRUD |
| `library/tests/test_admin_views.py` | Custom admin views render and submit correctly |
| `llm_manager/tests/test_models.py` | LLMApi, LLMModel creation, new model types |
| `llm_manager/tests/test_api.py` | LLM Manager API endpoints |
**Neo4j test strategy:**
- Tests use a dedicated Neo4j test database (separate from development/production)
- `NEOMODEL_NEO4J_BOLT_URL` overridden in test settings to point to test database
- Each test class clears its nodes in `setUp` / `tearDown` using `neomodel.clear_neo4j_database()`
- CI/CD (Gitea Runner on Puck) uses a Docker Neo4j instance for isolated test runs
- For local development without Neo4j, tests that require Neo4j are skipped via `@unittest.skipUnless(neo4j_available(), "Neo4j not available")`
## Dependencies
```toml
# pyproject.toml — floor-pinned with ceiling per Red Panda Standards
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"django-neomodel>=0.1,<1.0",
"neomodel>=5.3,<6.0",
"neo4j>=5.0,<6.0",
"celery>=5.3,<6.0",
"django-storages[boto3]>=1.14,<2.0",
"django-environ>=0.11,<1.0",
"psycopg[binary]>=3.1,<4.0",
"dj-database-url>=2.1,<3.0",
"shortuuid>=1.0,<2.0",
"gunicorn>=21.0,<24.0",
"cryptography>=41.0,<45.0",
"flower>=2.0,<3.0",
"pymemcache>=4.0,<5.0",
"django-heluca-themis",
]
```
## Success Criteria
- [ ] Config module renamed to `config/`, `pyproject.toml` at repo root with floor-pinned deps
- [ ] Settings load from environment variables via `django-environ` (`.env.example` provided)
- [ ] Django project runs with dual PostgreSQL + Neo4j databases
- [ ] Can create Library → Collection → Item through custom admin views
- [ ] DRF API endpoints return Library/Collection/Item data
- [ ] Neo4j graph shows correct node types and relationships
- [ ] Content-type configurations loaded via `load_library_types` and retrievable per library
- [ ] LLM Manager ported from Spelunker; uses Themis `UserAPIKey` for credential storage
- [ ] S3 storage configured against Incus bucket (Terraform-provisioned) and tested
- [ ] Celery worker connects to RabbitMQ on Oberon
- [ ] Structured logging configured (JSON format, compatible with Loki/Alloy)
- [ ] Tests pass for all Phase 1 apps (library, llm_manager)
- [ ] HAProxy route provisioned: `mnemosyne.ouranos.helu.ca`

View File

@@ -0,0 +1,498 @@
# Phase 2: Embedding Pipeline
## Objective
Build the complete document ingestion and embedding pipeline: upload content → parse (text + images) → chunk (content-type-aware) → embed via configurable model → store vectors in Neo4j → extract concepts for knowledge graph.
## Heritage
The embedding pipeline adapts proven patterns from [Spelunker](https://git.helu.ca/r/spelunker)'s `rag/services/embeddings.py` — semantic chunking, batch embedding, S3 chunk storage, and progress tracking — enhanced with multimodal capabilities, knowledge graph relationships, and content-type awareness.
## Architecture Overview
```
Upload (API/Admin)
→ S3 Storage (original file)
→ Document Parsing (PyMuPDF — text + images)
→ Content-Type-Aware Chunking (semantic-text-splitter)
→ Text Embedding (system embedding model via LLM Manager)
→ Image Embedding (multimodal model, if available)
→ Neo4j Graph Storage (Chunk nodes, Image nodes, vectors)
→ Concept Extraction (system chat model)
→ Knowledge Graph (Concept nodes, MENTIONS/REFERENCES edges)
```
## Deliverables
### 1. Document Parsing Service (`library/services/parsers.py`)
**Primary parser: PyMuPDF** — a single library handling all document formats with unified text + image extraction.
#### Supported Formats
| Format | Extensions | Text Extraction | Image Extraction |
|--------|-----------|----------------|-----------------|
| PDF | `.pdf` | Layout-preserving text | Embedded images, diagrams |
| EPUB | `.epub` | Chapter-structured HTML | Cover art, illustrations |
| DOCX | `.docx` | Via HTML conversion | Inline images, diagrams |
| PPTX | `.pptx` | Via HTML conversion | Slide images, charts |
| XLSX | `.xlsx` | Via HTML conversion | Embedded charts |
| XPS | `.xps` | Native | Native |
| MOBI | `.mobi` | Native | Native |
| FB2 | `.fb2` | Native | Native |
| CBZ | `.cbz` | Native | Native (comic pages) |
| Plain text | `.txt`, `.md` | Direct read | N/A |
| HTML | `.html`, `.htm` | PyMuPDF or direct | Inline images |
| Images | `.jpg`, `.png`, etc. | N/A (OCR future) | The image itself |
#### Text Sanitization
Ported from Spelunker's `text_utils.py`:
- Remove null bytes and control characters
- Remove zero-width characters
- Normalize Unicode to NFC
- Replace invalid UTF-8 sequences
- Clean PDF ligatures and artifacts
- Normalize whitespace
#### Image Extraction
For each document page/section, extract embedded images via `page.get_images()``doc.extract_image(xref)`:
- Raw image bytes (PNG/JPEG)
- Dimensions (width × height)
- Source page/position for chunk-image association
- Store in S3: `images/{item_uid}/{image_index}.{ext}`
#### Parse Result Structure
```python
@dataclass
class TextBlock:
text: str
page: int
metadata: dict # {heading_level, section_name, etc.}
@dataclass
class ExtractedImage:
data: bytes
ext: str # png, jpg, etc.
width: int
height: int
source_page: int
source_index: int
@dataclass
class ParseResult:
text_blocks: list[TextBlock]
images: list[ExtractedImage]
metadata: dict # {page_count, title, author, etc.}
file_type: str
```
### 2. Content-Type-Aware Chunking Service (`library/services/chunker.py`)
Uses `semantic-text-splitter` with HuggingFace tokenizer (proven in Spelunker).
#### Strategy Dispatch
Based on `Library.chunking_config`:
| Strategy | Library Type | Boundary Markers | Chunk Size | Overlap |
|----------|-------------|-----------------|-----------|---------|
| `chapter_aware` | Fiction | chapter, scene, paragraph | 1024 | 128 |
| `section_aware` | Technical | section, subsection, code_block, list | 512 | 64 |
| `song_level` | Music | song, verse, chorus | 512 | 32 |
| `scene_level` | Film | scene, act, sequence | 768 | 64 |
| `description_level` | Art | artwork, description, analysis | 512 | 32 |
| `entry_level` | Journal | entry, date, paragraph | 512 | 32 |
#### Chunk-Image Association
Track which images appeared near which text chunks:
- PDF: image bounding boxes on specific pages
- DOCX/PPTX: images associated with slides/sections
- EPUB: images referenced from specific chapters
Creates `Chunk -[HAS_NEARBY_IMAGE]-> Image` relationships with proximity metadata.
#### Chunk Storage
- Chunk text stored in S3: `chunks/{item_uid}/chunk_{index}.txt`
- `text_preview` (first 500 chars) stored on Chunk node for full-text indexing
### 3. Embedding Client (`library/services/embedding_client.py`)
Multi-backend embedding client dispatching by `LLMApi.api_type`.
#### Backend Support
| API Type | Protocol | Auth | Batch Support |
|----------|---------|------|---------------|
| `openai` | HTTP POST `/embeddings` | API key header | Native batch |
| `vllm` | HTTP POST `/embeddings` | API key header | Native batch |
| `llama-cpp` | HTTP POST `/embeddings` | API key header | Native batch |
| `ollama` | HTTP POST `/embeddings` | None | Native batch |
| `bedrock` | HTTP POST `/model/{id}/invoke` | Bearer token | Client-side loop |
#### Bedrock Integration
Uses Amazon Bedrock API keys (Bearer token auth) — no boto3 SDK required:
```
POST https://bedrock-runtime.{region}.amazonaws.com/model/{model_id}/invoke
Authorization: Bearer {bedrock_api_key}
Content-Type: application/json
{"inputText": "text to embed", "dimensions": 1024, "normalize": true}
→ {"embedding": [float, ...], "inputTextTokenCount": 42}
```
**LLMApi setup for Bedrock embeddings:**
- `api_type`: `"bedrock"`
- `base_url`: `https://bedrock-runtime.us-east-1.amazonaws.com`
- `api_key`: Bedrock API key (encrypted)
**LLMApi setup for Bedrock chat (Claude, etc.):**
- `api_type`: `"openai"` (Mantle endpoint is OpenAI-compatible)
- `base_url`: `https://bedrock-mantle.us-east-1.api.aws/v1`
- `api_key`: Same Bedrock API key
#### Embedding Instruction Prefix
Before embedding, prepend the library's `embedding_instruction` to each chunk:
```
"{embedding_instruction}\n\n{chunk_text}"
```
#### Image Embedding
For multimodal models (`model.supports_multimodal`):
- Send base64-encoded image to the embedding endpoint
- Create `ImageEmbedding` node with the resulting vector
- If no multimodal model available, skip (images stored but not embedded)
#### Model Matching
Track embedded model by **name** (not UUID). Multiple APIs can serve the same model — matching by name allows provider switching without re-embedding.
### 4. Pipeline Orchestrator (`library/services/pipeline.py`)
Coordinates the full flow: parse → chunk → embed → store → graph.
#### Pipeline Stages
1. **Parse**: Extract text blocks + images from document
2. **Chunk**: Split text using content-type-aware strategy
3. **Store chunks**: S3 + Chunk nodes in Neo4j
4. **Embed text**: Generate vectors for all chunks
5. **Store images**: S3 + Image nodes in Neo4j
6. **Embed images**: Multimodal vectors (if available)
7. **Extract concepts**: Named entities from chunk text (via system chat model)
8. **Build graph**: Create Concept nodes, MENTIONS/REFERENCES edges
#### Idempotency
- Check `Item.content_hash` — skip if already processed with same hash
- Re-embedding deletes existing Chunk/Image nodes before re-processing
#### Dimension Compatibility
- Validate that the system embedding model's `vector_dimensions` matches the Neo4j vector index dimensions
- Warn at embed time if mismatch detected
### 5. Concept Extraction (`library/services/concepts.py`)
Uses the system chat model for LLM-based named entity recognition.
- Extract: people, places, topics, techniques, themes
- Create/update `Concept` nodes (deduplicated by name via unique_index)
- Connect: `Chunk -[MENTIONS]-> Concept`, `Item -[REFERENCES]-> Concept`
- Embed concept names for vector search
- If no system chat model configured, concept extraction is skipped
### 6. Celery Tasks (`library/tasks.py`)
All tasks pass IDs (not model instances) per Red Panda Standards.
| Task | Queue | Purpose |
|------|-------|---------|
| `embed_item(item_uid)` | `embedding` | Full pipeline for single item |
| `embed_collection(collection_uid)` | `batch` | All items in a collection |
| `embed_library(library_uid)` | `batch` | All items in a library |
| `batch_embed_items(item_uids)` | `batch` | Specific items |
| `reembed_item(item_uid)` | `embedding` | Delete + re-embed |
Tasks are idempotent, include retry logic, and track progress via Memcached: `library:task:{task_id}:progress`.
### 7. Prometheus Metrics (`library/metrics.py`)
Custom metrics for pipeline observability:
| Metric | Type | Labels | Purpose |
|--------|------|--------|---------|
| `mnemosyne_documents_parsed_total` | Counter | file_type, status | Parse throughput |
| `mnemosyne_document_parse_duration_seconds` | Histogram | file_type | Parse latency |
| `mnemosyne_images_extracted_total` | Counter | file_type | Image extraction volume |
| `mnemosyne_chunks_created_total` | Counter | library_type, strategy | Chunk throughput |
| `mnemosyne_chunk_size_tokens` | Histogram | — | Chunk size distribution |
| `mnemosyne_embeddings_generated_total` | Counter | model_name, api_type, content_type | Embedding throughput |
| `mnemosyne_embedding_batch_duration_seconds` | Histogram | model_name, api_type | API latency |
| `mnemosyne_embedding_api_errors_total` | Counter | model_name, api_type, error_type | API failures |
| `mnemosyne_embedding_tokens_total` | Counter | model_name | Token consumption |
| `mnemosyne_pipeline_items_total` | Counter | status | Pipeline throughput |
| `mnemosyne_pipeline_item_duration_seconds` | Histogram | — | End-to-end latency |
| `mnemosyne_pipeline_items_in_progress` | Gauge | — | Concurrent processing |
| `mnemosyne_concepts_extracted_total` | Counter | concept_type | Concept extraction volume |
### 8. Model Changes
#### Item Node — New Fields
| Field | Type | Purpose |
|-------|------|---------|
| `embedding_status` | StringProperty | pending / processing / completed / failed |
| `embedding_model_name` | StringProperty | Name of model that generated embeddings |
| `chunk_count` | IntegerProperty | Number of chunks created |
| `image_count` | IntegerProperty | Number of images extracted |
| `error_message` | StringProperty | Last error message (if failed) |
#### New Relationship Model
```python
class NearbyImageRel(StructuredRel):
proximity = StringProperty(default="same_page") # same_page, inline, same_slide, same_chapter
```
#### Chunk Node — New Relationship
```python
nearby_images = RelationshipTo('Image', 'HAS_NEARBY_IMAGE', model=NearbyImageRel)
```
#### LLMApi Model — New API Type
Add `("bedrock", "Amazon Bedrock")` to `api_type` choices.
### 9. API Enhancements
- `POST /api/v1/library/items/` — File upload with auto-trigger of `embed_item` task
- `POST /api/v1/library/items/<uid>/reembed/` — Re-embed endpoint
- `GET /api/v1/library/items/<uid>/status/` — Embedding status check
- Admin views: File upload field on item create, embedding status display
### 10. Management Commands
| Command | Purpose |
|---------|---------|
| `embed_item <uid>` | CLI embedding for testing |
| `embed_collection <uid>` | CLI batch embedding |
| `embedding_status` | Show embedding progress/statistics |
### 11. Dynamic Vector Index Dimensions
Update `setup_neo4j_indexes` to read dimensions from `LLMModel.get_system_embedding_model().vector_dimensions` instead of hardcoding 4096.
## Celery Workers & Scheduler
### Prerequisites
- RabbitMQ running on `oberon.incus:5672` with `mnemosyne` vhost and user
- `.env` configured with `CELERY_BROKER_URL=amqp://mnemosyne:password@oberon.incus:5672/mnemosyne`
- Virtual environment activated: `source ~/env/mnemosyne/bin/activate`
### Queues
Mnemosyne uses three Celery queues with task routing configured in `settings.py`:
| Queue | Tasks | Purpose | Recommended Concurrency |
|-------|-------|---------|------------------------|
| `celery` (default) | `llm_manager.validate_all_llm_apis`, `llm_manager.validate_single_api` | LLM API validation & model discovery | 2 |
| `embedding` | `library.tasks.embed_item`, `library.tasks.reembed_item` | Single-item embedding pipeline (GPU-bound) | 1 |
| `batch` | `library.tasks.embed_collection`, `library.tasks.embed_library`, `library.tasks.batch_embed_items` | Batch orchestration (dispatches to embedding queue) | 2 |
Task routing (`settings.py`):
```python
CELERY_TASK_ROUTES = {
"library.tasks.embed_*": {"queue": "embedding"},
"library.tasks.batch_*": {"queue": "batch"},
}
```
### Starting Workers
All commands run from the Django project root (`mnemosyne/`):
**Development — single worker, all queues:**
```bash
cd mnemosyne
celery -A mnemosyne worker -l info -Q celery,embedding,batch
```
**Development — eager mode (no worker needed):**
Set `CELERY_TASK_ALWAYS_EAGER=True` in `.env`. All tasks execute synchronously in the web process. Useful for debugging but does not test async behavior.
**Production — separate workers per queue:**
```bash
# Embedding worker (single concurrency — GPU is sequential)
celery -A mnemosyne worker \
-l info \
-Q embedding \
-c 1 \
-n embedding@%h \
--max-tasks-per-child=100
# Batch orchestration worker
celery -A mnemosyne worker \
-l info \
-Q batch \
-c 2 \
-n batch@%h
# Default queue worker (LLM API validation, etc.)
celery -A mnemosyne worker \
-l info \
-Q celery \
-c 2 \
-n default@%h
```
### Celery Beat (Periodic Scheduler)
Celery Beat runs scheduled tasks (e.g., periodic LLM API validation):
```bash
# File-based scheduler (simple, stores schedule in celerybeat-schedule file)
celery -A mnemosyne beat -l info
# Or with Django database scheduler (if django-celery-beat is installed)
celery -A mnemosyne beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
```
Example periodic task schedule (add to `settings.py` if needed):
```python
from celery.schedules import crontab
CELERY_BEAT_SCHEDULE = {
"validate-llm-apis-daily": {
"task": "llm_manager.validate_all_llm_apis",
"schedule": crontab(hour=6, minute=0), # Daily at 6 AM
},
}
```
### Flower (Task Monitoring)
[Flower](https://flower.readthedocs.io/) provides a real-time web UI for monitoring Celery workers and tasks:
```bash
celery -A mnemosyne flower --port=5555
```
Access at `http://localhost:5555`. Shows:
- Active/completed/failed tasks
- Worker status and resource usage
- Task execution times and retry counts
- Queue depths
### Reliability Configuration
The following settings are already configured in `settings.py`:
| Setting | Value | Purpose |
|---------|-------|---------|
| `CELERY_TASK_ACKS_LATE` | `True` | Acknowledge tasks after execution (not on receipt) — prevents task loss on worker crash |
| `CELERY_WORKER_PREFETCH_MULTIPLIER` | `1` | Workers fetch one task at a time — ensures fair distribution across workers |
| `CELERY_ACCEPT_CONTENT` | `["json"]` | Only accept JSON-serialized tasks |
| `CELERY_TASK_SERIALIZER` | `"json"` | Serialize task arguments as JSON |
### Task Progress Tracking
Embedding tasks report progress via Memcached using the key pattern:
```
library:task:{task_id}:progress → {"percent": 45, "message": "Embedded 12/27 chunks"}
```
Tasks also update Celery's native state:
```python
# Query task progress from Python
from celery.result import AsyncResult
result = AsyncResult(task_id)
result.state # "PROGRESS", "SUCCESS", "FAILURE"
result.info # {"percent": 45, "message": "..."}
```
## Dependencies
```toml
# New additions to pyproject.toml
"PyMuPDF>=1.24,<2.0",
"pymupdf4llm>=0.0.17,<1.0",
"semantic-text-splitter>=0.20,<1.0",
"tokenizers>=0.20,<1.0",
"Pillow>=10.0,<12.0",
"django-prometheus>=2.3,<3.0",
```
### License Note
PyMuPDF is AGPL-3.0 licensed. Acceptable for self-hosted personal use. Commercial distribution would require Artifex's commercial license.
## File Structure
```
mnemosyne/library/
├── services/
│ ├── __init__.py
│ ├── parsers.py # PyMuPDF universal document parsing
│ ├── text_utils.py # Text sanitization (from Spelunker)
│ ├── chunker.py # Content-type-aware chunking
│ ├── embedding_client.py # Multi-backend embedding API client
│ ├── pipeline.py # Orchestration: parse → chunk → embed → graph
│ └── concepts.py # LLM-based concept extraction
├── metrics.py # Prometheus metrics definitions
├── tasks.py # Celery tasks for async embedding
├── management/commands/
│ ├── embed_item.py
│ ├── embed_collection.py
│ └── embedding_status.py
└── tests/
├── test_parsers.py
├── test_text_utils.py
├── test_chunker.py
├── test_embedding_client.py
├── test_pipeline.py
├── test_concepts.py
└── test_tasks.py
```
## Testing Strategy
All tests use Django `TestCase`. External services (LLM APIs, Neo4j) are mocked.
| Test File | Scope |
|-----------|-------|
| `test_parsers.py` | PyMuPDF parsing for each file type, image extraction, text sanitization |
| `test_text_utils.py` | Sanitization functions, PDF artifact cleaning, Unicode normalization |
| `test_chunker.py` | Content-type strategies, boundary detection, chunk-image association |
| `test_embedding_client.py` | OpenAI-compat + Bedrock backends (mocked HTTP), batch processing, usage tracking |
| `test_pipeline.py` | Full pipeline integration (mocked), S3 storage, idempotency |
| `test_concepts.py` | Concept extraction, deduplication, graph relationships |
| `test_tasks.py` | Celery tasks (eager mode), retry logic, error handling |
## Success Criteria
- [ ] Upload a document (PDF, EPUB, DOCX, PPTX, TXT) via API or admin → file stored in S3
- [ ] Images extracted from documents and stored as Image nodes in Neo4j
- [ ] Document automatically chunked using content-type-aware strategy
- [ ] Chunks embedded via system embedding model and vectors stored in Neo4j Chunk nodes
- [ ] Images embedded multimodally into ImageEmbedding nodes (when multimodal model available)
- [ ] Chunk-image proximity relationships established in graph
- [ ] Concepts extracted and graph populated with MENTIONS/REFERENCES relationships
- [ ] Neo4j vector indexes usable for similarity queries on stored embeddings
- [ ] Celery tasks handle async embedding with progress tracking
- [ ] Re-embedding works (delete old chunks, re-process)
- [ ] Content hash prevents redundant re-embedding
- [ ] Prometheus metrics exposed at `/metrics` for pipeline monitoring
- [ ] All tests pass with mocked LLM/embedding APIs
- [ ] Bedrock embedding works via Bearer token HTTP (no boto3)

View File

@@ -0,0 +1,673 @@
# Async Task Pattern v1.0.0
Defines how Spelunker Django apps implement background task processing using Celery, RabbitMQ, Memcached, and Flower — covering fire-and-forget tasks, long-running batch jobs, signal-triggered tasks, and periodic scheduled tasks.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Why a Pattern, Not a Shared Implementation
Long-running work in Spelunker spans multiple domains, each with distinct progress-tracking and state requirements:
- A `solution_library` document embedding task needs to update `review_status` on a `Document` and count vector chunks created.
- An `rfp_manager` batch job tracks per-question progress, per-question errors, and the Celery task ID on an `RFPBatchJob` record.
- An `llm_manager` API-validation task iterates over all active APIs and accumulates model sync statistics.
- A `solution_library` documentation-source sync task fires from a View, stores `celery_task_id` on a `SyncJob`, and reports incremental progress via a callback.
Instead, this pattern defines:
- **Required task interface** — every task must have a namespaced name, a structured return dict, and structured logging.
- **Recommended job-tracking fields** — most tasks that represent a significant unit of work should have a corresponding DB job record.
- **Error handling conventions** — how to catch, log, and reflect failures back to the record.
- **Dispatch variants** — signal-triggered, admin action, view-triggered, and periodic (Beat).
- **Infrastructure conventions** — broker, result backend, serialization, and cache settings.
---
## Required Task Interface
Every Celery task in Spelunker **must**:
```python
from celery import shared_task
import logging
logger = logging.getLogger(__name__)
@shared_task(name='<app_label>.<action_name>')
def my_task(primary_id: int, user_id: int = None) -> dict:
"""One-line description of what this task does."""
try:
# ... do work ...
logger.info(f"Task succeeded for {primary_id}")
return {'success': True, 'id': primary_id}
except Exception as e:
logger.error(
f"Task failed for {primary_id}: {type(e).__name__}: {e}",
extra={'id': primary_id, 'error': str(e)},
exc_info=True,
)
return {'success': False, 'id': primary_id, 'error': str(e)}
```
| Requirement | Rule |
|---|---|
| `name` | Must be `'<app_label>.<action>'`, e.g., `'solution_library.embed_document'` |
| Return value | Always a dict with at minimum `{'success': bool}` |
| Logging | Use structured `extra={}` kwargs; never silence exceptions silently |
| Import style | Use `@shared_task`, not direct `app.task` references |
| Idempotency | Tasks **must** be safe to re-execute with the same arguments (broker redelivery, worker crash). Use `update_or_create`, check-before-write, or guard with the job record's status before re-processing. |
| Arguments | Pass only JSON-serialisable primitives (PKs, strings, numbers). Never pass ORM instances. |
---
## Retry & Time-Limit Policy
Tasks that call external services (LLM APIs, S3, remote URLs) should declare automatic retries for transient failures. Tasks must also set time limits to prevent hung workers.
### Recommended Retry Decorator
```python
@shared_task(
name='<app_label>.<action>',
bind=True,
autoretry_for=(ConnectionError, TimeoutError),
retry_backoff=60, # first retry after 60 s, then 120 s, 240 s …
retry_backoff_max=600, # cap at 10 minutes
retry_jitter=True, # add randomness to avoid thundering herd
max_retries=3,
soft_time_limit=1800, # raise SoftTimeLimitExceeded after 30 min
time_limit=2100, # hard-kill after 35 min
)
def my_task(self, primary_id: int, ...):
...
```
| Setting | Purpose | Guideline |
|---|---|---|
| `autoretry_for` | Exception classes that trigger an automatic retry | Use for **transient** errors only (network, timeout). Never for `ValueError` or business-logic errors. |
| `retry_backoff` | Seconds before first retry (doubles each attempt) | 60 s is a reasonable default for external API calls. |
| `max_retries` | Maximum retry attempts | 3 for API calls; 0 (no retry) for user-triggered batch jobs that track their own progress. |
| `soft_time_limit` | Raises `SoftTimeLimitExceeded` — allows graceful cleanup | Set on every task. Catch it to mark the job record as failed. |
| `time_limit` | Hard `SIGKILL` — last resort | Set 510 min above `soft_time_limit`. |
### Handling `SoftTimeLimitExceeded`
```python
from celery.exceptions import SoftTimeLimitExceeded
@shared_task(bind=True, soft_time_limit=1800, time_limit=2100, ...)
def long_running_task(self, job_id: int):
job = MyJob.objects.get(id=job_id)
try:
for item in items:
process(item)
except SoftTimeLimitExceeded:
logger.warning(f"Job {job_id} hit soft time limit — marking as failed")
job.status = 'failed'
job.completed_at = timezone.now()
job.save()
return {'success': False, 'job_id': job_id, 'error': 'Time limit exceeded'}
```
> **Note:** Batch jobs in `rfp_manager` do **not** use `autoretry_for` because they track per-question progress and should not re-run the entire batch. Instead, individual question failures are logged and the batch continues.
---
## Standard Values / Conventions
### Task Name Registry
| App | Task name | Trigger |
|---|---|---|
| `solution_library` | `solution_library.embed_document` | Signal / admin action |
| `solution_library` | `solution_library.embed_documents_batch` | Admin action |
| `solution_library` | `solution_library.sync_documentation_source` | View / admin action |
| `solution_library` | `solution_library.sync_all_documentation_sources` | Celery Beat (periodic) |
| `rfp_manager` | `rfp_manager.summarize_information_document` | Admin action |
| `rfp_manager` | `rfp_manager.batch_generate_responder_answers` | View |
| `rfp_manager` | `rfp_manager.batch_generate_reviewer_answers` | View |
| `llm_manager` | `llm_manager.validate_all_llm_apis` | Celery Beat (periodic) |
| `llm_manager` | `llm_manager.validate_single_api` | Admin action |
### Job Status Choices (DB Job Records)
```python
STATUS_PENDING = 'pending'
STATUS_PROCESSING = 'processing'
STATUS_COMPLETED = 'completed'
STATUS_FAILED = 'failed'
STATUS_CANCELLED = 'cancelled' # optional — used by rfp_manager
```
---
## Recommended Job-Tracking Fields
Tasks that represent a significant unit of work should write their state to a DB model. These are the recommended fields:
```python
class MyJobModel(models.Model):
# Celery linkage
celery_task_id = models.CharField(
max_length=255, blank=True,
help_text="Celery task ID for Flower monitoring"
)
# Status lifecycle
status = models.CharField(
max_length=20, choices=STATUS_CHOICES, default=STATUS_PENDING
)
started_at = models.DateTimeField(null=True, blank=True)
completed_at = models.DateTimeField(null=True, blank=True)
# Audit
started_by = models.ForeignKey(
User, on_delete=models.PROTECT, related_name='+'
)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
# Error accumulation
errors = models.JSONField(default=list)
class Meta:
indexes = [
models.Index(fields=['celery_task_id']),
models.Index(fields=['-created_at']),
]
```
For batch jobs that process many items, add counter fields:
```python
total_items = models.IntegerField(default=0)
processed_items = models.IntegerField(default=0)
successful_items = models.IntegerField(default=0)
failed_items = models.IntegerField(default=0)
def get_progress_percentage(self) -> int:
if self.total_items == 0:
return 0
return int((self.processed_items / self.total_items) * 100)
def is_stale(self, timeout_minutes: int = 30) -> bool:
"""True if stuck in pending/processing without recent updates."""
if self.status not in (self.STATUS_PENDING, self.STATUS_PROCESSING):
return False
return (timezone.now() - self.updated_at).total_seconds() > (timeout_minutes * 60)
```
---
## Variant 1 — Fire-and-Forget (Signal-Triggered)
Automatically dispatch a task whenever a model record is saved. Used by `solution_library` to kick off embedding whenever a `Document` is created.
```python
# solution_library/signals.py
from django.db.models.signals import post_save
from django.dispatch import receiver
from django.conf import settings
@receiver(post_save, sender=Document)
def trigger_document_embedding(sender, instance, created, **kwargs):
if not created:
return
if not getattr(settings, 'AUTO_EMBED_DOCUMENTS', True):
return
from solution_library.tasks import embed_document_task # avoid circular import
from django.db import transaction
def _dispatch():
try:
task = embed_document_task.delay(
document_id=instance.id,
embedding_model_id=instance.embedding_model_id or None,
user_id=None,
)
logger.info(f"Queued embedding task {task.id} for document {instance.id}")
except Exception as e:
logger.error(f"Failed to queue embedding task for document {instance.id}: {e}")
# Dispatch AFTER the transaction commits so the worker can read the row
transaction.on_commit(_dispatch)
```
The corresponding task updates the record's status field at start and completion:
```python
@shared_task(name='solution_library.embed_document')
def embed_document_task(document_id: int, embedding_model_id: int = None, user_id: int = None):
document = Document.objects.get(id=document_id)
document.review_status = 'processing'
document.save(update_fields=['review_status', 'embedding_model'])
# ... perform work ...
document.review_status = 'pending'
document.save(update_fields=['review_status'])
return {'success': True, 'document_id': document_id, 'chunks_created': count}
```
---
## Variant 2 — Long-Running Batch Job (View or Admin Triggered)
Used by `rfp_manager` for multi-hour batch RAG processing. The outer transaction creates the DB job record first, then dispatches the Celery task, passing the job's PK.
```python
# rfp_manager/views.py (dispatch)
from django.db import transaction
job = RFPBatchJob.objects.create(
rfp=rfp,
started_by=request.user,
job_type=RFPBatchJob.JOB_TYPE_RESPONDER,
status=RFPBatchJob.STATUS_PENDING,
)
def _dispatch():
task = batch_generate_responder_answers.delay(rfp.pk, request.user.pk, job.pk)
# Save the Celery task ID for Flower cross-reference
job.celery_task_id = task.id
job.save(update_fields=['celery_task_id'])
# IMPORTANT: dispatch after the transaction commits so the worker
# can read the job row. Without this, the worker may receive the
# message before the row is visible, causing DoesNotExist.
transaction.on_commit(_dispatch)
```
Inside the task, use `bind=True` to get the Celery task ID:
```python
@shared_task(bind=True, name='rfp_manager.batch_generate_responder_answers')
def batch_generate_responder_answers(self, rfp_id: int, user_id: int, job_id: int):
job = RFPBatchJob.objects.get(id=job_id)
job.status = RFPBatchJob.STATUS_PROCESSING
job.started_at = timezone.now()
job.celery_task_id = self.request.id # authoritative Celery ID
job.save()
for item in items_to_process:
try:
# ... process item ...
job.processed_questions += 1
job.successful_questions += 1
job.save(update_fields=['processed_questions', 'successful_questions', 'updated_at'])
except Exception as e:
job.add_error(item, str(e))
job.status = RFPBatchJob.STATUS_COMPLETED
job.completed_at = timezone.now()
job.save()
return {'success': True, 'job_id': job_id}
```
---
## Variant 3 — Progress-Callback Task (View or Admin Triggered)
Used by `solution_library`'s `sync_documentation_source_task` when an underlying synchronous service needs to stream incremental progress updates back to the DB.
```python
@shared_task(bind=True, name='solution_library.sync_documentation_source')
def sync_documentation_source_task(self, source_id: int, user_id: int, job_id: int):
job = SyncJob.objects.get(id=job_id)
job.status = SyncJob.STATUS_PROCESSING
job.started_at = timezone.now()
job.celery_task_id = self.request.id
job.save(update_fields=['status', 'started_at', 'celery_task_id', 'updated_at'])
def update_progress(created, updated, skipped, processed, total):
job.documents_created = created
job.documents_updated = updated
job.documents_skipped = skipped
job.save(update_fields=['documents_created', 'documents_updated',
'documents_skipped', 'updated_at'])
result = sync_documentation_source(source_id, user_id, progress_callback=update_progress)
job.status = SyncJob.STATUS_COMPLETED if result.status == 'completed' else SyncJob.STATUS_FAILED
job.completed_at = timezone.now()
job.save()
return {'success': True, 'job_id': job_id}
```
---
## Variant 4 — Periodic Task (Celery Beat)
Used by `llm_manager` for hourly/daily API validation and by `solution_library` for nightly source syncs. Schedule via django-celery-beat in Django admin (no hardcoded schedules in code).
```python
@shared_task(name='llm_manager.validate_all_llm_apis')
def validate_all_llm_apis():
"""Periodic task: validate all active LLM APIs and refresh model lists."""
active_apis = LLMApi.objects.filter(is_active=True)
results = {'tested': 0, 'successful': 0, 'failed': 0, 'details': []}
for api in active_apis:
results['tested'] += 1
try:
result = test_llm_api(api)
if result['success']:
results['successful'] += 1
else:
results['failed'] += 1
except Exception as e:
results['failed'] += 1
logger.error(f"Error validating {api.name}: {e}", exc_info=True)
return results
@shared_task(name='solution_library.sync_all_documentation_sources')
def sync_all_sources_task():
"""Periodic task: queue a sync for every active documentation source."""
sources = DocumentationSource.objects.all()
system_user = User.objects.filter(is_superuser=True).first()
for source in sources:
# Skip if an active sync job already exists
if SyncJob.objects.filter(source=source,
status__in=[SyncJob.STATUS_PENDING,
SyncJob.STATUS_PROCESSING]).exists():
continue
job = SyncJob.objects.create(source=source, started_by=system_user,
status=SyncJob.STATUS_PENDING)
sync_documentation_source_task.delay(source.id, system_user.id, job.id)
return {'queued': queued, 'skipped': skipped}
```
---
## Infrastructure Configuration
### `spelunker/celery.py` — App Entry Point
```python
import os
from celery import Celery
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "spelunker.settings")
app = Celery("spelunker")
app.config_from_object("django.conf:settings", namespace="CELERY")
app.autodiscover_tasks() # auto-discovers tasks.py in every INSTALLED_APP
```
### `settings.py` — Celery Settings
```python
# Broker and result backend — supplied via environment variables
CELERY_BROKER_URL = env('CELERY_BROKER_URL') # amqp://spelunker:<pw>@rabbitmq:5672/spelunker
CELERY_RESULT_BACKEND = env('CELERY_RESULT_BACKEND') # rpc://
# Serialization — JSON only (no pickle)
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = env('TIME_ZONE')
# Result expiry — critical when using rpc:// backend.
# Uncollected results accumulate in worker memory without this.
CELERY_RESULT_EXPIRES = 3600 # 1 hour; safe because we store state in DB job records
# Global time limits (can be overridden per-task with decorator args)
CELERY_TASK_SOFT_TIME_LIMIT = 1800 # 30 min soft limit → SoftTimeLimitExceeded
CELERY_TASK_TIME_LIMIT = 2100 # 35 min hard kill
# Late ack: acknowledge messages AFTER task completes, not before.
# If a worker crashes mid-task, the broker redelivers the message.
CELERY_TASK_ACKS_LATE = True
CELERY_WORKER_PREFETCH_MULTIPLIER = 1 # fetch one task at a time per worker slot
# Separate logging level for Celery vs. application code
CELERY_LOGGING_LEVEL = env('CELERY_LOGGING_LEVEL', default='INFO')
```
> **`CELERY_TASK_ACKS_LATE`**: Combined with idempotent tasks, this provides at-least-once delivery. If a worker process is killed (OOM, deployment), the message returns to the queue and another worker picks it up. This is why idempotency is a hard requirement.
### `settings.py` — Memcached (Django Cache)
Memcached is the Django HTTP-layer cache (sessions, view caching). It is **not** used as a Celery result backend.
```python
CACHES = {
"default": {
"BACKEND": "django.core.cache.backends.memcached.PyMemcacheCache",
"LOCATION": env('KVDB_LOCATION'), # memcached:11211
"KEY_PREFIX": env('KVDB_PREFIX'), # spelunker
"TIMEOUT": 300,
}
}
```
### `INSTALLED_APPS` — Required
```python
INSTALLED_APPS = [
...
'django_celery_beat', # DB-backed periodic task scheduler (Beat)
...
]
```
### `docker-compose.yml` — Service Topology
| Service | Image | Purpose |
|---|---|---|
| `rabbitmq` | `rabbitmq:3-management-alpine` | AMQP message broker |
| `memcached` | `memcached:1.6-alpine` | Django HTTP cache |
| `worker` | `spelunker:latest` | Celery worker (`--concurrency=4`) |
| `scheduler` | `spelunker:latest` | Celery Beat with `DatabaseScheduler` |
| `flower` | `mher/flower:latest` | Task monitoring UI (port 5555) |
### Task Routing / Queues (Recommended)
By default all tasks run in the `celery` default queue. For production deployments, separate CPU-heavy work from I/O-bound work:
```python
# settings.py
CELERY_TASK_ROUTES = {
'solution_library.embed_document': {'queue': 'embedding'},
'solution_library.embed_documents_batch': {'queue': 'embedding'},
'rfp_manager.batch_generate_*': {'queue': 'batch'},
'llm_manager.validate_*': {'queue': 'default'},
}
```
```yaml
# docker-compose.yml — separate workers per queue
worker-default:
command: celery -A spelunker worker -Q default --concurrency=4
worker-embedding:
command: celery -A spelunker worker -Q embedding --concurrency=2
worker-batch:
command: celery -A spelunker worker -Q batch --concurrency=2
```
This prevents a burst of embedding tasks from starving time-sensitive API validation, and lets you scale each queue independently.
### Database Connection Management
Celery workers are long-lived processes. Django DB connections can become stale between tasks. Set `CONN_MAX_AGE` to `0` (the Django default) so connections are closed after each request cycle, or use a connection pooler like PgBouncer. Celery's `worker_pool_restarts` and Django's `close_old_connections()` (called automatically by Celery's Django fixup) handle cleanup between tasks.
---
## Domain Extension Examples
### `solution_library` App
Three task types: single-document embed, batch embed, and documentation-source sync. The single-document task is also triggered by a `post_save` signal for automatic processing on upload.
```python
# Auto-embed on create (signal)
embed_document_task.delay(document_id=instance.id, ...)
# Manual batch from admin action
embed_documents_batch_task.delay(document_ids=[1, 2, 3], ...)
# Source sync from view (with progress callback)
sync_documentation_source_task.delay(source_id=..., user_id=..., job_id=...)
```
### `rfp_manager` App
Two-stage pipeline: responder answers first, reviewer answers second. Each stage is a separate Celery batch job. Both check for an existing active job before dispatching to prevent duplicate runs.
```python
# Guard against duplicate jobs before dispatch
if RFPBatchJob.objects.filter(
rfp=rfp,
job_type=RFPBatchJob.JOB_TYPE_RESPONDER,
status__in=[RFPBatchJob.STATUS_PENDING, RFPBatchJob.STATUS_PROCESSING]
).exists():
# surface error to user
...
# Stage 1
batch_generate_responder_answers.delay(rfp.pk, user.pk, job.pk)
# Stage 2 (after Stage 1 is complete)
batch_generate_reviewer_answers.delay(rfp.pk, user.pk, job.pk)
```
### `llm_manager` App
Stateless periodic task — no DB job record needed because results are written directly to the `LLMApi` and `LLMModel` objects.
```python
# Triggered by Celery Beat; schedule managed via django-celery-beat admin
validate_all_llm_apis.delay()
# Triggered from admin action for a single API
validate_single_api.delay(api_id=api.pk)
```
---
## Anti-Patterns
- ❌ Don't use `rpc://` result backend for tasks where the caller never retrieves the result — the result accumulates in memory. Spelunker mitigates this by storing state in DB job records rather than reading Celery results. Always set `CELERY_RESULT_EXPIRES`.
- ❌ Don't pass full model instances as task arguments — pass PKs only. Celery serialises arguments as JSON; ORM objects are not JSON serialisable.
- ❌ Don't share the same `celery_task_id` between the dispatch call and the task's `self.request.id` without re-saving. The dispatch `AsyncResult.id` and the in-task `self.request.id` are the same value; write it from **inside** the task using `bind=True` as the authoritative source.
- ❌ Don't silence exceptions with bare `except: pass` — always log errors and reflect failure status onto the DB record.
- ❌ Don't skip the duplicate-job guard when the task is triggered from a view or admin action. Without it, double-clicking a submit button can queue two identical jobs.
- ❌ Don't use `CELERY_TASK_SERIALIZER = 'pickle'` — JSON only, to prevent arbitrary code execution via crafted task payloads.
- ❌ Don't hardcode periodic task schedules in code via `app.conf.beat_schedule` — use `django_celery_beat` and manage schedules in Django admin so they survive deployments.
- ❌ Don't call `.delay()` inside a database transaction — use `transaction.on_commit()`. The worker may receive the message before the row is committed, causing `DoesNotExist`.
- ❌ Don't write non-idempotent tasks — workers may crash and brokers may redeliver. A re-executed task must produce the same result (or safely no-op).
- ❌ Don't omit time limits — a hung external API call (LLM, S3) will block a worker slot forever. Always set `soft_time_limit` and `time_limit`.
- ❌ Don't retry business-logic errors with `autoretry_for` — only retry **transient** failures (network errors, timeouts). A `ValueError` or `DoesNotExist` will never succeed on retry.
---
## Migration / Adoption
When adding a new Celery task to an existing app:
1. Create `<app>/tasks.py` using `@shared_task`, not `@app.task`.
2. Name the task `'<app_label>.<action>'`.
3. If the task is long-running, create a DB job model with the recommended fields above.
4. Register the app in `INSTALLED_APPS` (required for `autodiscover_tasks`).
5. For periodic tasks, add a schedule record via Django admin → Periodic Tasks (django-celery-beat) rather than in code.
6. Add a test that confirms the task can be called synchronously with `CELERY_TASK_ALWAYS_EAGER = True`.
---
## Settings
```python
# settings.py
# Required — broker and result backend
CELERY_BROKER_URL = env('CELERY_BROKER_URL') # amqp://user:pw@host:5672/vhost
CELERY_RESULT_BACKEND = env('CELERY_RESULT_BACKEND') # rpc://
# Serialization (do not change)
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = env('TIME_ZONE') # must match Django TIME_ZONE
# Result expiry — prevents unbounded memory growth with rpc:// backend
CELERY_RESULT_EXPIRES = 3600 # seconds (1 hour)
# Time limits — global defaults, overridable per-task
CELERY_TASK_SOFT_TIME_LIMIT = 1800 # SoftTimeLimitExceeded after 30 min
CELERY_TASK_TIME_LIMIT = 2100 # hard SIGKILL after 35 min
# Reliability — late ack + single prefetch for at-least-once delivery
CELERY_TASK_ACKS_LATE = True
CELERY_WORKER_PREFETCH_MULTIPLIER = 1
# Logging
CELERY_LOGGING_LEVEL = env('CELERY_LOGGING_LEVEL', default='INFO') # separate from app/Django level
# Optional — disable for production
# AUTO_EMBED_DOCUMENTS = True # set False to suppress signal-triggered embedding
# Optional — task routing (see Infrastructure Configuration for queue examples)
# CELERY_TASK_ROUTES = { ... }
```
---
## Testing
```python
from django.test import TestCase, override_settings
@override_settings(CELERY_TASK_ALWAYS_EAGER=True, CELERY_TASK_EAGER_PROPAGATES=True)
class EmbedDocumentTaskTest(TestCase):
def test_happy_path(self):
"""Task embeds a document and returns success."""
# arrange: create Document, LLMModel fixtures
result = embed_document_task(document_id=doc.id)
self.assertTrue(result['success'])
self.assertGreater(result['chunks_created'], 0)
doc.refresh_from_db()
self.assertEqual(doc.review_status, 'pending')
def test_document_not_found(self):
"""Task returns success=False for a missing document ID."""
result = embed_document_task(document_id=999999)
self.assertFalse(result['success'])
self.assertIn('not found', result['error'])
def test_no_embedding_model(self):
"""Task returns success=False when no embedding model is available."""
# arrange: no LLMModel with is_system_default=True
result = embed_document_task(document_id=doc.id)
self.assertFalse(result['success'])
@override_settings(CELERY_TASK_ALWAYS_EAGER=True, CELERY_TASK_EAGER_PROPAGATES=True)
class BatchJobTest(TestCase):
def test_job_reaches_completed_status(self):
"""Batch job transitions from pending → processing → completed."""
job = RFPBatchJob.objects.create(...)
batch_generate_responder_answers(rfp_id=rfp.pk, user_id=user.pk, job_id=job.pk)
job.refresh_from_db()
self.assertEqual(job.status, RFPBatchJob.STATUS_COMPLETED)
def test_duplicate_job_guard(self):
"""A second dispatch when a job is already active is rejected by the view."""
# arrange: one active job
response = self.client.post(dispatch_url)
self.assertContains(response, 'already running', status_code=400)
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,401 @@
# Notification Trigger Pattern v1.0.0
Standard pattern for triggering notifications from domain-specific events in Django applications that use Themis for notification infrastructure.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Overview
Themis provides the notification *mailbox* — the model, UI (bell + dropdown + list page), polling, browser notifications, user preferences, and cleanup. What Themis does **not** provide is the *trigger logic* — the rules that decide when a notification should be created.
Trigger logic is inherently domain-specific:
- A task tracker sends "Task overdue" notifications
- A calendar sends "Event starting in 15 minutes" reminders
- A finance app sends "Invoice payment received" alerts
- A monitoring system sends "Server CPU above 90%" warnings
This pattern documents how consuming apps should create notifications using Themis infrastructure.
---
## The Standard Interface
All notification creation goes through one function:
```python
from themis.notifications import notify_user
notify_user(
user=user, # Django User instance
title="Task overdue", # Short headline (max 200 chars)
message="Task 'Deploy v2' was due yesterday.", # Optional body
level="warning", # info | success | warning | danger
url="/tasks/42/", # Optional: where to navigate on click
source_app="tasks", # Your app label (for tracking/cleanup)
source_model="Task", # Model that triggered this
source_id="42", # PK of the source object (as string)
deduplicate=True, # Skip if unread duplicate exists
expires_at=None, # Optional: auto-expire datetime
)
```
**Never create `UserNotification` objects directly.** The `notify_user()` function handles:
- Checking if the user has notifications enabled
- Filtering by the user's minimum notification level
- Deduplication (when `deduplicate=True`)
- Returning `None` when skipped (so callers can check)
---
## Trigger Patterns
### 1. Signal-Based Triggers
The most common pattern — listen to Django signals and create notifications:
```python
# myapp/signals.py
from django.db.models.signals import post_save
from django.dispatch import receiver
from themis.notifications import notify_user
from .models import Task
@receiver(post_save, sender=Task)
def notify_task_assigned(sender, instance, created, **kwargs):
"""Notify user when a task is assigned to them."""
if not created and instance.assignee and instance.tracker.has_changed("assignee"):
notify_user(
user=instance.assignee,
title=f"Task assigned: {instance.title}",
message=f"You've been assigned to '{instance.title}'",
level="info",
url=instance.get_absolute_url(),
source_app="tasks",
source_model="Task",
source_id=str(instance.pk),
deduplicate=True,
)
```
### 2. View-Based Triggers
Create notifications during request processing:
```python
# myapp/views.py
from themis.notifications import notify_user
@login_required
def approve_request(request, pk):
req = get_object_or_404(Request, pk=pk)
req.status = "approved"
req.save()
# Notify the requester
notify_user(
user=req.requester,
title="Request approved",
message=f"Your request '{req.title}' has been approved.",
level="success",
url=req.get_absolute_url(),
source_app="requests",
source_model="Request",
source_id=str(req.pk),
)
messages.success(request, "Request approved.")
return redirect("request-list")
```
### 3. Management Command Triggers
For scheduled checks (e.g., daily overdue detection):
```python
# myapp/management/commands/check_overdue.py
from django.core.management.base import BaseCommand
from django.utils import timezone
from themis.notifications import notify_user
from myapp.models import Task
class Command(BaseCommand):
help = "Send notifications for overdue tasks"
def handle(self, *args, **options):
overdue = Task.objects.filter(
due_date__lt=timezone.now().date(),
status__in=["open", "in_progress"],
)
count = 0
for task in overdue:
result = notify_user(
user=task.assignee,
title=f"Overdue: {task.title}",
message=f"Task was due {task.due_date}",
level="danger",
url=task.get_absolute_url(),
source_app="tasks",
source_model="Task",
source_id=str(task.pk),
deduplicate=True, # Don't send again if unread
)
if result:
count += 1
self.stdout.write(f"Sent {count} overdue notification(s)")
```
Schedule with cron or Kubernetes CronJob:
```yaml
# Kubernetes CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: check-overdue-tasks
spec:
schedule: "0 8 * * *" # Daily at 8 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: check-overdue
command: ["python", "manage.py", "check_overdue"]
```
### 4. Celery Task Triggers
For apps with background workers:
```python
# myapp/tasks.py
from celery import shared_task
from django.contrib.auth import get_user_model
from themis.notifications import notify_user
User = get_user_model()
@shared_task
def notify_report_ready(user_id, report_id):
"""Notify user when their report has been generated."""
from myapp.models import Report
user = User.objects.get(pk=user_id)
report = Report.objects.get(pk=report_id)
notify_user(
user=user,
title="Report ready",
message=f"Your {report.report_type} report is ready to download.",
level="success",
url=report.get_absolute_url(),
source_app="reports",
source_model="Report",
source_id=str(report.pk),
)
```
---
## Notification Levels
Choose the appropriate level for each notification type:
| Level | Weight | Use For |
|---|---|---|
| `info` | 0 | Informational updates (assigned, comment added) |
| `success` | 0 | Positive outcomes (approved, completed, payment received) |
| `warning` | 1 | Needs attention (approaching deadline, low balance) |
| `danger` | 2 | Urgent/error (overdue, failed, system error) |
Users can set a minimum notification level in their preferences:
- **info** (default) — receive all notifications
- **warning** — only warnings and errors
- **danger** — only errors
Note that `info` and `success` have the same weight (0), so setting minimum to "warning" filters out both.
---
## Source Tracking
The three source tracking fields enable two important features:
### Deduplication
When `deduplicate=True`, `notify_user()` checks for existing unread notifications with the same `source_app`, `source_model`, and `source_id`. This prevents notification spam when the same event is checked multiple times (e.g., a daily cron job for overdue tasks).
### Bulk Cleanup
When a source object is deleted, clean up its notifications:
```python
# In your model's delete signal or post_delete:
from themis.models import UserNotification
@receiver(post_delete, sender=Task)
def cleanup_task_notifications(sender, instance, **kwargs):
UserNotification.objects.filter(
source_app="tasks",
source_model="Task",
source_id=str(instance.pk),
).delete()
```
---
## Expiring Notifications
For time-sensitive notifications, use `expires_at`:
```python
from datetime import timedelta
from django.utils import timezone
# Event reminder that expires when the event starts
notify_user(
user=attendee,
title=f"Starting soon: {event.title}",
level="info",
url=event.get_absolute_url(),
expires_at=event.start_time,
source_app="events",
source_model="Event",
source_id=str(event.pk),
deduplicate=True,
)
```
Expired notifications are automatically excluded from counts and lists. The `cleanup_notifications` management command deletes them permanently.
---
## Multi-User Notifications
For events that affect multiple users, call `notify_user()` in a loop:
```python
def notify_team(team, title, message, **kwargs):
"""Send a notification to all members of a team."""
for member in team.members.all():
notify_user(user=member, title=title, message=message, **kwargs)
```
For large recipient lists, consider using a Celery task to avoid blocking the request.
---
## Notification Cleanup
Themis provides automatic cleanup via the management command:
```bash
# Uses THEMIS_NOTIFICATION_MAX_AGE_DAYS (default: 90)
python manage.py cleanup_notifications
# Override max age
python manage.py cleanup_notifications --max-age-days=60
```
**What gets deleted:**
- Read notifications older than the max age
- Dismissed notifications older than the max age
- Expired notifications (past their `expires_at`)
**What is preserved:**
- Unread notifications (regardless of age)
Schedule this as a daily cron job or Kubernetes CronJob.
---
## Settings
Themis recognizes these settings for notification behavior:
```python
# Polling interval for the notification bell (seconds, 0 = disabled)
THEMIS_NOTIFICATION_POLL_INTERVAL = 60
# Hard ceiling for notification cleanup (days)
THEMIS_NOTIFICATION_MAX_AGE_DAYS = 90
```
Users control their own preferences in Settings:
- **Enable notifications** — master on/off switch
- **Minimum level** — filter low-priority notifications
- **Browser desktop notifications** — opt-in for OS-level alerts
- **Retention days** — how long to keep read notifications
---
## Anti-Patterns
- ❌ Don't create `UserNotification` objects directly — use `notify_user()`
- ❌ Don't send notifications in tight loops without `deduplicate=True`
- ❌ Don't use notifications for real-time chat — use WebSocket channels
- ❌ Don't store sensitive data in notification messages (they're visible in admin)
- ❌ Don't rely on notifications as the sole delivery mechanism — they may be disabled by the user
- ❌ Don't forget `source_app`/`source_model`/`source_id` — they enable cleanup and dedup
---
## Testing Notifications
```python
from themis.notifications import notify_user
from themis.models import UserNotification
class MyAppNotificationTest(TestCase):
def test_task_overdue_notification(self):
"""Overdue task creates a danger notification."""
user = User.objects.create_user(username="test", password="pass")
task = Task.objects.create(
title="Deploy v2",
assignee=user,
due_date=date.today() - timedelta(days=1),
)
# Trigger your notification logic
check_overdue_tasks()
# Verify notification was created
notif = UserNotification.objects.get(
user=user,
source_app="tasks",
source_model="Task",
source_id=str(task.pk),
)
self.assertEqual(notif.level, "danger")
self.assertIn("Deploy v2", notif.title)
def test_disabled_user_gets_no_notification(self):
"""Users with notifications disabled get nothing."""
user = User.objects.create_user(username="quiet", password="pass")
user.profile.notifications_enabled = False
user.profile.save()
result = notify_user(user, "Should be skipped")
self.assertIsNone(result)
self.assertEqual(UserNotification.objects.count(), 0)
```

View File

@@ -0,0 +1,275 @@
# Organization Model Pattern v1.0.0
Standard pattern for Organization models across Django applications. Each app implements its own Organization model following this pattern to ensure interoperability and consistent field names.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Why a Pattern, Not a Shared Model
Organization requirements vary by domain. A financial app needs stock symbols and ISIN codes. A healthcare app needs provider IDs. An education app needs accreditation fields. Shipping a monolithic Organization model with 40+ fields forces every app to carry fields it does not need.
Instead, this pattern defines:
- **Required fields** every Organization model must have
- **Recommended fields** most apps should include
- **Extension guidelines** for domain-specific needs
- **Standard choice values** for interoperability
---
## Required Fields
Every Organization model must include these fields:
```python
import uuid
from django.conf import settings
from django.db import models
class Organization(models.Model):
# Primary key
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
# Core identity
name = models.CharField(max_length=255, db_index=True,
help_text="Organization display name")
slug = models.SlugField(max_length=255, unique=True,
help_text="URL-friendly identifier")
# Classification
type = models.CharField(max_length=20, choices=TYPE_CHOICES,
help_text="Organization type")
status = models.CharField(max_length=20, choices=STATUS_CHOICES,
default="active", help_text="Current status")
# Location
country = models.CharField(max_length=2, help_text="ISO 3166-1 alpha-2 country code")
# Audit
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
created_by = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.SET_NULL,
null=True, blank=True, related_name="created_organizations")
updated_by = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.SET_NULL,
null=True, blank=True, related_name="updated_organizations")
class Meta:
verbose_name = "Organization"
verbose_name_plural = "Organizations"
def __str__(self):
return self.name
def get_absolute_url(self):
from django.urls import reverse
return reverse("organization-detail", kwargs={"slug": self.slug})
```
---
## Standard Choice Values
Use these exact values for interoperability between apps:
### TYPE_CHOICES
```python
TYPE_CHOICES = [
("for-profit", "For-Profit"),
("non-profit", "Non-Profit"),
("government", "Government"),
("ngo", "NGO"),
("educational", "Educational"),
("healthcare", "Healthcare"),
("cooperative", "Cooperative"),
]
```
### STATUS_CHOICES
```python
STATUS_CHOICES = [
("active", "Active"),
("inactive", "Inactive"),
("pending", "Pending"),
("suspended", "Suspended"),
("dissolved", "Dissolved"),
("merged", "Merged"),
]
```
### SIZE_CHOICES (recommended)
```python
SIZE_CHOICES = [
("micro", "Micro (1-9)"),
("small", "Small (10-49)"),
("medium", "Medium (50-249)"),
("large", "Large (250-999)"),
("enterprise", "Enterprise (1000+)"),
]
```
### PARENT_RELATIONSHIP_CHOICES (if using hierarchy)
```python
PARENT_RELATIONSHIP_CHOICES = [
("subsidiary", "Subsidiary"),
("division", "Division"),
("branch", "Branch"),
("franchise", "Franchise"),
("joint-venture", "Joint Venture"),
("department", "Department"),
]
```
---
## Recommended Fields
Most apps should include these fields:
```python
# Extended identity
legal_name = models.CharField(max_length=255, blank=True, default="",
help_text="Full legal entity name")
abbreviated_name = models.CharField(max_length=50, blank=True, default="",
db_index=True, help_text="Short name/acronym")
# Classification
size = models.CharField(max_length=20, choices=SIZE_CHOICES, blank=True, default="",
help_text="Organization size")
# Contact
primary_email = models.EmailField(blank=True, default="", help_text="Primary contact email")
primary_phone = models.CharField(max_length=20, blank=True, default="", help_text="Primary phone")
website = models.URLField(blank=True, default="", help_text="Organization website")
# Address
address_line1 = models.CharField(max_length=255, blank=True, default="")
address_line2 = models.CharField(max_length=255, blank=True, default="")
city = models.CharField(max_length=100, blank=True, default="")
state_province = models.CharField(max_length=100, blank=True, default="")
postal_code = models.CharField(max_length=20, blank=True, default="")
# Content
overview = models.TextField(blank=True, default="", help_text="Organization description")
# Metadata
is_active = models.BooleanField(default=True, help_text="Soft delete flag")
tags = models.JSONField(default=list, blank=True, help_text="Flexible tags")
```
---
## Hierarchy Pattern
For apps that need parent-child organization relationships:
```python
# Hierarchical relationships
parent_organization = models.ForeignKey(
"self",
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name="subsidiaries",
help_text="Parent organization",
)
parent_relationship_type = models.CharField(
max_length=20,
choices=PARENT_RELATIONSHIP_CHOICES,
blank=True,
default="",
help_text="Type of relationship with parent",
)
```
### Hierarchy Utility Functions
```python
def get_ancestors(org):
"""Walk up the parent chain. Returns list of Organization instances."""
ancestors = []
current = org.parent_organization
while current:
ancestors.append(current)
current = current.parent_organization
return ancestors
def get_descendants(org):
"""Recursively collect all child organizations."""
descendants = []
for child in org.subsidiaries.all():
descendants.append(child)
descendants.extend(get_descendants(child))
return descendants
```
⚠️ **Warning:** Recursive queries can be expensive. For deep hierarchies, consider using `django-mptt` or `django-treebeard`, or store a materialized path.
---
## Domain Extension Examples
### Financial App
```python
class Organization(BaseOrganization):
revenue = models.DecimalField(max_digits=15, decimal_places=2, null=True, blank=True)
revenue_year = models.PositiveIntegerField(null=True, blank=True)
employee_count = models.PositiveIntegerField(null=True, blank=True)
stock_symbol = models.CharField(max_length=10, blank=True, default="", db_index=True)
fiscal_year_end_month = models.PositiveSmallIntegerField(null=True, blank=True)
```
### Healthcare App
```python
class Organization(BaseOrganization):
npi_number = models.CharField(max_length=10, blank=True, default="")
facility_type = models.CharField(max_length=30, choices=FACILITY_CHOICES)
bed_count = models.PositiveIntegerField(null=True, blank=True)
accreditation = models.JSONField(default=list, blank=True)
```
### Education App
```python
class Organization(BaseOrganization):
institution_type = models.CharField(max_length=30, choices=INSTITUTION_CHOICES)
student_count = models.PositiveIntegerField(null=True, blank=True)
accreditation_body = models.CharField(max_length=100, blank=True, default="")
```
---
## Anti-Patterns
- ❌ Don't use `null=True` on CharField/TextField — use `blank=True, default=""`
- ❌ Don't put all possible fields in a single model — extend per domain
- ❌ Don't use `Meta.ordering` on Organization — specify in queries
- ❌ Don't override `save()` for hierarchy calculation — use signals or service functions
- ❌ Don't expose sequential IDs in URLs — use slug or short UUID
---
## Indexing Recommendations
```python
class Meta:
indexes = [
models.Index(fields=["name"], name="org_name_idx"),
models.Index(fields=["status"], name="org_status_idx"),
models.Index(fields=["type"], name="org_type_idx"),
models.Index(fields=["country"], name="org_country_idx"),
]
```
Add domain-specific indexes as needed (e.g., `stock_symbol` for financial apps).

View File

@@ -0,0 +1,434 @@
# S3/MinIO File Storage Pattern v1.0.0
Standardizes how Django apps in Spelunker store, read, and reference files in S3/MinIO, covering upload paths, model metadata fields, storage-agnostic I/O, and test isolation.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Why a Pattern, Not a Shared Implementation
Each Django app stores files for a different domain purpose with different path conventions, processing workflows, and downstream consumers, making a single shared model impractical.
- The **rfp_manager** app needs files scoped under an RFP ID (info docs, question spreadsheets, generated exports), with no embedding — only LLM summarization
- The **solution_library** app needs files tied to vendor/solution hierarchies, with full text embedding and chunk storage, plus scraped documents that have no Django `FileField` at all
- The **rag** app needs to programmatically write chunk texts to S3 during embedding and read them back for search context
- The **core** app needs a simple image upload for organization logos without any processing pipeline
Instead, this pattern defines:
- **Required fields** — the minimum every file-backed model must have
- **Recommended fields** — metadata most implementations should track
- **Standard path conventions** — bucket key prefixes each domain owns
- **Storage-agnostic I/O** — how to read and write files so tests work without a real S3 bucket
---
## Required Fields
Every model that stores a file in S3/MinIO must have at minimum:
```python
from django.core.validators import FileExtensionValidator
from django.db import models
def my_domain_upload_path(instance, filename):
"""Return a scoped S3 key for this domain."""
return f'my_domain/{instance.parent_id}/{filename}'
class MyDocument(models.Model):
file = models.FileField(
upload_to=my_domain_upload_path, # or a string prefix
validators=[FileExtensionValidator(allowed_extensions=[...])],
)
file_type = models.CharField(max_length=100, blank=True) # extension without dot
file_size = models.PositiveIntegerField(null=True, blank=True) # bytes
```
---
## Standard Path Conventions
Use these exact key prefixes so buckets stay organized and IAM policies can target prefixes.
| App / Purpose | S3 Key Prefix |
|--------------------------------|--------------------------------------------|
| Solution library documents | `documents/` |
| Scraped documentation sources | `scraped/{source_id}/{filename}` |
| Embedding chunk texts | `chunks/{document_id}/chunk_{index}.txt` |
| RFP information documents | `rfp_info_documents/{rfp_id}/{filename}` |
| RFP question spreadsheets | `rfp_question_documents/{rfp_id}/{filename}` |
| RFP generated exports | `rfp_exports/{rfp_id}/{filename}` |
| Organization logos | `orgs/logos/` |
---
## Recommended Fields and Behaviors
Most file-backed models should also include these and populate them automatically.
```python
class MyDocument(models.Model):
# ... required fields above ...
# Recommended: explicit S3 key for programmatic access and admin visibility
s3_key = models.CharField(max_length=500, blank=True)
def save(self, *args, **kwargs):
"""Auto-populate file metadata on every save."""
if self.file:
self.s3_key = self.file.name
if hasattr(self.file, 'size'):
self.file_size = self.file.size
if self.file.name and '.' in self.file.name:
self.file_type = self.file.name.rsplit('.', 1)[-1].lower()
super().save(*args, **kwargs)
```
---
## Pattern Variant 1: FileField Upload (User-Initiated Upload)
Used by `rfp_manager.RFPInformationDocument`, `rfp_manager.RFPQuestionDocument`, `rfp_manager.RFPExport`, `solution_library.Document`, and `core.Organization`.
The user (or Celery task generating an export) provides a file. Django's `FileField` handles the upload to S3 automatically via the configured storage backend.
```python
import os
from django.core.validators import FileExtensionValidator
from django.db import models
def rfp_info_document_path(instance, filename):
"""Scope uploads under the parent RFP's ID to keep the bucket organized."""
return f'rfp_info_documents/{instance.rfp.id}/{filename}'
class RFPInformationDocument(models.Model):
file = models.FileField(
upload_to=rfp_info_document_path,
validators=[FileExtensionValidator(
allowed_extensions=['pdf', 'doc', 'docx', 'txt', 'md']
)],
)
title = models.CharField(max_length=500)
file_type = models.CharField(max_length=100, blank=True)
file_size = models.PositiveIntegerField(null=True, blank=True)
def save(self, *args, **kwargs):
if self.file:
if hasattr(self.file, 'size'):
self.file_size = self.file.size
if self.file.name:
self.file_type = os.path.splitext(self.file.name)[1].lstrip('.')
super().save(*args, **kwargs)
```
---
## Pattern Variant 2: Programmatic Write (Code-Generated Content)
Used by `rag.services.embeddings` (chunk texts) and `solution_library.services.sync` (scraped documents).
Content is generated or fetched in code and written directly to S3 using `default_storage.save()` with a `ContentFile`. The model records the resulting S3 key for later retrieval.
```python
from django.core.files.base import ContentFile
from django.core.files.storage import default_storage
def store_chunk(document_id: int, chunk_index: int, text: str) -> str:
"""
Store an embedding chunk in S3 and return the saved key.
Returns:
The actual S3 key (may differ from requested if file_overwrite=False)
"""
s3_key = f'chunks/{document_id}/chunk_{chunk_index}.txt'
saved_key = default_storage.save(s3_key, ContentFile(text.encode('utf-8')))
return saved_key
def store_scraped_document(source_id: int, filename: str, content: str) -> str:
"""Store scraped document content in S3 and return the saved key."""
s3_key = f'scraped/{source_id}/{filename}'
return default_storage.save(s3_key, ContentFile(content.encode('utf-8')))
```
When creating the model record after a programmatic write, use `s3_key` rather than a `FileField`:
```python
Document.objects.create(
title=filename,
s3_key=saved_key,
file_size=len(content),
file_type='md',
# Note: `file` field is intentionally empty — this is a scraped document
)
```
---
## Pattern Variant 3: Storage-Agnostic Read
Used by `rfp_manager.services.excel_processor`, `rag.services.embeddings._read_document_content`, and `solution_library.models.DocumentEmbedding.get_chunk_text`.
Always read via `default_storage.open()` so the same code works against S3 in production and `FileSystemStorage` in tests. Never construct a filesystem path from `settings.MEDIA_ROOT`.
```python
from django.core.files.storage import default_storage
from io import BytesIO
def load_binary_from_storage(file_path: str) -> BytesIO:
"""
Read a binary file from storage into a BytesIO buffer.
Works against S3/MinIO in production and FileSystemStorage in tests.
"""
with default_storage.open(file_path, 'rb') as f:
return BytesIO(f.read())
def read_text_from_storage(s3_key: str) -> str:
"""Read a text file from storage."""
with default_storage.open(s3_key, 'r') as f:
return f.read()
```
When a model has both a `file` field (user upload) and a bare `s3_key` (scraped/programmatic), check which path applies:
```python
def _read_document_content(self, document) -> str:
if document.s3_key and not document.file:
# Scraped document: no FileField, read by key
with default_storage.open(document.s3_key, 'r') as f:
return f.read()
# Uploaded document: use the FileField
with document.file.open('r') as f:
return f.read()
```
---
## Pattern Variant 4: S3 Connectivity Validation
Used by `solution_library.models.Document.clean()` and `solution_library.services.sync.sync_documentation_source`.
Validate that the bucket is reachable before attempting an upload or sync. This surfaces credential errors with a user-friendly message rather than a cryptic 500.
```python
from botocore.exceptions import ClientError, NoCredentialsError
from django.core.exceptions import ValidationError
from django.core.files.storage import default_storage
def validate_s3_connectivity():
"""
Raise ValidationError if S3/MinIO bucket is not accessible.
Only call on new uploads or at the start of a background sync.
"""
if not hasattr(default_storage, 'bucket'):
return # Not an S3 backend (e.g., tests), skip validation
try:
default_storage.bucket.meta.client.head_bucket(
Bucket=default_storage.bucket_name
)
except ClientError as e:
code = e.response.get('Error', {}).get('Code', '')
if code == '403':
raise ValidationError(
"S3/MinIO credentials are invalid or permissions are insufficient."
)
elif code == '404':
raise ValidationError(
f"Bucket '{default_storage.bucket_name}' does not exist."
)
raise ValidationError(f"S3/MinIO error ({code}): {e}")
except NoCredentialsError:
raise ValidationError("S3/MinIO credentials are not configured.")
```
In a model's `clean()`, guard with `not self.pk` to avoid checking on every update:
```python
def clean(self):
super().clean()
if self.file and not self.pk: # New uploads only
validate_s3_connectivity()
```
---
## Domain Extension Examples
### rfp_manager App
RFP documents are scoped under the RFP ID for isolation and easy cleanup. The app uses three document types (info, question, export), each with its own callable path function to keep the bucket navigation clear.
```python
def rfp_export_path(instance, filename):
return f'rfp_exports/{instance.rfp.id}/{filename}'
class RFPExport(models.Model):
export_file = models.FileField(upload_to=rfp_export_path)
version = models.CharField(max_length=50)
file_size = models.PositiveIntegerField(null=True, blank=True)
question_count = models.IntegerField()
answered_count = models.IntegerField()
# No s3_key field - export files are always accessed via FileField
```
### solution_library App
Solution library documents track an explicit `s3_key` because the app supports two document origins: user uploads (with `FileField`) and scraped documents (programmatic write only, no `FileField`). For embedding, chunk texts are stored separately in S3 and referenced from `DocumentEmbedding` via `chunk_s3_key`.
```python
class Document(models.Model):
file = models.FileField(upload_to='documents/', blank=True) # blank=True: scraped docs
s3_key = models.CharField(max_length=500, blank=True) # always populated
content_hash = models.CharField(max_length=64, blank=True, db_index=True)
class DocumentEmbedding(models.Model):
document = models.ForeignKey(Document, on_delete=models.CASCADE, related_name='embeddings')
chunk_s3_key = models.CharField(max_length=500) # e.g. chunks/42/chunk_7.txt
chunk_index = models.IntegerField()
chunk_size = models.PositiveIntegerField()
embedding = VectorField(null=True, blank=True) # pgvector column
def get_chunk_text(self) -> str:
from django.core.files.storage import default_storage
with default_storage.open(self.chunk_s3_key, 'r') as f:
return f.read()
```
---
## Anti-Patterns
- ❌ Don't build filesystem paths with `os.path.join(settings.MEDIA_ROOT, ...)` — always read through `default_storage.open()`
- ❌ Don't store file content as a `TextField` or `BinaryField` in the database
- ❌ Don't use `default_acl='public-read'` — all Spelunker buckets use `private` ACL with `querystring_auth=True` (pre-signed URLs)
- ❌ Don't skip `FileExtensionValidator` on upload fields — it is the first line of defence against unexpected file types
- ❌ Don't call `document.file.storage.size()` or `.exists()` in hot paths — these make network round-trips; use the `s3_key` and metadata fields for display purposes
- ❌ Don't make S3 API calls in tests without first overriding `STORAGES` in `test_settings.py`
- ❌ Don't use `file_overwrite=True` — the global setting `file_overwrite=False` ensures Django auto-appends a unique suffix rather than silently overwriting existing objects
---
## Settings
```python
# spelunker/settings.py
STORAGES = {
"default": {
"BACKEND": "storages.backends.s3boto3.S3Boto3Storage",
"OPTIONS": {
"access_key": env('S3_ACCESS_KEY'),
"secret_key": env('S3_SECRET_KEY'),
"bucket_name": env('S3_BUCKET_NAME'),
"endpoint_url": env('S3_ENDPOINT_URL'), # Use for MinIO or non-AWS S3
"use_ssl": env('S3_USE_SSL'),
"default_acl": env('S3_DEFAULT_ACL'), # Must be 'private'
"region_name": env('S3_REGION_NAME'),
"file_overwrite": False, # Prevent silent overwrites
"querystring_auth": True, # Pre-signed URLs for all access
"verify": env.bool('S3_VERIFY_SSL', default=True),
}
},
"staticfiles": {
# Static files are served locally (nginx), never from S3
"BACKEND": "django.contrib.staticfiles.storage.StaticFilesStorage",
},
}
```
Environment variables (see `.env.example`):
```bash
S3_ACCESS_KEY=
S3_SECRET_KEY=
S3_BUCKET_NAME=spelunker-documents
S3_ENDPOINT_URL=http://localhost:9000 # MinIO local dev
S3_USE_SSL=False
S3_VERIFY_SSL=False
S3_DEFAULT_ACL=private
S3_REGION_NAME=us-east-1
```
Test override (disables all S3 calls):
```python
# spelunker/test_settings.py
STORAGES = {
"default": {
"BACKEND": "django.core.files.storage.FileSystemStorage",
"OPTIONS": {"location": "/tmp/test_media/"},
},
"staticfiles": {
"BACKEND": "django.contrib.staticfiles.storage.StaticFilesStorage",
},
}
```
---
## Testing
Standard test cases every file-backed implementation should cover.
```python
import os
import tempfile
from django.core.files.uploadedfile import SimpleUploadedFile
from django.test import TestCase, override_settings
@override_settings(
STORAGES={
"default": {
"BACKEND": "django.core.files.storage.FileSystemStorage",
"OPTIONS": {"location": tempfile.mkdtemp()},
},
"staticfiles": {
"BACKEND": "django.contrib.staticfiles.storage.StaticFilesStorage",
},
}
)
class MyDocumentStorageTest(TestCase):
def test_file_metadata_populated_on_save(self):
"""file_type and file_size are auto-populated from the uploaded file."""
uploaded = SimpleUploadedFile("report.pdf", b"%PDF-1.4 content", content_type="application/pdf")
doc = MyDocument.objects.create(file=uploaded, title="Test")
self.assertEqual(doc.file_type, "pdf")
self.assertGreater(doc.file_size, 0)
def test_upload_path_includes_parent_id(self):
"""upload_to callable scopes the key under the parent ID."""
uploaded = SimpleUploadedFile("q.xlsx", b"PK content")
doc = MyDocument.objects.create(file=uploaded, title="Questions", rfp=self.rfp)
self.assertIn(str(self.rfp.id), doc.file.name)
def test_rejected_extension(self):
"""FileExtensionValidator rejects disallowed file types."""
from django.core.exceptions import ValidationError
uploaded = SimpleUploadedFile("hack.exe", b"MZ")
doc = MyDocument(file=uploaded, title="Bad")
with self.assertRaises(ValidationError):
doc.full_clean()
def test_storage_agnostic_read(self):
"""Reading via default_storage.open() works against FileSystemStorage."""
from django.core.files.base import ContentFile
from django.core.files.storage import default_storage
key = default_storage.save("test/hello.txt", ContentFile(b"hello world"))
with default_storage.open(key, 'r') as f:
content = f.read()
self.assertEqual(content, "hello world")
default_storage.delete(key)
```

View File

@@ -0,0 +1,736 @@
# SSO with Allauth & Casdoor Pattern v1.0.0
Standardizes OIDC-based Single Sign-On using Django Allauth and Casdoor, covering adapter customization, user provisioning, group mapping, superuser protection, and configurable local-login fallback. Used by the `core` Django application.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Why a Pattern, Not a Shared Implementation
Every Django project that adopts SSO has different identity-provider configurations, claim schemas, permission models, and organizational structures:
- A **project management** app needs role claims mapped to project-scoped permissions
- An **e-commerce** app needs tenant/store claims with purchase-limit groups
- An **RFP tool** (Spelunker) needs organization + group claims mapped to View Only / Staff / SME / Admin groups
Instead, this pattern defines:
- **Required components** — every implementation must have
- **Required settings** — Django & Allauth configuration values
- **Standard conventions** — group names, claim mappings, redirect URL format
- **Extension guidelines** — for domain-specific provisioning logic
---
## Required Components
Every SSO implementation following this pattern must provide these files:
| Component | Location | Purpose |
|-----------|----------|---------|
| Social account adapter | `<app>/adapters.py` | User provisioning, group mapping, superuser protection |
| Local account adapter | `<app>/adapters.py` | Disable local signup, authentication logging |
| Management command | `<app>/management/commands/create_sso_groups.py` | Idempotent group + permission creation |
| Login template | `templates/account/login.html` | SSO button + conditional local login form |
| Context processor | `<app>/context_processors.py` | Expose `CASDOOR_ENABLED` / `ALLOW_LOCAL_LOGIN` to templates |
| SSL patch (optional) | `<app>/ssl_patch.py` | Development-only SSL bypass |
### Minimum settings.py configuration
```python
# INSTALLED_APPS — required entries
INSTALLED_APPS = [
# ... standard Django apps ...
'allauth',
'allauth.account',
'allauth.socialaccount',
'allauth.socialaccount.providers.openid_connect',
'<your_app>',
]
# MIDDLEWARE — Allauth middleware is required
MIDDLEWARE = [
# ... standard Django middleware ...
'allauth.account.middleware.AccountMiddleware',
]
# AUTHENTICATION_BACKENDS — both local and SSO
AUTHENTICATION_BACKENDS = [
'django.contrib.auth.backends.ModelBackend',
'allauth.account.auth_backends.AuthenticationBackend',
]
```
---
## Standard Values / Conventions
### Environment Variables
Every deployment must set these environment variables (or `.env` entries):
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `CASDOOR_ENABLED` | Yes | — | Enable/disable SSO (`true`/`false`) |
| `CASDOOR_ORIGIN` | Yes | — | Casdoor backend URL for OIDC discovery |
| `CASDOOR_ORIGIN_FRONTEND` | Yes | — | Casdoor frontend URL (may differ behind reverse proxy) |
| `CASDOOR_CLIENT_ID` | Yes | — | OAuth client ID from Casdoor application |
| `CASDOOR_CLIENT_SECRET` | Yes | — | OAuth client secret from Casdoor application |
| `CASDOOR_ORG_NAME` | Yes | — | Default organization slug in Casdoor |
| `ALLOW_LOCAL_LOGIN` | No | `false` | Show local login form for non-superusers |
| `CASDOOR_SSL_VERIFY` | No | `true` | SSL verification (`true`, `false`, or CA-bundle path) |
### Redirect URL Convention
The Allauth OIDC callback URL follows a fixed format. Register this URL in Casdoor:
```
/accounts/oidc/<provider_id>/login/callback/
```
For Spelunker with `provider_id = casdoor`:
```
/accounts/oidc/casdoor/login/callback/
```
> **Important:** The path segment is `oidc`, not `openid_connect`.
### Standard Group Mapping
Casdoor group names map to Django groups with consistent naming:
| Casdoor Group | Django Group | `is_staff` | Permissions |
|---------------|-------------|------------|-------------|
| `view_only` | `View Only` | `False` | `view_*` |
| `staff` | `Staff` | `True` | `view_*`, `add_*`, `change_*` |
| `sme` | `SME` | `True` | `view_*`, `add_*`, `change_*` |
| `admin` | `Admin` | `True` | `view_*`, `add_*`, `change_*`, `delete_*` |
### Standard OIDC Claim Mapping
| Casdoor Claim | Django Field | Notes |
|---------------|-------------|-------|
| `email` | `User.username`, `User.email` | Full email used as username |
| `given_name` | `User.first_name` | — |
| `family_name` | `User.last_name` | — |
| `name` | Parsed into first/last | Fallback when given/family absent |
| `organization` | Organization lookup/create | Via adapter |
| `groups` | Django Group membership | Via adapter mapping |
---
## Recommended Settings
Most implementations should include these Allauth settings:
```python
# Authentication mode
ACCOUNT_LOGIN_METHODS = {'email'}
ACCOUNT_SIGNUP_FIELDS = ['email*', 'password1*', 'password2*']
ACCOUNT_EMAIL_VERIFICATION = 'optional'
ACCOUNT_SESSION_REMEMBER = True
ACCOUNT_LOGIN_ON_PASSWORD_RESET = True
ACCOUNT_UNIQUE_EMAIL = True
# Redirects
LOGIN_REDIRECT_URL = '/dashboard/'
ACCOUNT_LOGOUT_REDIRECT_URL = '/'
LOGIN_URL = '/accounts/login/'
# Social account behavior
SOCIALACCOUNT_AUTO_SIGNUP = True
SOCIALACCOUNT_EMAIL_VERIFICATION = 'none'
SOCIALACCOUNT_QUERY_EMAIL = True
SOCIALACCOUNT_STORE_TOKENS = True
SOCIALACCOUNT_ADAPTER = '<app>.adapters.CasdoorAccountAdapter'
ACCOUNT_ADAPTER = '<app>.adapters.LocalAccountAdapter'
# Session management
SESSION_COOKIE_AGE = 28800 # 8 hours
SESSION_SAVE_EVERY_REQUEST = True
# Account linking — auto-connect SSO to an existing local account with
# the same verified email instead of raising a conflict error
SOCIALACCOUNT_EMAIL_AUTHENTICATION_AUTO_CONNECT = True
```
### Multi-Factor Authentication (Recommended)
Add `allauth.mfa` for TOTP/WebAuthn second-factor support:
```python
INSTALLED_APPS += ['allauth.mfa']
MFA_ADAPTER = 'allauth.mfa.adapter.DefaultMFAAdapter'
```
MFA is enforced per-user inside Django; Casdoor may also enforce its own MFA upstream.
### Rate Limiting on Local Login (Recommended)
Protect the local login form from brute-force attacks with `django-axes` or similar:
```python
# pip install django-axes
INSTALLED_APPS += ['axes']
AUTHENTICATION_BACKENDS = [
'axes.backends.AxesStandaloneBackend',
'django.contrib.auth.backends.ModelBackend',
'allauth.account.auth_backends.AuthenticationBackend',
]
AXES_FAILURE_LIMIT = 5 # Lock after 5 failures
AXES_COOLOFF_TIME = 1 # 1-hour cooloff
AXES_LOCKOUT_PARAMETERS = ['ip_address', 'username']
```
---
## Social Account Adapter
The social account adapter is the core of the pattern. It handles user provisioning on SSO login, maps claims to Django fields, enforces superuser protection, and assigns groups.
```python
from allauth.socialaccount.adapter import DefaultSocialAccountAdapter
from allauth.exceptions import ImmediateHttpResponse
from django.contrib.auth.models import User, Group
from django.contrib import messages
from django.shortcuts import redirect
import logging
logger = logging.getLogger(__name__)
class CasdoorAccountAdapter(DefaultSocialAccountAdapter):
def is_open_for_signup(self, request, sociallogin):
"""Always allow SSO-initiated signup."""
return True
def pre_social_login(self, request, sociallogin):
"""
Runs on every SSO login (new and returning users).
1. Blocks superusers — they must use local auth.
2. Re-syncs organization and group claims for returning users
so that IdP changes are reflected immediately.
"""
if sociallogin.user.id:
user = sociallogin.user
# --- Superuser gate ---
if user.is_superuser:
logger.warning(
f"SSO login blocked for superuser {user.username}. "
"Superusers must use local authentication."
)
messages.error(
request,
"Superuser accounts must use local authentication."
)
raise ImmediateHttpResponse(redirect('account_login'))
# --- Re-sync claims for returning users ---
extra_data = sociallogin.account.extra_data
org_identifier = extra_data.get('organization', '')
if org_identifier:
self._assign_organization(user, org_identifier)
groups = extra_data.get('groups', [])
self._assign_groups(user, groups)
user.is_staff = any(
g in ['staff', 'sme', 'admin'] for g in groups
)
user.save(update_fields=['is_staff'])
def populate_user(self, request, sociallogin, data):
"""Map Casdoor claims to Django User fields."""
user = super().populate_user(request, sociallogin, data)
email = data.get('email', '')
user.username = email
user.email = email
user.first_name = data.get('given_name', '')
user.last_name = data.get('family_name', '')
# Fallback: parse full 'name' claim
if not user.first_name and not user.last_name:
full_name = data.get('name', '')
if full_name:
parts = full_name.split(' ', 1)
user.first_name = parts[0]
user.last_name = parts[1] if len(parts) > 1 else ''
# Security: SSO users are never superusers
user.is_superuser = False
# Set is_staff from group membership
groups = data.get('groups', [])
user.is_staff = any(g in ['staff', 'sme', 'admin'] for g in groups)
return user
def save_user(self, request, sociallogin, form=None):
"""Save user and handle organization + group mapping."""
user = super().save_user(request, sociallogin, form)
extra_data = sociallogin.account.extra_data
org_identifier = extra_data.get('organization', '')
if org_identifier:
self._assign_organization(user, org_identifier)
groups = extra_data.get('groups', [])
self._assign_groups(user, groups)
return user
def _assign_organization(self, user, org_identifier):
"""Assign (or create) organization from the OIDC claim."""
# Domain-specific — see Extension Examples below
raise NotImplementedError("Override per project")
def _assign_groups(self, user, group_names):
"""Map Casdoor groups to Django groups."""
group_mapping = {
'view_only': 'View Only',
'staff': 'Staff',
'sme': 'SME',
'admin': 'Admin',
}
user.groups.clear()
for casdoor_group in group_names:
django_group_name = group_mapping.get(casdoor_group.lower())
if django_group_name:
group, _ = Group.objects.get_or_create(name=django_group_name)
user.groups.add(group)
logger.info(f"Added {user.username} to group {django_group_name}")
```
---
## Local Account Adapter
Prevents local registration and logs authentication failures:
```python
from allauth.account.adapter import DefaultAccountAdapter
import logging
logger = logging.getLogger(__name__)
class LocalAccountAdapter(DefaultAccountAdapter):
def is_open_for_signup(self, request):
"""Disable local signup — all users come via SSO or admin."""
return False
def authentication_failed(self, request, **kwargs):
"""Log failures for security monitoring."""
logger.warning(
f"Local authentication failed from {request.META.get('REMOTE_ADDR')}"
)
super().authentication_failed(request, **kwargs)
```
---
## OIDC Provider Configuration
Register Casdoor as an OpenID Connect provider in `settings.py`:
```python
SOCIALACCOUNT_PROVIDERS = {
'openid_connect': {
'APPS': [
{
'provider_id': 'casdoor',
'name': 'Casdoor SSO',
'client_id': CASDOOR_CLIENT_ID,
'secret': CASDOOR_CLIENT_SECRET,
'settings': {
'server_url': f'{CASDOOR_ORIGIN}/.well-known/openid-configuration',
},
}
],
'OAUTH_PKCE_ENABLED': True,
}
}
```
---
## Management Command — Group Creation
An idempotent management command ensures groups and permissions exist:
```python
from django.core.management.base import BaseCommand
from django.contrib.auth.models import Group, Permission
class Command(BaseCommand):
help = 'Create Django groups for Casdoor SSO integration'
def handle(self, *args, **options):
groups_config = {
'View Only': {'permissions': ['view']},
'Staff': {'permissions': ['view', 'add', 'change']},
'SME': {'permissions': ['view', 'add', 'change']},
'Admin': {'permissions': ['view', 'add', 'change', 'delete']},
}
# Add your domain-specific model names here
models_to_permission = [
'vendor', 'document', 'rfp', 'rfpquestion',
]
for group_name, config in groups_config.items():
group, created = Group.objects.get_or_create(name=group_name)
status = 'Created' if created else 'Exists'
self.stdout.write(f'{status}: {group_name}')
for perm_prefix in config['permissions']:
for model in models_to_permission:
try:
perm = Permission.objects.get(
codename=f'{perm_prefix}_{model}'
)
group.permissions.add(perm)
except Permission.DoesNotExist:
pass
self.stdout.write(self.style.SUCCESS('SSO groups created successfully'))
```
---
## Login Template
The login template shows an SSO button when Casdoor is enabled and conditionally reveals the local login form:
```html
{% load socialaccount %}
<!-- SSO Login Button (POST form for CSRF protection) -->
{% if CASDOOR_ENABLED %}
<form method="post" action="{% provider_login_url 'casdoor' %}">
{% csrf_token %}
<button type="submit">Sign in with SSO</button>
</form>
{% endif %}
<!-- Local Login Form (conditional) -->
{% if ALLOW_LOCAL_LOGIN or user.is_superuser %}
<form method="post" action="{% url 'account_login' %}">
{% csrf_token %}
{{ form.as_p }}
<button type="submit">Sign In Locally</button>
</form>
{% endif %}
```
> **Why POST?** Using a `<a href>` GET link to initiate the OAuth flow skips CSRF
> validation. Allauth's `{% provider_login_url %}` is designed for use inside a
> `<form method="post">` so the CSRF token is verified before the redirect.
---
## Context Processor
Exposes SSO settings to every template:
```python
from django.conf import settings
def user_preferences(request):
context = {}
# Always expose SSO flags for the login page
context['CASDOOR_ENABLED'] = getattr(settings, 'CASDOOR_ENABLED', False)
context['ALLOW_LOCAL_LOGIN'] = getattr(settings, 'ALLOW_LOCAL_LOGIN', False)
return context
```
Register in `settings.py`:
```python
TEMPLATES = [{
'OPTIONS': {
'context_processors': [
# ... standard processors ...
'<app>.context_processors.user_preferences',
],
},
}]
```
---
## SSL Bypass (Development Only)
For sandbox environments with self-signed certificates, an optional SSL patch disables verification at the `requests` library level:
```python
import os, logging
logger = logging.getLogger(__name__)
def apply_ssl_bypass():
ssl_verify = os.environ.get('CASDOOR_SSL_VERIFY', 'true').lower()
if ssl_verify != 'false':
return
logger.warning("SSL verification DISABLED — sandbox only")
import urllib3
from requests.adapters import HTTPAdapter
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
_original_send = HTTPAdapter.send
def _patched_send(self, request, stream=False, timeout=None,
verify=True, cert=None, proxies=None):
return _original_send(self, request, stream=stream,
timeout=timeout, verify=False,
cert=cert, proxies=proxies)
HTTPAdapter.send = _patched_send
apply_ssl_bypass()
```
Load it at the top of `settings.py` **before** any library imports that make HTTP calls:
```python
_ssl_verify = os.environ.get('CASDOOR_SSL_VERIFY', 'true').lower()
if _ssl_verify == 'false':
import <app>.ssl_patch # noqa: F401
```
---
## Logout Flow
By default, Django's `account_logout` destroys the local session but does **not** terminate the upstream Casdoor session. The user remains logged in at the IdP and will be silently re-authenticated on next visit.
### Options
| Strategy | Behaviour | Implementation |
|----------|-----------|----------------|
| **Local-only logout** (default) | Destroys Django session; IdP session survives | No extra work |
| **IdP redirect logout** | Redirects to Casdoor's `/api/logout` after local logout | Override `ACCOUNT_LOGOUT_REDIRECT_URL` to point at Casdoor |
| **OIDC back-channel logout** | Casdoor notifies Django to invalidate sessions | Requires Casdoor back-channel support + a Django webhook endpoint |
### Recommended: IdP redirect logout
```python
# settings.py
ACCOUNT_LOGOUT_REDIRECT_URL = (
f'{CASDOOR_ORIGIN}/api/logout'
f'?post_logout_redirect_uri=https://your-app.example.com/'
)
```
This ensures the Casdoor session cookie is cleared before the user returns to your app.
---
## Domain Extension Examples
### Spelunker (RFP Tool)
Spelunker's adapter creates organizations on first encounter and links them to user profiles:
```python
def _assign_organization(self, user, org_identifier):
from django.db import models
from django.utils.text import slugify
from core.models import Organization
try:
org = Organization.objects.filter(
models.Q(slug=org_identifier) | models.Q(name=org_identifier)
).first()
if not org:
org = Organization.objects.create(
name=org_identifier,
slug=slugify(org_identifier),
type='for-profit',
legal_country='CA',
status='active',
)
logger.info(f"Created organization: {org.name}")
if hasattr(user, 'profile'):
logger.info(f"Assigned {user.username}{org.name}")
except Exception as e:
logger.error(f"Organization assignment error: {e}")
```
### Multi-Tenant SaaS App
A multi-tenant app might restrict users to a single tenant and enforce tenant isolation:
```python
def _assign_organization(self, user, org_identifier):
from tenants.models import Tenant
tenant = Tenant.objects.filter(external_id=org_identifier).first()
if not tenant:
raise ValueError(f"Unknown tenant: {org_identifier}")
user.tenant = tenant
user.save(update_fields=['tenant'])
```
---
## Anti-Patterns
- ❌ Don't allow SSO to grant `is_superuser` — always force `is_superuser = False` in `populate_user`
- ❌ Don't *log-and-continue* for superuser SSO attempts — raise `ImmediateHttpResponse` to actually block the login
- ❌ Don't disable local login for superusers — they need emergency access when SSO is unavailable
- ❌ Don't rely on SSO username claims — use email as the canonical identifier
- ❌ Don't hard-code the OIDC provider URL — always read from environment variables
- ❌ Don't skip the management command — groups and permissions must be idempotent and repeatable
- ❌ Don't use `CASDOOR_SSL_VERIFY=false` in production — only for sandbox environments with self-signed certificates
- ❌ Don't forget PKCE — always set `OAUTH_PKCE_ENABLED: True` for Authorization Code flow
- ❌ Don't sync groups only on first login — re-sync in `pre_social_login` so IdP changes take effect immediately
- ❌ Don't use a GET link (`<a href>`) to start the OAuth flow — use a POST form so CSRF protection applies
- ❌ Don't assume Django logout kills the IdP session — configure an IdP redirect or back-channel logout
- ❌ Don't leave the local login endpoint unprotected — add rate limiting (e.g. `django-axes`) to prevent brute-force attacks
---
## Settings
All Django settings this pattern recognizes:
```python
# settings.py
# --- SSO Provider ---
CASDOOR_ENABLED = env.bool('CASDOOR_ENABLED') # Master SSO toggle
CASDOOR_ORIGIN = env('CASDOOR_ORIGIN') # OIDC discovery base URL
CASDOOR_ORIGIN_FRONTEND = env('CASDOOR_ORIGIN_FRONTEND') # Frontend URL (may differ)
CASDOOR_CLIENT_ID = env('CASDOOR_CLIENT_ID') # OAuth client ID
CASDOOR_CLIENT_SECRET = env('CASDOOR_CLIENT_SECRET') # OAuth client secret
CASDOOR_ORG_NAME = env('CASDOOR_ORG_NAME') # Default organization
CASDOOR_SSL_VERIFY = env('CASDOOR_SSL_VERIFY') # true | false | /path/to/ca.pem
# --- Login Behavior ---
ALLOW_LOCAL_LOGIN = env.bool('ALLOW_LOCAL_LOGIN', default=False) # Show local form
# --- Allauth ---
SOCIALACCOUNT_ADAPTER = '<app>.adapters.CasdoorAccountAdapter'
ACCOUNT_ADAPTER = '<app>.adapters.LocalAccountAdapter'
```
---
## Testing
Standard test cases every implementation should cover:
```python
from django.test import TestCase, override_settings
from unittest.mock import MagicMock
from django.contrib.auth.models import User, Group
from <app>.adapters import CasdoorAccountAdapter, LocalAccountAdapter
class CasdoorAdapterTest(TestCase):
def setUp(self):
self.adapter = CasdoorAccountAdapter()
def test_signup_always_open(self):
"""SSO signup must always be permitted."""
self.assertTrue(self.adapter.is_open_for_signup(MagicMock(), MagicMock()))
def test_superuser_never_set_via_sso(self):
"""populate_user must force is_superuser=False."""
sociallogin = MagicMock()
data = {'email': 'admin@example.com', 'groups': ['admin']}
user = self.adapter.populate_user(MagicMock(), sociallogin, data)
self.assertFalse(user.is_superuser)
def test_email_used_as_username(self):
"""Username must be the full email address."""
sociallogin = MagicMock()
data = {'email': 'jane@example.com'}
user = self.adapter.populate_user(MagicMock(), sociallogin, data)
self.assertEqual(user.username, 'jane@example.com')
def test_staff_flag_from_groups(self):
"""is_staff must be True when user belongs to staff/sme/admin."""
sociallogin = MagicMock()
for group in ['staff', 'sme', 'admin']:
data = {'email': 'user@example.com', 'groups': [group]}
user = self.adapter.populate_user(MagicMock(), sociallogin, data)
self.assertTrue(user.is_staff, f"is_staff should be True for group '{group}'")
def test_name_fallback_parsing(self):
"""When given_name/family_name absent, parse 'name' claim."""
sociallogin = MagicMock()
data = {'email': 'user@example.com', 'name': 'Jane Doe'}
user = self.adapter.populate_user(MagicMock(), sociallogin, data)
self.assertEqual(user.first_name, 'Jane')
self.assertEqual(user.last_name, 'Doe')
def test_group_mapping(self):
"""Casdoor groups must map to correctly named Django groups."""
Group.objects.create(name='View Only')
Group.objects.create(name='Staff')
user = User.objects.create_user('test@example.com', 'test@example.com')
self.adapter._assign_groups(user, ['view_only', 'staff'])
group_names = set(user.groups.values_list('name', flat=True))
self.assertEqual(group_names, {'View Only', 'Staff'})
def test_superuser_sso_login_blocked(self):
"""pre_social_login must raise ImmediateHttpResponse for superusers."""
from allauth.exceptions import ImmediateHttpResponse
user = User.objects.create_superuser(
'admin@example.com', 'admin@example.com', 'pass'
)
sociallogin = MagicMock()
sociallogin.user = user
sociallogin.user.id = user.id
with self.assertRaises(ImmediateHttpResponse):
self.adapter.pre_social_login(MagicMock(), sociallogin)
def test_groups_resync_on_returning_login(self):
"""pre_social_login must re-sync groups for existing users."""
Group.objects.create(name='Admin')
Group.objects.create(name='Staff')
user = User.objects.create_user('user@example.com', 'user@example.com')
user.groups.add(Group.objects.get(name='Staff'))
sociallogin = MagicMock()
sociallogin.user = user
sociallogin.user.id = user.id
sociallogin.account.extra_data = {
'groups': ['admin'],
'organization': '',
}
self.adapter.pre_social_login(MagicMock(), sociallogin)
group_names = set(user.groups.values_list('name', flat=True))
self.assertEqual(group_names, {'Admin'})
class LocalAdapterTest(TestCase):
def test_local_signup_disabled(self):
"""Local signup must always be disabled."""
adapter = LocalAccountAdapter()
self.assertFalse(adapter.is_open_for_signup(MagicMock()))
```

96
docs/Red Panda Django.md Normal file
View File

@@ -0,0 +1,96 @@
## Red Panda Approval™
This project follows Red Panda Approval standards - our gold standard for Django application quality. Code must be elegant, reliable, and maintainable to earn the approval of our adorable red panda judges.
### The 5 Sacred Django Criteria
1. **Fresh Migration Test** - Clean migrations from empty database
2. **Elegant Simplicity** - No unnecessary complexity
3. **Observable & Debuggable** - Proper logging and error handling
4. **Consistent Patterns** - Follow Django conventions
5. **Actually Works** - Passes all checks and serves real user needs
### Standards
# Environment
Virtual environment: ~/env/PROJECT/bin/activate
Python version: 3.12
# Code Organization
Maximum file length: 1000 lines
CSS: External .css files only (no inline/embedded)
JS: External .js files only (no inline/embedded)
# Required Packages
- Bootstrap 5.x (no custom CSS unless absolutely necessary)
- Bootstrap Icons (no emojis)
- django-crispy-forms + crispy-bootstrap5
- django-allauth
# Testing
Framework: Django TestCase (not pytest)
Minimum coverage: XX%? (optional)
### Database Conventions
# Development vs Production
- Development: SQLite
- Production: PostgreSQL
- Use dj-database-url for configuration
# Model Naming
- Model names: singular PascalCase (User, BlogPost, OrderItem)
- Related names: plural snake_case with proper English pluralization
- user.blog_posts, order.items
- category.industries (not industrys)
- person.children (not childs)
- analysis.analyses (not analysiss)
- Through tables: describe relationship (ProjectMembership, CourseEnrollment)
# Field Naming
- Foreign keys: singular without _id suffix (author, category, parent)
- Boolean fields: use prefixes (is_active, has_permission, can_edit)
- Date fields: use suffixes (created_at, updated_at, published_on)
- Avoid abbreviations (use description, not desc)
# Required Model Fields
All models should include:
- created_at = models.DateTimeField(auto_now_add=True)
- updated_at = models.DateTimeField(auto_now=True)
Consider adding:
- id = models.UUIDField(primary_key=True) for public-facing models
- is_active = models.BooleanField(default=True) for soft deletes
# Indexing
- Add db_index=True to frequently queried fields
- Use Meta.indexes for composite indexes
- Document why each index exists
# Migrations
- Never edit migrations that have been deployed
- Use meaningful migration names: --name add_email_to_profile
- One logical change per migration when possible
- Test migrations both forward and backward
# Queries
- Use select_related() for foreign keys
- Use prefetch_related() for reverse relations and M2M
- Avoid queries in loops (N+1 problem)
- Use .only() and .defer() for large models
- Add comments explaining complex querysets
## Monitoring & Health Check Endpoints
Follow standard Kubernetes health check endpoints for container orchestration:
### /ready/ - Readiness probe checks if the application is ready to serve traffic
Validates database connectivity
Validates cache connectivity
Returns 200 if ready, 503 if dependencies are unavailable
Used by load balancers to determine if pod should receive traffic
### /live/ - Liveness probe checks if the application process is alive
Simple health check with minimal logic
Returns 200 if Django is responding to requests
Used by Kubernetes to determine if pod should be restarted
Note: For detailed metrics and monitoring, use Prometheus and Alloy integration rather than custom health endpoints.

View File

@@ -0,0 +1,306 @@
## 🐾 Red Panda Approval™
This project follows Red Panda Approval standards — our gold standard for Django application quality. Code must be elegant, reliable, and maintainable to earn the approval of our adorable red panda judges.
### The 5 Sacred Django Criteria
1. **Fresh Migration Test** — Clean migrations from empty database
2. **Elegant Simplicity** — No unnecessary complexity
3. **Observable & Debuggable** — Proper logging and error handling
4. **Consistent Patterns** — Follow Django conventions
5. **Actually Works** — Passes all checks and serves real user needs
## Environment Standards
- Virtual environment: ~/env/PROJECT/bin/activate
- Use pyproject.toml for project configuration (no setup.py, no requirements.txt)
- Python version: specified in pyproject.toml
- Dependencies: floor-pinned with ceiling (e.g. `Django>=5.2,<6.0`)
### Dependency Pinning
```toml
# Correct — floor pin with ceiling
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"cryptography>=41.0,<45.0",
]
# Wrong — exact pins in library packages
dependencies = [
"Django==5.2.7", # too strict, breaks downstream
]
```
Exact pins (`==`) are only appropriate in application-level lock files, not in reusable library packages.
## Directory Structure
myproject/ # Git repository root
├── .gitignore
├── README.md
├── pyproject.toml # Project configuration (moved to repo root)
├── docker-compose.yml
├── .env # Docker Compose environment (DATABASE_URL=postgres://...)
├── .env.example
├── project/ # Django project root (manage.py lives here)
│ ├── manage.py
│ ├── Dockerfile
│ ├── .env # Local development environment (DATABASE_URL=sqlite:///...)
│ ├── .env.example
│ │
│ ├── config/ # Django configuration module
│ │ ├── __init__.py
│ │ ├── settings.py
│ │ ├── urls.py
│ │ ├── wsgi.py
│ │ └── asgi.py
│ │
│ ├── accounts/ # Django app
│ │ ├── __init__.py
│ │ ├── models.py
│ │ ├── views.py
│ │ └── urls.py
│ │
│ ├── blog/ # Django app
│ │ ├── __init__.py
│ │ ├── models.py
│ │ ├── views.py
│ │ └── urls.py
│ │
│ ├── static/
│ │ ├── css/
│ │ └── js/
│ │
│ └── templates/
│ └── base.html
├── web/ # Nginx configuration
│ └── nginx.conf
├── db/ # PostgreSQL configuration
│ └── postgresql.conf
└── docs/ # Project documentation
└── index.md
## Settings Structure
- Use a single settings.py file
- Use django-environ or python-dotenv for environment variables
- Never commit .env files to version control
- Provide .env.example with all required variables documented
- Create .gitignore file
- Create a .dockerignore file
## Code Organization
- Imports: PEP 8 ordering (stdlib, third-party, local)
- Type hints on function parameters
- CSS: External .css files only (no inline styles, no embedded `<style>` tags)
- JS: External .js files only (no inline handlers, no embedded `<script>` blocks)
- Maximum file length: 1000 lines
- If a file exceeds 500 lines, consider splitting by domain concept
## Database Conventions
- Migrations run cleanly from empty database
- Never edit deployed migrations
- Use meaningful migration names: --name add_email_to_profile
- One logical change per migration when possible
- Test migrations both forward and backward
### Development vs Production
- Development: SQLite
- Production: PostgreSQL
## Caching
- Expensive queries are cached
- Cache keys follow naming convention
- TTLs are appropriate (not infinite)
- Invalidation is documented
- Key Naming Pattern: {app}:{model}:{identifier}:{field}
## Model Naming
- Model names: singular PascalCase (User, BlogPost, OrderItem)
- Correct English pluralization on related names
- All models have created_at and updated_at
- All models define __str__ and get_absolute_url
- TextChoices used for status fields
- related_name defined on ForeignKey fields
- Related names: plural snake_case with proper English pluralization
## Forms
- Use ModelForm with explicit fields list (never __all__)
## Field Naming
- Foreign keys: singular without _id suffix (author, category, parent)
- Boolean fields: use prefixes (is_active, has_permission, can_edit)
- Date fields: use suffixes (created_at, updated_at, published_on)
- Avoid abbreviations (use description, not desc)
## Required Model Fields
- All models should include:
- created_at = models.DateTimeField(auto_now_add=True)
- updated_at = models.DateTimeField(auto_now=True)
- Consider adding:
- id = models.UUIDField(primary_key=True) for public-facing models
- is_active = models.BooleanField(default=True) for soft deletes
## Indexing
- Add db_index=True to frequently queried fields
- Use Meta.indexes for composite indexes
- Document why each index exists
## Queries
- Use select_related() for foreign keys
- Use prefetch_related() for reverse relations and M2M
- Avoid queries in loops (N+1 problem)
- Use .only() and .defer() for large models
- Add comments explaining complex querysets
## Docstrings
- Use Sphinx style docstrings
- Document all public functions, classes, and modules
- Skip docstrings for obvious one-liners and standard Django overrides
## Views
- Use Function-Based Views (FBVs) exclusively
- Explicit logic is preferred over implicit inheritance
- Extract shared logic into utility functions
## URLs & Identifiers
- Public URLs use short UUIDs (12 characters) via `shortuuid`
- Never expose sequential IDs in URLs (security/enumeration risk)
- Internal references may use standard UUIDs or PKs
## URL Patterns
- Resource-based URLs (RESTful style)
- Namespaced URL names per app
- Trailing slashes (Django default)
- Flat structure preferred over deep nesting
## Background Tasks
- All tasks are run synchronously unless the design specifies background tasks are needed for long operations
- Long operations use Celery tasks
- Use Memcached, task progress pattern: {app}:task:{task_id}:progress
- Tasks are idempotent
- Tasks include retry logic
- Tasks live in app/tasks.py
- RabbitMQ is the Message Broker
- Flower Monitoring: Use for debugging failed tasks
## Testing
- Framework: Django TestCase (not pytest)
- Separate test files per module: test_models.py, test_views.py, test_forms.py
## Frontend Standards
### New Projects (DaisyUI + Tailwind)
- DaisyUI 4 via CDN for component classes
- Tailwind CSS via CDN for utility classes
- Theme management via Themis (DaisyUI `data-theme` attribute)
- All apps extend `themis/base.html` for consistent navigation
- No inline styles or scripts
### Existing Projects (Bootstrap 5)
- Bootstrap 5 via CDN
- Bootstrap Icons via CDN
- Bootswatch for theme variants (if applicable)
- django-bootstrap5 and crispy-bootstrap5 for form rendering
## Preferred Packages
### Core Django
- django>=5.2,<6.0
- django-environ — Environment variables
### Authentication & Security
- django-allauth — User management
- django-allauth-2fa — Two-factor authentication
### API Development
- djangorestframework>=3.14,<4.0 — REST APIs
- drf-spectacular — OpenAPI/Swagger documentation
### Encryption
- cryptography — Fernet encryption for secrets/API keys
### Background Tasks
- celery — Async task queue
- django-celery-progress — Progress bars
- flower — Celery monitoring
### Caching
- pymemcache — Memcached backend
### Database
- dj-database-url — Database URL configuration
- psycopg[binary] — PostgreSQL adapter
- shortuuid — Short UUIDs for public URLs
### Production
- gunicorn — WSGI server
### Shared Apps
- django-heluca-themis — User preferences, themes, key management, navigation
### Deprecated / Removed
- ~~pytz~~ — Use stdlib `zoneinfo` (Python 3.9+, Django 4+)
- ~~Pillow~~ — Only add if your app needs ImageField
- ~~django-heluca-core~~ — Replaced by Themis
## Anti-Patterns to Avoid
### Models
- Don't use `Model.objects.get()` without handling `DoesNotExist`
- Don't use `null=True` on `CharField` or `TextField` (use `blank=True, default=""`)
- Don't use `related_name='+'` unless you have a specific reason
- Don't override `save()` for business logic (use signals or service functions)
- Don't use `auto_now=True` on fields you might need to manually set
- Don't use `ForeignKey` without specifying `on_delete` explicitly
- Don't use `Meta.ordering` on large tables (specify ordering in queries)
### Queries
- Don't query inside loops (N+1 problem)
- Don't use `.all()` when you need a subset
- Don't use raw SQL unless absolutely necessary
- Don't forget `select_related()` and `prefetch_related()`
### Views
- Don't put business logic in views
- Don't use `request.POST.get()` without validation (use forms)
- Don't return sensitive data in error messages
- Don't forget `login_required` decorator on protected views
### Forms
- Don't use `fields = '__all__'` in ModelForm
- Don't trust client-side validation alone
- Don't use `exclude` in ModelForm (use explicit `fields`)
### Templates
- Don't use `{{ variable }}` for URLs (use `{% url %}` tag)
- Don't put logic in templates
- Don't use inline CSS or JavaScript (external files only)
- Don't forget `{% csrf_token %}` in forms
### Security
- Don't store secrets in `settings.py` (use environment variables)
- Don't commit `.env` files to version control
- Don't use `DEBUG=True` in production
- Don't expose sequential IDs in public URLs
- Don't use `mark_safe()` on user-supplied content
- Don't disable CSRF protection
### Imports & Code Style
- Don't use `from module import *`
- Don't use mutable default arguments
- Don't use bare `except:` clauses
- Don't ignore linter warnings without documented reason
### Migrations
- Don't edit migrations that have been deployed
- Don't use `RunPython` without a reverse function
- Don't add non-nullable fields without a default value
### Celery Tasks
- Don't pass model instances to tasks (pass IDs and re-fetch)
- Don't assume tasks run immediately
- Don't forget retry logic for external service calls

475
docs/Themis_V1-00.md Normal file
View File

@@ -0,0 +1,475 @@
# Themis v1.0.0
Reusable Django app providing user preferences, DaisyUI theme management, API key management, and standard navigation templates.
**Package:** django-heluca-themis
**Django:** >=5.2, <6.0
**Python:** >=3.10
**License:** MIT
## 🐾 Red Panda Approval™
This project follows Red Panda Approval standards.
---
## Overview
Themis provides the foundational elements every Django application needs:
- **UserProfile** — timezone, date/time/number formatting, DaisyUI theme selection
- **Notifications** — in-app notification bell, polling, browser desktop notifications, user preferences
- **API Key Management** — encrypted storage with per-key instructions
- **Standard Navigation** — navbar, user menu, notification bell, theme toggle, bottom nav
- **Middleware** — automatic timezone activation and theme context
- **Formatting Utilities** — date, time, number formatting respecting user preferences
- **Health Checks** — Kubernetes-ready `/ready/` and `/live/` endpoints
Themis does not provide domain models (Organization, etc.) or notification triggers. Those are documented as patterns for consuming apps to implement.
---
## Installation
### From Git Repository
```bash
pip install git+ssh://git@git.helu.ca:22022/r/themis.git
```
### For Local Development
```bash
pip install -e /path/to/themis
```
### Configuration
**settings.py:**
```python
INSTALLED_APPS = [
...
"rest_framework",
"themis",
...
]
MIDDLEWARE = [
...
"themis.middleware.TimezoneMiddleware",
"themis.middleware.ThemeMiddleware",
...
]
TEMPLATES = [{
"OPTIONS": {
"context_processors": [
...
"themis.context_processors.themis_settings",
"themis.context_processors.user_preferences",
"themis.context_processors.notifications",
],
},
}]
# Themis app settings
THEMIS_APP_NAME = "My Application"
THEMIS_NOTIFICATION_POLL_INTERVAL = 60 # seconds (0 to disable polling)
THEMIS_NOTIFICATION_MAX_AGE_DAYS = 90 # cleanup ceiling for read notifications
```
**urls.py:**
```python
from django.urls import include, path
urlpatterns = [
...
path("", include("themis.urls")),
path("api/v1/", include("themis.api.urls")),
...
]
```
**Run migrations:**
```bash
python manage.py migrate
```
---
## Models
### UserProfile
Extends Django's User model with display preferences. Automatically created via `post_save` signal when a User is created.
| Field | Type | Default | Description |
|---|---|---|---|
| user | OneToOneField | required | Link to Django User |
| home_timezone | CharField(50) | UTC | Permanent timezone |
| current_timezone | CharField(50) | (blank) | Current timezone when traveling |
| date_format | CharField(20) | YYYY-MM-DD | Date display format |
| time_format | CharField(10) | 24-hour | 12-hour or 24-hour |
| thousand_separator | CharField(10) | comma | Number formatting |
| week_start | CharField(10) | monday | First day of week |
| theme_mode | CharField(10) | auto | light / dark / auto |
| theme_name | CharField(30) | corporate | DaisyUI light theme |
| dark_theme_name | CharField(30) | business | DaisyUI dark theme |
| created_at | DateTimeField | auto | Record creation |
| updated_at | DateTimeField | auto | Last update |
**Properties:**
- `effective_timezone` — returns current_timezone if set, otherwise home_timezone
- `is_traveling` — True if current_timezone differs from home_timezone
**Why two timezone fields?**
Users who travel frequently need to see times in their current location while still knowing what time it is "at home." Setting `current_timezone` enables this without losing the home timezone setting.
### UserAPIKey
Stores encrypted API keys, MCP credentials, DAV passwords, and other service credentials.
| Field | Type | Default | Description |
|---|---|---|---|
| id | UUIDField | auto | Primary key |
| user | ForeignKey | required | Owner |
| service_name | CharField(100) | required | Service name (e.g. "OpenAI") |
| key_type | CharField(30) | api | api / mcp / dav / token / secret / other |
| label | CharField(100) | (blank) | User nickname for this key |
| encrypted_value | TextField | required | Fernet-encrypted credential |
| instructions | TextField | (blank) | How to obtain and use this key |
| help_url | URLField | (blank) | Link to service documentation |
| is_active | BooleanField | True | Whether key is in use |
| last_used_at | DateTimeField | null | Last usage timestamp |
| expires_at | DateTimeField | null | Expiration date |
| created_at | DateTimeField | auto | Record creation |
| updated_at | DateTimeField | auto | Last update |
**Properties:**
- `masked_value` — shows only last 4 characters (e.g. `****7xQ2`)
- `display_name` — returns label if set, otherwise service_name
**Encryption:**
Keys are encrypted at rest using Fernet symmetric encryption derived from Django's `SECRET_KEY`. The plaintext value is never stored and is only shown at creation time.
### UserNotification
In-app notification for a user. Created by consuming apps via `notify_user()`.
| Field | Type | Default | Description |
|---|---|---|---|
| id | UUIDField | auto | Primary key |
| user | ForeignKey | required | Recipient |
| title | CharField(200) | required | Short headline |
| message | TextField | (blank) | Body text |
| level | CharField(10) | info | info / success / warning / danger |
| url | CharField(500) | (blank) | Link to navigate on click |
| source_app | CharField(100) | (blank) | App label of sender |
| source_model | CharField(100) | (blank) | Model that triggered this |
| source_id | CharField(100) | (blank) | PK of source object |
| is_read | BooleanField | False | Whether user has read this |
| read_at | DateTimeField | null | When it was read |
| is_dismissed | BooleanField | False | Whether user dismissed this |
| dismissed_at | DateTimeField | null | When it was dismissed |
| expires_at | DateTimeField | null | Auto-expire datetime |
| created_at | DateTimeField | auto | Record creation |
| updated_at | DateTimeField | auto | Last update |
**Properties:**
- `level_weight` — numeric weight for level comparison (info=0, success=0, warning=1, danger=2)
- `is_expired` — True if expires_at has passed
- `level_css_class` — DaisyUI alert class (e.g. `alert-warning`)
- `level_badge_class` — DaisyUI badge class (e.g. `badge-warning`)
### UserProfile Notification Preferences
The UserProfile model includes four notification preference fields:
| Field | Type | Default | Description |
|---|---|---|---|
| notifications_enabled | BooleanField | True | Master on/off switch |
| notifications_min_level | CharField(10) | info | Minimum level to display |
| browser_notifications_enabled | BooleanField | False | Browser desktop notifications |
| notification_retention_days | PositiveIntegerField | 30 | Days to keep read notifications |
---
## Notifications
### Creating Notifications
All notification creation goes through the `notify_user()` utility:
```python
from themis.notifications import notify_user
notify_user(
user=user,
title="Task overdue",
message="Task 'Deploy v2' was due yesterday.",
level="warning",
url="/tasks/42/",
source_app="tasks",
source_model="Task",
source_id="42",
deduplicate=True,
)
```
This function respects user preferences (enabled flag, minimum level) and supports deduplication via source tracking fields.
### Notification Bell
The notification bell appears in the navbar for authenticated users with notifications enabled. It shows an unread count badge and a dropdown with a link to the full notification list.
### Polling
The `notifications.js` script polls the `/notifications/count/` endpoint at a configurable interval (default: 60 seconds) and updates the badge. Set `THEMIS_NOTIFICATION_POLL_INTERVAL = 0` to disable polling.
### Browser Desktop Notifications
When a user enables browser notifications in their preferences, Themis will request permission from the browser and show OS-level desktop notifications when new notifications arrive via polling.
### Cleanup
Old read/dismissed/expired notifications can be cleaned up with:
```bash
python manage.py cleanup_notifications
python manage.py cleanup_notifications --max-age-days=60
```
For details on trigger patterns, see **[Notification Trigger Pattern](Pattern_Notification_V1-00.md)**.
---
## Templates
### Base Template
All consuming apps extend `themis/base.html`:
```html
{% extends "themis/base.html" %}
{% block title %}Dashboard — My App{% endblock %}
{% block nav_items %}
<li><a href="{% url 'dashboard' %}">Dashboard</a></li>
<li><a href="{% url 'reports' %}">Reports</a></li>
{% endblock %}
{% block content %}
<h1 class="text-2xl font-bold">Dashboard</h1>
<!-- app content -->
{% endblock %}
```
### Available Blocks
| Block | Location | Purpose |
|---|---|---|
| `title` | `<title>` | Page title |
| `extra_head` | `<head>` | Additional CSS/meta |
| `navbar` | Top of `<body>` | Entire navbar (override to customize) |
| `nav_items` | Navbar (mobile) | Navigation links |
| `nav_items_desktop` | Navbar (desktop) | Desktop-only nav links |
| `nav_items_mobile` | Navbar (mobile) | Mobile-only nav links |
| `body_attrs` | `<body>` | Extra body attributes |
| `content` | `<main>` | Page content |
| `footer` | Bottom of `<body>` | Entire footer (override to customize) |
| `extra_scripts` | Before `</body>` | Additional JavaScript |
### Navigation Structure
**Navbar (fixed):**
```
[App Logo/Name] [Nav Items] [Theme ☀/🌙] [🔔 3] [User ▾]
├─ Settings
├─ API Keys
└─ Logout
```
Collapses to hamburger menu on mobile.
**Bottom Nav (fixed):**
```
© 2026 App Name
```
### What Apps Cannot Change
- Navbar is always a horizontal bar at the top
- User menu is always on the right
- Theme toggle is always in the navbar
- Bottom nav is always present
- Messages display below the navbar
- Content is in a centered container
---
## Middleware
### TimezoneMiddleware
Activates the user's effective timezone for each request using `zoneinfo`. All datetime operations within the request use the user's timezone. Falls back to UTC for anonymous users.
```python
MIDDLEWARE = [
...
"themis.middleware.TimezoneMiddleware",
...
]
```
### ThemeMiddleware
Attaches DaisyUI theme information to the request. The context processor reads these values for template rendering.
```python
MIDDLEWARE = [
...
"themis.middleware.ThemeMiddleware",
...
]
```
---
## Context Processors
### themis_settings
Provides app configuration from `THEMIS_*` settings:
- `themis_app_name`
- `themis_notification_poll_interval`
### user_preferences
Provides user preferences:
- `user_timezone`
- `user_date_format`
- `user_time_format`
- `user_is_traveling`
- `user_theme_mode`
- `user_theme_name`
- `user_dark_theme_name`
- `user_profile`
### notifications
Provides notification state:
- `themis_unread_notification_count`
- `themis_notifications_enabled`
- `themis_browser_notifications_enabled`
---
## Utilities
### Formatting
```python
from themis.utils import format_date_for_user, format_time_for_user, format_number_for_user
formatted_date = format_date_for_user(date_obj, request.user)
formatted_time = format_time_for_user(time_obj, request.user)
formatted_num = format_number_for_user(1000000, request.user)
```
### Timezone
```python
from themis.utils import convert_to_user_timezone, get_timezone_display
user_time = convert_to_user_timezone(utc_datetime, request.user)
tz_name = get_timezone_display(request.user)
```
### Template Tags
```html
{% load themis_tags %}
{{ event.date|user_date:request.user }}
{{ event.time|user_time:request.user }}
{{ revenue|user_number:request.user }}
{% user_timezone_name request.user %}
```
---
## URL Patterns
| URL | View | Purpose |
|---|---|---|
| `/ready/` | `ready` | Kubernetes readiness probe |
| `/live/` | `live` | Kubernetes liveness probe |
| `/profile/settings/` | `profile_settings` | User preferences page |
| `/profile/keys/` | `key_list` | API key list |
| `/profile/keys/add/` | `key_create` | Add new key |
| `/profile/keys/<uuid>/` | `key_detail` | Key detail + instructions |
| `/profile/keys/<uuid>/edit/` | `key_edit` | Edit key metadata |
| `/profile/keys/<uuid>/delete/` | `key_delete` | Delete key (POST only) |
| `/notifications/` | `notification_list` | Notification list page |
| `/notifications/<uuid>/read/` | `notification_mark_read` | Mark as read (POST) |
| `/notifications/read-all/` | `notification_mark_all_read` | Mark all read (POST) |
| `/notifications/<uuid>/dismiss/` | `notification_dismiss` | Dismiss (POST) |
| `/notifications/count/` | `notification_count` | Unread count JSON |
---
## REST API
| Endpoint | Method | Description |
|---|---|---|
| `/api/v1/profiles/` | GET | List profiles (own only, admin sees all) |
| `/api/v1/profiles/{id}/` | GET/PATCH | View/update profile |
| `/api/v1/keys/` | GET | List own API keys |
| `/api/v1/keys/` | POST | Create new key |
| `/api/v1/keys/{uuid}/` | GET/PATCH/DELETE | View/update/delete key |
| `/api/v1/notifications/` | GET | List own notifications (filterable) |
| `/api/v1/notifications/{uuid}/` | GET/PATCH/DELETE | View/update/delete notification |
| `/api/v1/notifications/{uuid}/mark_read/` | PATCH | Mark as read |
| `/api/v1/notifications/mark-all-read/` | PATCH | Mark all as read |
| `/api/v1/notifications/{uuid}/dismiss/` | PATCH | Dismiss notification |
| `/api/v1/notifications/count/` | GET | Unread count |
---
## DaisyUI Themes
Themis supports all 32 built-in DaisyUI themes. Users select separate themes for light and dark modes. The theme toggle cycles: light → dark → auto (system).
No database table needed — themes are a simple CharField storing the DaisyUI theme name.
### Available Themes
light, dark, cupcake, bumblebee, emerald, corporate, synthwave, retro, cyberpunk, valentine, halloween, garden, forest, aqua, lofi, pastel, fantasy, wireframe, black, luxury, dracula, cmyk, autumn, business, acid, lemonade, night, coffee, winter, dim, nord, sunset
---
## Dependencies
```toml
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"cryptography>=41.0,<45.0",
]
```
No `pytz` (uses stdlib `zoneinfo`). No `Pillow`. No database-stored themes.

732
docs/mnemosyne.html Normal file
View File

@@ -0,0 +1,732 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Mnemosyne — Architecture Documentation</title>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
<link href="https://cdn.jsdelivr.net/npm/bootstrap-icons@1.11.0/font/bootstrap-icons.css" rel="stylesheet">
<script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
<script>mermaid.initialize({ startOnLoad: true, theme: 'default' });</script>
</head>
<body>
<div class="container-fluid">
<nav class="navbar navbar-dark bg-dark rounded mb-4">
<div class="container-fluid">
<a class="navbar-brand" href="#"><i class="bi bi-book"></i> Mnemosyne — Architecture Documentation</a>
<div class="navbar-nav d-flex flex-row">
<a class="nav-link me-3" href="#overview">Overview</a>
<a class="nav-link me-3" href="#architecture">Architecture</a>
<a class="nav-link me-3" href="#data-model">Data Model</a>
<a class="nav-link me-3" href="#content-types">Content Types</a>
<a class="nav-link me-3" href="#multimodal-pipeline">Multimodal</a>
<a class="nav-link me-3" href="#search-pipeline">Search</a>
<a class="nav-link me-3" href="#mcp-interface">MCP</a>
<a class="nav-link me-3" href="#gpu-services">GPU</a>
<a class="nav-link" href="#deployment">Deployment</a>
</div>
</div>
</nav>
<div class="row">
<div class="col-12">
<h1 class="display-4 mb-2"><i class="bi bi-book-fill"></i> Mnemosyne <span class="badge bg-primary">Architecture</span></h1>
<p class="lead text-muted fst-italic">"The electric light did not come from the continuous improvement of candles." — Oren Harari</p>
<p class="lead">Mnemosyne is a content-type-aware, multimodal personal knowledge management system built on Neo4j knowledge graphs and Qwen3-VL multimodal AI. Named after the Titan goddess of memory, it understands <em>what kind</em> of knowledge it holds and makes it searchable through text, images, and natural language.</p>
</div>
</div>
<!-- SECTION: OVERVIEW -->
<section id="overview" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-info-circle"></i> Overview</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>Purpose</h3>
<p><strong>Mnemosyne</strong> is a personal knowledge management system that treats content type as a first-class concept. Unlike generic knowledge bases that treat all documents identically, Mnemosyne understands the difference between a novel, a technical manual, album artwork, and a journal entry — and adjusts its chunking, embedding, search, and LLM prompting accordingly.</p>
</div>
<div class="row g-4 mb-4">
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary"><i class="bi bi-diagram-3"></i> Knowledge Graph</h3>
<ul class="mb-0">
<li>Neo4j stores relationships between content, not just vectors</li>
<li>Author → Book → Character → Theme traversals</li>
<li>Artist → Album → Track → Genre connections</li>
<li>No vector dimension limits (full 4096d Qwen3-VL)</li>
<li>Graph + vector + full-text search in one database</li>
</ul>
</div>
</div>
</div>
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary"><i class="bi bi-eye"></i> Multimodal AI</h3>
<ul class="mb-0">
<li>Qwen3-VL-Embedding: text + images + video in one vector space</li>
<li>Qwen3-VL-Reranker: cross-attention scoring across modalities</li>
<li>Album art, diagrams, screenshots become searchable</li>
<li>Local GPU inference (5090 + 3090) — zero API costs</li>
<li>llama.cpp text fallback via existing Ansible/systemd infra</li>
</ul>
</div>
</div>
</div>
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary"><i class="bi bi-tags"></i> Content-Type Awareness</h3>
<ul class="mb-0">
<li>Library types define chunking, embedding, and prompt behavior</li>
<li>Fiction: narrative-aware chunking, character extraction</li>
<li>Technical: section-aware, code block preservation</li>
<li>Music: lyrics as primary, metadata-heavy (genre, mood)</li>
<li>Each type injects context into the LLM prompt</li>
</ul>
</div>
</div>
</div>
</div>
<div class="alert alert-info border-start border-4 border-info">
<h3>Key Differentiators</h3>
<ul class="mb-0">
<li><strong>Content-type-aware pipeline</strong> — chunking, embedding instructions, re-ranking instructions, and LLM context all adapt per library type</li>
<li><strong>Neo4j knowledge graph</strong> — traversable relationships, not just flat vector similarity</li>
<li><strong>Full multimodal</strong> — Qwen3-VL processes images, diagrams, album art alongside text in a unified vector space</li>
<li><strong>No dimension limits</strong> — Neo4j handles 4096d vectors natively (pgvector caps at 2000)</li>
<li><strong>MCP-first interface</strong> — designed for LLM integration from day one</li>
<li><strong>Proven RAG architecture</strong> — two-stage responder/reviewer pattern inherited from Spelunker</li>
<li><strong>Local GPU inference</strong> — zero ongoing API costs via vLLM + llama.cpp on RTX 5090/3090</li>
</ul>
</div>
<div class="alert alert-secondary border-start border-4 border-secondary">
<h3>Heritage</h3>
<p class="mb-0">Mnemosyne's RAG pipeline architecture is inspired by <strong>Spelunker</strong>, an enterprise RFP response platform built on Django, PostgreSQL/pgvector, and LangChain. The proven patterns — hybrid search, two-stage RAG, citation-based retrieval, async document processing, and SME-approved knowledge bases — are carried forward and enhanced with multimodal capabilities and knowledge graph relationships. Proven patterns from Mnemosyne will be backported to Spelunker.</p>
</div>
</section>
<!-- SECTION: ARCHITECTURE -->
<section id="architecture" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-diagram-3"></i> System Architecture</h2>
<div class="card mb-4">
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-diagram-3"></i> High-Level Architecture</h3></div>
<div class="card-body">
<div class="mermaid">
graph TB
subgraph Clients["Client Layer"]
MCP["MCP Clients<br/>(Claude, Copilot, etc.)"]
UI["Django Web UI"]
API["REST API (DRF)"]
end
subgraph App["Application Layer — Django"]
Core["core/<br/>Users, Auth"]
Library["library/<br/>Libraries, Collections, Items"]
Engine["engine/<br/>Embedding, Search, Reranker, RAG"]
MCPServer["mcp_server/<br/>MCP Tool Interface"]
Importers["importers/<br/>File, Calibre, Web"]
end
subgraph Data["Data Layer"]
Neo4j["Neo4j 5.x<br/>Knowledge Graph + Vectors"]
PG["PostgreSQL<br/>Auth, Config, Analytics"]
S3["S3/MinIO<br/>Content + Chunks"]
RMQ["RabbitMQ<br/>Task Queue"]
end
subgraph GPU["GPU Services"]
vLLM_E["vLLM<br/>Qwen3-VL-Embedding-8B<br/>(Multimodal Embed)"]
vLLM_R["vLLM<br/>Qwen3-VL-Reranker-8B<br/>(Multimodal Rerank)"]
LCPP["llama.cpp<br/>Qwen3-Reranker-0.6B<br/>(Text Fallback)"]
LCPP_C["llama.cpp<br/>Qwen3 Chat<br/>(RAG Responder)"]
end
MCP --> MCPServer
UI --> Core
API --> Library
API --> Engine
MCPServer --> Engine
MCPServer --> Library
Library --> Neo4j
Engine --> Neo4j
Engine --> S3
Core --> PG
Engine --> vLLM_E
Engine --> vLLM_R
Engine --> LCPP
Engine --> LCPP_C
Library --> RMQ
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card">
<div class="card-header bg-primary text-white"><h4 class="mb-0"><i class="bi bi-folder"></i> Django Apps</h4></div>
<div class="card-body">
<ul class="list-group list-group-flush">
<li class="list-group-item"><strong>core/</strong> — Users, authentication, profiles, permissions</li>
<li class="list-group-item"><strong>library/</strong> — Libraries, Collections, Items, Chunks, Concepts (Neo4j models)</li>
<li class="list-group-item"><strong>engine/</strong> — Embedding, search, reranker, RAG pipeline services</li>
<li class="list-group-item"><strong>mcp_server/</strong> — MCP tool definitions and server interface</li>
<li class="list-group-item"><strong>importers/</strong> — Content acquisition (file upload, Calibre, web scrape)</li>
<li class="list-group-item"><strong>llm_manager/</strong> — LLM API/model config, usage tracking (from Spelunker)</li>
</ul>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card">
<div class="card-header bg-success text-white"><h4 class="mb-0"><i class="bi bi-stack"></i> Technology Stack</h4></div>
<div class="card-body">
<ul>
<li><strong>Django 5.x</strong>, Python ≥3.12, Django REST Framework</li>
<li><strong>Neo4j 5.x</strong> + django-neomodel — knowledge graph + vector index</li>
<li><strong>PostgreSQL</strong> — Django auth, config, analytics only</li>
<li><strong>S3/MinIO</strong> — all content and chunk storage</li>
<li><strong>Celery + RabbitMQ</strong> — async embedding and graph construction</li>
<li><strong>vLLM ≥0.14</strong> — Qwen3-VL multimodal serving</li>
<li><strong>llama.cpp</strong> — text model serving (existing Ansible infra)</li>
<li><strong>MCP SDK</strong> — Model Context Protocol server</li>
</ul>
</div>
</div>
</div>
</div>
<h3 class="mt-4">Project Structure</h3>
<pre class="bg-light p-3 rounded"><code>mnemosyne/
├── mnemosyne/ # Django settings, URLs, WSGI/ASGI
├── core/ # Users, auth, profiles
├── library/ # Neo4j models (Library, Collection, Item, Chunk, Concept)
├── engine/ # RAG pipeline services
│ ├── embeddings.py # Qwen3-VL embedding client
│ ├── reranker.py # Qwen3-VL reranker client
│ ├── search.py # Hybrid search (vector + graph + full-text)
│ ├── pipeline.py # Two-stage RAG (responder + reviewer)
│ ├── llm_client.py # OpenAI-compatible LLM client
│ └── content_types.py # Library type definitions
├── mcp_server/ # MCP tool definitions
├── importers/ # Content import tools
├── llm_manager/ # LLM API/model config (ported from Spelunker)
├── static/
├── templates/
├── docker-compose.yml
├── pyproject.toml
└── manage.py</code></pre>
</section>
<!-- SECTION: DATA MODEL -->
<section id="data-model" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-database"></i> Data Model — Neo4j Knowledge Graph</h2>
<div class="alert alert-info border-start border-4 border-info">
<h3>Dual Database Strategy</h3>
<p class="mb-0"><strong>Neo4j</strong> stores all content knowledge: libraries, collections, items, chunks, concepts, and their relationships + vector embeddings. <strong>PostgreSQL</strong> stores only Django operational data: users, auth, LLM configurations, analytics, and Celery results. Content never lives in PostgreSQL.</p>
</div>
<div class="card mb-4">
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-diagram-2"></i> Graph Schema</h3></div>
<div class="card-body">
<div class="mermaid">
graph LR
L["Library<br/>(fiction, technical,<br/>music, art, journal)"] -->|CONTAINS| Col["Collection<br/>(genre, author,<br/>artist, project)"]
Col -->|CONTAINS| I["Item<br/>(book, manual,<br/>album, film, entry)"]
I -->|HAS_CHUNK| Ch["Chunk<br/>(text + optional image<br/>+ 4096d vector)"]
I -->|REFERENCES| Con["Concept<br/>(person, topic,<br/>technique, theme)"]
I -->|RELATED_TO| I
Con -->|RELATED_TO| Con
Ch -->|MENTIONS| Con
I -->|HAS_IMAGE| Img["Image<br/>(cover, diagram,<br/>artwork, still)"]
Img -->|HAS_EMBEDDING| ImgE["ImageEmbedding<br/>(4096d multimodal<br/>vector)"]
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h4 class="mb-0">Core Nodes</h4></div>
<div class="card-body">
<table class="table table-sm">
<thead><tr><th>Node</th><th>Key Properties</th><th>Vector?</th></tr></thead>
<tbody>
<tr><td><strong>Library</strong></td><td>name, library_type, chunking_config, embedding_instruction, llm_context_prompt</td><td>No</td></tr>
<tr><td><strong>Collection</strong></td><td>name, description, metadata</td><td>No</td></tr>
<tr><td><strong>Item</strong></td><td>title, item_type, s3_key, content_hash, metadata, created_at</td><td>No</td></tr>
<tr><td><strong>Chunk</strong></td><td>chunk_index, chunk_s3_key, chunk_size, embedding (4096d)</td><td><strong>Yes</strong></td></tr>
<tr><td><strong>Concept</strong></td><td>name, concept_type, embedding (4096d)</td><td><strong>Yes</strong></td></tr>
<tr><td><strong>Image</strong></td><td>s3_key, image_type, description, metadata</td><td>No</td></tr>
<tr><td><strong>ImageEmbedding</strong></td><td>embedding (4096d multimodal)</td><td><strong>Yes</strong></td></tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-success text-white"><h4 class="mb-0">Relationships</h4></div>
<div class="card-body">
<table class="table table-sm">
<thead><tr><th>Relationship</th><th>From → To</th><th>Properties</th></tr></thead>
<tbody>
<tr><td><strong>CONTAINS</strong></td><td>Library → Collection</td><td></td></tr>
<tr><td><strong>CONTAINS</strong></td><td>Collection → Item</td><td>position</td></tr>
<tr><td><strong>HAS_CHUNK</strong></td><td>Item → Chunk</td><td></td></tr>
<tr><td><strong>HAS_IMAGE</strong></td><td>Item → Image</td><td>image_role</td></tr>
<tr><td><strong>HAS_EMBEDDING</strong></td><td>Image → ImageEmbedding</td><td></td></tr>
<tr><td><strong>REFERENCES</strong></td><td>Item → Concept</td><td>relevance</td></tr>
<tr><td><strong>MENTIONS</strong></td><td>Chunk → Concept</td><td></td></tr>
<tr><td><strong>RELATED_TO</strong></td><td>Item → Item</td><td>relationship_type, weight</td></tr>
<tr><td><strong>RELATED_TO</strong></td><td>Concept → Concept</td><td>relationship_type</td></tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="alert alert-warning border-start border-4 border-warning">
<h4><i class="bi bi-lightning"></i> Neo4j Vector Indexes</h4>
<pre class="bg-light p-3 rounded mb-0"><code>// Chunk text+image embeddings (4096 dimensions, no pgvector limits!)
CREATE VECTOR INDEX chunk_embedding FOR (c:Chunk)
ON (c.embedding) OPTIONS {indexConfig: {
`vector.dimensions`: 4096,
`vector.similarity_function`: 'cosine'
}}
// Concept embeddings for semantic concept search
CREATE VECTOR INDEX concept_embedding FOR (con:Concept)
ON (con.embedding) OPTIONS {indexConfig: {
`vector.dimensions`: 4096,
`vector.similarity_function`: 'cosine'
}}
// Image multimodal embeddings
CREATE VECTOR INDEX image_embedding FOR (ie:ImageEmbedding)
ON (ie.embedding) OPTIONS {indexConfig: {
`vector.dimensions`: 4096,
`vector.similarity_function`: 'cosine'
}}
// Full-text index for keyword/BM25-style search
CREATE FULLTEXT INDEX chunk_fulltext FOR (c:Chunk) ON EACH [c.text_preview]</code></pre>
</div>
</section>
<!-- SECTION: CONTENT TYPES -->
<section id="content-types" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-tags"></i> Content Type System</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>The Core Innovation</h3>
<p class="mb-0">Each Library has a <strong>library_type</strong> that defines how content is chunked, what embedding instructions are sent to Qwen3-VL, what re-ranking instructions are used, and what context prompt is injected when the LLM generates answers. This is configured per library in the database — not hardcoded.</p>
</div>
<div class="row g-4 mb-4">
<div class="col-md-4">
<div class="card h-100 border-primary">
<div class="card-header bg-primary text-white"><h5 class="mb-0"><i class="bi bi-book"></i> Fiction</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Chapter-aware, preserve dialogue blocks, narrative flow</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the narrative passage for literary retrieval, capturing themes, characters, and plot elements"</em></p>
<p><strong>Reranker Instruction:</strong> <em>"Score relevance of this fiction excerpt to the query, considering narrative themes and character arcs"</em></p>
<p><strong>LLM Context:</strong> <em>"The following excerpts are from fiction. Interpret as narrative — consider themes, symbolism, character development."</em></p>
<p><strong>Multimodal:</strong> Cover art, illustrations</p>
<p><strong>Graph:</strong> Author → Book → Character → Theme</p>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100 border-success">
<div class="card-header bg-success text-white"><h5 class="mb-0"><i class="bi bi-gear"></i> Technical</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Section/heading-aware, preserve code blocks and tables as atomic units</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the technical documentation for precise procedural retrieval"</em></p>
<p><strong>Reranker Instruction:</strong> <em>"Score relevance of this technical documentation to the query, prioritizing procedural accuracy"</em></p>
<p><strong>LLM Context:</strong> <em>"The following excerpts are from technical documentation. Provide precise, actionable instructions."</em></p>
<p><strong>Multimodal:</strong> Diagrams, screenshots, wiring diagrams</p>
<p><strong>Graph:</strong> Product → Manual → Section → Procedure → Tool</p>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100 border-info">
<div class="card-header bg-info text-white"><h5 class="mb-0"><i class="bi bi-music-note-beamed"></i> Music</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Song-level (lyrics as one chunk), verse/chorus segmentation</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the song lyrics and album context for music discovery and thematic analysis"</em></p>
<p><strong>Reranker Instruction:</strong> <em>"Score relevance considering lyrical themes, musical context, and artist style"</em></p>
<p><strong>LLM Context:</strong> <em>"The following excerpts are song lyrics and music metadata. Interpret in musical and cultural context."</em></p>
<p><strong>Multimodal:</strong> Album artwork, liner note images</p>
<p><strong>Graph:</strong> Artist → Album → Track → Genre; Track → SAMPLES → Track</p>
</div>
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-4">
<div class="card h-100 border-warning">
<div class="card-header bg-warning text-dark"><h5 class="mb-0"><i class="bi bi-film"></i> Film</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Scene-level for scripts, paragraph-level for synopses</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the film content for cinematic retrieval, capturing visual and narrative elements"</em></p>
<p><strong>Multimodal:</strong> Movie stills, posters, screenshots</p>
<p><strong>Graph:</strong> Director → Film → Scene → Actor; Film → BASED_ON → Book</p>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100 border-danger">
<div class="card-header bg-danger text-white"><h5 class="mb-0"><i class="bi bi-palette"></i> Art</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Description-level, catalog entry as unit</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the artwork and its description for visual and stylistic retrieval"</em></p>
<p><strong>Multimodal:</strong> <strong>The artwork itself</strong> — primary content is visual</p>
<p><strong>Graph:</strong> Artist → Piece → Style → Movement; Piece → INSPIRED_BY → Piece</p>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100 border-secondary">
<div class="card-header bg-secondary text-white"><h5 class="mb-0"><i class="bi bi-journal-text"></i> Journals</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Entry-level (one entry = one chunk), paragraph split for long entries</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the personal journal entry for temporal and reflective retrieval"</em></p>
<p><strong>Multimodal:</strong> Photos, sketches attached to entries</p>
<p><strong>Graph:</strong> Date → Entry → Topic; Entry → MENTIONS → Person/Place</p>
</div>
</div>
</div>
</div>
</section>
<!-- SECTION: MULTIMODAL PIPELINE -->
<section id="multimodal-pipeline" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-eye-fill"></i> Multimodal Embedding &amp; Re-ranking Pipeline</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>Two-Stage Multimodal Pipeline</h3>
<p><strong>Stage 1 — Embedding (Qwen3-VL-Embedding-8B):</strong> Generates 4096-dimensional vectors from text, images, screenshots, and video in a unified semantic space. Accepts content-type-specific instructions for optimized representations.</p>
<p class="mb-0"><strong>Stage 2 — Re-ranking (Qwen3-VL-Reranker-8B):</strong> Takes (query, document) pairs — where both can be multimodal — and outputs precise relevance scores via cross-attention. Dramatically sharpens retrieval accuracy.</p>
</div>
<div class="card mb-4">
<div class="card-header bg-success text-white"><h3 class="mb-0"><i class="bi bi-flow-chart"></i> Embedding &amp; Ingestion Flow</h3></div>
<div class="card-body">
<div class="mermaid">
flowchart TD
A["New Content<br/>(file upload, import)"] --> B{"Content Type?"}
B -->|"Text (PDF, DOCX, MD)"| C["Parse Text<br/>+ Extract Images"]
B -->|"Image (art, photo)"| D["Image Only"]
B -->|"Mixed (manual + diagrams)"| E["Parse Text<br/>+ Keep Page Images"]
C --> F["Chunk Text<br/>(content-type-aware)"]
D --> G["Image to S3"]
E --> F
E --> G
F --> H["Store Chunks in S3"]
H --> I["Qwen3-VL-Embedding<br/>(text + instruction)"]
G --> J["Qwen3-VL-Embedding<br/>(image + instruction)"]
I --> K["4096d Vector"]
J --> K
K --> L["Store in Neo4j<br/>Chunk/ImageEmbedding Node"]
L --> M["Extract Concepts<br/>(LLM entity extraction)"]
M --> N["Create Concept Nodes<br/>+ REFERENCES/MENTIONS edges"]
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-info text-white"><h4 class="mb-0">Qwen3-VL-Embedding-8B</h4></div>
<div class="card-body">
<ul>
<li><strong>Dimensions:</strong> 4096 (full), or MRL truncation to 3072/2048/1536/1024</li>
<li><strong>Input:</strong> Text, images, screenshots, video, or any mix</li>
<li><strong>Instruction-aware:</strong> Content-type instruction improves quality 15%</li>
<li><strong>Quantization:</strong> Int8 (~8GB VRAM), Int4 (~4GB VRAM)</li>
<li><strong>Serving:</strong> vLLM with <code>--runner pooling</code></li>
<li><strong>Languages:</strong> 30+ languages supported</li>
</ul>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-warning text-dark"><h4 class="mb-0">Qwen3-VL-Reranker-8B</h4></div>
<div class="card-body">
<ul>
<li><strong>Architecture:</strong> Single-tower cross-attention (deep query↔document interaction)</li>
<li><strong>Input:</strong> (query, document) pairs — both can be multimodal</li>
<li><strong>Output:</strong> Relevance score (sigmoid of yes/no token probabilities)</li>
<li><strong>Instruction-aware:</strong> Custom re-ranking instructions per content type</li>
<li><strong>Serving:</strong> vLLM with <code>--runner pooling</code> + score endpoint</li>
<li><strong>Fallback:</strong> Qwen3-Reranker-0.6B via llama.cpp (text-only)</li>
</ul>
</div>
</div>
</div>
</div>
<div class="alert alert-info border-start border-4 border-info">
<h4><i class="bi bi-image"></i> Why Multimodal Matters</h4>
<p>Traditional RAG systems OCR images and diagrams, producing garbled text. Multimodal embedding understands the <em>visual content</em> directly:</p>
<ul class="mb-0">
<li><strong>Technical diagrams:</strong> Wiring diagrams, network topologies, architecture diagrams — searchable by visual content, not OCR garbage</li>
<li><strong>Album artwork:</strong> "psychedelic album covers from the 70s" finds matching art via visual similarity</li>
<li><strong>Art:</strong> The actual painting/sculpture becomes the searchable content, not just its text description</li>
<li><strong>PDF pages:</strong> Image-only PDF pages with charts and tables are embedded as images, not skipped</li>
</ul>
</div>
</section>
<!-- SECTION: SEARCH PIPELINE -->
<section id="search-pipeline" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-search"></i> Search Pipeline — GraphRAG + Vector + Re-rank</h2>
<div class="card mb-4">
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-flow-chart"></i> Search Flow</h3></div>
<div class="card-body">
<div class="mermaid">
flowchart TD
Q["User Query"] --> E["Embed Query<br/>(Qwen3-VL-Embedding)"]
E --> VS["1. Vector Search<br/>(Neo4j vector index)<br/>Top-K × 3 oversample"]
E --> GT["2. Graph Traversal<br/>(Cypher queries)<br/>Concept + relationship walks"]
Q --> FT["3. Full-Text Search<br/>(Neo4j fulltext index)<br/>Keyword matching"]
VS --> F["Candidate Fusion<br/>+ Deduplication"]
GT --> F
FT --> F
F --> RR["4. Re-Rank<br/>(Qwen3-VL-Reranker)<br/>Cross-attention scoring"]
RR --> TK["Top-K Results"]
TK --> CTX["Inject Content-Type<br/>Context Prompt"]
CTX --> LLM["5. LLM Responder<br/>(Two-stage RAG)"]
LLM --> REV["6. LLM Reviewer<br/>(Quality + citation check)"]
REV --> ANS["Final Answer<br/>with Citations"]
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h5 class="mb-0">1. Vector Search</h5></div>
<div class="card-body">
<p>Cosine similarity via Neo4j vector index on Chunk and ImageEmbedding nodes.</p>
<pre class="bg-light p-2 rounded"><code>CALL db.index.vector.queryNodes(
'chunk_embedding', 30,
$query_vector
) YIELD node, score
WHERE score > $threshold</code></pre>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-success text-white"><h5 class="mb-0">2. Graph Traversal</h5></div>
<div class="card-body">
<p>Walk relationships to find contextually related content that vector search alone would miss.</p>
<pre class="bg-light p-2 rounded"><code>MATCH (c:Chunk)-[:HAS_CHUNK]-(i:Item)
-[:REFERENCES]->(con:Concept)
-[:RELATED_TO]-(con2:Concept)
<-[:REFERENCES]-(i2:Item)
-[:HAS_CHUNK]->(c2:Chunk)
RETURN c2, i2</code></pre>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-info text-white"><h5 class="mb-0">3. Full-Text Search</h5></div>
<div class="card-body">
<p>Neo4j native full-text index for keyword matching (BM25-equivalent).</p>
<pre class="bg-light p-2 rounded"><code>CALL db.index.fulltext.queryNodes(
'chunk_fulltext',
$query_text
) YIELD node, score</code></pre>
</div>
</div>
</div>
</div>
</section>
<!-- SECTION: MCP INTERFACE -->
<section id="mcp-interface" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-plug"></i> MCP Server Interface</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>MCP-First Design</h3>
<p class="mb-0">Mnemosyne exposes its capabilities as MCP tools, making the entire knowledge base accessible to Claude, Copilot, and any MCP-compatible LLM client. The MCP server is a primary interface, not an afterthought.</p>
</div>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h4 class="mb-0">Search &amp; Retrieval Tools</h4></div>
<div class="card-body">
<table class="table table-sm">
<thead><tr><th>Tool</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>search_library</code></td><td>Semantic + graph + full-text search with re-ranking. Filters by library, collection, content type.</td></tr>
<tr><td><code>ask_about</code></td><td>Full RAG pipeline — search, re-rank, content-type context injection, LLM response with citations.</td></tr>
<tr><td><code>find_similar</code></td><td>Find items similar to a given item using vector similarity. Optionally search across libraries.</td></tr>
<tr><td><code>search_by_image</code></td><td>Multimodal search — find content matching an uploaded image.</td></tr>
<tr><td><code>explore_connections</code></td><td>Traverse knowledge graph from an item — find related concepts, authors, themes.</td></tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-success text-white"><h4 class="mb-0">Management &amp; Navigation Tools</h4></div>
<div class="card-body">
<table class="table table-sm">
<thead><tr><th>Tool</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>browse_libraries</code></td><td>List all libraries with their content types and item counts.</td></tr>
<tr><td><code>browse_collections</code></td><td>List collections within a library.</td></tr>
<tr><td><code>get_item</code></td><td>Get detailed info about a specific item, including metadata and graph connections.</td></tr>
<tr><td><code>add_content</code></td><td>Add new content to a library — triggers async embedding + graph construction.</td></tr>
<tr><td><code>get_concepts</code></td><td>List extracted concepts for an item or across a library.</td></tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</section>
<!-- SECTION: GPU SERVICES -->
<section id="gpu-services" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-gpu-card"></i> GPU Services</h2>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h4 class="mb-0">RTX 5090 (32GB VRAM)</h4></div>
<div class="card-body">
<table class="table table-sm">
<tbody>
<tr><td><strong>Model</strong></td><td>Qwen3-VL-Reranker-8B</td></tr>
<tr><td><strong>VRAM (bf16)</strong></td><td>~18GB</td></tr>
<tr><td><strong>Serving</strong></td><td>vLLM <code>--runner pooling</code></td></tr>
<tr><td><strong>Port</strong></td><td>:8001</td></tr>
<tr><td><strong>Role</strong></td><td>Multimodal re-ranking</td></tr>
<tr><td><strong>Headroom</strong></td><td>~14GB for chat model</td></tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-success text-white"><h4 class="mb-0">RTX 3090 (24GB VRAM)</h4></div>
<div class="card-body">
<table class="table table-sm">
<tbody>
<tr><td><strong>Model</strong></td><td>Qwen3-VL-Embedding-8B</td></tr>
<tr><td><strong>VRAM (bf16)</strong></td><td>~18GB</td></tr>
<tr><td><strong>Serving</strong></td><td>vLLM <code>--runner pooling</code></td></tr>
<tr><td><strong>Port</strong></td><td>:8002</td></tr>
<tr><td><strong>Role</strong></td><td>Multimodal embedding</td></tr>
<tr><td><strong>Headroom</strong></td><td>~6GB</td></tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="alert alert-info border-start border-4 border-info">
<h4><i class="bi bi-arrow-repeat"></i> Fallback: llama.cpp (Existing Ansible Infra)</h4>
<p class="mb-0">Text-only Qwen3-Reranker-0.6B GGUF served via <code>llama-server</code> on existing systemd/Ansible infrastructure. Managed by the same playbooks, monitored by the same Grafana dashboards. Used when vLLM services are down or for text-only workloads.</p>
</div>
</section>
<!-- SECTION: DEPLOYMENT -->
<section id="deployment" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-box-seam"></i> Deployment</h2>
<div class="row g-4 mb-4">
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h4 class="mb-0">Core Services</h4></div>
<div class="card-body">
<ul class="mb-0">
<li><strong>web:</strong> Django app (Gunicorn)</li>
<li><strong>postgres:</strong> PostgreSQL (auth/config only)</li>
<li><strong>neo4j:</strong> Neo4j 5.x (knowledge graph + vectors)</li>
<li><strong>rabbitmq:</strong> Celery broker</li>
</ul>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-success text-white"><h4 class="mb-0">Async Processing</h4></div>
<div class="card-body">
<ul class="mb-0">
<li><strong>celery-worker:</strong> Embedding, graph construction</li>
<li><strong>celery-beat:</strong> Scheduled re-sync tasks</li>
</ul>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-info text-white"><h4 class="mb-0">Storage &amp; Proxy</h4></div>
<div class="card-body">
<ul class="mb-0">
<li><strong>minio:</strong> S3-compatible content storage</li>
<li><strong>nginx:</strong> Static/proxy</li>
<li><strong>mcp-server:</strong> MCP interface process</li>
</ul>
</div>
</div>
</div>
</div>
<div class="alert alert-secondary border-start border-4 border-secondary">
<h4>Shared Infrastructure with Spelunker</h4>
<p class="mb-0">Mnemosyne and Spelunker share: GPU model services (llama.cpp + vLLM), MinIO/S3 (separate buckets), Neo4j (separate databases), RabbitMQ (separate vhosts), and Grafana monitoring. Each is its own Docker Compose stack but points to shared infra.</p>
</div>
</section>
<!-- SECTION: BACKPORT -->
<section id="backport" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-arrow-left-right"></i> Backport Strategy to Spelunker</h2>
<div class="alert alert-warning border-start border-4 border-warning">
<h3>Build Forward, Backport Back</h3>
<p class="mb-0">Mnemosyne proves the architecture with no legacy constraints. Once validated, proven components flow back to Spelunker to enhance its RFP workflow with multimodal understanding and re-ranking precision.</p>
</div>
<table class="table table-bordered">
<thead class="table-dark"><tr><th>Component</th><th>Mnemosyne (Prove)</th><th>Spelunker (Backport)</th></tr></thead>
<tbody>
<tr><td><strong>RerankerService</strong></td><td>Qwen3-VL multimodal + llama.cpp text</td><td>Drop into <code>rag/services/reranker.py</code></td></tr>
<tr><td><strong>Multimodal Embedding</strong></td><td>Qwen3-VL-Embedding via vLLM</td><td>Add alongside OpenAI embeddings, MRL@1536d for pgvector compat</td></tr>
<tr><td><strong>Diagram Understanding</strong></td><td>Image pages embedded multimodally</td><td>PDF diagrams in RFP docs become searchable</td></tr>
<tr><td><strong>MCP Server</strong></td><td>Primary interface from day one</td><td>Add as secondary interface to Spelunker</td></tr>
<tr><td><strong>Neo4j (optional)</strong></td><td>Primary vector + graph store</td><td>Could replace pgvector, or run alongside</td></tr>
<tr><td><strong>Content-Type Config</strong></td><td>Library type definitions</td><td>Adapt as document classification in Spelunker</td></tr>
</tbody>
</table>
</section>
<div class="alert alert-success border-start border-4 border-success mt-5">
<h3><i class="bi bi-check-circle"></i> Documentation Complete</h3>
<p class="mb-0">This document describes the target architecture for Mnemosyne. Phase implementation documents provide detailed build plans.</p>
</div>
</div>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
</body>
</html>

View File

@@ -0,0 +1,351 @@
# Mnemosyne Integration — Daedalus & Pallas Reference
This document summarises the Mnemosyne-specific implementation required for integration with the Daedalus & Pallas architecture. The full specification lives in [`daedalus/docs/mnemosyne_integration.md`](../../daedalus/docs/mnemosyne_integration.md).
---
## Overview
Mnemosyne exposes two interfaces for the wider Ouranos ecosystem:
1. **MCP Server** (port 22091) — consumed by Pallas agents for synchronous search, browse, and retrieval operations
2. **REST Ingest API** — consumed by the Daedalus backend for asynchronous file ingestion and embedding job lifecycle management
---
## 1. MCP Server (Phase 5)
### Port & URL
| Service | Port | URL |
|---------|------|-----|
| Mnemosyne MCP | 22091 | `http://puck.incus:22091/mcp` |
| Health check | 22091 | `http://puck.incus:22091/mcp/health` |
### Project Structure
Following the [Django MCP Pattern](Pattern_Django-MCP_V1-00.md):
```
mnemosyne/mnemosyne/mcp_server/
├── __init__.py
├── server.py # FastMCP instance + tool registration
├── asgi.py # Starlette ASGI mount at /mcp
├── middleware.py # MCPAuthMiddleware (disabled for internal use)
├── context.py # get_mcp_user(), get_mcp_token()
└── tools/
├── __init__.py
├── search.py # register_search_tools(mcp) → search_knowledge, search_by_category
├── browse.py # register_browse_tools(mcp) → list_libraries, list_collections, get_item, get_concepts
└── health.py # register_health_tools(mcp) → get_health
```
### Tools to Implement
| Tool | Module | Description |
|------|--------|-------------|
| `search_knowledge` | `search.py` | Hybrid vector + full-text + graph search → re-rank → return chunks with citations |
| `search_by_category` | `search.py` | Same as above, scoped to a specific `library_type` |
| `list_libraries` | `browse.py` | List all libraries with type, description, counts |
| `list_collections` | `browse.py` | List collections within a library |
| `get_item` | `browse.py` | Retrieve item detail with chunk previews and concept links |
| `get_concepts` | `browse.py` | Traverse concept graph from a starting concept or item |
| `get_health` | `health.py` | Check Neo4j, S3, embedding model reachability |
### MCP Resources
| Resource URI | Source |
|---|---|
| `mnemosyne://library-types` | `library/content_types.py``LIBRARY_TYPE_DEFAULTS` |
| `mnemosyne://libraries` | `Library.nodes.order_by("name")` serialized to JSON |
### Deployment
Separate Uvicorn process alongside Django's Gunicorn:
```bash
# Django WSGI (existing)
gunicorn --bind :22090 --workers 3 mnemosyne.wsgi
# MCP ASGI (new)
uvicorn mcp_server.asgi:app --host 0.0.0.0 --port 22091 --workers 1
```
Auth is disabled (`MCP_REQUIRE_AUTH=False`) since all traffic is internal (10.10.0.0/24).
### ⚠️ DEBUG LOG Points — MCP Server
| Location | Log Event | Level | What to Log |
|----------|-----------|-------|-------------|
| Tool dispatch | `mcp_tool_called` | DEBUG | Tool name, all input parameters |
| Vector search | `mcp_search_vector_query` | DEBUG | Query text, embedding dims, library filter, limit |
| Vector search result | `mcp_search_vector_results` | DEBUG | Candidate count, top/lowest scores |
| Full-text search | `mcp_search_fulltext_query` | DEBUG | Query terms, index used |
| Re-ranking | `mcp_search_rerank` | DEBUG | Candidates in/out, reranker model, duration_ms |
| Graph traversal | `mcp_graph_traverse` | DEBUG | Starting node UID, relationships, depth, nodes visited |
| Neo4j query | `mcp_neo4j_query` | DEBUG | Cypher query (parameterized), execution time_ms |
| Tool response | `mcp_tool_response` | DEBUG | Tool name, result size (bytes/items), duration_ms |
| Health check | `mcp_health_check` | DEBUG | Each dependency status, overall result |
**Important:** All neomodel ORM calls inside async tool functions **must** be wrapped with `sync_to_async(thread_sensitive=True)`.
---
## 2. REST Ingest API
### New Endpoints
| Method | Route | Purpose |
|--------|-------|---------|
| `POST` | `/api/v1/library/ingest` | Accept a file for ingestion + embedding |
| `GET` | `/api/v1/library/jobs/{job_id}` | Poll job status |
| `POST` | `/api/v1/library/jobs/{job_id}/retry` | Retry a failed job |
| `GET` | `/api/v1/library/jobs` | List recent jobs (optional `?status=` filter) |
These endpoints are consumed by the **Daedalus FastAPI backend** only. Not by the frontend.
### New Model: `IngestJob`
Add to `library/` app (Django ORM on PostgreSQL, not Neo4j):
```python
class IngestJob(models.Model):
"""Tracks the lifecycle of a content ingestion + embedding job."""
id = models.CharField(max_length=64, primary_key=True)
item_uid = models.CharField(max_length=64, db_index=True)
celery_task_id = models.CharField(max_length=255, blank=True)
status = models.CharField(
max_length=20,
choices=[
("pending", "Pending"),
("processing", "Processing"),
("completed", "Completed"),
("failed", "Failed"),
],
default="pending",
db_index=True,
)
progress = models.CharField(max_length=50, default="queued")
error = models.TextField(blank=True, null=True)
retry_count = models.PositiveIntegerField(default=0)
chunks_created = models.PositiveIntegerField(default=0)
concepts_extracted = models.PositiveIntegerField(default=0)
embedding_model = models.CharField(max_length=100, blank=True)
source = models.CharField(max_length=50, default="")
source_ref = models.CharField(max_length=200, blank=True)
s3_key = models.CharField(max_length=500)
created_at = models.DateTimeField(auto_now_add=True)
started_at = models.DateTimeField(null=True, blank=True)
completed_at = models.DateTimeField(null=True, blank=True)
class Meta:
ordering = ["-created_at"]
indexes = [
models.Index(fields=["status", "-created_at"]),
models.Index(fields=["source", "source_ref"]),
]
```
### Ingest Request Schema
```json
{
"s3_key": "workspaces/ws_abc/files/f_def/report.pdf",
"title": "Q4 Technical Report",
"library_uid": "lib_technical_001",
"collection_uid": "col_reports_2026",
"file_type": "application/pdf",
"file_size": 245000,
"source": "daedalus",
"source_ref": "ws_abc/f_def"
}
```
### Job Status Response Schema
```json
{
"job_id": "job_789xyz",
"item_uid": "item_abc123",
"status": "processing",
"progress": "embedding",
"chunks_created": 0,
"concepts_extracted": 0,
"embedding_model": "qwen3-vl-embedding-8b",
"started_at": "2026-03-12T15:42:01Z",
"completed_at": null,
"error": null
}
```
### ⚠️ DEBUG LOG Points — Ingest Endpoint
| Location | Log Event | Level | What to Log |
|----------|-----------|-------|-------------|
| Request received | `ingest_request_received` | INFO | s3_key, title, library_uid, file_type, source, source_ref |
| S3 key validation | `ingest_s3_key_check` | DEBUG | s3_key, exists (bool), bucket name |
| Library lookup | `ingest_library_lookup` | DEBUG | library_uid, found (bool), library_type |
| Item node creation | `ingest_item_created` | INFO | item_uid, title, library_uid, collection_uid |
| Celery task dispatch | `ingest_task_dispatched` | INFO | job_id, item_uid, celery_task_id, queue name |
| Celery task dispatch failure | `ingest_task_dispatch_failed` | ERROR | job_id, item_uid, exception details |
---
## 3. Celery Embedding Pipeline
### New Task: `embed_item`
```python
@shared_task(
name="library.embed_item",
bind=True,
max_retries=3,
default_retry_delay=60,
autoretry_for=(S3ConnectionError, EmbeddingModelError),
retry_backoff=True,
retry_backoff_max=600,
acks_late=True,
queue="embedding",
)
def embed_item(self, job_id, item_uid):
...
```
### Task Flow
1. Update job → `processing` / `fetching`
2. Fetch file from Daedalus S3 bucket (cross-bucket read)
3. Copy to Mnemosyne's own S3 bucket
4. Load library type → chunking config
5. Chunk content per strategy
6. Store chunk text in S3
7. Generate embeddings (Arke/vLLM batch call)
8. Write Chunk nodes + vectors to Neo4j
9. Extract concepts (LLM call)
10. Build graph relationships
11. Update job → `completed`
On failure at any step: update job → `failed` with error message.
### ⚠️ DEBUG LOG Points — Celery Worker (Critical)
These are the most important log points in the entire integration. Without them, debugging async embedding failures is nearly impossible.
| Location | Log Event | Level | What to Log |
|----------|-----------|-------|-------------|
| Task pickup | `embed_task_started` | INFO | job_id, item_uid, worker hostname, retry count |
| S3 fetch start | `embed_s3_fetch_start` | DEBUG | s3_key, source bucket |
| S3 fetch complete | `embed_s3_fetch_complete` | DEBUG | s3_key, file_size, duration_ms |
| S3 fetch failed | `embed_s3_fetch_failed` | ERROR | s3_key, error, retry_count |
| S3 cross-bucket copy start | `s3_cross_bucket_copy_start` | DEBUG | source_bucket, source_key, dest_bucket, dest_key |
| S3 cross-bucket copy complete | `s3_cross_bucket_copy_complete` | DEBUG | source_key, dest_key, file_size, duration_ms |
| S3 cross-bucket copy failed | `s3_cross_bucket_copy_failed` | ERROR | source_bucket, source_key, error |
| Chunking start | `embed_chunking_start` | DEBUG | library_type, strategy, chunk_size, chunk_overlap |
| Chunking complete | `embed_chunking_complete` | INFO | chunks_created, avg_chunk_size |
| Chunking failed | `embed_chunking_failed` | ERROR | file_type, error |
| Embedding start | `embed_vectors_start` | DEBUG | model_name, dimensions, batch_size, total_chunks |
| Embedding complete | `embed_vectors_complete` | INFO | model_name, duration_ms, tokens_processed |
| Embedding failed | `embed_vectors_failed` | ERROR | model_name, chunk_index, error |
| Neo4j write start | `embed_neo4j_write_start` | DEBUG | chunks_to_write count |
| Neo4j write complete | `embed_neo4j_write_complete` | INFO | chunks_written, duration_ms |
| Neo4j write failed | `embed_neo4j_write_failed` | ERROR | chunk_index, neo4j_error |
| Concept extraction start | `embed_concepts_start` | DEBUG | model_name |
| Concept extraction complete | `embed_concepts_complete` | INFO | concepts_extracted, concept_names, duration_ms |
| Graph build start | `embed_graph_build_start` | DEBUG | — |
| Graph build complete | `embed_graph_build_complete` | INFO | relationships_created, duration_ms |
| Job completed | `embed_job_completed` | INFO | job_id, item_uid, total_duration_ms, chunks, concepts |
| Job failed | `embed_job_failed` | ERROR | job_id, item_uid, exception_type, error, full traceback |
---
## 4. S3 Bucket Strategy
Mnemosyne uses its own bucket (`mnemosyne-content`, Terraform-provisioned per Phase 1). On ingest, the Celery worker copies the file from the Daedalus bucket to Mnemosyne's bucket.
```
mnemosyne-content bucket
├── items/
│ └── {item_uid}/
│ └── original/{filename} ← copied from Daedalus bucket
│ └── chunks/
│ └── chunk_000.txt
│ └── chunk_001.txt
├── images/
│ └── {image_uid}/{filename}
```
### Configuration
```bash
# .env additions
# Mnemosyne's own bucket (existing)
AWS_STORAGE_BUCKET_NAME=mnemosyne-content
# Cross-bucket read access to Daedalus bucket
DAEDALUS_S3_BUCKET_NAME=daedalus
DAEDALUS_S3_ENDPOINT_URL=http://incus-s3.incus:9000
DAEDALUS_S3_ACCESS_KEY_ID=${VAULT_DAEDALUS_S3_READ_KEY}
DAEDALUS_S3_SECRET_ACCESS_KEY=${VAULT_DAEDALUS_S3_READ_SECRET}
# MCP server
MCP_SERVER_PORT=22091
MCP_REQUIRE_AUTH=False
```
---
## 5. Prometheus Metrics
```
# MCP tool calls
mnemosyne_mcp_tool_invocations_total{tool,status} counter
mnemosyne_mcp_tool_duration_seconds{tool} histogram
# Ingest pipeline
mnemosyne_ingest_jobs_total{status} counter
mnemosyne_ingest_duration_seconds{library_type} histogram
mnemosyne_chunks_created_total{library_type} counter
mnemosyne_concepts_extracted_total counter
mnemosyne_embeddings_generated_total{model} counter
mnemosyne_embedding_duration_seconds{model} histogram
# Search performance
mnemosyne_search_duration_seconds{search_type} histogram
mnemosyne_search_results_total{search_type} counter
mnemosyne_rerank_duration_seconds{model} histogram
# Infrastructure
mnemosyne_neo4j_query_duration_seconds{query_type} histogram
mnemosyne_s3_operations_total{operation,status} counter
```
---
## 6. Implementation Phases (Mnemosyne-specific)
### Phase 1 — REST Ingest API
- [ ] Create `IngestJob` model + Django migration
- [ ] Implement `POST /api/v1/library/ingest` endpoint
- [ ] Implement `GET /api/v1/library/jobs/{job_id}` endpoint
- [ ] Implement `POST /api/v1/library/jobs/{job_id}/retry` endpoint
- [ ] Implement `GET /api/v1/library/jobs` list endpoint
- [ ] Implement `embed_item` Celery task with full debug logging
- [ ] Add S3 cross-bucket copy logic
- [ ] Add ingest API serializers and URL routing
### Phase 2 — MCP Server (Phase 5 of Mnemosyne roadmap)
- [ ] Create `mcp_server/` module following Django MCP Pattern
- [ ] Implement `search_knowledge` tool (hybrid search + re-rank)
- [ ] Implement `search_by_category` tool
- [ ] Implement `list_libraries`, `list_collections`, `get_item`, `get_concepts` tools
- [ ] Implement `get_health` tool per Pallas health spec
- [ ] Register MCP resources (`mnemosyne://library-types`, `mnemosyne://libraries`)
- [ ] ASGI mount + Uvicorn deployment on port 22091
- [ ] Systemd service for MCP Uvicorn process
- [ ] Add Prometheus metrics

333
docs/ouranos.md Normal file
View File

@@ -0,0 +1,333 @@
# Ouranos Lab
Infrastructure-as-Code project managing the **Ouranos Lab** — a development sandbox at [ouranos.helu.ca](https://ouranos.helu.ca). Uses **Terraform** for container provisioning and **Ansible** for configuration management, themed around the moons of Uranus.
---
## Project Overview
| Component | Purpose |
|-----------|---------|
| **Terraform** | Provisions 10 specialised Incus containers (LXC) with DNS-resolved networking, security policies, and resource dependencies |
| **Ansible** | Deploys Docker, databases (PostgreSQL, Neo4j), observability stack (Prometheus, Grafana, Loki), and application runtimes across all hosts |
> **DNS Domain**: Incus resolves containers via the `.incus` domain suffix (e.g., `oberon.incus`, `portia.incus`). IPv4 addresses are dynamically assigned — always use DNS names, never hardcode IPs.
---
## Uranian Host Architecture
All containers are named after moons of Uranus and resolved via the `.incus` DNS suffix.
| Name | Role | Description | Nesting |
|------|------|-------------|---------|
| **ariel** | graph_database | Neo4j — Ethereal graph connections | ✔ |
| **caliban** | agent_automation | Agent S MCP Server with MATE Desktop | ✔ |
| **miranda** | mcp_docker_host | Dedicated Docker Host for MCP Servers | ✔ |
| **oberon** | container_orchestration | Docker Host — MCP Switchboard, RabbitMQ, Open WebUI | ✔ |
| **portia** | database | PostgreSQL — Relational database host | ❌ |
| **prospero** | observability | PPLG stack — Prometheus, Grafana, Loki, PgAdmin | ❌ |
| **puck** | application_runtime | Python App Host — JupyterLab, Django apps, Gitea Runner | ✔ |
| **rosalind** | collaboration | Gitea, LobeChat, Nextcloud, AnythingLLM | ✔ |
| **sycorax** | language_models | Arke LLM Proxy | ✔ |
| **titania** | proxy_sso | HAProxy TLS termination + Casdoor SSO | ✔ |
### oberon — Container Orchestration
King of the Fairies orchestrating containers and managing MCP infrastructure.
- Docker engine
- MCP Switchboard (port 22785) — Django app routing MCP tool calls
- RabbitMQ message queue
- Open WebUI LLM interface (port 22088, PostgreSQL backend on Portia)
- SearXNG privacy search (port 22083, behind OAuth2-Proxy)
- smtp4dev SMTP test server (port 22025)
### portia — Relational Database
Intelligent and resourceful — the reliability of relational databases.
- PostgreSQL 17 (port 5432)
- Databases: `arke`, `anythingllm`, `gitea`, `hass`, `lobechat`, `mcp_switchboard`, `nextcloud`, `openwebui`, `periplus`, `spelunker`
### ariel — Graph Database
Air spirit — ethereal, interconnected nature mirroring graph relationships.
- Neo4j 5.26.0 (Docker)
- HTTP API: port 25584
- Bolt: port 25554
### puck — Application Runtime
Shape-shifting trickster embodying Python's versatility.
- Docker engine
- JupyterLab (port 22071 via OAuth2-Proxy)
- Gitea Runner (CI/CD agent)
- Home Assistant (port 8123)
- Django applications: Angelia (22281), Athena (22481), Kairos (22581), Icarlos (22681), Spelunker (22881), Peitho (22981)
### prospero — Observability Stack
Master magician observing all events.
- PPLG stack via Docker Compose: Prometheus, Loki, Grafana, PgAdmin
- Internal HAProxy with OAuth2-Proxy for all dashboards
- AlertManager with Pushover notifications
- Prometheus metrics collection (`node-exporter`, HAProxy, Loki)
- Loki log aggregation via Alloy (all hosts)
- Grafana dashboard suite with Casdoor SSO integration
### miranda — MCP Docker Host
Curious bridge between worlds — hosting MCP server containers.
- Docker engine (API exposed on port 2375 for MCP Switchboard)
- MCPO OpenAI-compatible MCP proxy
- Grafana MCP Server (port 25533)
- Gitea MCP Server (port 25535)
- Neo4j MCP Server
- Argos MCP Server — web search via SearXNG (port 25534)
### sycorax — Language Models
Original magical power wielding language magic.
- Arke LLM API Proxy (port 25540)
- Multi-provider support (OpenAI, Anthropic, etc.)
- Session management with Memcached
- Database backend on Portia
### caliban — Agent Automation
Autonomous computer agent learning through environmental interaction.
- Docker engine
- Agent S MCP Server (MATE desktop, AT-SPI automation)
- Kernos MCP Shell Server (port 22021)
- GPU passthrough for vision tasks
- RDP access (port 25521)
### rosalind — Collaboration Services
Witty and resourceful moon for PHP, Go, and Node.js runtimes.
- Gitea self-hosted Git (port 22082, SSH on 22022)
- LobeChat AI chat interface (port 22081)
- Nextcloud file sharing and collaboration (port 22083)
- AnythingLLM document AI workspace (port 22084)
- Nextcloud data on dedicated Incus storage volume
### titania — Proxy & SSO Services
Queen of the Fairies managing access control and authentication.
- HAProxy 3.x with TLS termination (port 443)
- Let's Encrypt wildcard certificate via certbot DNS-01 (Namecheap)
- HTTP to HTTPS redirect (port 80)
- Gitea SSH proxy (port 22022)
- Casdoor SSO (port 22081, local PostgreSQL)
- Prometheus metrics at `:8404/metrics`
---
## External Access via HAProxy
Titania provides TLS termination and reverse proxy for all services.
- **Base domain**: `ouranos.helu.ca`
- **HTTPS**: port 443 (standard)
- **HTTP**: port 80 (redirects to HTTPS)
- **Certificate**: Let's Encrypt wildcard via certbot DNS-01
### Route Table
| Subdomain | Backend | Service |
|-----------|---------|---------|
| `ouranos.helu.ca` (root) | puck.incus:22281 | Angelia (Django) |
| `alertmanager.ouranos.helu.ca` | prospero.incus:443 (SSL) | AlertManager |
| `angelia.ouranos.helu.ca` | puck.incus:22281 | Angelia (Django) |
| `anythingllm.ouranos.helu.ca` | rosalind.incus:22084 | AnythingLLM |
| `arke.ouranos.helu.ca` | sycorax.incus:25540 | Arke LLM Proxy |
| `athena.ouranos.helu.ca` | puck.incus:22481 | Athena (Django) |
| `gitea.ouranos.helu.ca` | rosalind.incus:22082 | Gitea |
| `grafana.ouranos.helu.ca` | prospero.incus:443 (SSL) | Grafana |
| `hass.ouranos.helu.ca` | oberon.incus:8123 | Home Assistant |
| `id.ouranos.helu.ca` | titania.incus:22081 | Casdoor SSO |
| `icarlos.ouranos.helu.ca` | puck.incus:22681 | Icarlos (Django) |
| `jupyterlab.ouranos.helu.ca` | puck.incus:22071 | JupyterLab (OAuth2-Proxy) |
| `kairos.ouranos.helu.ca` | puck.incus:22581 | Kairos (Django) |
| `lobechat.ouranos.helu.ca` | rosalind.incus:22081 | LobeChat |
| `loki.ouranos.helu.ca` | prospero.incus:443 (SSL) | Loki |
| `mcp-switchboard.ouranos.helu.ca` | oberon.incus:22785 | MCP Switchboard |
| `nextcloud.ouranos.helu.ca` | rosalind.incus:22083 | Nextcloud |
| `openwebui.ouranos.helu.ca` | oberon.incus:22088 | Open WebUI |
| `peitho.ouranos.helu.ca` | puck.incus:22981 | Peitho (Django) |
| `pgadmin.ouranos.helu.ca` | prospero.incus:443 (SSL) | PgAdmin 4 |
| `prometheus.ouranos.helu.ca` | prospero.incus:443 (SSL) | Prometheus |
| `searxng.ouranos.helu.ca` | oberon.incus:22073 | SearXNG (OAuth2-Proxy) |
| `smtp4dev.ouranos.helu.ca` | oberon.incus:22085 | smtp4dev |
| `spelunker.ouranos.helu.ca` | puck.incus:22881 | Spelunker (Django) |
---
## Infrastructure Management
### Quick Start
```bash
# Provision containers
cd terraform
terraform init
terraform plan
terraform apply
# Start all containers
cd ../ansible
source ~/env/agathos/bin/activate
ansible-playbook sandbox_up.yml
# Deploy all services
ansible-playbook site.yml
# Stop all containers
ansible-playbook sandbox_down.yml
```
### Terraform Workflow
1. **Define** — Containers, networks, and resources in `*.tf` files
2. **Plan** — Review changes with `terraform plan`
3. **Apply** — Provision with `terraform apply`
4. **Verify** — Check outputs and container status
### Ansible Workflow
1. **Bootstrap** — Update packages, install essentials (`apt_update.yml`)
2. **Agents** — Deploy Alloy (log/metrics) and Node Exporter on all hosts
3. **Services** — Configure databases, Docker, applications, observability
4. **Verify** — Check service health and connectivity
### Vault Management
```bash
# Edit secrets
ansible-vault edit inventory/group_vars/all/vault.yml
# View secrets
ansible-vault view inventory/group_vars/all/vault.yml
# Encrypt a new file
ansible-vault encrypt new_secrets.yml
```
---
## S3 Storage Provisioning
Terraform provisions Incus S3 buckets for services requiring object storage:
| Service | Host | Purpose |
|---------|------|---------|
| **Casdoor** | Titania | User avatars and SSO resource storage |
| **LobeChat** | Rosalind | File uploads and attachments |
> S3 credentials (access key, secret key, endpoint) are stored as sensitive Terraform outputs and managed in Ansible Vault with the `vault_*_s3_*` prefix.
---
## Ansible Automation
### Full Deployment (`site.yml`)
Playbooks run in dependency order:
| Playbook | Hosts | Purpose |
|----------|-------|---------|
| `apt_update.yml` | All | Update packages and install essentials |
| `alloy/deploy.yml` | All | Grafana Alloy log/metrics collection |
| `prometheus/node_deploy.yml` | All | Node Exporter metrics |
| `docker/deploy.yml` | Oberon, Ariel, Miranda, Puck, Rosalind, Sycorax, Caliban, Titania | Docker engine |
| `smtp4dev/deploy.yml` | Oberon | SMTP test server |
| `pplg/deploy.yml` | Prospero | Full observability stack + HAProxy + OAuth2-Proxy |
| `postgresql/deploy.yml` | Portia | PostgreSQL with all databases |
| `postgresql_ssl/deploy.yml` | Titania | Dedicated PostgreSQL for Casdoor |
| `neo4j/deploy.yml` | Ariel | Neo4j graph database |
| `searxng/deploy.yml` | Oberon | SearXNG privacy search |
| `haproxy/deploy.yml` | Titania | HAProxy TLS termination and routing |
| `casdoor/deploy.yml` | Titania | Casdoor SSO |
| `mcpo/deploy.yml` | Miranda | MCPO MCP proxy |
| `openwebui/deploy.yml` | Oberon | Open WebUI LLM interface |
| `hass/deploy.yml` | Oberon | Home Assistant |
| `gitea/deploy.yml` | Rosalind | Gitea self-hosted Git |
| `nextcloud/deploy.yml` | Rosalind | Nextcloud collaboration |
### Individual Service Deployments
Services with standalone deploy playbooks (not in `site.yml`):
| Playbook | Host | Service |
|----------|------|---------|
| `anythingllm/deploy.yml` | Rosalind | AnythingLLM document AI |
| `arke/deploy.yml` | Sycorax | Arke LLM proxy |
| `argos/deploy.yml` | Miranda | Argos MCP web search server |
| `caliban/deploy.yml` | Caliban | Agent S MCP Server |
| `certbot/deploy.yml` | Titania | Let's Encrypt certificate renewal |
| `gitea_mcp/deploy.yml` | Miranda | Gitea MCP Server |
| `gitea_runner/deploy.yml` | Puck | Gitea CI/CD runner |
| `grafana_mcp/deploy.yml` | Miranda | Grafana MCP Server |
| `jupyterlab/deploy.yml` | Puck | JupyterLab + OAuth2-Proxy |
| `kernos/deploy.yml` | Caliban | Kernos MCP shell server |
| `lobechat/deploy.yml` | Rosalind | LobeChat AI chat |
| `neo4j_mcp/deploy.yml` | Miranda | Neo4j MCP Server |
| `rabbitmq/deploy.yml` | Oberon | RabbitMQ message queue |
### Lifecycle Playbooks
| Playbook | Purpose |
|----------|---------|
| `sandbox_up.yml` | Start all Uranian host containers |
| `sandbox_down.yml` | Gracefully stop all containers |
| `apt_update.yml` | Update packages on all hosts |
| `site.yml` | Full deployment orchestration |
---
## Data Flow Architecture
### Observability Pipeline
```
All Hosts Prospero Alerts
Alloy + Node Exporter → Prometheus + Loki + Grafana → AlertManager + Pushover
collect metrics & logs storage & visualisation notifications
```
### Integration Points
| Consumer | Provider | Connection |
|----------|----------|-----------|
| All LLM apps | Arke (Sycorax) | `http://sycorax.incus:25540` |
| Open WebUI, Arke, Gitea, Nextcloud, LobeChat | PostgreSQL (Portia) | `portia.incus:5432` |
| Neo4j MCP | Neo4j (Ariel) | `ariel.incus:7687` (Bolt) |
| MCP Switchboard | Docker API (Miranda) | `tcp://miranda.incus:2375` |
| MCP Switchboard | RabbitMQ (Oberon) | `oberon.incus:5672` |
| Kairos, Spelunker | RabbitMQ (Oberon) | `oberon.incus:5672` |
| SMTP (all apps) | smtp4dev (Oberon) | `oberon.incus:22025` |
| All hosts | Loki (Prospero) | `http://prospero.incus:3100` |
| All hosts | Prometheus (Prospero) | `http://prospero.incus:9090` |
---
## Important Notes
⚠️ **Alloy Host Variables Required** — Every host with `alloy` in its `services` list must define `alloy_log_level` in `inventory/host_vars/<host>.incus.yml`. The playbook will fail with an undefined variable error if this is missing.
⚠️ **Alloy Syslog Listeners Required for Docker Services** — Any Docker Compose service using the syslog logging driver must have a corresponding `loki.source.syslog` listener in the host's Alloy config template (`ansible/alloy/<hostname>/config.alloy.j2`). Missing listeners cause Docker containers to fail on start.
⚠️ **Local Terraform State** — This project uses local Terraform state (no remote backend). Do not run `terraform apply` from multiple machines simultaneously.
⚠️ **Nested Docker** — Docker runs inside Incus containers (nested), requiring `security.nesting = true` and `lxc.apparmor.profile=unconfined` AppArmor override on all Docker-enabled hosts.
⚠️ **Deployment Order** — Prospero (observability) must be fully deployed before other hosts, as Alloy on every host pushes logs and metrics to `prospero.incus`. Run `pplg/deploy.yml` before `site.yml` on a fresh environment.

View File

@@ -0,0 +1,63 @@
# =============================================================================
# Mnemosyne Django Environment Variables
# =============================================================================
# Copy this file to .env and configure for your environment
# This file contains all variables read by Django settings.py
# --- Security ---
SECRET_KEY=change-me-to-a-real-secret-key
DEBUG=True
ALLOWED_HOSTS=localhost,127.0.0.1,mnemosyne.ouranos.helu.ca
CSRF_TRUSTED_ORIGINS=http://localhost:8000,https://mnemosyne.ouranos.helu.ca
# --- PostgreSQL Database ---
DATABASE_URL=postgres://mnemosyne:password@portia.incus:5432/mnemosyne
# --- Neo4j Graph Database ---
NEOMODEL_NEO4J_BOLT_URL=bolt://neo4j:password@ariel.incus:25554
# --- Memcached ---
KVDB_LOCATION=127.0.0.1:11211
KVDB_PREFIX=mnemosyne
# --- Celery / RabbitMQ ---
CELERY_BROKER_URL=amqp://mnemosyne:password@oberon.incus:5672/mnemosyne
CELERY_RESULT_BACKEND=rpc://
CELERY_TASK_ALWAYS_EAGER=False
# --- S3 Storage (Incus bucket, MinIO-backed) ---
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_STORAGE_BUCKET_NAME=mnemosyne-content
AWS_S3_ENDPOINT_URL=
AWS_S3_USE_SSL=False
AWS_S3_VERIFY=False
AWS_S3_REGION_NAME=us-east-1
# Set to True to use local FileSystemStorage instead of S3 (dev/test)
USE_LOCAL_STORAGE=True
# --- Email (smtp4dev on Oberon) ---
EMAIL_HOST=oberon.incus
EMAIL_PORT=22025
EMAIL_USE_TLS=False
# --- LLM API Encryption ---
# Encryption key for LLM API keys stored in the database
# Generate with: python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
LLM_API_SECRETS_ENCRYPTION_KEY=
# --- Embedding Pipeline (Phase 2) ---
# Batch size for embedding API calls (smaller for local GPU, larger for cloud)
EMBEDDING_BATCH_SIZE=8
# Timeout in seconds for embedding API requests
EMBEDDING_TIMEOUT=120
# --- Logging ---
# Valid levels: DEBUG, INFO, WARNING, ERROR, CRITICAL
LOGGING_LEVEL=INFO
CELERY_LOGGING_LEVEL=INFO
DJANGO_LOGGING_LEVEL=WARNING
# --- Localization ---
TIME_ZONE=UTC
LANGUAGE_CODE=en-us

View File

@@ -0,0 +1 @@
default_app_config = "library.apps.LibraryConfig"

View File

@@ -0,0 +1,5 @@
# Library app does not use standard Django admin (neomodel StructuredNodes
# are not Django ORM models). Custom admin views are provided as regular
# app views in library/views.py, rendered within Themis's template structure.
#
# The embedding pipeline dashboard is at /library/embedding/

View File

View File

@@ -0,0 +1,83 @@
"""
DRF serializers for the library app.
Serialize Neo4j neomodel nodes into JSON for the REST API.
"""
from rest_framework import serializers
class LibrarySerializer(serializers.Serializer):
uid = serializers.CharField(read_only=True)
name = serializers.CharField(max_length=200)
library_type = serializers.ChoiceField(
choices=["fiction", "technical", "music", "film", "art", "journal"]
)
description = serializers.CharField(required=False, allow_blank=True, default="")
chunking_config = serializers.JSONField(required=False, default=dict)
embedding_instruction = serializers.CharField(
required=False, allow_blank=True, default=""
)
reranker_instruction = serializers.CharField(
required=False, allow_blank=True, default=""
)
llm_context_prompt = serializers.CharField(
required=False, allow_blank=True, default=""
)
created_at = serializers.DateTimeField(read_only=True)
class CollectionSerializer(serializers.Serializer):
uid = serializers.CharField(read_only=True)
name = serializers.CharField(max_length=200)
description = serializers.CharField(required=False, allow_blank=True, default="")
metadata = serializers.JSONField(required=False, default=dict)
created_at = serializers.DateTimeField(read_only=True)
library_uid = serializers.CharField(
required=False, write_only=True, help_text="UID of the parent library"
)
class ItemSerializer(serializers.Serializer):
uid = serializers.CharField(read_only=True)
title = serializers.CharField(max_length=500)
item_type = serializers.CharField(required=False, allow_blank=True, default="")
s3_key = serializers.CharField(read_only=True)
content_hash = serializers.CharField(read_only=True)
file_type = serializers.CharField(required=False, allow_blank=True, default="")
file_size = serializers.IntegerField(read_only=True)
metadata = serializers.JSONField(required=False, default=dict)
created_at = serializers.DateTimeField(read_only=True)
updated_at = serializers.DateTimeField(read_only=True)
collection_uid = serializers.CharField(
required=False, write_only=True, help_text="UID of the parent collection"
)
# Phase 2: Embedding pipeline fields
embedding_status = serializers.CharField(read_only=True)
embedding_model_name = serializers.CharField(read_only=True)
chunk_count = serializers.IntegerField(read_only=True)
image_count = serializers.IntegerField(read_only=True)
class ChunkSerializer(serializers.Serializer):
uid = serializers.CharField(read_only=True)
chunk_index = serializers.IntegerField()
chunk_s3_key = serializers.CharField()
chunk_size = serializers.IntegerField(required=False, default=0)
text_preview = serializers.CharField(required=False, allow_blank=True, default="")
created_at = serializers.DateTimeField(read_only=True)
class ConceptSerializer(serializers.Serializer):
uid = serializers.CharField(read_only=True)
name = serializers.CharField(max_length=200)
concept_type = serializers.CharField(required=False, allow_blank=True, default="")
class ImageSerializer(serializers.Serializer):
uid = serializers.CharField(read_only=True)
s3_key = serializers.CharField()
image_type = serializers.CharField(required=False, allow_blank=True, default="")
description = serializers.CharField(required=False, allow_blank=True, default="")
metadata = serializers.JSONField(required=False, default=dict)
created_at = serializers.DateTimeField(read_only=True)

View File

@@ -0,0 +1,24 @@
"""
URL patterns for the library DRF API.
"""
from django.urls import path
from . import views
app_name = "library-api"
urlpatterns = [
# Libraries
path("libraries/", views.library_list_create, name="library-list"),
path("libraries/<str:uid>/", views.library_detail, name="library-detail"),
# Collections
path("collections/", views.collection_list_create, name="collection-list"),
path("collections/<str:uid>/", views.collection_detail, name="collection-detail"),
# Items
path("items/", views.item_list_create, name="item-list"),
path("items/upload/", views.item_upload, name="item-upload"),
path("items/<str:uid>/", views.item_detail, name="item-detail"),
path("items/<str:uid>/reembed/", views.item_reembed, name="item-reembed"),
path("items/<str:uid>/status/", views.item_status, name="item-status"),
]

View File

@@ -0,0 +1,426 @@
"""
DRF API views for the library app.
All views are function-based per Red Panda Standards.
"""
import hashlib
import logging
import os
from django.core.files.base import ContentFile
from django.core.files.storage import default_storage
from rest_framework import status
from rest_framework.decorators import api_view, parser_classes, permission_classes
from rest_framework.parsers import FormParser, JSONParser, MultiPartParser
from rest_framework.permissions import IsAuthenticated
from rest_framework.response import Response
from library.content_types import get_library_type_config
from .serializers import (
CollectionSerializer,
ItemSerializer,
LibrarySerializer,
)
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Library API
# ---------------------------------------------------------------------------
@api_view(["GET", "POST"])
@permission_classes([IsAuthenticated])
def library_list_create(request):
"""List all libraries or create a new one."""
from library.models import Library
if request.method == "GET":
libraries = Library.nodes.order_by("name")
serializer = LibrarySerializer(libraries, many=True)
return Response(serializer.data)
# POST — create
serializer = LibrarySerializer(data=request.data)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
# Populate defaults from content-type config if not provided
library_type = data["library_type"]
defaults = get_library_type_config(library_type)
lib = Library(
name=data["name"],
library_type=library_type,
description=data.get("description", ""),
chunking_config=data.get("chunking_config") or defaults["chunking_config"],
embedding_instruction=(
data.get("embedding_instruction") or defaults["embedding_instruction"]
),
reranker_instruction=(
data.get("reranker_instruction") or defaults["reranker_instruction"]
),
llm_context_prompt=(
data.get("llm_context_prompt") or defaults["llm_context_prompt"]
),
)
lib.save()
return Response(LibrarySerializer(lib).data, status=status.HTTP_201_CREATED)
@api_view(["GET", "PUT", "DELETE"])
@permission_classes([IsAuthenticated])
def library_detail(request, uid):
"""Retrieve, update, or delete a library."""
from library.models import Library
try:
lib = Library.nodes.get(uid=uid)
except Library.DoesNotExist:
return Response(
{"detail": "Library not found."}, status=status.HTTP_404_NOT_FOUND
)
if request.method == "GET":
return Response(LibrarySerializer(lib).data)
if request.method == "PUT":
serializer = LibrarySerializer(data=request.data, partial=True)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
for field in [
"name",
"library_type",
"description",
"chunking_config",
"embedding_instruction",
"reranker_instruction",
"llm_context_prompt",
]:
if field in data:
setattr(lib, field, data[field])
lib.save()
return Response(LibrarySerializer(lib).data)
# DELETE
lib.delete()
return Response(status=status.HTTP_204_NO_CONTENT)
# ---------------------------------------------------------------------------
# Collection API
# ---------------------------------------------------------------------------
@api_view(["GET", "POST"])
@permission_classes([IsAuthenticated])
def collection_list_create(request):
"""List all collections or create a new one."""
from library.models import Collection, Library
if request.method == "GET":
# Optionally filter by library_uid query param
library_uid = request.query_params.get("library_uid")
if library_uid:
try:
lib = Library.nodes.get(uid=library_uid)
collections = lib.collections.all()
except Library.DoesNotExist:
return Response(
{"detail": "Library not found."}, status=status.HTTP_404_NOT_FOUND
)
else:
collections = Collection.nodes.all()
serializer = CollectionSerializer(collections, many=True)
return Response(serializer.data)
# POST
serializer = CollectionSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
col = Collection(
name=data["name"],
description=data.get("description", ""),
metadata=data.get("metadata", {}),
)
col.save()
# Connect to library if library_uid provided
library_uid = data.get("library_uid")
if library_uid:
try:
lib = Library.nodes.get(uid=library_uid)
lib.collections.connect(col)
col.library.connect(lib)
except Library.DoesNotExist:
pass
return Response(CollectionSerializer(col).data, status=status.HTTP_201_CREATED)
@api_view(["GET", "PUT", "DELETE"])
@permission_classes([IsAuthenticated])
def collection_detail(request, uid):
"""Retrieve, update, or delete a collection."""
from library.models import Collection
try:
col = Collection.nodes.get(uid=uid)
except Collection.DoesNotExist:
return Response(
{"detail": "Collection not found."}, status=status.HTTP_404_NOT_FOUND
)
if request.method == "GET":
return Response(CollectionSerializer(col).data)
if request.method == "PUT":
serializer = CollectionSerializer(data=request.data, partial=True)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
for field in ["name", "description", "metadata"]:
if field in data:
setattr(col, field, data[field])
col.save()
return Response(CollectionSerializer(col).data)
col.delete()
return Response(status=status.HTTP_204_NO_CONTENT)
# ---------------------------------------------------------------------------
# Item API
# ---------------------------------------------------------------------------
@api_view(["GET", "POST"])
@permission_classes([IsAuthenticated])
def item_list_create(request):
"""List all items or create a new one."""
from library.models import Collection, Item
if request.method == "GET":
collection_uid = request.query_params.get("collection_uid")
if collection_uid:
try:
col = Collection.nodes.get(uid=collection_uid)
items = col.items.all()
except Collection.DoesNotExist:
return Response(
{"detail": "Collection not found."},
status=status.HTTP_404_NOT_FOUND,
)
else:
items = Item.nodes.all()
serializer = ItemSerializer(items, many=True)
return Response(serializer.data)
# POST
serializer = ItemSerializer(data=request.data)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
item = Item(
title=data["title"],
item_type=data.get("item_type", ""),
file_type=data.get("file_type", ""),
metadata=data.get("metadata", {}),
)
item.save()
collection_uid = data.get("collection_uid")
if collection_uid:
try:
col = Collection.nodes.get(uid=collection_uid)
col.items.connect(item)
except Collection.DoesNotExist:
pass
return Response(ItemSerializer(item).data, status=status.HTTP_201_CREATED)
@api_view(["GET", "PUT", "DELETE"])
@permission_classes([IsAuthenticated])
def item_detail(request, uid):
"""Retrieve, update, or delete an item."""
from library.models import Item
try:
item = Item.nodes.get(uid=uid)
except Item.DoesNotExist:
return Response(
{"detail": "Item not found."}, status=status.HTTP_404_NOT_FOUND
)
if request.method == "GET":
return Response(ItemSerializer(item).data)
if request.method == "PUT":
serializer = ItemSerializer(data=request.data, partial=True)
serializer.is_valid(raise_exception=True)
data = serializer.validated_data
for field in ["title", "item_type", "file_type", "metadata"]:
if field in data:
setattr(item, field, data[field])
item.save()
return Response(ItemSerializer(item).data)
item.delete()
return Response(status=status.HTTP_204_NO_CONTENT)
# ---------------------------------------------------------------------------
# Item Upload (Phase 2)
# ---------------------------------------------------------------------------
@api_view(["POST"])
@permission_classes([IsAuthenticated])
@parser_classes([MultiPartParser, FormParser])
def item_upload(request):
"""
Upload a file to create a new Item and trigger embedding.
Expects multipart form data with:
- file: The document file
- title: Item title
- collection_uid: (optional) UID of parent collection
- auto_embed: (optional) Whether to auto-trigger embedding (default: true)
"""
from library.models import Collection, Item
uploaded_file = request.FILES.get("file")
if not uploaded_file:
return Response(
{"detail": "No file provided."}, status=status.HTTP_400_BAD_REQUEST
)
title = request.data.get("title", uploaded_file.name)
collection_uid = request.data.get("collection_uid", "")
auto_embed = request.data.get("auto_embed", "true").lower() in ("true", "1", "yes")
# Determine file type from extension
_, ext = os.path.splitext(uploaded_file.name)
file_type = ext.lstrip(".").lower()
# Read file data
file_data = uploaded_file.read()
content_hash = hashlib.sha256(file_data).hexdigest()
# Create Item node
item = Item(
title=title,
file_type=file_type,
file_size=len(file_data),
content_hash=content_hash,
embedding_status="pending",
)
item.save()
# Store file in S3
s3_key = f"items/{item.uid}/original.{file_type}"
try:
default_storage.save(s3_key, ContentFile(file_data))
item.s3_key = s3_key
item.save()
except Exception as exc:
logger.error("Failed to store file to S3: %s", exc)
item.delete()
return Response(
{"detail": f"File storage failed: {exc}"},
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
# Connect to collection if specified
if collection_uid:
try:
col = Collection.nodes.get(uid=collection_uid)
col.items.connect(item)
except Exception:
logger.warning("Collection not found: %s", collection_uid)
# Auto-trigger embedding
task_id = None
if auto_embed:
try:
from library.tasks import embed_item
task = embed_item.delay(item.uid, request.user.id)
task_id = task.id
logger.info(
"Auto-triggered embedding item_uid=%s task_id=%s",
item.uid,
task_id,
)
except Exception as exc:
logger.warning("Failed to queue embedding task: %s", exc)
return Response(
{
**ItemSerializer(item).data,
"task_id": task_id,
},
status=status.HTTP_201_CREATED,
)
@api_view(["POST"])
@permission_classes([IsAuthenticated])
def item_reembed(request, uid):
"""Trigger re-embedding for an existing Item."""
from library.models import Item
try:
item = Item.nodes.get(uid=uid)
except Item.DoesNotExist:
return Response(
{"detail": "Item not found."}, status=status.HTTP_404_NOT_FOUND
)
try:
from library.tasks import reembed_item
task = reembed_item.delay(uid, request.user.id)
return Response(
{
"detail": "Re-embedding queued.",
"item_uid": uid,
"task_id": task.id,
}
)
except Exception as exc:
logger.error("Failed to queue reembed task: %s", exc)
return Response(
{"detail": f"Failed to queue task: {exc}"},
status=status.HTTP_500_INTERNAL_SERVER_ERROR,
)
@api_view(["GET"])
@permission_classes([IsAuthenticated])
def item_status(request, uid):
"""Get embedding status for an Item."""
from library.models import Item
try:
item = Item.nodes.get(uid=uid)
except Item.DoesNotExist:
return Response(
{"detail": "Item not found."}, status=status.HTTP_404_NOT_FOUND
)
return Response(
{
"uid": item.uid,
"title": item.title,
"embedding_status": item.embedding_status,
"embedding_model_name": item.embedding_model_name,
"chunk_count": item.chunk_count,
"image_count": item.image_count,
"error_message": item.error_message,
}
)

View File

@@ -0,0 +1,7 @@
from django.apps import AppConfig
class LibraryConfig(AppConfig):
default_auto_field = "django.db.models.BigAutoField"
name = "library"
verbose_name = "Library"

View File

@@ -0,0 +1,163 @@
"""
Content-type system configuration for Mnemosyne library types.
Each library type has a default configuration that governs chunking,
embedding, re-ranking, and LLM context injection.
"""
# Default configurations per library type.
# These are loaded into Library nodes via the load_library_types management command.
LIBRARY_TYPE_DEFAULTS = {
"fiction": {
"chunking_config": {
"strategy": "chapter_aware",
"chunk_size": 1024,
"chunk_overlap": 128,
"respect_boundaries": ["chapter", "scene", "paragraph"],
},
"embedding_instruction": (
"Represent this passage from a work of fiction for retrieval. "
"Focus on narrative elements: characters, plot events, themes, "
"setting, and emotional tone."
),
"reranker_instruction": (
"Re-rank passages from fiction based on narrative relevance to the query. "
"Prioritize character actions, dialogue, plot developments, and thematic elements."
),
"llm_context_prompt": (
"The following excerpts are from fiction (novels, short stories, etc.). "
"Treat this as creative/narrative content. Respect the literary context — "
"characters, settings, and events are fictional. Cite specific passages "
"when answering."
),
},
"technical": {
"chunking_config": {
"strategy": "section_aware",
"chunk_size": 512,
"chunk_overlap": 64,
"respect_boundaries": ["section", "subsection", "code_block", "list"],
},
"embedding_instruction": (
"Represent this passage from technical documentation for retrieval. "
"Focus on procedures, configurations, API references, code examples, "
"and technical concepts."
),
"reranker_instruction": (
"Re-rank passages from technical documentation based on procedural relevance. "
"Prioritize step-by-step instructions, code examples, and specific configurations."
),
"llm_context_prompt": (
"The following excerpts are from technical documentation (manuals, guides, "
"reference material). Provide precise, actionable answers. Include code "
"examples and exact configurations when available. Cite source sections."
),
},
"music": {
"chunking_config": {
"strategy": "song_level",
"chunk_size": 512,
"chunk_overlap": 32,
"respect_boundaries": ["song", "verse", "chorus"],
},
"embedding_instruction": (
"Represent this music content (lyrics, liner notes, metadata) for retrieval. "
"Focus on artist, album, genre, lyrical themes, and musical elements."
),
"reranker_instruction": (
"Re-rank music content based on relevance to the query. "
"Consider artist, genre, lyrical themes, and musical characteristics."
),
"llm_context_prompt": (
"The following excerpts are song lyrics and music metadata. "
"Consider the artistic and cultural context. Reference specific "
"songs, albums, and artists when answering."
),
},
"film": {
"chunking_config": {
"strategy": "scene_level",
"chunk_size": 768,
"chunk_overlap": 64,
"respect_boundaries": ["scene", "act", "sequence"],
},
"embedding_instruction": (
"Represent this film content (scripts, synopses, reviews) for retrieval. "
"Focus on scenes, characters, visual elements, dialogue, and narrative structure."
),
"reranker_instruction": (
"Re-rank film content based on cinematic relevance. "
"Prioritize scene descriptions, character interactions, and visual elements."
),
"llm_context_prompt": (
"The following excerpts are from film-related content (scripts, synopses, "
"reviews). Consider the cinematic context — visual storytelling, "
"direction, and performance. Cite specific scenes and films."
),
},
"art": {
"chunking_config": {
"strategy": "description_level",
"chunk_size": 512,
"chunk_overlap": 32,
"respect_boundaries": ["artwork", "description", "analysis"],
},
"embedding_instruction": (
"Represent this art content (descriptions, catalogs, analysis) for retrieval. "
"Focus on visual elements, style, medium, artist, period, and artistic movements."
),
"reranker_instruction": (
"Re-rank art content based on visual and stylistic relevance. "
"Prioritize descriptions of artwork, technique, composition, and artistic context."
),
"llm_context_prompt": (
"The following excerpts describe artworks and artistic content. "
"Consider visual elements, artistic technique, historical context, "
"and the artist's intent. Reference specific works and movements."
),
},
"journal": {
"chunking_config": {
"strategy": "entry_level",
"chunk_size": 512,
"chunk_overlap": 32,
"respect_boundaries": ["entry", "date", "paragraph"],
},
"embedding_instruction": (
"Represent this personal journal entry for retrieval. "
"Focus on temporal context, personal reflections, mentioned people "
"and places, and emotional content."
),
"reranker_instruction": (
"Re-rank journal entries based on temporal and thematic relevance. "
"Prioritize entries matching the time period, people, places, or topics in the query."
),
"llm_context_prompt": (
"The following excerpts are from personal journal entries. "
"This is private, reflective content. Respect the personal nature — "
"answer with sensitivity. Note dates and temporal context when relevant."
),
},
}
def get_library_type_config(library_type):
"""
Get the default configuration for a library type.
Args:
library_type: One of 'fiction', 'technical', 'music', 'film', 'art', 'journal'
Returns:
dict with keys: chunking_config, embedding_instruction,
reranker_instruction, llm_context_prompt
Raises:
ValueError: If library_type is not recognized
"""
if library_type not in LIBRARY_TYPE_DEFAULTS:
raise ValueError(
f"Unknown library type '{library_type}'. "
f"Valid types: {', '.join(LIBRARY_TYPE_DEFAULTS.keys())}"
)
return LIBRARY_TYPE_DEFAULTS[library_type]

100
mnemosyne/library/forms.py Normal file
View File

@@ -0,0 +1,100 @@
"""
Django forms for Library admin views.
These forms are used by the custom admin views for Library, Collection,
and Item CRUD. They are plain Django forms (not ModelForms) because
neomodel StructuredNodes are not Django ORM models.
"""
from django import forms
from .content_types import LIBRARY_TYPE_DEFAULTS
LIBRARY_TYPE_CHOICES = [
(key, key.capitalize()) for key in LIBRARY_TYPE_DEFAULTS.keys()
]
class LibraryForm(forms.Form):
"""Form for creating/editing a Library node."""
name = forms.CharField(
max_length=200,
widget=forms.TextInput(attrs={"class": "input input-bordered w-full"}),
)
library_type = forms.ChoiceField(
choices=LIBRARY_TYPE_CHOICES,
widget=forms.Select(attrs={"class": "select select-bordered w-full"}),
)
description = forms.CharField(
required=False,
widget=forms.Textarea(
attrs={"class": "textarea textarea-bordered w-full", "rows": 3}
),
)
embedding_instruction = forms.CharField(
required=False,
widget=forms.Textarea(
attrs={"class": "textarea textarea-bordered w-full", "rows": 3}
),
)
reranker_instruction = forms.CharField(
required=False,
widget=forms.Textarea(
attrs={"class": "textarea textarea-bordered w-full", "rows": 3}
),
)
llm_context_prompt = forms.CharField(
required=False,
widget=forms.Textarea(
attrs={"class": "textarea textarea-bordered w-full", "rows": 3}
),
)
class CollectionForm(forms.Form):
"""Form for creating/editing a Collection node."""
name = forms.CharField(
max_length=200,
widget=forms.TextInput(attrs={"class": "input input-bordered w-full"}),
)
description = forms.CharField(
required=False,
widget=forms.Textarea(
attrs={"class": "textarea textarea-bordered w-full", "rows": 3}
),
)
class ItemForm(forms.Form):
"""Form for creating/editing an Item node."""
title = forms.CharField(
max_length=500,
widget=forms.TextInput(attrs={"class": "input input-bordered w-full"}),
)
item_type = forms.CharField(
required=False,
max_length=100,
widget=forms.TextInput(attrs={"class": "input input-bordered w-full"}),
)
file_type = forms.CharField(
required=False,
max_length=50,
widget=forms.TextInput(attrs={"class": "input input-bordered w-full"}),
)
file = forms.FileField(
required=False,
widget=forms.ClearableFileInput(
attrs={"class": "file-input file-input-bordered w-full"}
),
help_text="Upload a document (PDF, EPUB, DOCX, PPTX, TXT, etc.)",
)
auto_embed = forms.BooleanField(
required=False,
initial=True,
widget=forms.CheckboxInput(attrs={"class": "checkbox checkbox-primary"}),
help_text="Automatically start embedding after upload",
)

View File

View File

@@ -0,0 +1,43 @@
"""
Management command to embed all items in a Collection.
Usage:
python manage.py embed_collection <collection_uid>
"""
import logging
from django.core.management.base import BaseCommand, CommandError
logger = logging.getLogger(__name__)
class Command(BaseCommand):
help = "Queue embedding tasks for all items in a Collection."
def add_arguments(self, parser):
parser.add_argument(
"collection_uid", type=str, help="UID of the Collection to embed"
)
def handle(self, *args, **options):
collection_uid = options["collection_uid"]
try:
from library.models import Collection
col = Collection.nodes.get(uid=collection_uid)
except Exception as exc:
raise CommandError(f"Collection not found: {collection_uid} ({exc})")
items = col.items.all()
self.stdout.write(f"Collection: {col.name} ({len(items)} items)")
from library.tasks import embed_collection
task = embed_collection.delay(collection_uid)
self.stdout.write(
self.style.SUCCESS(
f"Batch task queued: {task.id} ({len(items)} items)"
)
)

View File

@@ -0,0 +1,68 @@
"""
Management command to embed a single Item via the CLI.
Usage:
python manage.py embed_item <item_uid>
python manage.py embed_item <item_uid> --sync
"""
import logging
from django.core.management.base import BaseCommand, CommandError
logger = logging.getLogger(__name__)
class Command(BaseCommand):
help = "Run the embedding pipeline for a single Item."
def add_arguments(self, parser):
parser.add_argument("item_uid", type=str, help="UID of the Item to embed")
parser.add_argument(
"--sync",
action="store_true",
help="Run synchronously instead of queueing a Celery task",
)
def handle(self, *args, **options):
item_uid = options["item_uid"]
sync = options["sync"]
# Verify item exists
try:
from library.models import Item
item = Item.nodes.get(uid=item_uid)
except Exception as exc:
raise CommandError(f"Item not found: {item_uid} ({exc})")
self.stdout.write(f"Item: {item.title} (type={item.file_type}, status={item.embedding_status})")
if sync:
self.stdout.write("Running embedding pipeline synchronously...")
from library.services.pipeline import EmbeddingPipeline
pipeline = EmbeddingPipeline()
def progress_cb(percent, message):
self.stdout.write(f" [{percent:3d}%] {message}")
try:
result = pipeline.process_item(item_uid, progress_callback=progress_cb)
self.stdout.write(
self.style.SUCCESS(
f"\nCompleted: {result.get('chunks_created', 0)} chunks, "
f"{result.get('images_stored', 0)} images, "
f"{result.get('concepts_extracted', 0)} concepts"
)
)
except Exception as exc:
raise CommandError(f"Embedding failed: {exc}")
else:
self.stdout.write("Queueing embedding task...")
from library.tasks import embed_item
task = embed_item.delay(item_uid)
self.stdout.write(
self.style.SUCCESS(f"Task queued: {task.id}")
)

View File

@@ -0,0 +1,132 @@
"""
Management command to display embedding pipeline status and statistics.
Usage:
python manage.py embedding_status
python manage.py embedding_status --library <uid>
"""
import logging
from django.core.management.base import BaseCommand
logger = logging.getLogger(__name__)
class Command(BaseCommand):
help = "Display embedding pipeline status and statistics."
def add_arguments(self, parser):
parser.add_argument(
"--library",
type=str,
default="",
help="Filter by library UID",
)
def handle(self, *args, **options):
library_uid = options["library"]
try:
from neomodel import db
except ImportError:
self.stderr.write(self.style.ERROR("neomodel not available"))
return
self.stdout.write(self.style.HTTP_INFO("\n=== Mnemosyne Embedding Pipeline Status ===\n"))
# System embedding model
try:
from llm_manager.models import LLMModel
embed_model = LLMModel.get_system_embedding_model()
if embed_model:
self.stdout.write(
f"System Embedding Model: {embed_model.api.name}: {embed_model.name} "
f"(dimensions={embed_model.vector_dimensions or '?'})"
)
else:
self.stdout.write(
self.style.WARNING("System Embedding Model: NOT CONFIGURED")
)
chat_model = LLMModel.get_system_chat_model()
if chat_model:
self.stdout.write(f"System Chat Model: {chat_model.api.name}: {chat_model.name}")
else:
self.stdout.write(
self.style.WARNING("System Chat Model: NOT CONFIGURED (concept extraction disabled)")
)
except Exception as exc:
self.stdout.write(self.style.ERROR(f"Could not query LLM models: {exc}"))
self.stdout.write("")
# Item status counts
try:
statuses = ["pending", "processing", "completed", "failed"]
self.stdout.write("Item Embedding Status:")
for status in statuses:
if library_uid:
query = (
"MATCH (l:Library {uid: $lib_uid})-[:CONTAINS]->(c:Collection)"
"-[:CONTAINS]->(i:Item {embedding_status: $status}) "
"RETURN count(i)"
)
results, _ = db.cypher_query(
query, {"lib_uid": library_uid, "status": status}
)
else:
query = (
"MATCH (i:Item {embedding_status: $status}) RETURN count(i)"
)
results, _ = db.cypher_query(query, {"status": status})
count = results[0][0] if results else 0
style = {
"completed": self.style.SUCCESS,
"failed": self.style.ERROR,
"processing": self.style.WARNING,
"pending": self.style.NOTICE,
}.get(status, str)
self.stdout.write(f" {status:12s}: {style(str(count))}")
except Exception as exc:
self.stdout.write(self.style.ERROR(f"Could not query items: {exc}"))
self.stdout.write("")
# Node counts
try:
node_types = [
("Library", "Library"),
("Collection", "Collection"),
("Item", "Item"),
("Chunk", "Chunk"),
("Concept", "Concept"),
("Image", "Image"),
("ImageEmbedding", "ImageEmbedding"),
]
self.stdout.write("Graph Node Counts:")
for label, display in node_types:
results, _ = db.cypher_query(f"MATCH (n:{label}) RETURN count(n)")
count = results[0][0] if results else 0
self.stdout.write(f" {display:20s}: {count}")
except Exception as exc:
self.stdout.write(self.style.ERROR(f"Could not query nodes: {exc}"))
# Chunks with embeddings
try:
results, _ = db.cypher_query(
"MATCH (c:Chunk) WHERE c.embedding IS NOT NULL RETURN count(c)"
)
embedded = results[0][0] if results else 0
results, _ = db.cypher_query("MATCH (c:Chunk) RETURN count(c)")
total = results[0][0] if results else 0
self.stdout.write(
f"\nChunks with embeddings: {embedded}/{total}"
)
except Exception:
pass
self.stdout.write("")

View File

@@ -0,0 +1,94 @@
"""
Management command to load default library type configurations.
Idempotent — safe to re-run. Creates Library nodes with default content-type
configurations if they don't already exist. Does NOT overwrite existing
libraries that have been customized.
"""
import logging
from django.core.management.base import BaseCommand
from library.content_types import LIBRARY_TYPE_DEFAULTS
logger = logging.getLogger(__name__)
class Command(BaseCommand):
help = (
"Load default library type configurations into Neo4j. "
"Creates one Library node per type with default chunking, embedding, "
"reranker, and LLM context settings. Safe to re-run."
)
def add_arguments(self, parser):
parser.add_argument(
"--force",
action="store_true",
help="Update existing libraries with default configurations (overwrites customizations)",
)
def handle(self, *args, **options):
force = options["force"]
try:
from library.models import Library
except Exception as e:
self.stderr.write(
self.style.ERROR(f"Cannot import library models: {e}")
)
return
created_count = 0
updated_count = 0
skipped_count = 0
for library_type, config in LIBRARY_TYPE_DEFAULTS.items():
display_name = library_type.capitalize()
default_name = f"Default {display_name} Library"
# Check if a library of this type already exists
existing = Library.nodes.filter(library_type=library_type)
if existing:
if force:
lib = existing[0]
lib.chunking_config = config["chunking_config"]
lib.embedding_instruction = config["embedding_instruction"]
lib.reranker_instruction = config["reranker_instruction"]
lib.llm_context_prompt = config["llm_context_prompt"]
lib.save()
updated_count += 1
self.stdout.write(
self.style.WARNING(f"Updated: {lib.name} ({library_type})")
)
else:
skipped_count += 1
self.stdout.write(
self.style.NOTICE(
f"Skipped: {existing[0].name} ({library_type}) — already exists"
)
)
else:
lib = Library(
name=default_name,
library_type=library_type,
description=f"Default {display_name.lower()} library",
chunking_config=config["chunking_config"],
embedding_instruction=config["embedding_instruction"],
reranker_instruction=config["reranker_instruction"],
llm_context_prompt=config["llm_context_prompt"],
)
lib.save()
created_count += 1
self.stdout.write(
self.style.SUCCESS(f"Created: {default_name} ({library_type})")
)
self.stdout.write(
self.style.SUCCESS(
f"\nDone. Created: {created_count}, "
f"Updated: {updated_count}, Skipped: {skipped_count}"
)
)

View File

@@ -0,0 +1,189 @@
"""
Management command to create Neo4j indexes for Mnemosyne content graph.
Creates:
- Vector indexes (dynamic dimensions from system embedding model) for Chunk, Concept, and ImageEmbedding
- Full-text indexes for text search on Chunk.text_preview and Concept.name
- Constraint indexes enforced by neomodel (unique properties)
"""
import logging
from django.core.management.base import BaseCommand
from neomodel import db
logger = logging.getLogger(__name__)
# Default vector dimensions (used when no system embedding model is configured)
DEFAULT_VECTOR_DIMENSIONS = 4096
# Full-text index definitions: (index_name, label, properties)
FULLTEXT_INDEXES = [
("chunk_text_fulltext", "Chunk", ["text_preview"]),
("concept_name_fulltext", "Concept", ["name"]),
("item_title_fulltext", "Item", ["title"]),
("library_name_fulltext", "Library", ["name"]),
]
def _get_vector_dimensions():
"""
Get vector dimensions from the system embedding model.
Falls back to DEFAULT_VECTOR_DIMENSIONS if no model is configured
or the model has no vector_dimensions set.
:returns: Tuple of (dimensions, source_description).
"""
try:
from llm_manager.models import LLMModel
model = LLMModel.get_system_embedding_model()
if model and model.vector_dimensions:
return model.vector_dimensions, f"{model.api.name}: {model.name}"
except Exception:
pass
return DEFAULT_VECTOR_DIMENSIONS, "default (no system embedding model)"
class Command(BaseCommand):
help = (
"Create Neo4j vector, full-text, and constraint indexes "
"for the Mnemosyne content graph. Vector dimensions are read "
"from the system embedding model."
)
def add_arguments(self, parser):
parser.add_argument(
"--drop",
action="store_true",
help="Drop existing indexes before recreating them",
)
parser.add_argument(
"--dimensions",
type=int,
default=0,
help="Override vector dimensions (default: read from system embedding model)",
)
def handle(self, *args, **options):
drop = options["drop"]
override_dims = options["dimensions"]
# Resolve vector dimensions
if override_dims > 0:
dimensions = override_dims
source = f"CLI override ({override_dims})"
else:
dimensions, source = _get_vector_dimensions()
self.stdout.write(
self.style.HTTP_INFO(
f"Vector dimensions: {dimensions} (source: {source})"
)
)
# Vector index definitions (dynamic dimensions)
vector_indexes = [
("chunk_embedding_index", "Chunk", "embedding", dimensions, "cosine"),
("concept_embedding_index", "Concept", "embedding", dimensions, "cosine"),
("image_embedding_index", "ImageEmbedding", "embedding", dimensions, "cosine"),
]
# Get existing indexes
existing_indexes = self._get_existing_indexes()
if drop:
self._drop_indexes(existing_indexes, vector_indexes)
existing_indexes = self._get_existing_indexes()
# Create vector indexes
for name, label, prop, dims, similarity in vector_indexes:
if name in existing_indexes:
self.stdout.write(
self.style.NOTICE(f"Vector index '{name}' already exists, skipping")
)
continue
try:
cypher = (
f"CREATE VECTOR INDEX {name} IF NOT EXISTS "
f"FOR (n:{label}) ON (n.{prop}) "
f"OPTIONS {{indexConfig: {{"
f"`vector.dimensions`: {dims}, "
f"`vector.similarity_function`: '{similarity}'"
f"}}}}"
)
db.cypher_query(cypher)
self.stdout.write(
self.style.SUCCESS(
f"Created vector index: {name} ({label}.{prop}, {dims}d {similarity})"
)
)
except Exception as e:
self.stderr.write(
self.style.ERROR(f"Failed to create vector index '{name}': {e}")
)
# Create full-text indexes
for name, label, properties in FULLTEXT_INDEXES:
if name in existing_indexes:
self.stdout.write(
self.style.NOTICE(
f"Full-text index '{name}' already exists, skipping"
)
)
continue
try:
props_str = ", ".join(f"n.{p}" for p in properties)
cypher = (
f"CREATE FULLTEXT INDEX {name} IF NOT EXISTS "
f"FOR (n:{label}) ON EACH [{props_str}]"
)
db.cypher_query(cypher)
self.stdout.write(
self.style.SUCCESS(
f"Created full-text index: {name} ({label}: {', '.join(properties)})"
)
)
except Exception as e:
self.stderr.write(
self.style.ERROR(f"Failed to create full-text index '{name}': {e}")
)
# Install neomodel constraints (unique indexes from model definitions)
try:
from neomodel import install_all_labels
install_all_labels()
self.stdout.write(
self.style.SUCCESS("Installed neomodel constraint indexes")
)
except Exception as e:
self.stderr.write(
self.style.ERROR(f"Failed to install neomodel labels: {e}")
)
self.stdout.write(self.style.SUCCESS("\nNeo4j index setup complete."))
def _get_existing_indexes(self):
"""Return set of existing index names."""
try:
results, _ = db.cypher_query("SHOW INDEXES YIELD name RETURN name")
return {row[0] for row in results}
except Exception:
return set()
def _drop_indexes(self, existing_indexes, vector_indexes):
"""Drop all Mnemosyne-managed indexes."""
managed_names = {name for name, *_ in vector_indexes} | {
name for name, *_ in FULLTEXT_INDEXES
}
for name in managed_names & existing_indexes:
try:
db.cypher_query(f"DROP INDEX {name} IF EXISTS")
self.stdout.write(self.style.WARNING(f"Dropped index: {name}"))
except Exception as e:
self.stderr.write(
self.style.ERROR(f"Failed to drop index '{name}': {e}")
)

View File

@@ -0,0 +1,96 @@
"""
Prometheus metrics for the Mnemosyne embedding pipeline.
Exposes counters, histograms, and gauges for monitoring document parsing,
chunking, embedding, and pipeline orchestration.
"""
from prometheus_client import Counter, Gauge, Histogram
# --- Document Parsing ---
DOCUMENTS_PARSED_TOTAL = Counter(
"mnemosyne_documents_parsed_total",
"Total documents parsed",
["file_type", "status"],
)
DOCUMENT_PARSE_DURATION = Histogram(
"mnemosyne_document_parse_duration_seconds",
"Time to parse a document",
["file_type"],
buckets=[0.1, 0.5, 1, 2, 5, 10, 30, 60, 120],
)
IMAGES_EXTRACTED_TOTAL = Counter(
"mnemosyne_images_extracted_total",
"Total images extracted from documents",
["file_type"],
)
# --- Chunking ---
CHUNKS_CREATED_TOTAL = Counter(
"mnemosyne_chunks_created_total",
"Total chunks created",
["library_type", "strategy"],
)
CHUNK_SIZE_TOKENS = Histogram(
"mnemosyne_chunk_size_tokens",
"Distribution of chunk sizes in tokens",
buckets=[32, 64, 128, 256, 512, 768, 1024, 2048],
)
# --- Embedding ---
EMBEDDINGS_GENERATED_TOTAL = Counter(
"mnemosyne_embeddings_generated_total",
"Total embeddings generated",
["model_name", "api_type", "content_type"],
)
EMBEDDING_BATCH_DURATION = Histogram(
"mnemosyne_embedding_batch_duration_seconds",
"Time per embedding batch request",
["model_name", "api_type"],
buckets=[0.1, 0.5, 1, 2, 5, 10, 30, 60],
)
EMBEDDING_API_ERRORS_TOTAL = Counter(
"mnemosyne_embedding_api_errors_total",
"Embedding API errors",
["model_name", "api_type", "error_type"],
)
EMBEDDING_TOKENS_TOTAL = Counter(
"mnemosyne_embedding_tokens_total",
"Total tokens sent to embedding APIs",
["model_name"],
)
# --- Pipeline ---
PIPELINE_ITEMS_TOTAL = Counter(
"mnemosyne_pipeline_items_total",
"Total items processed by embedding pipeline",
["status"],
)
PIPELINE_DURATION = Histogram(
"mnemosyne_pipeline_item_duration_seconds",
"Total time to process one item through the full pipeline",
buckets=[1, 5, 10, 30, 60, 120, 300, 600],
)
PIPELINE_ITEMS_IN_PROGRESS = Gauge(
"mnemosyne_pipeline_items_in_progress",
"Items currently being processed",
)
# --- Concept Extraction ---
CONCEPTS_EXTRACTED_TOTAL = Counter(
"mnemosyne_concepts_extracted_total",
"Total concepts extracted",
["concept_type"],
)
# --- System State ---
EMBEDDING_QUEUE_SIZE = Gauge(
"mnemosyne_embedding_queue_size",
"Items waiting in the embedding queue",
)

View File

251
mnemosyne/library/models.py Normal file
View File

@@ -0,0 +1,251 @@
"""
Neo4j graph models for the Mnemosyne content library.
All content data (libraries, collections, items, chunks, concepts, images)
lives in Neo4j as a knowledge graph. These models use neomodel's StructuredNode
OGM — they do NOT participate in Django's ORM or migrations.
"""
from neomodel import (
ArrayProperty,
DateTimeProperty,
FloatProperty,
IntegerProperty,
JSONProperty,
RelationshipTo,
StringProperty,
StructuredNode,
StructuredRel,
UniqueIdProperty,
)
# --- Relationship models ---
class ReferencesRel(StructuredRel):
"""Relationship properties for Item -> Concept REFERENCES edges."""
weight = FloatProperty(default=1.0)
context = StringProperty(default="")
class RelatedToRel(StructuredRel):
"""Relationship properties for Item -> Item RELATED_TO edges."""
relationship_type = StringProperty(default="")
weight = FloatProperty(default=1.0)
class NearbyImageRel(StructuredRel):
"""Relationship properties for Chunk -> Image HAS_NEARBY_IMAGE edges."""
proximity = StringProperty(default="same_page") # same_page, inline, same_slide, same_chapter
# --- Node models ---
class Library(StructuredNode):
"""
Top-level container representing a content library.
Each library has a type (fiction, technical, music, film, art, journal)
that drives chunking strategy, embedding instructions, and LLM prompts.
"""
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
library_type = StringProperty(
required=True,
choices={
"fiction": "Fiction",
"technical": "Technical",
"music": "Music",
"film": "Film",
"art": "Art",
"journal": "Journal",
},
)
description = StringProperty(default="")
# Content-type configuration
chunking_config = JSONProperty(default={})
embedding_instruction = StringProperty(default="")
reranker_instruction = StringProperty(default="")
llm_context_prompt = StringProperty(default="")
created_at = DateTimeProperty(default_now=True)
# Relationships
collections = RelationshipTo("Collection", "CONTAINS")
def __str__(self):
return f"{self.name} ({self.library_type})"
class Collection(StructuredNode):
"""
A grouping of items within a library.
Examples: a book series, an album discography, a project folder.
"""
uid = UniqueIdProperty()
name = StringProperty(required=True)
description = StringProperty(default="")
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
# Relationships
items = RelationshipTo("Item", "CONTAINS")
library = RelationshipTo("Library", "BELONGS_TO")
def __str__(self):
return self.name
class Item(StructuredNode):
"""
An individual piece of content: a document, song, image set, journal entry, etc.
Items store their original file in S3 (via s3_key) and are chunked
for embedding and retrieval.
"""
uid = UniqueIdProperty()
title = StringProperty(required=True)
item_type = StringProperty(default="")
s3_key = StringProperty(default="")
content_hash = StringProperty(index=True)
file_type = StringProperty(default="")
file_size = IntegerProperty(default=0)
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
updated_at = DateTimeProperty(default_now=True)
# Embedding pipeline fields (Phase 2)
embedding_status = StringProperty(
default="pending",
choices={
"pending": "Pending",
"processing": "Processing",
"completed": "Completed",
"failed": "Failed",
},
)
embedding_model_name = StringProperty(default="")
chunk_count = IntegerProperty(default=0)
image_count = IntegerProperty(default=0)
error_message = StringProperty(default="")
# Relationships
chunks = RelationshipTo("Chunk", "HAS_CHUNK")
images = RelationshipTo("Image", "HAS_IMAGE")
concepts = RelationshipTo("Concept", "REFERENCES", model=ReferencesRel)
related_items = RelationshipTo("Item", "RELATED_TO", model=RelatedToRel)
def __str__(self):
return self.title
class Chunk(StructuredNode):
"""
A text chunk extracted from an Item for embedding and retrieval.
Chunk text is stored in S3; text_preview holds the first 500 chars
for Neo4j full-text indexing.
"""
uid = UniqueIdProperty()
chunk_index = IntegerProperty(required=True)
chunk_s3_key = StringProperty(required=True)
chunk_size = IntegerProperty(default=0)
text_preview = StringProperty(default="") # First 500 chars for full-text index
embedding = ArrayProperty(FloatProperty()) # 4096d vector
created_at = DateTimeProperty(default_now=True)
# Relationships
mentions = RelationshipTo("Concept", "MENTIONS")
nearby_images = RelationshipTo("Image", "HAS_NEARBY_IMAGE", model=NearbyImageRel)
def __str__(self):
return f"Chunk {self.chunk_index} ({self.uid})"
class Concept(StructuredNode):
"""
A named entity or topic extracted from content.
Concepts form the backbone of the knowledge graph, linking items
and chunks through shared references.
"""
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
concept_type = StringProperty(
default="",
choices={
"person": "Person",
"place": "Place",
"topic": "Topic",
"technique": "Technique",
"theme": "Theme",
},
)
embedding = ArrayProperty(FloatProperty()) # 4096d vector
# Relationships
related_concepts = RelationshipTo("Concept", "RELATED_TO")
def __str__(self):
return self.name
class Image(StructuredNode):
"""
An image associated with an Item (cover art, diagram, photo, etc.).
The image file is stored in S3; embeddings enable multimodal search.
"""
uid = UniqueIdProperty()
s3_key = StringProperty(required=True)
image_type = StringProperty(
default="",
choices={
"cover": "Cover",
"diagram": "Diagram",
"artwork": "Artwork",
"still": "Still",
"photo": "Photo",
},
)
description = StringProperty(default="")
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
# Relationships
embeddings = RelationshipTo("ImageEmbedding", "HAS_EMBEDDING")
def __str__(self):
return f"Image {self.image_type} ({self.uid})"
class ImageEmbedding(StructuredNode):
"""
A multimodal embedding vector for an Image node.
Generated by Qwen3-VL for unified text+image vector space.
"""
uid = UniqueIdProperty()
embedding = ArrayProperty(FloatProperty()) # 4096d multimodal vector
created_at = DateTimeProperty(default_now=True)
def __str__(self):
return f"ImageEmbedding ({self.uid})"

View File

@@ -0,0 +1,11 @@
"""
Library services for the Mnemosyne embedding pipeline.
Services:
- parsers: Universal document parsing via PyMuPDF
- text_utils: Text sanitization for embedding APIs
- chunker: Content-type-aware chunking
- embedding_client: Multi-backend embedding API client
- pipeline: Orchestration of parse → chunk → embed → graph
- concepts: LLM-based concept extraction
"""

View File

@@ -0,0 +1,250 @@
"""
Content-type-aware chunking service.
Uses semantic-text-splitter with HuggingFace tokenizers to produce
chunks that respect document structure boundaries per library type.
"""
import logging
from typing import Optional
from library.metrics import CHUNKS_CREATED_TOTAL, CHUNK_SIZE_TOKENS
from .parsers import ParseResult, TextBlock
logger = logging.getLogger(__name__)
# Default tokenizer when no model-specific tokenizer is available
DEFAULT_TOKENIZER = "bert-base-uncased"
# Boundary markers used to detect structural elements in text
_BOUNDARY_PATTERNS = {
"chapter": [
r"(?m)^chapter\s+\d+",
r"(?m)^CHAPTER\s+\d+",
r"(?m)^Chapter\s+\w+",
],
"scene": [r"(?m)^\*\s*\*\s*\*", r"(?m)^---+$", r"(?m)^###"],
"section": [
r"(?m)^#{1,3}\s+",
r"(?m)^\d+\.\d*\s+\w",
r"(?m)^Section\s+\d+",
],
"subsection": [r"(?m)^#{4,6}\s+", r"(?m)^\d+\.\d+\.\d+"],
"entry": [
r"(?m)^\d{4}-\d{2}-\d{2}",
r"(?m)^(January|February|March|April|May|June|July|August|September|October|November|December)\s+\d",
],
"song": [r"(?m)^Track\s+\d+", r"(?m)^\[.*?\]$"],
"verse": [r"(?m)^\[Verse", r"(?m)^\[Chorus"],
}
class ChunkResult:
"""Result of chunking a document."""
def __init__(
self,
chunks: list[str],
chunk_page_map: dict[int, int],
strategy: str,
):
"""
:param chunks: List of chunk text strings.
:param chunk_page_map: Mapping of chunk_index -> source page number.
:param strategy: The chunking strategy used.
"""
self.chunks = chunks
self.chunk_page_map = chunk_page_map
self.strategy = strategy
def __len__(self) -> int:
return len(self.chunks)
class ContentTypeChunker:
"""
Content-type-aware document chunker.
Dispatches to different chunking strategies based on the library's
chunking configuration, using semantic-text-splitter for
token-aware splitting.
"""
def __init__(self, tokenizer_name: Optional[str] = None):
"""
:param tokenizer_name: HuggingFace tokenizer name for token counting.
"""
self._tokenizer_name = tokenizer_name or DEFAULT_TOKENIZER
self._splitter_cache: dict[tuple[int, int], object] = {}
def chunk(
self,
parse_result: ParseResult,
chunking_config: dict,
library_type: str = "",
) -> ChunkResult:
"""
Chunk parsed document text using the library's chunking config.
:param parse_result: ParseResult from the document parser.
:param chunking_config: Library chunking configuration dict.
:param library_type: Library type for metrics labeling.
:returns: ChunkResult with chunk texts and page mapping.
"""
strategy = chunking_config.get("strategy", "section_aware")
chunk_size = chunking_config.get("chunk_size", 512)
chunk_overlap = chunking_config.get("chunk_overlap", 64)
# Combine all text blocks into a single document text,
# tracking page boundaries for chunk-page mapping
full_text, page_offsets = self._combine_text_blocks(parse_result.text_blocks)
if not full_text.strip():
logger.warning("No text to chunk strategy=%s", strategy)
return ChunkResult(chunks=[], chunk_page_map={}, strategy=strategy)
logger.info(
"Chunking text strategy=%s chunk_size=%d overlap=%d total_chars=%d",
strategy,
chunk_size,
chunk_overlap,
len(full_text),
)
# Get or create the text splitter for this size/overlap
splitter = self._get_splitter(chunk_size, chunk_overlap)
# Split into chunks
try:
chunks = splitter.chunks(full_text)
except Exception as exc:
logger.error("Chunking failed strategy=%s: %s", strategy, exc)
raise
# Build chunk -> page mapping
chunk_page_map = self._map_chunks_to_pages(chunks, full_text, page_offsets)
# Record metrics
CHUNKS_CREATED_TOTAL.labels(
library_type=library_type,
strategy=strategy,
).inc(len(chunks))
for chunk_text in chunks:
CHUNK_SIZE_TOKENS.observe(len(chunk_text.split()))
logger.info(
"Chunked document strategy=%s chunks=%d avg_size=%d",
strategy,
len(chunks),
sum(len(c) for c in chunks) // max(len(chunks), 1),
)
return ChunkResult(
chunks=chunks,
chunk_page_map=chunk_page_map,
strategy=strategy,
)
def _get_splitter(self, chunk_size: int, chunk_overlap: int):
"""
Get or create a semantic text splitter for the given parameters.
:param chunk_size: Maximum chunk size in tokens.
:param chunk_overlap: Overlap between chunks in tokens.
:returns: TextSplitter instance.
"""
cache_key = (chunk_size, chunk_overlap)
if cache_key in self._splitter_cache:
return self._splitter_cache[cache_key]
from semantic_text_splitter import TextSplitter
from tokenizers import Tokenizer
try:
tokenizer = Tokenizer.from_pretrained(self._tokenizer_name)
splitter = TextSplitter.from_huggingface_tokenizer(
tokenizer,
capacity=chunk_size,
overlap=chunk_overlap,
)
logger.debug(
"Created text splitter tokenizer=%s capacity=%d overlap=%d",
self._tokenizer_name,
chunk_size,
chunk_overlap,
)
except Exception as exc:
logger.warning(
"Failed to load tokenizer %s: %s, falling back to %s",
self._tokenizer_name,
exc,
DEFAULT_TOKENIZER,
)
tokenizer = Tokenizer.from_pretrained(DEFAULT_TOKENIZER)
splitter = TextSplitter.from_huggingface_tokenizer(
tokenizer,
capacity=chunk_size,
overlap=chunk_overlap,
)
self._splitter_cache[cache_key] = splitter
return splitter
def _combine_text_blocks(
self, text_blocks: list[TextBlock]
) -> tuple[str, list[tuple[int, int]]]:
"""
Combine text blocks into a single string, tracking page offsets.
:param text_blocks: List of TextBlock from parser.
:returns: Tuple of (combined_text, page_offsets) where page_offsets
is a list of (char_offset, page_number).
"""
parts: list[str] = []
page_offsets: list[tuple[int, int]] = []
current_offset = 0
for block in text_blocks:
page_offsets.append((current_offset, block.page))
parts.append(block.text)
current_offset += len(block.text) + 2 # +2 for paragraph separator
return "\n\n".join(parts), page_offsets
def _map_chunks_to_pages(
self,
chunks: list[str],
full_text: str,
page_offsets: list[tuple[int, int]],
) -> dict[int, int]:
"""
Map each chunk index to its source page number.
:param chunks: List of chunk strings.
:param full_text: Combined document text.
:param page_offsets: List of (char_offset, page_number).
:returns: Dict mapping chunk_index -> page_number.
"""
chunk_page_map: dict[int, int] = {}
search_start = 0
for chunk_idx, chunk_text in enumerate(chunks):
# Find where this chunk starts in the full text
pos = full_text.find(chunk_text[:100], search_start)
if pos == -1:
pos = search_start
# Find which page this position belongs to
page = 0
for offset, page_num in page_offsets:
if offset <= pos:
page = page_num
else:
break
chunk_page_map[chunk_idx] = page
search_start = max(search_start, pos)
return chunk_page_map

View File

@@ -0,0 +1,267 @@
"""
LLM-based concept extraction for the knowledge graph.
Uses the system chat model to extract named entities (people, places,
topics, techniques, themes) from document chunks, then creates Concept
nodes and MENTIONS/REFERENCES relationships in Neo4j.
"""
import json
import logging
from typing import Optional
from library.metrics import CONCEPTS_EXTRACTED_TOTAL
logger = logging.getLogger(__name__)
# Prompt for concept extraction
CONCEPT_EXTRACTION_PROMPT = """Extract named entities and key concepts from the following text.
Return a JSON array of objects, each with:
- "name": the entity/concept name (lowercase, canonical form)
- "type": one of "person", "place", "topic", "technique", "theme"
Only extract significant, specific concepts — not generic words.
Return at most 20 concepts. Return ONLY the JSON array, no other text.
Text:
{text}"""
class ConceptExtractor:
"""
Extracts concepts from text using the system chat model.
Creates or updates Concept nodes in Neo4j and connects them
to Chunk and Item nodes via MENTIONS and REFERENCES relationships.
"""
def __init__(self, chat_model, user=None):
"""
:param chat_model: LLMModel instance for chat/completion.
:param user: Optional Django user for usage tracking.
"""
self.chat_model = chat_model
self.user = user
def extract_for_item(
self,
item,
chunk_nodes: list,
chunk_texts: list[str],
) -> int:
"""
Extract concepts from all chunks of an item.
:param item: Item node.
:param chunk_nodes: List of Chunk nodes.
:param chunk_texts: List of chunk text strings.
:returns: Total number of unique concepts extracted.
"""
all_concepts: dict[str, str] = {} # name -> type
# Sample chunks for extraction (don't process every chunk for large docs)
sample_indices = self._select_sample_indices(len(chunk_texts), max_samples=10)
for idx in sample_indices:
chunk_text = chunk_texts[idx]
chunk_node = chunk_nodes[idx]
concepts = self._extract_from_text(chunk_text)
if not concepts:
continue
for concept_data in concepts:
name = concept_data.get("name", "").strip().lower()
concept_type = concept_data.get("type", "topic")
if not name or len(name) < 2:
continue
all_concepts[name] = concept_type
# Connect chunk -> concept via MENTIONS
concept_node = self._get_or_create_concept(name, concept_type)
if concept_node:
try:
chunk_node.mentions.connect(concept_node)
except Exception:
pass # Already connected
# Connect item -> all concepts via REFERENCES
for name, concept_type in all_concepts.items():
concept_node = self._get_or_create_concept(name, concept_type)
if concept_node:
try:
item.concepts.connect(concept_node, {"weight": 1.0})
except Exception:
pass # Already connected
CONCEPTS_EXTRACTED_TOTAL.labels(concept_type=concept_type).inc()
logger.info(
"Extracted %d concepts for item_uid=%s",
len(all_concepts),
item.uid,
)
return len(all_concepts)
def _extract_from_text(self, text: str) -> list[dict]:
"""
Call the chat model to extract concepts from text.
:param text: Text to analyze.
:returns: List of concept dicts with 'name' and 'type' keys.
"""
# Truncate very long text to avoid token limits
if len(text) > 3000:
text = text[:3000]
prompt = CONCEPT_EXTRACTION_PROMPT.format(text=text)
try:
response_text = self._call_chat_model(prompt)
concepts = self._parse_concept_response(response_text)
logger.debug(
"Extracted %d concepts from text chunk (len=%d)",
len(concepts),
len(text),
)
return concepts
except Exception as exc:
logger.warning("Concept extraction failed: %s", exc)
return []
def _call_chat_model(self, prompt: str) -> str:
"""
Make a chat completion request to the system chat model.
:param prompt: User prompt text.
:returns: Response text from the model.
"""
import requests
api = self.chat_model.api
base_url = api.base_url.rstrip("/")
if api.api_type == "bedrock":
# Bedrock Converse endpoint
url = f"{base_url}/model/{self.chat_model.name}/converse"
headers = {
"Authorization": f"Bearer {api.api_key}",
"Content-Type": "application/json",
}
body = {
"messages": [{"role": "user", "content": [{"text": prompt}]}],
}
else:
# OpenAI-compatible
url = f"{base_url}/chat/completions"
headers = {"Content-Type": "application/json"}
if api.api_key:
headers["Authorization"] = f"Bearer {api.api_key}"
body = {
"model": self.chat_model.name,
"messages": [{"role": "user", "content": prompt}],
"temperature": 0.1,
"max_tokens": 1000,
}
resp = requests.post(
url, json=body, headers=headers, timeout=api.timeout_seconds or 60
)
resp.raise_for_status()
data = resp.json()
# Parse response based on format
if "output" in data:
# Bedrock Converse format
return data["output"]["message"]["content"][0]["text"]
if "choices" in data:
# OpenAI format
return data["choices"][0]["message"]["content"]
raise ValueError(f"Unexpected chat response format: {list(data.keys())}")
def _parse_concept_response(self, response_text: str) -> list[dict]:
"""
Parse the LLM's concept extraction response into structured data.
:param response_text: Raw response text (expected JSON array).
:returns: List of concept dicts.
"""
# Try to extract JSON from the response
text = response_text.strip()
# Handle markdown code blocks
if text.startswith("```"):
lines = text.split("\n")
text = "\n".join(lines[1:-1]) if len(lines) > 2 else text
try:
concepts = json.loads(text)
if isinstance(concepts, list):
return [
c for c in concepts
if isinstance(c, dict) and "name" in c
]
except json.JSONDecodeError:
pass
# Try to find JSON array in the response
import re
match = re.search(r"\[.*\]", text, re.DOTALL)
if match:
try:
concepts = json.loads(match.group())
if isinstance(concepts, list):
return [
c for c in concepts
if isinstance(c, dict) and "name" in c
]
except json.JSONDecodeError:
pass
logger.debug("Could not parse concept response: %s", text[:200])
return []
def _get_or_create_concept(self, name: str, concept_type: str):
"""
Get or create a Concept node by name.
:param name: Concept name (lowercase).
:param concept_type: Concept type (person, place, topic, etc.).
:returns: Concept node, or None on failure.
"""
from library.models import Concept
try:
# Try to get existing
existing = Concept.nodes.filter(name=name)
if existing:
return existing[0]
# Create new
concept = Concept(name=name, concept_type=concept_type)
concept.save()
return concept
except Exception as exc:
logger.debug("Failed to get/create concept '%s': %s", name, exc)
return None
def _select_sample_indices(
self, total: int, max_samples: int = 10
) -> list[int]:
"""
Select evenly-spaced sample indices for concept extraction.
:param total: Total number of chunks.
:param max_samples: Maximum samples to take.
:returns: List of chunk indices to process.
"""
if total <= max_samples:
return list(range(total))
step = total / max_samples
return [int(i * step) for i in range(max_samples)]

View File

@@ -0,0 +1,396 @@
"""
Multi-backend embedding client.
Dispatches embedding requests to OpenAI-compatible APIs (OpenAI, vLLM,
llama-cpp, Ollama) or Amazon Bedrock via direct HTTP with Bearer token auth.
"""
import base64
import json
import logging
import time
from typing import Optional
import requests
from library.metrics import (
EMBEDDING_API_ERRORS_TOTAL,
EMBEDDING_BATCH_DURATION,
EMBEDDING_TOKENS_TOTAL,
EMBEDDINGS_GENERATED_TOTAL,
)
logger = logging.getLogger(__name__)
class EmbeddingClient:
"""
Client for generating text and image embeddings via multiple backends.
Dispatches based on ``LLMApi.api_type``:
* ``openai``, ``vllm``, ``llama-cpp``, ``ollama`` — OpenAI-compatible
``POST /embeddings``
* ``bedrock`` — Amazon Bedrock Runtime ``POST /model/{id}/invoke``
with Bearer token auth
"""
def __init__(self, embedding_model, user=None):
"""
:param embedding_model: ``LLMModel`` instance for embeddings.
:param user: Optional Django user for usage tracking.
"""
self.model = embedding_model
self.api = embedding_model.api
self.user = user
self.base_url = self.api.base_url.rstrip("/")
self.model_name = self.model.name
self.api_type = self.api.api_type
self.timeout = self.api.timeout_seconds or 120
logger.info(
"EmbeddingClient initialized api=%s model=%s api_type=%s base_url=%s",
self.api.name,
self.model_name,
self.api_type,
self.base_url,
)
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def embed_text(self, text: str) -> list[float]:
"""
Generate an embedding vector for a single text string.
:param text: Text to embed.
:returns: Embedding vector as list of floats.
"""
if self.api_type == "bedrock":
return self._embed_bedrock_single(text)
return self._embed_openai_single(text)
def embed_texts(self, texts: list[str]) -> list[list[float]]:
"""
Generate embeddings for multiple texts.
:param texts: List of text strings.
:returns: List of embedding vectors.
"""
if self.api_type == "bedrock":
return self._embed_bedrock_batch(texts)
return self._embed_openai_batch(texts)
def embed_image(self, image_data: bytes, image_ext: str = "png") -> Optional[list[float]]:
"""
Generate a multimodal embedding for an image.
Requires a model with ``supports_multimodal=True``.
:param image_data: Raw image bytes.
:param image_ext: Image format extension.
:returns: Embedding vector, or None if not supported.
"""
if not self.model.supports_multimodal:
logger.debug(
"Model %s does not support multimodal, skipping image embedding",
self.model_name,
)
return None
b64 = base64.b64encode(image_data).decode("utf-8")
mime_type = f"image/{image_ext}" if image_ext != "jpg" else "image/jpeg"
if self.api_type == "bedrock":
return self._embed_bedrock_image(b64, mime_type)
return self._embed_openai_image(b64, mime_type)
# ------------------------------------------------------------------
# OpenAI-compatible backend
# ------------------------------------------------------------------
def _embed_openai_single(self, text: str) -> list[float]:
"""Embed a single text via OpenAI-compatible /embeddings endpoint."""
result = self._embed_openai_batch([text])
return result[0]
def _embed_openai_batch(self, texts: list[str]) -> list[list[float]]:
"""Embed a batch of texts via OpenAI-compatible /embeddings endpoint."""
url = f"{self.base_url}/embeddings"
payload = {"input": texts, "model": self.model_name}
headers = self._openai_headers()
logger.debug(
"OpenAI embedding request texts=%d model=%s",
len(texts),
self.model_name,
)
with EMBEDDING_BATCH_DURATION.labels(
model_name=self.model_name, api_type=self.api_type
).time():
try:
resp = requests.post(
url,
json=payload,
headers=headers,
timeout=self.timeout * max(1, len(texts) // 10),
)
if resp.status_code != 200:
logger.error(
"OpenAI embedding failed status=%d body=%s",
resp.status_code,
resp.text[:500],
)
resp.raise_for_status()
data = resp.json()
except requests.RequestException as exc:
EMBEDDING_API_ERRORS_TOTAL.labels(
model_name=self.model_name,
api_type=self.api_type,
error_type=type(exc).__name__,
).inc()
logger.error("OpenAI embedding request failed: %s", exc)
raise
embeddings = self._parse_openai_response(data)
# Metrics
EMBEDDINGS_GENERATED_TOTAL.labels(
model_name=self.model_name,
api_type=self.api_type,
content_type="text",
).inc(len(embeddings))
EMBEDDING_TOKENS_TOTAL.labels(model_name=self.model_name).inc(
sum(len(t.split()) for t in texts)
)
self._log_usage(len(texts), sum(len(t.split()) for t in texts))
logger.debug(
"OpenAI embedding response texts=%d dimensions=%d",
len(embeddings),
len(embeddings[0]) if embeddings else 0,
)
return embeddings
def _embed_openai_image(self, b64_image: str, mime_type: str) -> Optional[list[float]]:
"""Embed an image via OpenAI-compatible multimodal endpoint."""
url = f"{self.base_url}/embeddings"
payload = {
"input": [{"type": "image_url", "image_url": {"url": f"data:{mime_type};base64,{b64_image}"}}],
"model": self.model_name,
}
headers = self._openai_headers()
try:
resp = requests.post(url, json=payload, headers=headers, timeout=self.timeout)
resp.raise_for_status()
data = resp.json()
embeddings = self._parse_openai_response(data)
if embeddings:
EMBEDDINGS_GENERATED_TOTAL.labels(
model_name=self.model_name,
api_type=self.api_type,
content_type="image",
).inc()
return embeddings[0]
except Exception as exc:
EMBEDDING_API_ERRORS_TOTAL.labels(
model_name=self.model_name,
api_type=self.api_type,
error_type=type(exc).__name__,
).inc()
logger.warning("Image embedding failed: %s", exc)
return None
def _openai_headers(self) -> dict[str, str]:
"""Build headers for OpenAI-compatible requests."""
headers = {"Content-Type": "application/json"}
if self.api.api_key:
headers["Authorization"] = f"Bearer {self.api.api_key}"
return headers
def _parse_openai_response(self, data) -> list[list[float]]:
"""
Parse embedding response from various OpenAI-compatible formats.
Handles:
- OpenAI standard: ``{"data": [{"embedding": [...], "index": 0}]}``
- Direct list of dicts: ``[{"embedding": [...]}]``
- Direct list of vectors: ``[[0.1, 0.2, ...]]``
- Dict with embeddings key: ``{"embeddings": [[...]]}``
:param data: Parsed JSON response.
:returns: List of embedding vectors.
"""
if isinstance(data, list):
if data and isinstance(data[0], dict) and "embedding" in data[0]:
return [item["embedding"] for item in data]
return data
if isinstance(data, dict):
if "data" in data:
return [
item["embedding"]
for item in sorted(data["data"], key=lambda x: x.get("index", 0))
]
if "embedding" in data:
return [data["embedding"]]
if "embeddings" in data:
return data["embeddings"]
raise ValueError(f"Unexpected embedding response format: {type(data)}")
# ------------------------------------------------------------------
# Amazon Bedrock backend
# ------------------------------------------------------------------
def _embed_bedrock_single(self, text: str) -> list[float]:
"""Embed a single text via Bedrock Runtime InvokeModel."""
url = f"{self.base_url}/model/{self.model_name}/invoke"
headers = {
"Authorization": f"Bearer {self.api.api_key}",
"Content-Type": "application/json",
"Accept": "application/json",
}
body = {"inputText": text, "normalize": True}
# Include dimensions if the model supports configurable output
if self.model.vector_dimensions:
body["dimensions"] = self.model.vector_dimensions
logger.debug(
"Bedrock embedding request model=%s text_len=%d",
self.model_name,
len(text),
)
with EMBEDDING_BATCH_DURATION.labels(
model_name=self.model_name, api_type=self.api_type
).time():
try:
resp = requests.post(
url, json=body, headers=headers, timeout=self.timeout
)
if resp.status_code != 200:
logger.error(
"Bedrock embedding failed status=%d body=%s",
resp.status_code,
resp.text[:500],
)
resp.raise_for_status()
data = resp.json()
except requests.RequestException as exc:
EMBEDDING_API_ERRORS_TOTAL.labels(
model_name=self.model_name,
api_type=self.api_type,
error_type=type(exc).__name__,
).inc()
logger.error("Bedrock embedding request failed: %s", exc)
raise
embedding = data.get("embedding")
if not embedding:
raise ValueError(f"Bedrock response missing 'embedding' key: {list(data.keys())}")
token_count = data.get("inputTextTokenCount", len(text.split()))
EMBEDDINGS_GENERATED_TOTAL.labels(
model_name=self.model_name,
api_type=self.api_type,
content_type="text",
).inc()
EMBEDDING_TOKENS_TOTAL.labels(model_name=self.model_name).inc(token_count)
self._log_usage(1, token_count)
logger.debug(
"Bedrock embedding response dimensions=%d tokens=%d",
len(embedding),
token_count,
)
return embedding
def _embed_bedrock_batch(self, texts: list[str]) -> list[list[float]]:
"""
Embed multiple texts via Bedrock (client-side loop).
Bedrock InvokeModel accepts one input at a time, so we loop.
:param texts: List of text strings.
:returns: List of embedding vectors.
"""
embeddings = []
for i, text in enumerate(texts):
embedding = self._embed_bedrock_single(text)
embeddings.append(embedding)
if (i + 1) % 10 == 0:
logger.debug(
"Bedrock batch progress %d/%d", i + 1, len(texts)
)
return embeddings
def _embed_bedrock_image(self, b64_image: str, mime_type: str) -> Optional[list[float]]:
"""Embed an image via Bedrock multimodal endpoint."""
url = f"{self.base_url}/model/{self.model_name}/invoke"
headers = {
"Authorization": f"Bearer {self.api.api_key}",
"Content-Type": "application/json",
"Accept": "application/json",
}
body = {
"inputImage": b64_image,
"normalize": True,
}
if self.model.vector_dimensions:
body["dimensions"] = self.model.vector_dimensions
try:
resp = requests.post(url, json=body, headers=headers, timeout=self.timeout)
resp.raise_for_status()
data = resp.json()
embedding = data.get("embedding")
if embedding:
EMBEDDINGS_GENERATED_TOTAL.labels(
model_name=self.model_name,
api_type=self.api_type,
content_type="image",
).inc()
return embedding
except Exception as exc:
EMBEDDING_API_ERRORS_TOTAL.labels(
model_name=self.model_name,
api_type=self.api_type,
error_type=type(exc).__name__,
).inc()
logger.warning("Bedrock image embedding failed: %s", exc)
return None
# ------------------------------------------------------------------
# Usage tracking
# ------------------------------------------------------------------
def _log_usage(self, text_count: int, token_count: int):
"""
Log embedding usage to LLMUsage model.
:param text_count: Number of texts embedded.
:param token_count: Approximate token count.
"""
try:
from llm_manager.models import LLMUsage
LLMUsage.objects.create(
model=self.model,
user=self.user,
input_tokens=token_count,
output_tokens=0,
cached_tokens=0,
total_cost=(token_count / 1000) * float(self.model.input_cost_per_1k),
purpose="embeddings",
)
except Exception as exc:
logger.warning("Failed to log embedding usage: %s", exc)

View File

@@ -0,0 +1,360 @@
"""
Universal document parsing service using PyMuPDF.
Handles text extraction and image extraction for all supported formats:
PDF, EPUB, DOCX, PPTX, XLSX, XPS, MOBI, FB2, CBZ, TXT, HTML, and images.
"""
import logging
import os
import tempfile
from dataclasses import dataclass, field
import fitz # PyMuPDF
from library.metrics import (
DOCUMENT_PARSE_DURATION,
DOCUMENTS_PARSED_TOTAL,
IMAGES_EXTRACTED_TOTAL,
)
from .text_utils import remove_excessive_whitespace, sanitize_text
logger = logging.getLogger(__name__)
# File extensions supported by PyMuPDF
PYMUPDF_EXTENSIONS = {
"pdf", "epub", "xps", "mobi", "fb2", "cbz", "svg",
"docx", "pptx", "xlsx", "hwpx",
}
# Plain text extensions — read directly, no PyMuPDF needed
PLAINTEXT_EXTENSIONS = {"txt", "md", "csv", "tsv", "log", "json", "yaml", "yml", "xml"}
# Image extensions — store as Image nodes directly
IMAGE_EXTENSIONS = {"jpg", "jpeg", "png", "gif", "bmp", "tiff", "tif", "webp", "svg"}
# Minimum image dimensions to extract (skip tiny icons/bullets)
MIN_IMAGE_WIDTH = 50
MIN_IMAGE_HEIGHT = 50
@dataclass
class TextBlock:
"""A block of extracted text with page/section context."""
text: str
page: int
metadata: dict = field(default_factory=dict)
@dataclass
class ExtractedImage:
"""An image extracted from a document."""
data: bytes
ext: str
width: int
height: int
source_page: int
source_index: int
@dataclass
class ParseResult:
"""Result of parsing a document: text blocks + images + metadata."""
text_blocks: list[TextBlock] = field(default_factory=list)
images: list[ExtractedImage] = field(default_factory=list)
metadata: dict = field(default_factory=dict)
file_type: str = ""
class DocumentParser:
"""
Universal document parser using PyMuPDF.
Extracts text and images from all supported document formats through
a single unified interface.
"""
def parse(self, file_path: str, file_type: str) -> ParseResult:
"""
Parse a document and extract text blocks and images.
:param file_path: Path to the document file.
:param file_type: File extension (without dot), e.g. 'pdf', 'epub'.
:returns: ParseResult with text blocks, images, and metadata.
:raises ValueError: If the file type is not supported.
"""
file_type = file_type.lower().lstrip(".")
logger.info(
"Parsing document file_type=%s path=%s",
file_type,
os.path.basename(file_path),
)
if file_type in PLAINTEXT_EXTENSIONS:
return self._parse_plaintext(file_path, file_type)
if file_type in IMAGE_EXTENSIONS:
return self._parse_image_file(file_path, file_type)
if file_type in PYMUPDF_EXTENSIONS:
return self._parse_with_pymupdf(file_path, file_type)
# HTML can be handled by PyMuPDF or direct read
if file_type in ("html", "htm"):
return self._parse_with_pymupdf(file_path, file_type)
raise ValueError(
f"Unsupported file type '{file_type}'. "
f"Supported: {sorted(PYMUPDF_EXTENSIONS | PLAINTEXT_EXTENSIONS | IMAGE_EXTENSIONS)}"
)
def parse_bytes(self, data: bytes, file_type: str, filename: str = "") -> ParseResult:
"""
Parse document from bytes (e.g. from S3 download).
:param data: Raw file bytes.
:param file_type: File extension (without dot).
:param filename: Optional original filename for logging.
:returns: ParseResult.
"""
suffix = f".{file_type.lower().lstrip('.')}"
with tempfile.NamedTemporaryFile(suffix=suffix, delete=False) as tmp:
tmp.write(data)
tmp_path = tmp.name
try:
logger.debug(
"Parsing from bytes file_type=%s size=%d filename=%s",
file_type,
len(data),
filename,
)
return self.parse(tmp_path, file_type)
finally:
os.unlink(tmp_path)
def _parse_with_pymupdf(self, file_path: str, file_type: str) -> ParseResult:
"""
Parse a document using PyMuPDF for text and image extraction.
:param file_path: Path to the document.
:param file_type: Normalized file extension.
:returns: ParseResult.
"""
with DOCUMENT_PARSE_DURATION.labels(file_type=file_type).time():
try:
doc = fitz.open(file_path)
except Exception as exc:
DOCUMENTS_PARSED_TOTAL.labels(file_type=file_type, status="error").inc()
logger.error("Failed to open document file_type=%s: %s", file_type, exc)
raise
text_blocks: list[TextBlock] = []
images: list[ExtractedImage] = []
image_global_index = 0
for page_num in range(len(doc)):
page = doc[page_num]
# --- Text extraction ---
try:
text = page.get_text("text")
if text and text.strip():
cleaned = sanitize_text(text, log_changes=False)
cleaned = remove_excessive_whitespace(cleaned)
if cleaned.strip():
text_blocks.append(
TextBlock(text=cleaned, page=page_num)
)
logger.debug(
"Extracted text page=%d chars=%d",
page_num,
len(cleaned),
)
except Exception as exc:
logger.warning(
"Text extraction failed page=%d: %s, continuing",
page_num,
exc,
)
# --- Image extraction ---
try:
for img_info in page.get_images(full=True):
xref = img_info[0]
try:
img_data = doc.extract_image(xref)
if not img_data or not img_data.get("image"):
continue
width = img_data.get("width", 0)
height = img_data.get("height", 0)
# Skip tiny images (icons, bullets, etc.)
if width < MIN_IMAGE_WIDTH or height < MIN_IMAGE_HEIGHT:
logger.debug(
"Skipping small image page=%d xref=%d size=%dx%d",
page_num,
xref,
width,
height,
)
continue
images.append(
ExtractedImage(
data=img_data["image"],
ext=img_data.get("ext", "png"),
width=width,
height=height,
source_page=page_num,
source_index=image_global_index,
)
)
image_global_index += 1
logger.debug(
"Extracted image page=%d format=%s size=%dx%d bytes=%d",
page_num,
img_data.get("ext", "?"),
width,
height,
len(img_data["image"]),
)
except Exception as exc:
logger.warning(
"Image extraction failed page=%d xref=%d: %s",
page_num,
xref,
exc,
)
except Exception as exc:
logger.warning(
"Image listing failed page=%d: %s, continuing",
page_num,
exc,
)
# Collect document metadata
meta = doc.metadata or {}
result_meta = {
"page_count": len(doc),
"title": meta.get("title", ""),
"author": meta.get("author", ""),
"subject": meta.get("subject", ""),
"creator": meta.get("creator", ""),
}
doc.close()
DOCUMENTS_PARSED_TOTAL.labels(file_type=file_type, status="success").inc()
IMAGES_EXTRACTED_TOTAL.labels(file_type=file_type).inc(len(images))
logger.info(
"Parsed document file_type=%s pages=%d text_blocks=%d images=%d",
file_type,
result_meta["page_count"],
len(text_blocks),
len(images),
)
return ParseResult(
text_blocks=text_blocks,
images=images,
metadata=result_meta,
file_type=file_type,
)
def _parse_plaintext(self, file_path: str, file_type: str) -> ParseResult:
"""
Parse a plain text file by direct read.
:param file_path: Path to the text file.
:param file_type: Normalized file extension.
:returns: ParseResult.
"""
with DOCUMENT_PARSE_DURATION.labels(file_type=file_type).time():
try:
with open(file_path, "r", encoding="utf-8", errors="replace") as f:
content = f.read()
except Exception as exc:
DOCUMENTS_PARSED_TOTAL.labels(file_type=file_type, status="error").inc()
logger.error("Failed to read text file file_type=%s: %s", file_type, exc)
raise
cleaned = sanitize_text(content, log_changes=True)
cleaned = remove_excessive_whitespace(cleaned)
text_blocks = []
if cleaned.strip():
text_blocks.append(TextBlock(text=cleaned, page=0))
DOCUMENTS_PARSED_TOTAL.labels(file_type=file_type, status="success").inc()
logger.info(
"Parsed plaintext file_type=%s chars=%d",
file_type,
len(cleaned),
)
return ParseResult(
text_blocks=text_blocks,
images=[],
metadata={"page_count": 1},
file_type=file_type,
)
def _parse_image_file(self, file_path: str, file_type: str) -> ParseResult:
"""
Handle a standalone image file — store as a single ExtractedImage.
:param file_path: Path to the image file.
:param file_type: Normalized file extension.
:returns: ParseResult with one image and no text.
"""
with DOCUMENT_PARSE_DURATION.labels(file_type=file_type).time():
try:
from PIL import Image as PILImage
with open(file_path, "rb") as f:
data = f.read()
img = PILImage.open(file_path)
width, height = img.size
img.close()
except Exception as exc:
DOCUMENTS_PARSED_TOTAL.labels(file_type=file_type, status="error").inc()
logger.error("Failed to read image file_type=%s: %s", file_type, exc)
raise
DOCUMENTS_PARSED_TOTAL.labels(file_type=file_type, status="success").inc()
IMAGES_EXTRACTED_TOTAL.labels(file_type=file_type).inc(1)
logger.info(
"Parsed image file file_type=%s size=%dx%d bytes=%d",
file_type,
width,
height,
len(data),
)
return ParseResult(
text_blocks=[],
images=[
ExtractedImage(
data=data,
ext=file_type,
width=width,
height=height,
source_page=0,
source_index=0,
)
],
metadata={"page_count": 0, "width": width, "height": height},
file_type=file_type,
)

View File

@@ -0,0 +1,581 @@
"""
Embedding pipeline orchestrator.
Coordinates the full ingestion flow:
parse → chunk → embed → store → graph construction.
"""
import hashlib
import logging
import time
from typing import Optional
from django.conf import settings
from django.core.cache import cache
from django.core.files.base import ContentFile
from django.core.files.storage import default_storage
from library.metrics import (
PIPELINE_DURATION,
PIPELINE_ITEMS_IN_PROGRESS,
PIPELINE_ITEMS_TOTAL,
)
logger = logging.getLogger(__name__)
# S3 key patterns
ORIGINAL_S3_KEY = "items/{item_uid}/original.{ext}"
CHUNK_S3_KEY = "chunks/{item_uid}/chunk_{index}.txt"
IMAGE_S3_KEY = "images/{item_uid}/{index}.{ext}"
# Batch sizes
EMBEDDING_BATCH_SIZE = getattr(settings, "EMBEDDING_BATCH_SIZE", 8)
class EmbeddingPipeline:
"""
Orchestrates the complete embedding pipeline for a single Item.
Stages:
1. Parse document (text + images)
2. Chunk text (content-type-aware)
3. Store chunks in S3 + Neo4j
4. Embed text chunks
5. Store images in S3 + Neo4j
6. Embed images (multimodal, if available)
7. Extract concepts (if system chat model available)
"""
def __init__(self, user=None):
"""
:param user: Optional Django user for usage tracking.
"""
self.user = user
def process_item(
self,
item_uid: str,
progress_callback: Optional[callable] = None,
) -> dict:
"""
Run the full embedding pipeline for an Item.
:param item_uid: UID of the Item node to process.
:param progress_callback: Optional callback(percent, message).
:returns: Dict with processing results.
:raises ValueError: If item not found or no embedding model configured.
"""
from library.models import Item
start_time = time.time()
PIPELINE_ITEMS_IN_PROGRESS.inc()
try:
item = Item.nodes.get(uid=item_uid)
except Exception:
PIPELINE_ITEMS_TOTAL.labels(status="failed").inc()
PIPELINE_ITEMS_IN_PROGRESS.dec()
raise ValueError(f"Item not found: {item_uid}")
logger.info(
"Pipeline starting item_uid=%s title='%s' file_type=%s",
item_uid,
item.title,
item.file_type,
)
# Mark as processing
item.embedding_status = "processing"
item.error_message = ""
item.save()
try:
result = self._run_pipeline(item, progress_callback)
# Mark as completed
item.embedding_status = "completed"
item.chunk_count = result.get("chunks_created", 0)
item.image_count = result.get("images_stored", 0)
item.embedding_model_name = result.get("model_name", "")
item.error_message = ""
item.save()
elapsed = time.time() - start_time
PIPELINE_ITEMS_TOTAL.labels(status="completed").inc()
PIPELINE_DURATION.observe(elapsed)
logger.info(
"Pipeline completed item_uid=%s chunks=%d images=%d concepts=%d elapsed=%.2fs",
item_uid,
result.get("chunks_created", 0),
result.get("images_stored", 0),
result.get("concepts_extracted", 0),
elapsed,
)
if progress_callback:
progress_callback(100, "Completed")
return result
except Exception as exc:
item.embedding_status = "failed"
item.error_message = str(exc)[:500]
item.save()
PIPELINE_ITEMS_TOTAL.labels(status="failed").inc()
logger.error(
"Pipeline failed item_uid=%s: %s",
item_uid,
exc,
exc_info=True,
)
raise
finally:
PIPELINE_ITEMS_IN_PROGRESS.dec()
def _run_pipeline(self, item, progress_callback) -> dict:
"""
Execute pipeline stages sequentially.
:param item: Item node instance.
:param progress_callback: Optional progress callback.
:returns: Results dict.
"""
from llm_manager.models import LLMModel
from .chunker import ContentTypeChunker
from .concepts import ConceptExtractor
from .embedding_client import EmbeddingClient
from .parsers import DocumentParser
result = {
"chunks_created": 0,
"chunks_embedded": 0,
"images_stored": 0,
"images_embedded": 0,
"concepts_extracted": 0,
"model_name": "",
}
# --- Resolve library context ---
library = self._get_item_library(item)
chunking_config = library.chunking_config if library else {}
embedding_instruction = library.embedding_instruction if library else ""
library_type = library.library_type if library else ""
# --- Get system embedding model ---
embedding_model = LLMModel.get_system_embedding_model()
if not embedding_model:
raise ValueError(
"No system embedding model configured. "
"Set one via Django admin > LLM Models > Set as System Embedding Model."
)
result["model_name"] = embedding_model.name
embed_client = EmbeddingClient(embedding_model, user=self.user)
# --- Check dimension compatibility ---
if embedding_model.vector_dimensions:
self._check_dimension_compatibility(embedding_model.vector_dimensions)
if progress_callback:
progress_callback(5, "Parsing document")
# --- Stage 1: Parse ---
parser = DocumentParser()
file_data = self._read_item_from_s3(item)
if not file_data:
logger.warning("No file data for item_uid=%s, skipping", item.uid)
return result
parse_result = parser.parse_bytes(
file_data,
item.file_type,
filename=item.title,
)
if progress_callback:
progress_callback(20, "Chunking text")
# --- Stage 2: Chunk ---
chunker = ContentTypeChunker()
chunk_result = chunker.chunk(parse_result, chunking_config, library_type)
if progress_callback:
progress_callback(30, "Storing chunks")
# --- Stage 3: Store chunks in S3 + Neo4j ---
chunk_nodes = self._store_chunks(item, chunk_result)
result["chunks_created"] = len(chunk_nodes)
if progress_callback:
progress_callback(40, "Embedding text chunks")
# --- Stage 4: Embed text chunks ---
if chunk_result.chunks:
self._embed_chunks(
item,
chunk_nodes,
chunk_result.chunks,
embed_client,
embedding_instruction,
progress_callback,
)
result["chunks_embedded"] = len(chunk_nodes)
if progress_callback:
progress_callback(70, "Storing images")
# --- Stage 5: Store images ---
image_nodes = self._store_images(item, parse_result.images)
result["images_stored"] = len(image_nodes)
# Associate images with nearby chunks
self._associate_images_with_chunks(
chunk_nodes, image_nodes, chunk_result, parse_result
)
if progress_callback:
progress_callback(80, "Embedding images")
# --- Stage 6: Embed images (multimodal) ---
if image_nodes and embedding_model.supports_multimodal:
embedded_count = self._embed_images(image_nodes, embed_client)
result["images_embedded"] = embedded_count
if progress_callback:
progress_callback(90, "Extracting concepts")
# --- Stage 7: Concept extraction ---
chat_model = LLMModel.get_system_chat_model()
if chat_model and chunk_result.chunks:
extractor = ConceptExtractor(chat_model, user=self.user)
concepts_count = extractor.extract_for_item(
item, chunk_nodes, chunk_result.chunks
)
result["concepts_extracted"] = concepts_count
# Update content hash to prevent redundant re-processing
if file_data:
item.content_hash = hashlib.sha256(file_data).hexdigest()
item.save()
return result
def _get_item_library(self, item):
"""
Walk the graph to find the Library containing this Item.
:param item: Item node.
:returns: Library node, or None.
"""
from library.models import Collection
try:
# Item <- Collection <- Library
from neomodel import db
results, _ = db.cypher_query(
"MATCH (l:Library)-[:CONTAINS]->(c:Collection)-[:CONTAINS]->(i:Item {uid: $uid}) "
"RETURN l.uid, l.library_type, l.chunking_config, l.embedding_instruction",
{"uid": item.uid},
)
if results:
from library.models import Library
return Library.nodes.get(uid=results[0][0])
except Exception as exc:
logger.warning("Could not resolve library for item_uid=%s: %s", item.uid, exc)
return None
def _read_item_from_s3(self, item) -> Optional[bytes]:
"""
Read the original file from S3 storage.
:param item: Item node with s3_key.
:returns: File bytes, or None.
"""
if not item.s3_key:
logger.warning("Item has no s3_key item_uid=%s", item.uid)
return None
try:
with default_storage.open(item.s3_key, "rb") as f:
data = f.read()
logger.debug(
"Read item from S3 key=%s size=%d", item.s3_key, len(data)
)
return data
except Exception as exc:
logger.error(
"Failed to read from S3 key=%s: %s", item.s3_key, exc
)
raise
def _store_chunks(self, item, chunk_result) -> list:
"""
Store chunk text in S3 and create Chunk nodes in Neo4j.
:param item: Item node.
:param chunk_result: ChunkResult from chunker.
:returns: List of Chunk node instances.
"""
from library.models import Chunk
# Delete existing chunks for this item
for old_chunk in item.chunks.all():
# Clean up S3
try:
default_storage.delete(old_chunk.chunk_s3_key)
except Exception:
pass
old_chunk.delete()
chunk_nodes = []
for idx, chunk_text in enumerate(chunk_result.chunks):
s3_key = CHUNK_S3_KEY.format(item_uid=item.uid, index=idx)
# Store chunk text in S3
try:
default_storage.save(s3_key, ContentFile(chunk_text.encode("utf-8")))
except Exception as exc:
logger.error("Failed to store chunk %d to S3: %s", idx, exc)
raise
# Create Chunk node
chunk_node = Chunk(
chunk_index=idx,
chunk_s3_key=s3_key,
chunk_size=len(chunk_text),
text_preview=chunk_text[:500],
)
chunk_node.save()
item.chunks.connect(chunk_node)
chunk_nodes.append(chunk_node)
logger.info(
"Stored %d chunks for item_uid=%s", len(chunk_nodes), item.uid
)
return chunk_nodes
def _embed_chunks(
self,
item,
chunk_nodes: list,
chunk_texts: list[str],
embed_client,
embedding_instruction: str,
progress_callback: Optional[callable],
):
"""
Generate embeddings for chunks and update Chunk nodes.
:param item: Item node.
:param chunk_nodes: List of Chunk nodes.
:param chunk_texts: List of chunk text strings.
:param embed_client: EmbeddingClient instance.
:param embedding_instruction: Instruction prefix for embedding.
:param progress_callback: Optional progress callback.
"""
# Prepend embedding instruction if configured
if embedding_instruction:
texts_to_embed = [
f"{embedding_instruction}\n\n{text}" for text in chunk_texts
]
else:
texts_to_embed = chunk_texts
batch_size = EMBEDDING_BATCH_SIZE
total_batches = (len(texts_to_embed) + batch_size - 1) // batch_size
for batch_idx in range(0, len(texts_to_embed), batch_size):
batch_texts = texts_to_embed[batch_idx : batch_idx + batch_size]
batch_nodes = chunk_nodes[batch_idx : batch_idx + batch_size]
batch_num = batch_idx // batch_size + 1
logger.debug(
"Embedding batch %d/%d size=%d item_uid=%s",
batch_num,
total_batches,
len(batch_texts),
item.uid,
)
embeddings = embed_client.embed_texts(batch_texts)
for node, embedding in zip(batch_nodes, embeddings):
node.embedding = embedding
node.save()
if progress_callback:
pct = 40 + (30 * (batch_idx + len(batch_texts)) / len(texts_to_embed))
progress_callback(
int(pct),
f"Embedded {batch_idx + len(batch_texts)}/{len(texts_to_embed)} chunks",
)
logger.info(
"Embedded %d chunks for item_uid=%s", len(chunk_nodes), item.uid
)
def _store_images(self, item, extracted_images) -> list:
"""
Store extracted images in S3 and create Image nodes in Neo4j.
:param item: Item node.
:param extracted_images: List of ExtractedImage from parser.
:returns: List of Image node instances.
"""
from library.models import Image
# Delete existing images for this item
for old_image in item.images.all():
try:
default_storage.delete(old_image.s3_key)
except Exception:
pass
old_image.delete()
image_nodes = []
for img in extracted_images:
s3_key = IMAGE_S3_KEY.format(
item_uid=item.uid,
index=img.source_index,
ext=img.ext,
)
try:
default_storage.save(s3_key, ContentFile(img.data))
except Exception as exc:
logger.warning(
"Failed to store image %d to S3: %s", img.source_index, exc
)
continue
image_node = Image(
s3_key=s3_key,
image_type="diagram", # Default; could be refined by content analysis
metadata={
"width": img.width,
"height": img.height,
"source_page": img.source_page,
"content_type": f"image/{img.ext}",
},
)
image_node.save()
item.images.connect(image_node)
image_nodes.append(image_node)
if image_nodes:
logger.info(
"Stored %d images for item_uid=%s", len(image_nodes), item.uid
)
return image_nodes
def _associate_images_with_chunks(
self, chunk_nodes, image_nodes, chunk_result, parse_result
):
"""
Create HAS_NEARBY_IMAGE relationships between chunks and images.
Associates images with chunks from the same page/section.
:param chunk_nodes: List of Chunk nodes.
:param image_nodes: List of Image nodes.
:param chunk_result: ChunkResult with page mapping.
:param parse_result: ParseResult with image source pages.
"""
if not chunk_nodes or not image_nodes:
return
# Build page -> images mapping
page_images: dict[int, list] = {}
for img_node, ext_img in zip(image_nodes, parse_result.images):
page_images.setdefault(ext_img.source_page, []).append(img_node)
# Connect chunks to images on the same page
connected = 0
for chunk_idx, chunk_node in enumerate(chunk_nodes):
page = chunk_result.chunk_page_map.get(chunk_idx, -1)
nearby = page_images.get(page, [])
for img_node in nearby:
chunk_node.nearby_images.connect(
img_node, {"proximity": "same_page"}
)
connected += 1
if connected:
logger.debug(
"Created %d chunk-image associations", connected
)
def _embed_images(self, image_nodes: list, embed_client) -> int:
"""
Generate multimodal embeddings for Image nodes.
:param image_nodes: List of Image nodes.
:param embed_client: EmbeddingClient with multimodal support.
:returns: Number of images successfully embedded.
"""
from library.models import ImageEmbedding
embedded_count = 0
for img_node in image_nodes:
try:
img_data = default_storage.open(img_node.s3_key, "rb").read()
ext = img_node.s3_key.rsplit(".", 1)[-1] if "." in img_node.s3_key else "png"
embedding = embed_client.embed_image(img_data, ext)
if embedding:
emb_node = ImageEmbedding(embedding=embedding)
emb_node.save()
img_node.embeddings.connect(emb_node)
embedded_count += 1
except Exception as exc:
logger.warning(
"Image embedding failed s3_key=%s: %s",
img_node.s3_key,
exc,
)
if embedded_count:
logger.info("Embedded %d images", embedded_count)
return embedded_count
def _check_dimension_compatibility(self, model_dimensions: int):
"""
Check if the model's vector dimensions match the Neo4j index.
:param model_dimensions: Expected embedding dimensions.
"""
# Log a warning — actual enforcement is in setup_neo4j_indexes
logger.debug(
"System embedding model dimensions=%d", model_dimensions
)
def reprocess_item(self, item_uid: str, progress_callback=None) -> dict:
"""
Re-embed an item: delete existing chunks/images, then re-process.
:param item_uid: UID of the Item to re-embed.
:param progress_callback: Optional progress callback.
:returns: Processing results dict.
"""
from library.models import Item
try:
item = Item.nodes.get(uid=item_uid)
except Exception:
raise ValueError(f"Item not found: {item_uid}")
# Clear content hash to force re-processing
item.content_hash = ""
item.save()
logger.info("Re-processing item_uid=%s title='%s'", item_uid, item.title)
return self.process_item(item_uid, progress_callback)

View File

@@ -0,0 +1,165 @@
"""
Text sanitization utilities for the embedding pipeline.
Ported from Spelunker's text_utils.py — ensures text can be safely
processed by embedding APIs and LLMs.
"""
import logging
import re
import unicodedata
logger = logging.getLogger(__name__)
# Common PDF ligatures
_LIGATURE_MAP = {
"\ufb01": "fi",
"\ufb02": "fl",
"\ufb00": "ff",
"\ufb03": "ffi",
"\ufb04": "ffl",
"\ufb05": "ft",
"\ufb06": "st",
}
# Common special characters from PDF extraction
_SPECIAL_CHAR_MAP = {
"\u2018": "'", # left single quotation
"\u2019": "'", # right single quotation
"\u201c": '"', # left double quotation
"\u201d": '"', # right double quotation
"\u2013": "-", # en dash
"\u2014": "-", # em dash
"\u2026": "...", # horizontal ellipsis
"\u00a0": " ", # non-breaking space
}
# Zero-width characters
_ZERO_WIDTH_CHARS = [
"\u200b", # zero-width space
"\u200c", # zero-width non-joiner
"\u200d", # zero-width joiner
"\ufeff", # zero-width no-break space (BOM)
]
# Control characters pattern (exclude newline, tab, carriage return)
_CONTROL_CHAR_RE = re.compile(r"[\x00-\x08\x0b-\x0c\x0e-\x1f\x7f-\x9f]")
def sanitize_text(text: str, log_changes: bool = True) -> str:
"""
Sanitize text for embedding APIs by removing problematic characters.
Addresses common issues that cause "invalid tokens" errors:
null bytes, control characters, zero-width characters, invalid UTF-8,
and non-normalized Unicode.
:param text: Text to sanitize.
:param log_changes: Whether to log sanitization actions.
:returns: Sanitized text safe for tokenization.
"""
if not text:
return text
original_length = len(text)
changes: list[str] = []
# 1. Remove null bytes
if "\x00" in text:
text = text.replace("\x00", "")
changes.append("removed null bytes")
# 2. Remove control characters
if _CONTROL_CHAR_RE.search(text):
text = _CONTROL_CHAR_RE.sub("", text)
changes.append("removed control characters")
# 3. Remove zero-width characters
for char in _ZERO_WIDTH_CHARS:
if char in text:
text = text.replace(char, "")
changes.append("removed zero-width characters")
break
# 4. Normalize Unicode to NFC form
normalized = unicodedata.normalize("NFC", text)
if normalized != text:
text = normalized
changes.append("normalized Unicode to NFC")
# 5. Replace invalid UTF-8 sequences
try:
text = text.encode("utf-8", errors="replace").decode("utf-8")
if "\ufffd" in text:
changes.append("replaced invalid UTF-8 sequences")
except Exception as exc:
logger.warning("Error during UTF-8 validation: %s", exc)
# 6. Clean PDF artifacts (ligatures, special chars)
cleaned = clean_pdf_artifacts(text)
if cleaned != text:
text = cleaned
changes.append("cleaned PDF artifacts")
if log_changes and changes:
chars_removed = original_length - len(text)
logger.info(
"Text sanitization: %s original_length=%d final_length=%d chars_removed=%d",
", ".join(changes),
original_length,
len(text),
chars_removed,
)
return text
def clean_pdf_artifacts(text: str) -> str:
"""
Clean common PDF extraction artifacts.
Replaces ligatures and special characters with standard equivalents.
:param text: Text to clean.
:returns: Cleaned text.
"""
for ligature, replacement in _LIGATURE_MAP.items():
text = text.replace(ligature, replacement)
for special, replacement in _SPECIAL_CHAR_MAP.items():
text = text.replace(special, replacement)
return text
def remove_excessive_whitespace(text: str) -> str:
"""
Remove excessive whitespace while preserving paragraph structure.
:param text: Text to clean.
:returns: Text with normalized whitespace.
"""
text = re.sub(r" +", " ", text)
text = re.sub(r"\n\n+", "\n\n", text)
text = "\n".join(line.strip() for line in text.split("\n"))
return text.strip()
def truncate_text(text: str, max_chars: int, suffix: str = "...") -> str:
"""
Truncate text to a maximum length preserving word boundaries.
:param text: Text to truncate.
:param max_chars: Maximum number of characters.
:param suffix: Suffix to add if truncated.
:returns: Truncated text.
"""
if len(text) <= max_chars:
return text
truncate_at = text.rfind(" ", 0, max_chars - len(suffix))
if truncate_at == -1:
truncate_at = max_chars - len(suffix)
return text[:truncate_at] + suffix

282
mnemosyne/library/tasks.py Normal file
View File

@@ -0,0 +1,282 @@
"""
Celery tasks for the embedding pipeline.
All tasks pass UIDs (not model instances) per Red Panda Standards.
Tasks are idempotent, include retry logic, and track progress
via Memcached: library:task:{task_id}:progress.
"""
import logging
from celery import shared_task
from django.core.cache import cache
logger = logging.getLogger(__name__)
# Cache key pattern for task progress
PROGRESS_KEY = "library:task:{task_id}:progress"
def _update_progress(task, percent: int, message: str):
"""
Update task progress in Memcached and Celery state.
:param task: Celery task instance (self).
:param percent: Progress percentage (0-100).
:param message: Human-readable status message.
"""
try:
task.update_state(state="PROGRESS", meta={"percent": percent, "message": message})
cache.set(
PROGRESS_KEY.format(task_id=task.request.id),
{"percent": percent, "message": message},
timeout=3600,
)
except Exception:
pass
@shared_task(
name="library.tasks.embed_item",
bind=True,
queue="embedding",
max_retries=3,
default_retry_delay=60,
acks_late=True,
)
def embed_item(self, item_uid: str, user_id: int = None):
"""
Run the full embedding pipeline for a single Item.
:param item_uid: UID of the Item node to process.
:param user_id: Optional user ID for usage tracking.
:returns: Dict with processing results.
"""
logger.info("Task embed_item starting item_uid=%s task_id=%s", item_uid, self.request.id)
try:
from library.services.pipeline import EmbeddingPipeline
user = _resolve_user(user_id)
pipeline = EmbeddingPipeline(user=user)
def progress_cb(percent, message):
_update_progress(self, percent, message)
result = pipeline.process_item(item_uid, progress_callback=progress_cb)
logger.info(
"Task embed_item completed item_uid=%s chunks=%d images=%d",
item_uid,
result.get("chunks_created", 0),
result.get("images_stored", 0),
)
return {"success": True, "item_uid": item_uid, **result}
except Exception as exc:
logger.error(
"Task embed_item failed item_uid=%s: %s",
item_uid,
exc,
exc_info=True,
)
# Retry on transient errors
if self.request.retries < self.max_retries:
raise self.retry(exc=exc)
return {"success": False, "item_uid": item_uid, "error": str(exc)}
@shared_task(
name="library.tasks.reembed_item",
bind=True,
queue="embedding",
max_retries=3,
default_retry_delay=60,
acks_late=True,
)
def reembed_item(self, item_uid: str, user_id: int = None):
"""
Delete existing embeddings and re-process an Item.
:param item_uid: UID of the Item node to re-embed.
:param user_id: Optional user ID for usage tracking.
:returns: Dict with processing results.
"""
logger.info("Task reembed_item starting item_uid=%s", item_uid)
try:
from library.services.pipeline import EmbeddingPipeline
user = _resolve_user(user_id)
pipeline = EmbeddingPipeline(user=user)
def progress_cb(percent, message):
_update_progress(self, percent, message)
result = pipeline.reprocess_item(item_uid, progress_callback=progress_cb)
logger.info("Task reembed_item completed item_uid=%s", item_uid)
return {"success": True, "item_uid": item_uid, **result}
except Exception as exc:
logger.error("Task reembed_item failed item_uid=%s: %s", item_uid, exc, exc_info=True)
if self.request.retries < self.max_retries:
raise self.retry(exc=exc)
return {"success": False, "item_uid": item_uid, "error": str(exc)}
@shared_task(
name="library.tasks.embed_collection",
bind=True,
queue="batch",
acks_late=True,
)
def embed_collection(self, collection_uid: str, user_id: int = None):
"""
Embed all items in a collection.
:param collection_uid: UID of the Collection node.
:param user_id: Optional user ID for usage tracking.
:returns: Dict with summary results.
"""
logger.info("Task embed_collection starting collection_uid=%s", collection_uid)
try:
from library.models import Collection
col = Collection.nodes.get(uid=collection_uid)
items = col.items.all()
results = {"total": len(items), "successful": 0, "failed": 0, "skipped": 0}
for i, item in enumerate(items):
# Skip already-completed items with unchanged content
if item.embedding_status == "completed" and item.content_hash:
results["skipped"] += 1
logger.debug("Skipping already-embedded item_uid=%s", item.uid)
continue
try:
embed_item.delay(item.uid, user_id)
results["successful"] += 1
except Exception as exc:
results["failed"] += 1
logger.error(
"Failed to queue embed for item_uid=%s: %s", item.uid, exc
)
_update_progress(
self,
int((i + 1) / len(items) * 100),
f"Queued {i + 1}/{len(items)} items",
)
logger.info(
"Task embed_collection completed collection_uid=%s queued=%d skipped=%d failed=%d",
collection_uid,
results["successful"],
results["skipped"],
results["failed"],
)
return {"success": True, "collection_uid": collection_uid, **results}
except Exception as exc:
logger.error(
"Task embed_collection failed collection_uid=%s: %s",
collection_uid,
exc,
exc_info=True,
)
return {"success": False, "collection_uid": collection_uid, "error": str(exc)}
@shared_task(
name="library.tasks.embed_library",
bind=True,
queue="batch",
acks_late=True,
)
def embed_library(self, library_uid: str, user_id: int = None):
"""
Embed all items across all collections in a library.
:param library_uid: UID of the Library node.
:param user_id: Optional user ID for usage tracking.
:returns: Dict with summary results.
"""
logger.info("Task embed_library starting library_uid=%s", library_uid)
try:
from library.models import Library
lib = Library.nodes.get(uid=library_uid)
collections = lib.collections.all()
results = {"total_collections": len(collections), "items_queued": 0}
for col in collections:
embed_collection.delay(col.uid, user_id)
results["items_queued"] += len(col.items.all())
logger.info(
"Task embed_library completed library_uid=%s collections=%d items=%d",
library_uid,
results["total_collections"],
results["items_queued"],
)
return {"success": True, "library_uid": library_uid, **results}
except Exception as exc:
logger.error(
"Task embed_library failed library_uid=%s: %s",
library_uid,
exc,
exc_info=True,
)
return {"success": False, "library_uid": library_uid, "error": str(exc)}
@shared_task(
name="library.tasks.batch_embed_items",
bind=True,
queue="batch",
acks_late=True,
)
def batch_embed_items(self, item_uids: list[str], user_id: int = None):
"""
Queue embedding tasks for a specific list of items.
:param item_uids: List of Item UIDs.
:param user_id: Optional user ID for usage tracking.
:returns: Dict with queuing results.
"""
logger.info("Task batch_embed_items starting count=%d", len(item_uids))
queued = 0
for uid in item_uids:
try:
embed_item.delay(uid, user_id)
queued += 1
except Exception as exc:
logger.error("Failed to queue item_uid=%s: %s", uid, exc)
logger.info("Task batch_embed_items completed queued=%d/%d", queued, len(item_uids))
return {"success": True, "queued": queued, "total": len(item_uids)}
def _resolve_user(user_id: int = None):
"""
Resolve a user ID to a User instance.
:param user_id: Optional user ID.
:returns: User instance, or None.
"""
if not user_id:
return None
try:
from django.contrib.auth import get_user_model
User = get_user_model()
return User.objects.get(pk=user_id)
except Exception:
return None

View File

@@ -0,0 +1,19 @@
{% extends "themis/base.html" %}
{% block title %}Delete {{ collection.name }} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="max-w-lg">
<h1 class="text-3xl font-bold mb-4 text-error">Delete Collection</h1>
<div class="alert alert-warning mb-6">
<span>Are you sure you want to delete <strong>{{ collection.name }}</strong>? This action cannot be undone.</span>
</div>
<form method="post">
{% csrf_token %}
<div class="flex gap-2">
<button type="submit" class="btn btn-error">Delete</button>
<a href="{% url 'library:collection-detail' uid=collection.uid %}" class="btn btn-ghost">Cancel</a>
</div>
</form>
</div>
{% endblock %}

View File

@@ -0,0 +1,71 @@
{% extends "themis/base.html" %}
{% block title %}{{ collection.name }} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="mb-4">
{% if library %}
<a href="{% url 'library:library-detail' uid=library.uid %}" class="btn btn-ghost btn-sm">← {{ library.name }}</a>
{% else %}
<a href="{% url 'library:library-list' %}" class="btn btn-ghost btn-sm">← Libraries</a>
{% endif %}
</div>
<div class="flex justify-between items-start mb-6">
<div>
<h1 class="text-3xl font-bold">{{ collection.name }}</h1>
{% if library %}<p class="opacity-60 mt-1">In: {{ library.name }}</p>{% endif %}
{% if collection.description %}
<p class="mt-3 opacity-80">{{ collection.description }}</p>
{% endif %}
</div>
<div class="flex gap-2">
<a href="{% url 'library:collection-edit' uid=collection.uid %}" class="btn btn-sm btn-outline">Edit</a>
<a href="{% url 'library:collection-delete' uid=collection.uid %}" class="btn btn-sm btn-error btn-outline">Delete</a>
</div>
</div>
<!-- Items -->
<div class="flex justify-between items-center mb-4">
<h2 class="text-xl font-bold">Items</h2>
<a href="{% url 'library:item-create' collection_uid=collection.uid %}" class="btn btn-sm btn-primary">
+ New Item
</a>
</div>
{% if items %}
<div class="overflow-x-auto">
<table class="table table-zebra w-full">
<thead>
<tr>
<th>Title</th>
<th>Type</th>
<th>File Type</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
{% for item in items %}
<tr>
<td>
<a href="{% url 'library:item-detail' uid=item.uid %}" class="link link-hover font-medium">
{{ item.title }}
</a>
</td>
<td>{{ item.item_type|default:"-" }}</td>
<td>{{ item.file_type|default:"-" }}</td>
<td>
<a href="{% url 'library:item-detail' uid=item.uid %}" class="btn btn-xs btn-ghost">View</a>
<a href="{% url 'library:item-edit' uid=item.uid %}" class="btn btn-xs btn-ghost">Edit</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% else %}
<div class="text-center py-8 opacity-60">
<p>No items in this collection yet.</p>
</div>
{% endif %}
{% endblock %}

View File

@@ -0,0 +1,43 @@
{% extends "themis/base.html" %}
{% block title %}{% if editing %}Edit Collection{% else %}New Collection{% endif %} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="mb-4">
{% if library %}
<a href="{% url 'library:library-detail' uid=library.uid %}" class="btn btn-ghost btn-sm">← {{ library.name }}</a>
{% else %}
<a href="{% url 'library:library-list' %}" class="btn btn-ghost btn-sm">← Libraries</a>
{% endif %}
</div>
<h1 class="text-3xl font-bold mb-6">
{% if editing %}Edit Collection: {{ collection.name }}{% else %}New Collection{% endif %}
</h1>
{% if library %}<p class="opacity-60 mb-4">In library: {{ library.name }}</p>{% endif %}
<form method="post" class="max-w-2xl">
{% csrf_token %}
<div class="space-y-4">
<div class="form-control">
<label class="label"><span class="label-text font-medium">Name</span></label>
{{ form.name }}
{% if form.name.errors %}<p class="text-error text-sm mt-1">{{ form.name.errors.0 }}</p>{% endif %}
</div>
<div class="form-control">
<label class="label"><span class="label-text font-medium">Description</span></label>
{{ form.description }}
</div>
</div>
<div class="flex gap-2 mt-6">
<button type="submit" class="btn btn-primary">
{% if editing %}Save Changes{% else %}Create Collection{% endif %}
</button>
{% if library %}
<a href="{% url 'library:library-detail' uid=library.uid %}" class="btn btn-ghost">Cancel</a>
{% else %}
<a href="{% url 'library:library-list' %}" class="btn btn-ghost">Cancel</a>
{% endif %}
</div>
</form>
{% endblock %}

View File

@@ -0,0 +1,163 @@
{% extends "themis/base.html" %}
{% load humanize %}
{% block title %}Embedding Pipeline — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="mb-4">
<a href="{% url 'library:library-list' %}" class="btn btn-ghost btn-sm">← Libraries</a>
</div>
<h1 class="text-3xl font-bold mb-6">Embedding Pipeline Dashboard</h1>
<!-- System Models -->
<div class="card bg-base-200 mb-6">
<div class="card-body">
<h2 class="card-title">System Models</h2>
<div class="overflow-x-auto">
<table class="table">
<tbody>
<tr>
<th class="w-48">Embedding Model</th>
<td>
{% if system_embedding_model %}
<span class="font-semibold">{{ system_embedding_model.api.name }}: {{ system_embedding_model.name }}</span>
{% if system_embedding_model.vector_dimensions %}
<span class="badge badge-info badge-sm ml-2">{{ system_embedding_model.vector_dimensions }}d</span>
{% endif %}
{% if system_embedding_model.supports_multimodal %}
<span class="badge badge-accent badge-sm ml-1">Multimodal</span>
{% endif %}
{% else %}
<div class="flex items-center gap-2">
<span class="badge badge-error">NOT CONFIGURED</span>
<span class="text-sm opacity-60">Set via <a href="/admin/llm_manager/llmmodel/" class="link link-primary">Admin → LLM Models</a> → Action: "Set as System Embedding Model"</span>
</div>
{% endif %}
</td>
</tr>
<tr>
<th>Chat Model</th>
<td>
{% if system_chat_model %}
<span class="font-semibold">{{ system_chat_model.api.name }}: {{ system_chat_model.name }}</span>
{% else %}
<span class="text-sm opacity-60">Not configured — concept extraction disabled</span>
{% endif %}
</td>
</tr>
<tr>
<th>Reranker Model</th>
<td>
{% if system_reranker_model %}
<span class="font-semibold">{{ system_reranker_model.api.name }}: {{ system_reranker_model.name }}</span>
{% else %}
<span class="text-sm opacity-60">Not configured — Phase 3</span>
{% endif %}
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
{% if not neo4j_available %}
<div class="alert alert-warning mb-6">
<svg xmlns="http://www.w3.org/2000/svg" class="stroke-current shrink-0 h-6 w-6" fill="none" viewBox="0 0 24 24"><path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M12 9v2m0 4h.01m-6.938 4h13.856c1.54 0 2.502-1.667 1.732-2.5L13.732 4c-.77-.833-1.962-.833-2.732 0L4.082 16.5c-.77.833.192 2.5 1.732 2.5z" /></svg>
<span>Neo4j is not available. Item counts and graph statistics cannot be loaded.</span>
</div>
{% endif %}
<!-- Embedding Status -->
{% if neo4j_available %}
<div class="grid grid-cols-2 md:grid-cols-5 gap-4 mb-6">
<div class="stat bg-base-200 rounded-lg">
<div class="stat-title">Total Items</div>
<div class="stat-value text-lg">{{ total_items }}</div>
</div>
{% for status, count in status_counts.items %}
<div class="stat bg-base-200 rounded-lg">
<div class="stat-title">
{% if status == "completed" %}✓ Completed
{% elif status == "processing" %}⟳ Processing
{% elif status == "failed" %}✗ Failed
{% elif status == "pending" %}◦ Pending
{% else %}{{ status }}{% endif %}
</div>
<div class="stat-value text-lg
{% if status == 'completed' %} text-success
{% elif status == 'processing' %} text-warning
{% elif status == 'failed' %} text-error
{% endif %}">
{{ count }}
</div>
{% if total_items > 0 %}
<div class="stat-desc">{% widthratio count total_items 100 %}% of items</div>
{% endif %}
</div>
{% endfor %}
</div>
{% endif %}
<!-- Actions -->
{% if status_counts.pending and status_counts.pending > 0 %}
<div class="card bg-base-200 mb-6">
<div class="card-body">
<h2 class="card-title">Actions</h2>
<form method="post" action="{% url 'library:embed-all-pending' %}">
{% csrf_token %}
<button type="submit" class="btn btn-primary"
onclick="return confirm('Queue embedding for {{ status_counts.pending }} pending items?')">
Embed All Pending Items ({{ status_counts.pending }})
</button>
<p class="text-sm opacity-60 mt-2">
This will queue Celery tasks for all pending items that have uploaded files.
</p>
</form>
</div>
</div>
{% endif %}
<!-- Knowledge Graph Nodes -->
{% if neo4j_available %}
<div class="card bg-base-200 mb-6">
<div class="card-body">
<h2 class="card-title">Knowledge Graph</h2>
<div class="grid grid-cols-2 md:grid-cols-4 gap-3">
{% for label, count in node_counts.items %}
<div class="stat bg-base-100 rounded-lg p-3">
<div class="stat-title text-xs">{{ label }}</div>
<div class="stat-value text-base">{{ count|intcomma }}</div>
</div>
{% endfor %}
</div>
{% if total_chunks > 0 %}
<div class="mt-4">
<div class="flex items-center gap-2">
<span class="font-medium">Chunks with embeddings:</span>
<span>{{ embedded_chunks|intcomma }} / {{ total_chunks|intcomma }}</span>
<progress class="progress progress-primary w-48"
value="{{ embedded_chunks }}"
max="{{ total_chunks }}"></progress>
<span class="text-sm opacity-60">{% widthratio embedded_chunks total_chunks 100 %}%</span>
</div>
</div>
{% endif %}
</div>
</div>
{% endif %}
<!-- Quick Links -->
<div class="card bg-base-200">
<div class="card-body">
<h2 class="card-title">Quick Links</h2>
<div class="flex flex-wrap gap-2">
<a href="{% url 'library:library-list' %}" class="btn btn-outline btn-sm">Libraries</a>
<a href="/llm/" class="btn btn-outline btn-sm">LLM Manager</a>
<a href="/admin/llm_manager/llmmodel/" class="btn btn-outline btn-sm">Admin: LLM Models</a>
<a href="/admin/llm_manager/llmusage/" class="btn btn-outline btn-sm">Admin: Usage</a>
</div>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,19 @@
{% extends "themis/base.html" %}
{% block title %}Delete {{ item.title }} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="max-w-lg">
<h1 class="text-3xl font-bold mb-4 text-error">Delete Item</h1>
<div class="alert alert-warning mb-6">
<span>Are you sure you want to delete <strong>{{ item.title }}</strong>? This action cannot be undone.</span>
</div>
<form method="post">
{% csrf_token %}
<div class="flex gap-2">
<button type="submit" class="btn btn-error">Delete</button>
<a href="{% url 'library:item-detail' uid=item.uid %}" class="btn btn-ghost">Cancel</a>
</div>
</form>
</div>
{% endblock %}

View File

@@ -0,0 +1,138 @@
{% extends "themis/base.html" %}
{% load humanize %}
{% block title %}{{ item.title }} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="mb-4">
<a href="{% url 'library:library-list' %}" class="btn btn-ghost btn-sm">← Libraries</a>
</div>
<div class="flex justify-between items-start mb-6">
<div>
<h1 class="text-3xl font-bold">{{ item.title }}</h1>
{% if item.item_type %}<div class="badge badge-outline mt-2">{{ item.item_type }}</div>{% endif %}
{% if item.file_type %}<div class="badge badge-ghost mt-2 ml-1">{{ item.file_type }}</div>{% endif %}
</div>
<div class="flex gap-2">
<a href="{% url 'library:item-edit' uid=item.uid %}" class="btn btn-sm btn-outline">Edit</a>
<form method="post" action="{% url 'library:item-reembed' uid=item.uid %}" class="inline">
{% csrf_token %}
<button type="submit" class="btn btn-sm btn-outline btn-secondary" title="Re-embed this item">
↻ Re-embed
</button>
</form>
<a href="{% url 'library:item-delete' uid=item.uid %}" class="btn btn-sm btn-error btn-outline">Delete</a>
</div>
</div>
<!-- Embedding Status -->
<div class="mb-6">
<div class="flex items-center gap-3">
<span class="font-medium">Embedding Status:</span>
{% if item.embedding_status == "completed" %}
<span class="badge badge-success">Completed</span>
{% elif item.embedding_status == "processing" %}
<span class="badge badge-warning">Processing</span>
{% elif item.embedding_status == "failed" %}
<span class="badge badge-error">Failed</span>
{% else %}
<span class="badge badge-ghost">Pending</span>
{% endif %}
{% if item.embedding_model_name %}
<span class="text-sm opacity-60">Model: {{ item.embedding_model_name }}</span>
{% endif %}
</div>
{% if item.error_message %}
<div class="alert alert-error mt-2">
<span>{{ item.error_message }}</span>
</div>
{% endif %}
</div>
<!-- Item Metadata -->
<div class="grid grid-cols-1 md:grid-cols-4 gap-4 mb-6">
<div class="stat bg-base-200 rounded-lg">
<div class="stat-title">File Size</div>
<div class="stat-value text-lg">{{ item.file_size|default:0|intcomma }} bytes</div>
</div>
<div class="stat bg-base-200 rounded-lg">
<div class="stat-title">Chunks</div>
<div class="stat-value text-lg">{{ item.chunk_count|default:0 }}</div>
</div>
<div class="stat bg-base-200 rounded-lg">
<div class="stat-title">Images</div>
<div class="stat-value text-lg">{{ item.image_count|default:0 }}</div>
</div>
<div class="stat bg-base-200 rounded-lg">
<div class="stat-title">Concepts</div>
<div class="stat-value text-lg">{{ concepts|length }}</div>
</div>
</div>
<!-- Concepts -->
{% if concepts %}
<div class="mb-6">
<h2 class="text-xl font-bold mb-3">Referenced Concepts</h2>
<div class="flex flex-wrap gap-2">
{% for concept in concepts %}
<div class="badge badge-lg badge-primary badge-outline">
{{ concept.name }}
{% if concept.concept_type %}
<span class="ml-1 opacity-60 text-xs">({{ concept.concept_type }})</span>
{% endif %}
</div>
{% endfor %}
</div>
</div>
{% endif %}
<!-- Images -->
{% if images %}
<div class="mb-6">
<h2 class="text-xl font-bold mb-3">Images ({{ images|length }})</h2>
<div class="grid grid-cols-2 md:grid-cols-4 gap-3">
{% for img in images %}
<div class="card bg-base-200">
<div class="card-body p-3">
<span class="badge badge-sm">{{ img.image_type|default:"image" }}</span>
{% if img.description %}
<p class="text-xs opacity-60 mt-1">{{ img.description|truncatewords:10 }}</p>
{% endif %}
<p class="text-xs opacity-40 mt-1">{{ img.s3_key }}</p>
</div>
</div>
{% endfor %}
</div>
</div>
{% endif %}
<!-- Chunks Preview -->
{% if chunks %}
<div class="mb-6">
<h2 class="text-xl font-bold mb-3">Chunks ({{ chunks|length }})</h2>
<div class="space-y-2">
{% for chunk in chunks|slice:":10" %}
<div class="collapse collapse-arrow bg-base-200">
<input type="checkbox" />
<div class="collapse-title font-medium">
Chunk {{ chunk.chunk_index }} <span class="text-sm opacity-60">({{ chunk.chunk_size }} chars)</span>
{% if chunk.embedding %}
<span class="badge badge-success badge-xs ml-2">embedded</span>
{% else %}
<span class="badge badge-ghost badge-xs ml-2">no vector</span>
{% endif %}
</div>
<div class="collapse-content">
<p class="text-sm whitespace-pre-wrap">{{ chunk.text_preview }}</p>
</div>
</div>
{% endfor %}
{% if chunks|length > 10 %}
<p class="text-sm opacity-60">… and {{ chunks|length|add:"-10" }} more chunks</p>
{% endif %}
</div>
</div>
{% endif %}
{% endblock %}

View File

@@ -0,0 +1,67 @@
{% extends "themis/base.html" %}
{% block title %}{% if editing %}Edit Item{% else %}New Item{% endif %} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="mb-4">
{% if collection %}
<a href="{% url 'library:collection-detail' uid=collection.uid %}" class="btn btn-ghost btn-sm">← {{ collection.name }}</a>
{% else %}
<a href="{% url 'library:library-list' %}" class="btn btn-ghost btn-sm">← Libraries</a>
{% endif %}
</div>
<h1 class="text-3xl font-bold mb-6">
{% if editing %}Edit Item: {{ item.title }}{% else %}New Item{% endif %}
</h1>
<form method="post" enctype="multipart/form-data" class="max-w-2xl">
{% csrf_token %}
<div class="space-y-4">
<div class="form-control">
<label class="label"><span class="label-text font-medium">Title</span></label>
{{ form.title }}
{% if form.title.errors %}<p class="text-error text-sm mt-1">{{ form.title.errors.0 }}</p>{% endif %}
</div>
<div class="form-control">
<label class="label"><span class="label-text font-medium">Item Type</span></label>
{{ form.item_type }}
</div>
<div class="form-control">
<label class="label"><span class="label-text font-medium">File Type</span></label>
{{ form.file_type }}
<label class="label"><span class="label-text-alt">Auto-detected from uploaded file if left blank</span></label>
</div>
{% if not editing %}
<!-- File upload (only on create) -->
<div class="form-control">
<label class="label"><span class="label-text font-medium">Document File</span></label>
{{ form.file }}
<label class="label">
<span class="label-text-alt">{{ form.file.help_text }}</span>
</label>
{% if form.file.errors %}<p class="text-error text-sm mt-1">{{ form.file.errors.0 }}</p>{% endif %}
</div>
<div class="form-control">
<label class="label cursor-pointer justify-start gap-3">
{{ form.auto_embed }}
<span class="label-text">Auto-embed after upload</span>
</label>
<label class="label">
<span class="label-text-alt">{{ form.auto_embed.help_text }}</span>
</label>
</div>
{% endif %}
</div>
<div class="flex gap-2 mt-6">
<button type="submit" class="btn btn-primary">
{% if editing %}Save Changes{% else %}Create Item{% endif %}
</button>
{% if collection %}
<a href="{% url 'library:collection-detail' uid=collection.uid %}" class="btn btn-ghost">Cancel</a>
{% endif %}
</div>
</form>
{% endblock %}

View File

@@ -0,0 +1,23 @@
{% extends "themis/base.html" %}
{% block title %}Delete {{ library.name }} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="mb-4">
<a href="{% url 'library:library-detail' uid=library.uid %}" class="btn btn-ghost btn-sm">← {{ library.name }}</a>
</div>
<div class="max-w-lg">
<h1 class="text-3xl font-bold mb-4 text-error">Delete Library</h1>
<div class="alert alert-warning mb-6">
<span>Are you sure you want to delete <strong>{{ library.name }}</strong>? This action cannot be undone.</span>
</div>
<form method="post">
{% csrf_token %}
<div class="flex gap-2">
<button type="submit" class="btn btn-error">Delete</button>
<a href="{% url 'library:library-detail' uid=library.uid %}" class="btn btn-ghost">Cancel</a>
</div>
</form>
</div>
{% endblock %}

View File

@@ -0,0 +1,93 @@
{% extends "themis/base.html" %}
{% block title %}{{ library.name }} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="mb-4">
<a href="{% url 'library:library-list' %}" class="btn btn-ghost btn-sm">← Libraries</a>
</div>
<div class="flex justify-between items-start mb-6">
<div>
<h1 class="text-3xl font-bold">{{ library.name }}</h1>
<div class="badge badge-primary mt-2">{{ library.library_type }}</div>
{% if library.description %}
<p class="mt-3 opacity-80">{{ library.description }}</p>
{% endif %}
</div>
<div class="flex gap-2">
<a href="{% url 'library:library-edit' uid=library.uid %}" class="btn btn-sm btn-outline">Edit</a>
<a href="{% url 'library:library-delete' uid=library.uid %}" class="btn btn-sm btn-error btn-outline">Delete</a>
</div>
</div>
<!-- Content-Type Configuration -->
<div class="collapse collapse-arrow bg-base-200 mb-6">
<input type="checkbox" />
<div class="collapse-title font-medium">Content-Type Configuration</div>
<div class="collapse-content">
<div class="grid grid-cols-1 gap-4">
{% if library.embedding_instruction %}
<div>
<h4 class="font-semibold text-sm opacity-60">Embedding Instruction</h4>
<p class="text-sm mt-1">{{ library.embedding_instruction }}</p>
</div>
{% endif %}
{% if library.reranker_instruction %}
<div>
<h4 class="font-semibold text-sm opacity-60">Reranker Instruction</h4>
<p class="text-sm mt-1">{{ library.reranker_instruction }}</p>
</div>
{% endif %}
{% if library.llm_context_prompt %}
<div>
<h4 class="font-semibold text-sm opacity-60">LLM Context Prompt</h4>
<p class="text-sm mt-1">{{ library.llm_context_prompt }}</p>
</div>
{% endif %}
</div>
</div>
</div>
<!-- Collections -->
<div class="flex justify-between items-center mb-4">
<h2 class="text-xl font-bold">Collections</h2>
<a href="{% url 'library:collection-create' library_uid=library.uid %}" class="btn btn-sm btn-primary">
+ New Collection
</a>
</div>
{% if collections %}
<div class="overflow-x-auto">
<table class="table table-zebra w-full">
<thead>
<tr>
<th>Name</th>
<th>Description</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
{% for col in collections %}
<tr>
<td>
<a href="{% url 'library:collection-detail' uid=col.uid %}" class="link link-hover font-medium">
{{ col.name }}
</a>
</td>
<td class="opacity-70">{{ col.description|truncatewords:15 }}</td>
<td>
<a href="{% url 'library:collection-detail' uid=col.uid %}" class="btn btn-xs btn-ghost">View</a>
<a href="{% url 'library:collection-edit' uid=col.uid %}" class="btn btn-xs btn-ghost">Edit</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% else %}
<div class="text-center py-8 opacity-60">
<p>No collections in this library yet.</p>
</div>
{% endif %}
{% endblock %}

View File

@@ -0,0 +1,59 @@
{% extends "themis/base.html" %}
{% block title %}{% if editing %}Edit Library{% else %}New Library{% endif %} — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="mb-4">
<a href="{% url 'library:library-list' %}" class="btn btn-ghost btn-sm">← Libraries</a>
</div>
<h1 class="text-3xl font-bold mb-6">
{% if editing %}Edit Library: {{ library.name }}{% else %}New Library{% endif %}
</h1>
<form method="post" class="max-w-2xl">
{% csrf_token %}
<div class="space-y-4">
<div class="form-control">
<label class="label"><span class="label-text font-medium">Name</span></label>
{{ form.name }}
{% if form.name.errors %}<p class="text-error text-sm mt-1">{{ form.name.errors.0 }}</p>{% endif %}
</div>
<div class="form-control">
<label class="label"><span class="label-text font-medium">Library Type</span></label>
{{ form.library_type }}
</div>
<div class="form-control">
<label class="label"><span class="label-text font-medium">Description</span></label>
{{ form.description }}
</div>
<div class="divider">Content-Type Configuration</div>
<div class="form-control">
<label class="label"><span class="label-text font-medium">Embedding Instruction</span></label>
{{ form.embedding_instruction }}
<label class="label"><span class="label-text-alt opacity-60">Leave blank to use default for the selected library type</span></label>
</div>
<div class="form-control">
<label class="label"><span class="label-text font-medium">Reranker Instruction</span></label>
{{ form.reranker_instruction }}
</div>
<div class="form-control">
<label class="label"><span class="label-text font-medium">LLM Context Prompt</span></label>
{{ form.llm_context_prompt }}
</div>
</div>
<div class="flex gap-2 mt-6">
<button type="submit" class="btn btn-primary">
{% if editing %}Save Changes{% else %}Create Library{% endif %}
</button>
<a href="{% url 'library:library-list' %}" class="btn btn-ghost">Cancel</a>
</div>
</form>
{% endblock %}

View File

@@ -0,0 +1,54 @@
{% extends "themis/base.html" %}
{% block title %}Libraries — {{ themis_app_name }}{% endblock %}
{% block content %}
<div class="flex justify-between items-center mb-6">
<h1 class="text-3xl font-bold">Libraries</h1>
<div class="flex gap-2">
<a href="{% url 'library:embedding-dashboard' %}" class="btn btn-outline btn-secondary">
Embedding Pipeline
</a>
<a href="{% url 'library:library-create' %}" class="btn btn-primary">
+ New Library
</a>
</div>
</div>
{% if error %}
<div class="alert alert-warning mb-4">
<span>{{ error }}</span>
</div>
{% endif %}
{% if libraries %}
<div class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
{% for lib in libraries %}
<div class="card bg-base-200 shadow-md">
<div class="card-body">
<h2 class="card-title">
<a href="{% url 'library:library-detail' uid=lib.uid %}" class="link link-hover">
{{ lib.name }}
</a>
</h2>
<div class="badge badge-outline">{{ lib.library_type }}</div>
{% if lib.description %}
<p class="text-sm opacity-70 mt-2">{{ lib.description|truncatewords:20 }}</p>
{% endif %}
<div class="card-actions justify-end mt-3">
<a href="{% url 'library:library-detail' uid=lib.uid %}" class="btn btn-sm btn-ghost">View</a>
<a href="{% url 'library:library-edit' uid=lib.uid %}" class="btn btn-sm btn-ghost">Edit</a>
</div>
</div>
</div>
{% endfor %}
</div>
{% else %}
{% if not error %}
<div class="text-center py-12 opacity-60">
<p class="text-lg">No libraries yet.</p>
<p class="mt-2">Create your first library to get started.</p>
</div>
{% endif %}
{% endif %}
{% endblock %}

View File

View File

@@ -0,0 +1,108 @@
"""
Tests for the content-type-aware chunking service.
"""
from unittest.mock import MagicMock, patch
from django.test import TestCase
from library.services.chunker import ChunkResult, ContentTypeChunker
from library.services.parsers import ParseResult, TextBlock
class ChunkResultTests(TestCase):
"""Tests for ChunkResult dataclass."""
def test_len(self):
result = ChunkResult(chunks=["a", "b", "c"], chunk_page_map={}, strategy="test")
self.assertEqual(len(result), 3)
def test_empty(self):
result = ChunkResult(chunks=[], chunk_page_map={}, strategy="test")
self.assertEqual(len(result), 0)
class ContentTypeChunkerTests(TestCase):
"""Tests for ContentTypeChunker."""
def _make_parse_result(self, text: str, pages: int = 1) -> ParseResult:
"""Helper to create a ParseResult with text blocks."""
blocks = []
if pages == 1:
blocks = [TextBlock(text=text, page=0)]
else:
chunk_size = len(text) // pages
for i in range(pages):
start = i * chunk_size
end = start + chunk_size if i < pages - 1 else len(text)
blocks.append(TextBlock(text=text[start:end], page=i))
return ParseResult(text_blocks=blocks, images=[], metadata={}, file_type="txt")
@patch("library.services.chunker.ContentTypeChunker._get_splitter")
def test_chunk_dispatches_strategy(self, mock_splitter):
"""Chunker uses the strategy from config."""
mock_instance = MagicMock()
mock_instance.chunks.return_value = ["chunk1", "chunk2"]
mock_splitter.return_value = mock_instance
chunker = ContentTypeChunker()
parse_result = self._make_parse_result("Some text to chunk into pieces")
config = {"strategy": "chapter_aware", "chunk_size": 512, "chunk_overlap": 64}
result = chunker.chunk(parse_result, config, library_type="fiction")
self.assertIsInstance(result, ChunkResult)
self.assertEqual(result.strategy, "chapter_aware")
self.assertEqual(len(result.chunks), 2)
mock_splitter.assert_called_once_with(512, 64)
@patch("library.services.chunker.ContentTypeChunker._get_splitter")
def test_empty_text_returns_empty(self, mock_splitter):
"""Empty text produces no chunks."""
chunker = ContentTypeChunker()
parse_result = ParseResult(text_blocks=[], images=[], metadata={}, file_type="txt")
config = {"strategy": "section_aware", "chunk_size": 512, "chunk_overlap": 64}
result = chunker.chunk(parse_result, config)
self.assertEqual(len(result), 0)
mock_splitter.assert_not_called()
@patch("library.services.chunker.ContentTypeChunker._get_splitter")
def test_default_config_values(self, mock_splitter):
"""Missing config keys use defaults."""
mock_instance = MagicMock()
mock_instance.chunks.return_value = ["chunk"]
mock_splitter.return_value = mock_instance
chunker = ContentTypeChunker()
parse_result = self._make_parse_result("Text")
result = chunker.chunk(parse_result, {})
# Default: strategy=section_aware, chunk_size=512, overlap=64
self.assertEqual(result.strategy, "section_aware")
mock_splitter.assert_called_once_with(512, 64)
@patch("library.services.chunker.ContentTypeChunker._get_splitter")
def test_page_mapping(self, mock_splitter):
"""Chunks are mapped to their source pages."""
mock_instance = MagicMock()
mock_instance.chunks.return_value = ["Page 0 text", "Page 1 text"]
mock_splitter.return_value = mock_instance
chunker = ContentTypeChunker()
parse_result = ParseResult(
text_blocks=[
TextBlock(text="Page 0 text content", page=0),
TextBlock(text="Page 1 text content", page=1),
],
images=[],
metadata={},
file_type="pdf",
)
config = {"strategy": "section_aware", "chunk_size": 512, "chunk_overlap": 64}
result = chunker.chunk(parse_result, config)
self.assertIn(0, result.chunk_page_map)

View File

@@ -0,0 +1,77 @@
"""
Tests for the concept extraction service.
"""
from unittest.mock import MagicMock, patch
from django.test import TestCase
from library.services.concepts import ConceptExtractor
class ConceptExtractionParsingTests(TestCase):
"""Tests for concept response parsing."""
def setUp(self):
self.mock_model = MagicMock()
self.mock_model.api.api_type = "openai"
self.mock_model.api.base_url = "http://localhost:8080/v1"
self.mock_model.api.api_key = "test"
self.mock_model.api.timeout_seconds = 30
self.mock_model.name = "test-chat"
self.extractor = ConceptExtractor(self.mock_model)
def test_parse_valid_json_array(self):
response = '[{"name": "python", "type": "topic"}, {"name": "django", "type": "technique"}]'
result = self.extractor._parse_concept_response(response)
self.assertEqual(len(result), 2)
self.assertEqual(result[0]["name"], "python")
self.assertEqual(result[1]["type"], "technique")
def test_parse_json_in_markdown_code_block(self):
response = '```json\n[{"name": "python", "type": "topic"}]\n```'
result = self.extractor._parse_concept_response(response)
self.assertEqual(len(result), 1)
def test_parse_json_embedded_in_text(self):
response = 'Here are the concepts: [{"name": "neo4j", "type": "technique"}] found in the text.'
result = self.extractor._parse_concept_response(response)
self.assertEqual(len(result), 1)
def test_parse_invalid_json_returns_empty(self):
response = "This is not JSON at all."
result = self.extractor._parse_concept_response(response)
self.assertEqual(result, [])
def test_parse_filters_invalid_entries(self):
response = '[{"name": "valid", "type": "topic"}, {"invalid": "entry"}, "string"]'
result = self.extractor._parse_concept_response(response)
self.assertEqual(len(result), 1)
self.assertEqual(result[0]["name"], "valid")
class SampleIndexSelectionTests(TestCase):
"""Tests for sample index selection."""
def setUp(self):
self.extractor = ConceptExtractor(MagicMock())
def test_small_total_returns_all(self):
indices = self.extractor._select_sample_indices(5, max_samples=10)
self.assertEqual(indices, [0, 1, 2, 3, 4])
def test_equal_total_returns_all(self):
indices = self.extractor._select_sample_indices(10, max_samples=10)
self.assertEqual(indices, list(range(10)))
def test_large_total_returns_max_samples(self):
indices = self.extractor._select_sample_indices(100, max_samples=10)
self.assertEqual(len(indices), 10)
# Should be evenly spaced
self.assertEqual(indices[0], 0)
self.assertEqual(indices[-1], 90)
def test_returns_integers(self):
indices = self.extractor._select_sample_indices(50, max_samples=7)
for idx in indices:
self.assertIsInstance(idx, int)

View File

@@ -0,0 +1,165 @@
"""
Tests for the multi-backend embedding client.
"""
from unittest.mock import MagicMock, patch
from django.test import TestCase
from library.services.embedding_client import EmbeddingClient
class MockLLMModel:
"""Mock LLMModel for testing."""
def __init__(self, api_type="openai", supports_multimodal=False, vector_dimensions=None):
self.name = "test-embedding-model"
self.supports_multimodal = supports_multimodal
self.vector_dimensions = vector_dimensions
self.input_cost_per_1k = "0.0001"
self.api = MockLLMApi(api_type=api_type)
class MockLLMApi:
"""Mock LLMApi for testing."""
def __init__(self, api_type="openai"):
self.name = "Test API"
self.api_type = api_type
self.base_url = "http://localhost:8080/v1"
self.api_key = "test-key"
self.timeout_seconds = 60
class EmbeddingClientInitTests(TestCase):
"""Tests for EmbeddingClient initialization."""
def test_init_openai(self):
model = MockLLMModel(api_type="openai")
client = EmbeddingClient(model)
self.assertEqual(client.api_type, "openai")
self.assertEqual(client.model_name, "test-embedding-model")
def test_init_bedrock(self):
model = MockLLMModel(api_type="bedrock")
model.api.base_url = "https://bedrock-runtime.us-east-1.amazonaws.com"
client = EmbeddingClient(model)
self.assertEqual(client.api_type, "bedrock")
def test_init_with_user(self):
model = MockLLMModel()
user = MagicMock()
client = EmbeddingClient(model, user=user)
self.assertEqual(client.user, user)
class OpenAIResponseParsingTests(TestCase):
"""Tests for OpenAI-compatible response parsing."""
def setUp(self):
self.model = MockLLMModel()
self.client = EmbeddingClient(self.model)
def test_parse_standard_openai_format(self):
data = {"data": [{"embedding": [0.1, 0.2, 0.3], "index": 0}]}
result = self.client._parse_openai_response(data)
self.assertEqual(len(result), 1)
self.assertEqual(result[0], [0.1, 0.2, 0.3])
def test_parse_multi_embedding_openai(self):
data = {
"data": [
{"embedding": [0.1, 0.2], "index": 1},
{"embedding": [0.3, 0.4], "index": 0},
]
}
result = self.client._parse_openai_response(data)
self.assertEqual(len(result), 2)
# Should be sorted by index
self.assertEqual(result[0], [0.3, 0.4])
self.assertEqual(result[1], [0.1, 0.2])
def test_parse_list_of_dicts(self):
data = [{"embedding": [0.1, 0.2]}, {"embedding": [0.3, 0.4]}]
result = self.client._parse_openai_response(data)
self.assertEqual(len(result), 2)
def test_parse_dict_with_embedding_key(self):
data = {"embedding": [0.1, 0.2, 0.3]}
result = self.client._parse_openai_response(data)
self.assertEqual(len(result), 1)
def test_parse_dict_with_embeddings_key(self):
data = {"embeddings": [[0.1, 0.2], [0.3, 0.4]]}
result = self.client._parse_openai_response(data)
self.assertEqual(len(result), 2)
def test_unexpected_format_raises(self):
with self.assertRaises(ValueError):
self.client._parse_openai_response({"unexpected": "data"})
class EmbeddingClientDispatchTests(TestCase):
"""Tests for API type dispatch."""
@patch("library.services.embedding_client.requests.post")
def test_embed_text_openai_dispatch(self, mock_post):
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"data": [{"embedding": [0.1, 0.2, 0.3], "index": 0}]
}
mock_post.return_value = mock_response
model = MockLLMModel(api_type="openai")
client = EmbeddingClient(model)
result = client.embed_text("test text")
self.assertEqual(result, [0.1, 0.2, 0.3])
mock_post.assert_called_once()
call_url = mock_post.call_args[0][0]
self.assertIn("/embeddings", call_url)
@patch("library.services.embedding_client.requests.post")
def test_embed_text_bedrock_dispatch(self, mock_post):
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"embedding": [0.4, 0.5, 0.6],
"inputTextTokenCount": 5,
}
mock_post.return_value = mock_response
model = MockLLMModel(api_type="bedrock", vector_dimensions=1024)
model.api.base_url = "https://bedrock-runtime.us-east-1.amazonaws.com"
client = EmbeddingClient(model)
result = client.embed_text("test text")
self.assertEqual(result, [0.4, 0.5, 0.6])
call_url = mock_post.call_args[0][0]
self.assertIn("/model/", call_url)
self.assertIn("/invoke", call_url)
def test_embed_image_not_multimodal_returns_none(self):
model = MockLLMModel(supports_multimodal=False)
client = EmbeddingClient(model)
result = client.embed_image(b"fake image data", "png")
self.assertIsNone(result)
@patch("library.services.embedding_client.requests.post")
def test_embed_texts_batch(self, mock_post):
mock_response = MagicMock()
mock_response.status_code = 200
mock_response.json.return_value = {
"data": [
{"embedding": [0.1, 0.2], "index": 0},
{"embedding": [0.3, 0.4], "index": 1},
]
}
mock_post.return_value = mock_response
model = MockLLMModel()
client = EmbeddingClient(model)
results = client.embed_texts(["text1", "text2"])
self.assertEqual(len(results), 2)

View File

@@ -0,0 +1,129 @@
"""
Tests for the document parser service.
"""
import os
import tempfile
from django.test import TestCase
from library.services.parsers import (
IMAGE_EXTENSIONS,
PLAINTEXT_EXTENSIONS,
PYMUPDF_EXTENSIONS,
DocumentParser,
ParseResult,
)
class DocumentParserPlaintextTests(TestCase):
"""Tests for plain text parsing."""
def setUp(self):
self.parser = DocumentParser()
def test_parse_txt_file(self):
with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False) as f:
f.write("Hello World\n\nThis is a test document.")
f.flush()
path = f.name
try:
result = self.parser.parse(path, "txt")
self.assertIsInstance(result, ParseResult)
self.assertEqual(result.file_type, "txt")
self.assertEqual(len(result.text_blocks), 1)
self.assertIn("Hello World", result.text_blocks[0].text)
self.assertEqual(len(result.images), 0)
finally:
os.unlink(path)
def test_parse_md_file(self):
with tempfile.NamedTemporaryFile(mode="w", suffix=".md", delete=False) as f:
f.write("# Heading\n\nSome markdown content.")
f.flush()
path = f.name
try:
result = self.parser.parse(path, "md")
self.assertEqual(result.file_type, "md")
self.assertIn("Heading", result.text_blocks[0].text)
finally:
os.unlink(path)
def test_parse_empty_file(self):
with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False) as f:
f.write("")
f.flush()
path = f.name
try:
result = self.parser.parse(path, "txt")
self.assertEqual(len(result.text_blocks), 0)
finally:
os.unlink(path)
def test_parse_bytes(self):
data = b"Hello from bytes"
result = self.parser.parse_bytes(data, "txt", filename="test.txt")
self.assertEqual(len(result.text_blocks), 1)
self.assertIn("Hello from bytes", result.text_blocks[0].text)
class DocumentParserValidationTests(TestCase):
"""Tests for parser input validation."""
def setUp(self):
self.parser = DocumentParser()
def test_unsupported_format_raises(self):
with tempfile.NamedTemporaryFile(suffix=".xyz", delete=False) as f:
f.write(b"data")
path = f.name
try:
with self.assertRaises(ValueError) as ctx:
self.parser.parse(path, "xyz")
self.assertIn("Unsupported file type", str(ctx.exception))
finally:
os.unlink(path)
def test_file_type_normalization(self):
"""File type should be normalized (lowercase, no dot)."""
with tempfile.NamedTemporaryFile(mode="w", suffix=".txt", delete=False) as f:
f.write("test")
path = f.name
try:
result = self.parser.parse(path, ".TXT")
self.assertEqual(result.file_type, "txt")
finally:
os.unlink(path)
class SupportedExtensionsTests(TestCase):
"""Tests for supported extension sets."""
def test_pymupdf_includes_pdf(self):
self.assertIn("pdf", PYMUPDF_EXTENSIONS)
def test_pymupdf_includes_epub(self):
self.assertIn("epub", PYMUPDF_EXTENSIONS)
def test_pymupdf_includes_docx(self):
self.assertIn("docx", PYMUPDF_EXTENSIONS)
def test_pymupdf_includes_pptx(self):
self.assertIn("pptx", PYMUPDF_EXTENSIONS)
def test_plaintext_includes_txt(self):
self.assertIn("txt", PLAINTEXT_EXTENSIONS)
def test_plaintext_includes_md(self):
self.assertIn("md", PLAINTEXT_EXTENSIONS)
def test_image_includes_png(self):
self.assertIn("png", IMAGE_EXTENSIONS)
def test_image_includes_jpg(self):
self.assertIn("jpg", IMAGE_EXTENSIONS)

View File

@@ -0,0 +1,103 @@
"""
Tests for the embedding pipeline orchestrator.
Pipeline tests mock external dependencies (Neo4j, S3, LLM APIs).
"""
from unittest.mock import MagicMock, patch
from django.test import TestCase
from library.services.pipeline import (
CHUNK_S3_KEY,
IMAGE_S3_KEY,
ORIGINAL_S3_KEY,
EmbeddingPipeline,
)
class S3KeyPatternTests(TestCase):
"""Tests for S3 key pattern formatting."""
def test_original_key_format(self):
key = ORIGINAL_S3_KEY.format(item_uid="abc123", ext="pdf")
self.assertEqual(key, "items/abc123/original.pdf")
def test_chunk_key_format(self):
key = CHUNK_S3_KEY.format(item_uid="abc123", index=5)
self.assertEqual(key, "chunks/abc123/chunk_5.txt")
def test_image_key_format(self):
key = IMAGE_S3_KEY.format(item_uid="abc123", index=2, ext="png")
self.assertEqual(key, "images/abc123/2.png")
class EmbeddingPipelineInitTests(TestCase):
"""Tests for pipeline initialization."""
def test_init_without_user(self):
pipeline = EmbeddingPipeline()
self.assertIsNone(pipeline.user)
def test_init_with_user(self):
user = MagicMock()
pipeline = EmbeddingPipeline(user=user)
self.assertEqual(pipeline.user, user)
class PipelineItemNotFoundTests(TestCase):
"""Tests for handling missing items."""
@patch("library.services.pipeline.Item")
def test_process_nonexistent_item_raises(self, mock_item_cls):
mock_item_cls.nodes.get.side_effect = Exception("Not found")
pipeline = EmbeddingPipeline()
with self.assertRaises(ValueError) as ctx:
pipeline.process_item("nonexistent-uid")
self.assertIn("Item not found", str(ctx.exception))
@patch("library.services.pipeline.Item")
def test_reprocess_nonexistent_item_raises(self, mock_item_cls):
mock_item_cls.nodes.get.side_effect = Exception("Not found")
pipeline = EmbeddingPipeline()
with self.assertRaises(ValueError):
pipeline.reprocess_item("nonexistent-uid")
class PipelineNoEmbeddingModelTests(TestCase):
"""Tests for handling missing system embedding model."""
@patch("library.services.pipeline.LLMModel")
@patch("library.services.pipeline.default_storage")
@patch("library.services.pipeline.DocumentParser")
def test_no_embedding_model_raises(self, mock_parser, mock_storage, mock_llm):
"""Pipeline raises ValueError if no system embedding model is configured."""
mock_llm.get_system_embedding_model.return_value = None
# Mock item
mock_item = MagicMock()
mock_item.uid = "test-uid"
mock_item.title = "Test"
mock_item.file_type = "txt"
mock_item.s3_key = "items/test-uid/original.txt"
mock_item.embedding_status = "pending"
mock_item.chunks.all.return_value = []
mock_item.images.all.return_value = []
with patch("library.services.pipeline.Item") as mock_item_cls:
mock_item_cls.nodes.get.return_value = mock_item
# Mock S3 read
mock_storage.open.return_value.__enter__ = MagicMock(
return_value=MagicMock(read=MagicMock(return_value=b"test content"))
)
mock_storage.open.return_value.__exit__ = MagicMock(return_value=False)
pipeline = EmbeddingPipeline()
with self.assertRaises(ValueError) as ctx:
pipeline.process_item("test-uid")
self.assertIn("No system embedding model", str(ctx.exception))

View File

@@ -0,0 +1,86 @@
"""
Tests for Celery embedding tasks.
Tasks are tested with CELERY_TASK_ALWAYS_EAGER=True for synchronous execution.
"""
from unittest.mock import MagicMock, patch
from django.test import TestCase, override_settings
@override_settings(CELERY_TASK_ALWAYS_EAGER=True)
class EmbedItemTaskTests(TestCase):
"""Tests for the embed_item task."""
@patch("library.tasks.EmbeddingPipeline")
def test_embed_item_success(self, mock_pipeline_cls):
from library.tasks import embed_item
mock_pipeline = MagicMock()
mock_pipeline.process_item.return_value = {
"chunks_created": 10,
"images_stored": 2,
"model_name": "test-model",
}
mock_pipeline_cls.return_value = mock_pipeline
result = embed_item("test-uid-123")
self.assertTrue(result["success"])
self.assertEqual(result["item_uid"], "test-uid-123")
mock_pipeline.process_item.assert_called_once()
@patch("library.tasks.EmbeddingPipeline")
def test_embed_item_failure(self, mock_pipeline_cls):
from library.tasks import embed_item
mock_pipeline = MagicMock()
mock_pipeline.process_item.side_effect = ValueError("Item not found")
mock_pipeline_cls.return_value = mock_pipeline
result = embed_item("nonexistent-uid")
self.assertFalse(result["success"])
self.assertIn("error", result)
@override_settings(CELERY_TASK_ALWAYS_EAGER=True)
class ReembedItemTaskTests(TestCase):
"""Tests for the reembed_item task."""
@patch("library.tasks.EmbeddingPipeline")
def test_reembed_item_success(self, mock_pipeline_cls):
from library.tasks import reembed_item
mock_pipeline = MagicMock()
mock_pipeline.reprocess_item.return_value = {
"chunks_created": 5,
"images_stored": 1,
"model_name": "test-model",
}
mock_pipeline_cls.return_value = mock_pipeline
result = reembed_item("test-uid-123")
self.assertTrue(result["success"])
mock_pipeline.reprocess_item.assert_called_once()
class ResolveUserTests(TestCase):
"""Tests for the _resolve_user helper."""
def test_none_user_id(self):
from library.tasks import _resolve_user
self.assertIsNone(_resolve_user(None))
def test_zero_user_id(self):
from library.tasks import _resolve_user
self.assertIsNone(_resolve_user(0))
def test_invalid_user_id(self):
from library.tasks import _resolve_user
self.assertIsNone(_resolve_user(999999))

View File

@@ -0,0 +1,121 @@
"""
Tests for text sanitization utilities.
"""
from django.test import TestCase
from library.services.text_utils import (
clean_pdf_artifacts,
remove_excessive_whitespace,
sanitize_text,
truncate_text,
)
class SanitizeTextTests(TestCase):
"""Tests for the sanitize_text function."""
def test_empty_string(self):
self.assertEqual(sanitize_text("", log_changes=False), "")
def test_none_input(self):
self.assertIsNone(sanitize_text(None, log_changes=False))
def test_clean_text_unchanged(self):
text = "Hello, this is clean text."
self.assertEqual(sanitize_text(text, log_changes=False), text)
def test_removes_null_bytes(self):
text = "Hello\x00World"
result = sanitize_text(text, log_changes=False)
self.assertNotIn("\x00", result)
self.assertEqual(result, "HelloWorld")
def test_removes_control_characters(self):
text = "Hello\x07World\x0eTest"
result = sanitize_text(text, log_changes=False)
self.assertNotIn("\x07", result)
self.assertNotIn("\x0e", result)
def test_preserves_newlines_and_tabs(self):
text = "Hello\nWorld\tTest\r\n"
result = sanitize_text(text, log_changes=False)
self.assertIn("\n", result)
self.assertIn("\t", result)
def test_removes_zero_width_characters(self):
text = "Hello\u200bWorld"
result = sanitize_text(text, log_changes=False)
self.assertNotIn("\u200b", result)
def test_normalizes_unicode(self):
# é as combining characters vs. precomposed
combining = "e\u0301" # e + combining acute
result = sanitize_text(combining, log_changes=False)
self.assertEqual(result, "\u00e9") # precomposed é
def test_cleans_pdf_ligatures(self):
text = "finding the flow of effort"
result = sanitize_text(text, log_changes=False)
self.assertIn("fi", result)
self.assertIn("fl", result)
self.assertIn("ff", result)
class CleanPdfArtifactsTests(TestCase):
"""Tests for clean_pdf_artifacts."""
def test_replaces_smart_quotes(self):
text = "\u201cHello\u201d \u2018World\u2019"
result = clean_pdf_artifacts(text)
self.assertEqual(result, '"Hello" \'World\'')
def test_replaces_dashes(self):
text = "word\u2013word\u2014end"
result = clean_pdf_artifacts(text)
self.assertEqual(result, "word-word-end")
def test_replaces_ellipsis(self):
text = "wait\u2026"
result = clean_pdf_artifacts(text)
self.assertEqual(result, "wait...")
def test_replaces_nbsp(self):
text = "non\u00a0breaking"
result = clean_pdf_artifacts(text)
self.assertEqual(result, "non breaking")
class RemoveExcessiveWhitespaceTests(TestCase):
"""Tests for remove_excessive_whitespace."""
def test_collapses_spaces(self):
self.assertEqual(remove_excessive_whitespace("a b"), "a b")
def test_collapses_newlines(self):
self.assertEqual(
remove_excessive_whitespace("a\n\n\n\nb"), "a\n\nb"
)
def test_strips_line_whitespace(self):
self.assertEqual(
remove_excessive_whitespace(" hello \n world "),
"hello\nworld",
)
class TruncateTextTests(TestCase):
"""Tests for truncate_text."""
def test_short_text_unchanged(self):
self.assertEqual(truncate_text("hello", 100), "hello")
def test_truncates_at_word_boundary(self):
text = "hello beautiful world"
result = truncate_text(text, 15)
self.assertTrue(result.endswith("..."))
self.assertLessEqual(len(result), 15)
def test_custom_suffix(self):
result = truncate_text("hello beautiful world", 15, suffix="")
self.assertTrue(result.endswith(""))

56
mnemosyne/library/urls.py Normal file
View File

@@ -0,0 +1,56 @@
"""
URL patterns for the library app.
Provides both custom admin views (HTML CRUD) and DRF API endpoints.
"""
from django.urls import include, path
from . import views
app_name = "library"
urlpatterns = [
# Embedding Pipeline Dashboard
path("embedding/", views.embedding_dashboard, name="embedding-dashboard"),
path("embedding/embed-all/", views.embed_all_pending, name="embed-all-pending"),
# Library CRUD
path("", views.library_list, name="library-list"),
path("create/", views.library_create, name="library-create"),
path("<str:uid>/", views.library_detail, name="library-detail"),
path("<str:uid>/edit/", views.library_edit, name="library-edit"),
path("<str:uid>/delete/", views.library_delete, name="library-delete"),
# Collection CRUD
path(
"<str:library_uid>/collections/create/",
views.collection_create,
name="collection-create",
),
path(
"collections/<str:uid>/",
views.collection_detail,
name="collection-detail",
),
path(
"collections/<str:uid>/edit/",
views.collection_edit,
name="collection-edit",
),
path(
"collections/<str:uid>/delete/",
views.collection_delete,
name="collection-delete",
),
# Item CRUD
path(
"collections/<str:collection_uid>/items/create/",
views.item_create,
name="item-create",
),
path("items/<str:uid>/", views.item_detail, name="item-detail"),
path("items/<str:uid>/edit/", views.item_edit, name="item-edit"),
path("items/<str:uid>/reembed/", views.item_reembed, name="item-reembed"),
path("items/<str:uid>/delete/", views.item_delete, name="item-delete"),
# DRF API
path("api/", include("library.api.urls")),
]

View File

@@ -0,0 +1,23 @@
"""
Utility helpers for the library app.
"""
import logging
logger = logging.getLogger(__name__)
def neo4j_available():
"""
Check whether Neo4j is reachable.
Returns True if a simple Cypher query succeeds, False otherwise.
Used to guard views/tests that require Neo4j.
"""
try:
from neomodel import db
db.cypher_query("RETURN 1")
return True
except Exception:
return False

587
mnemosyne/library/views.py Normal file
View File

@@ -0,0 +1,587 @@
"""
Custom admin views for Library, Collection, and Item CRUD.
Since neomodel StructuredNodes cannot use Django's standard ModelAdmin,
these FBVs provide CRUD operations rendered within Themis's template structure.
All views require login.
"""
import hashlib
import logging
import os
from django.contrib import messages
from django.contrib.auth.decorators import login_required
from django.core.files.base import ContentFile
from django.core.files.storage import default_storage
from django.shortcuts import redirect, render
from .content_types import get_library_type_config
from .forms import CollectionForm, ItemForm, LibraryForm
from .utils import neo4j_available
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Library views
# ---------------------------------------------------------------------------
@login_required
def library_list(request):
"""List all libraries."""
libraries = []
error = None
if neo4j_available():
try:
from .models import Library
libraries = Library.nodes.order_by("name")
except Exception as e:
error = f"Could not connect to Neo4j: {e}"
logger.error(error)
else:
error = "Neo4j is not available."
return render(
request,
"library/library_list.html",
{"libraries": libraries, "error": error},
)
@login_required
def library_create(request):
"""Create a new library."""
if request.method == "POST":
form = LibraryForm(request.POST)
if form.is_valid():
try:
from .models import Library
# If content-type fields are empty, populate from defaults
library_type = form.cleaned_data["library_type"]
defaults = get_library_type_config(library_type)
lib = Library(
name=form.cleaned_data["name"],
library_type=library_type,
description=form.cleaned_data.get("description", ""),
chunking_config=defaults["chunking_config"],
embedding_instruction=(
form.cleaned_data.get("embedding_instruction")
or defaults["embedding_instruction"]
),
reranker_instruction=(
form.cleaned_data.get("reranker_instruction")
or defaults["reranker_instruction"]
),
llm_context_prompt=(
form.cleaned_data.get("llm_context_prompt")
or defaults["llm_context_prompt"]
),
)
lib.save()
messages.success(request, f'Library "{lib.name}" created.')
return redirect("library:library-detail", uid=lib.uid)
except Exception as e:
messages.error(request, f"Error creating library: {e}")
else:
form = LibraryForm()
return render(request, "library/library_form.html", {"form": form, "editing": False})
@login_required
def library_detail(request, uid):
"""View library details and its collections."""
try:
from .models import Library
lib = Library.nodes.get(uid=uid)
collections = lib.collections.all()
except Exception as e:
messages.error(request, f"Library not found: {e}")
return redirect("library:library-list")
return render(
request,
"library/library_detail.html",
{"library": lib, "collections": collections},
)
@login_required
def library_edit(request, uid):
"""Edit an existing library."""
try:
from .models import Library
lib = Library.nodes.get(uid=uid)
except Exception as e:
messages.error(request, f"Library not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
form = LibraryForm(request.POST)
if form.is_valid():
try:
lib.name = form.cleaned_data["name"]
lib.library_type = form.cleaned_data["library_type"]
lib.description = form.cleaned_data.get("description", "")
lib.embedding_instruction = form.cleaned_data.get(
"embedding_instruction", ""
)
lib.reranker_instruction = form.cleaned_data.get(
"reranker_instruction", ""
)
lib.llm_context_prompt = form.cleaned_data.get(
"llm_context_prompt", ""
)
lib.save()
messages.success(request, f'Library "{lib.name}" updated.')
return redirect("library:library-detail", uid=lib.uid)
except Exception as e:
messages.error(request, f"Error updating library: {e}")
else:
form = LibraryForm(
initial={
"name": lib.name,
"library_type": lib.library_type,
"description": lib.description,
"embedding_instruction": lib.embedding_instruction,
"reranker_instruction": lib.reranker_instruction,
"llm_context_prompt": lib.llm_context_prompt,
}
)
return render(
request,
"library/library_form.html",
{"form": form, "editing": True, "library": lib},
)
@login_required
def library_delete(request, uid):
"""Delete a library (and confirm)."""
try:
from .models import Library
lib = Library.nodes.get(uid=uid)
except Exception as e:
messages.error(request, f"Library not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
name = lib.name
lib.delete()
messages.success(request, f'Library "{name}" deleted.')
return redirect("library:library-list")
return render(request, "library/library_confirm_delete.html", {"library": lib})
# ---------------------------------------------------------------------------
# Collection views
# ---------------------------------------------------------------------------
@login_required
def collection_create(request, library_uid):
"""Create a new collection within a library."""
try:
from .models import Library
lib = Library.nodes.get(uid=library_uid)
except Exception as e:
messages.error(request, f"Library not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
form = CollectionForm(request.POST)
if form.is_valid():
try:
from .models import Collection
col = Collection(
name=form.cleaned_data["name"],
description=form.cleaned_data.get("description", ""),
)
col.save()
lib.collections.connect(col)
col.library.connect(lib)
messages.success(request, f'Collection "{col.name}" created.')
return redirect("library:collection-detail", uid=col.uid)
except Exception as e:
messages.error(request, f"Error creating collection: {e}")
else:
form = CollectionForm()
return render(
request,
"library/collection_form.html",
{"form": form, "library": lib, "editing": False},
)
@login_required
def collection_detail(request, uid):
"""View collection details and its items."""
try:
from .models import Collection
col = Collection.nodes.get(uid=uid)
items = col.items.all()
libraries = col.library.all()
library = libraries[0] if libraries else None
except Exception as e:
messages.error(request, f"Collection not found: {e}")
return redirect("library:library-list")
return render(
request,
"library/collection_detail.html",
{"collection": col, "items": items, "library": library},
)
@login_required
def collection_edit(request, uid):
"""Edit an existing collection."""
try:
from .models import Collection
col = Collection.nodes.get(uid=uid)
libraries = col.library.all()
library = libraries[0] if libraries else None
except Exception as e:
messages.error(request, f"Collection not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
form = CollectionForm(request.POST)
if form.is_valid():
try:
col.name = form.cleaned_data["name"]
col.description = form.cleaned_data.get("description", "")
col.save()
messages.success(request, f'Collection "{col.name}" updated.')
return redirect("library:collection-detail", uid=col.uid)
except Exception as e:
messages.error(request, f"Error updating collection: {e}")
else:
form = CollectionForm(
initial={
"name": col.name,
"description": col.description,
}
)
return render(
request,
"library/collection_form.html",
{"form": form, "collection": col, "library": library, "editing": True},
)
@login_required
def collection_delete(request, uid):
"""Delete a collection."""
try:
from .models import Collection
col = Collection.nodes.get(uid=uid)
except Exception as e:
messages.error(request, f"Collection not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
name = col.name
col.delete()
messages.success(request, f'Collection "{name}" deleted.')
return redirect("library:library-list")
return render(
request,
"library/collection_confirm_delete.html",
{"collection": col},
)
# ---------------------------------------------------------------------------
# Item views
# ---------------------------------------------------------------------------
@login_required
def item_create(request, collection_uid):
"""Create a new item within a collection, with optional file upload."""
try:
from .models import Collection
col = Collection.nodes.get(uid=collection_uid)
libraries = col.library.all()
library = libraries[0] if libraries else None
except Exception as e:
messages.error(request, f"Collection not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
form = ItemForm(request.POST, request.FILES)
if form.is_valid():
try:
from .models import Item
uploaded_file = request.FILES.get("file")
file_type = form.cleaned_data.get("file_type", "")
# Infer file_type from upload if not explicitly set
if uploaded_file and not file_type:
_, ext = os.path.splitext(uploaded_file.name)
file_type = ext.lstrip(".").lower()
item = Item(
title=form.cleaned_data["title"],
item_type=form.cleaned_data.get("item_type", ""),
file_type=file_type,
embedding_status="pending",
)
# Handle file upload
if uploaded_file:
file_data = uploaded_file.read()
item.file_size = len(file_data)
item.content_hash = hashlib.sha256(file_data).hexdigest()
item.save()
# Store in S3
s3_key = f"items/{item.uid}/original.{file_type}"
default_storage.save(s3_key, ContentFile(file_data))
item.s3_key = s3_key
item.save()
else:
item.save()
col.items.connect(item)
# Auto-trigger embedding if file uploaded and checkbox set
auto_embed = form.cleaned_data.get("auto_embed", True)
if uploaded_file and auto_embed:
try:
from .tasks import embed_item
task = embed_item.delay(item.uid, request.user.id)
messages.info(
request,
f"Embedding queued (task: {task.id})",
)
except Exception as exc:
logger.warning("Failed to queue embedding: %s", exc)
messages.success(request, f'Item "{item.title}" created.')
return redirect("library:item-detail", uid=item.uid)
except Exception as e:
messages.error(request, f"Error creating item: {e}")
else:
form = ItemForm(initial={"auto_embed": True})
return render(
request,
"library/item_form.html",
{"form": form, "collection": col, "library": library, "editing": False},
)
@login_required
def item_detail(request, uid):
"""View item details."""
try:
from .models import Item
item = Item.nodes.get(uid=uid)
chunks = item.chunks.all()
images = item.images.all()
concepts = item.concepts.all()
except Exception as e:
messages.error(request, f"Item not found: {e}")
return redirect("library:library-list")
return render(
request,
"library/item_detail.html",
{"item": item, "chunks": chunks, "images": images, "concepts": concepts},
)
@login_required
def item_edit(request, uid):
"""Edit an existing item."""
try:
from .models import Item
item = Item.nodes.get(uid=uid)
except Exception as e:
messages.error(request, f"Item not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
form = ItemForm(request.POST)
if form.is_valid():
try:
item.title = form.cleaned_data["title"]
item.item_type = form.cleaned_data.get("item_type", "")
item.file_type = form.cleaned_data.get("file_type", "")
item.save()
messages.success(request, f'Item "{item.title}" updated.')
return redirect("library:item-detail", uid=item.uid)
except Exception as e:
messages.error(request, f"Error updating item: {e}")
else:
form = ItemForm(
initial={
"title": item.title,
"item_type": item.item_type,
"file_type": item.file_type,
}
)
return render(
request,
"library/item_form.html",
{"form": form, "item": item, "editing": True},
)
@login_required
def item_reembed(request, uid):
"""Trigger re-embedding for an item."""
try:
from .models import Item
item = Item.nodes.get(uid=uid)
except Exception as e:
messages.error(request, f"Item not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
try:
from .tasks import reembed_item
task = reembed_item.delay(uid, request.user.id)
messages.info(request, f"Re-embedding queued for \"{item.title}\" (task: {task.id})")
except Exception as exc:
messages.error(request, f"Failed to queue re-embedding: {exc}")
return redirect("library:item-detail", uid=uid)
return redirect("library:item-detail", uid=uid)
@login_required
def item_delete(request, uid):
"""Delete an item."""
try:
from .models import Item
item = Item.nodes.get(uid=uid)
except Exception as e:
messages.error(request, f"Item not found: {e}")
return redirect("library:library-list")
if request.method == "POST":
title = item.title
item.delete()
messages.success(request, f'Item "{title}" deleted.')
return redirect("library:library-list")
return render(request, "library/item_confirm_delete.html", {"item": item})
# ---------------------------------------------------------------------------
# Embedding Pipeline Dashboard
# ---------------------------------------------------------------------------
@login_required
def embedding_dashboard(request):
"""
Embedding pipeline dashboard — system model status, item embedding
progress, knowledge graph node counts, and batch actions.
"""
context = {
"system_embedding_model": None,
"system_chat_model": None,
"system_reranker_model": None,
"status_counts": {},
"node_counts": {},
"total_items": 0,
"embedded_chunks": 0,
"total_chunks": 0,
"neo4j_available": False,
}
# Get system models from LLM Manager
try:
from llm_manager.models import LLMModel
context["system_embedding_model"] = LLMModel.get_system_embedding_model()
context["system_chat_model"] = LLMModel.get_system_chat_model()
context["system_reranker_model"] = LLMModel.get_system_reranker_model()
except Exception as exc:
logger.warning("Could not load system models: %s", exc)
# Get item status counts and node counts from Neo4j
if neo4j_available():
context["neo4j_available"] = True
try:
from neomodel import db
for status in ["pending", "processing", "completed", "failed"]:
results, _ = db.cypher_query(
"MATCH (i:Item {embedding_status: $status}) RETURN count(i)",
{"status": status},
)
context["status_counts"][status] = results[0][0] if results else 0
results, _ = db.cypher_query("MATCH (i:Item) RETURN count(i)")
context["total_items"] = results[0][0] if results else 0
for label in ["Library", "Collection", "Item", "Chunk", "Concept", "Image", "ImageEmbedding"]:
results, _ = db.cypher_query(f"MATCH (n:{label}) RETURN count(n)")
context["node_counts"][label] = results[0][0] if results else 0
results, _ = db.cypher_query(
"MATCH (c:Chunk) WHERE c.embedding IS NOT NULL RETURN count(c)"
)
context["embedded_chunks"] = results[0][0] if results else 0
context["total_chunks"] = context["node_counts"].get("Chunk", 0)
except Exception as exc:
logger.warning("Could not query Neo4j for dashboard: %s", exc)
messages.warning(request, f"Neo4j query error: {exc}")
return render(request, "library/embedding_dashboard.html", context)
@login_required
def embed_all_pending(request):
"""
Trigger embedding for all pending items with uploaded files.
POST-only action, redirects back to dashboard.
"""
if request.method != "POST":
return redirect("library:embedding-dashboard")
try:
from neomodel import db
results, _ = db.cypher_query(
"MATCH (i:Item {embedding_status: 'pending'}) "
"WHERE i.s3_key IS NOT NULL AND i.s3_key <> '' "
"RETURN i.uid"
)
item_uids = [row[0] for row in results]
if not item_uids:
messages.info(request, "No pending items with files to embed.")
else:
from .tasks import batch_embed_items
task = batch_embed_items.delay(item_uids, request.user.id)
messages.success(
request,
f"Queued embedding for {len(item_uids)} items (task: {task.id})",
)
except Exception as exc:
logger.error("Failed to trigger batch embedding: %s", exc, exc_info=True)
messages.error(request, f"Failed to trigger embedding: {exc}")
return redirect("library:embedding-dashboard")

View File

@@ -0,0 +1 @@
default_app_config = "llm_manager.apps.LLMManagerConfig"

View File

@@ -0,0 +1,326 @@
"""
Admin configuration for LLM Manager — ported from Spelunker.
Adds system model actions for embedding, chat, and reranker models.
"""
from django.contrib import admin, messages
from django.db import transaction
from django.utils.html import format_html
from .models import LLMApi, LLMModel, LLMUsage
from .services import test_llm_api
@admin.register(LLMApi)
class LLMApiAdmin(admin.ModelAdmin):
list_display = (
"name",
"api_type",
"base_url",
"is_active",
"last_test_status",
"last_tested_at",
"supports_streaming",
"timeout_seconds",
"created_at",
)
list_filter = ("api_type", "is_active", "last_test_status", "supports_streaming")
search_fields = ("name", "base_url")
readonly_fields = (
"created_at",
"updated_at",
"last_tested_at",
"last_test_status",
"last_test_message",
)
actions = ["test_api_connection"]
fieldsets = (
("API Info", {"fields": ("name", "api_type", "base_url", "is_active")}),
("Security", {"fields": ("api_key",)}),
(
"Advanced",
{"fields": ("supports_streaming", "timeout_seconds", "max_retries", "created_by")},
),
(
"Test Status",
{"fields": ("last_tested_at", "last_test_status", "last_test_message")},
),
("Timestamps", {"fields": ("created_at", "updated_at")}),
)
def test_api_connection(self, request, queryset):
"""Test selected LLM API(s) and discover models."""
success_count = 0
failed_count = 0
total_added = 0
total_updated = 0
total_deactivated = 0
for api in queryset:
result = test_llm_api(api)
if result["success"]:
success_count += 1
total_added += result["models_added"]
total_updated += result["models_updated"]
total_deactivated += result["models_deactivated"]
self.message_user(request, f"{api.name}: {result['message']}", messages.SUCCESS)
else:
failed_count += 1
self.message_user(request, f"{api.name}: {result['error']}", messages.ERROR)
if success_count > 0:
summary = (
f"Tested {success_count} API(s). "
f"Total: {total_added} added, {total_updated} updated, "
f"{total_deactivated} deactivated."
)
self.message_user(request, summary, messages.SUCCESS)
if failed_count > 0:
self.message_user(
request,
f"Failed to test {failed_count} API(s). Check logs.",
messages.WARNING,
)
test_api_connection.short_description = "Test API Connection and Discover Models"
@admin.register(LLMModel)
class LLMModelAdmin(admin.ModelAdmin):
list_display = (
"name",
"api",
"model_type",
"vector_dimensions_display",
"context_window",
"input_cost_per_1k",
"system_embedding_badge",
"system_chat_badge",
"system_reranker_badge",
"is_active",
"created_at",
)
list_filter = (
"api",
"model_type",
"supports_cache",
"supports_vision",
"supports_multimodal",
"is_active",
"is_system_embedding_model",
"is_system_chat_model",
"is_system_reranker_model",
)
search_fields = ("name", "display_name", "api__name")
readonly_fields = (
"created_at",
"updated_at",
"is_system_embedding_model",
"is_system_chat_model",
"is_system_reranker_model",
)
actions = [
"set_as_system_embedding_model",
"set_as_system_chat_model",
"set_as_system_reranker_model",
]
fieldsets = (
("Model Info", {"fields": ("api", "name", "display_name", "model_type", "is_active")}),
(
"System Defaults",
{
"fields": (
"is_system_embedding_model",
"is_system_chat_model",
"is_system_reranker_model",
),
"classes": ("collapse",),
"description": (
"System default models are set via admin actions. "
"Only one model per type can be system default."
),
},
),
(
"Capabilities",
{
"fields": (
"context_window",
"max_output_tokens",
"vector_dimensions",
"supports_cache",
"supports_vision",
"supports_multimodal",
"supports_function_calling",
"supports_json_mode",
),
},
),
(
"Pricing",
{"fields": ("input_cost_per_1k", "output_cost_per_1k", "cached_cost_per_1k")},
),
("Timestamps", {"fields": ("created_at", "updated_at")}),
)
def vector_dimensions_display(self, obj):
if obj.model_type in ("embedding", "multimodal_embed") and obj.vector_dimensions:
return format_html(
'<span style="color: #0066cc; font-weight: bold;">{}</span>',
obj.vector_dimensions,
)
elif obj.model_type in ("embedding", "multimodal_embed"):
return format_html('<span style="color: #999;">Not set</span>')
return "-"
vector_dimensions_display.short_description = "Dimensions"
def system_embedding_badge(self, obj):
if obj.is_system_embedding_model and obj.model_type in ("embedding", "multimodal_embed"):
return format_html(
'<span style="background:#28a745;color:white;padding:3px 8px;'
'border-radius:3px;font-weight:bold;">SYSTEM DEFAULT</span>'
)
return ""
system_embedding_badge.short_description = "Embed Default"
def system_chat_badge(self, obj):
if obj.is_system_chat_model and obj.model_type == "chat":
return format_html(
'<span style="background:#007bff;color:white;padding:3px 8px;'
'border-radius:3px;font-weight:bold;">SYSTEM DEFAULT</span>'
)
return ""
system_chat_badge.short_description = "Chat Default"
def system_reranker_badge(self, obj):
if obj.is_system_reranker_model and obj.model_type == "reranker":
return format_html(
'<span style="background:#fd7e14;color:white;padding:3px 8px;'
'border-radius:3px;font-weight:bold;">SYSTEM DEFAULT</span>'
)
return ""
system_reranker_badge.short_description = "Reranker Default"
# --- System model actions -----------------------------------------------
def _set_system_model(self, request, queryset, model_type, field_name, label):
"""Generic helper for set-as-system-model admin actions."""
if queryset.count() != 1:
self.message_user(
request,
f"Please select exactly ONE model to set as system {label}.",
messages.ERROR,
)
return
new_model = queryset.first()
valid_types = [model_type]
if model_type == "embedding":
valid_types = ["embedding", "multimodal_embed"]
if new_model.model_type not in valid_types:
self.message_user(
request,
f'Only {label} models can be set as system {label}. '
f'"{new_model.name}" is type: {new_model.model_type}',
messages.ERROR,
)
return
if not new_model.is_active:
self.message_user(
request,
f'Cannot set inactive model "{new_model.name}" as system {label}.',
messages.ERROR,
)
return
with transaction.atomic():
LLMModel.objects.filter(**{field_name: True}).update(**{field_name: False})
setattr(new_model, field_name, True)
new_model.save(update_fields=[field_name])
self.message_user(
request,
f"{new_model.api.name}: {new_model.name} is now the system {label}.",
messages.SUCCESS,
)
def set_as_system_embedding_model(self, request, queryset):
self._set_system_model(request, queryset, "embedding", "is_system_embedding_model", "embedding model")
set_as_system_embedding_model.short_description = "Set as System Embedding Model"
def set_as_system_chat_model(self, request, queryset):
self._set_system_model(request, queryset, "chat", "is_system_chat_model", "chat model")
set_as_system_chat_model.short_description = "Set as System Chat Model"
def set_as_system_reranker_model(self, request, queryset):
self._set_system_model(request, queryset, "reranker", "is_system_reranker_model", "reranker model")
set_as_system_reranker_model.short_description = "Set as System Reranker Model"
def save_model(self, request, obj, form, change):
"""Ensure only ONE model per type is marked as system default."""
type_field_map = {
"embedding": "is_system_embedding_model",
"multimodal_embed": "is_system_embedding_model",
"chat": "is_system_chat_model",
"reranker": "is_system_reranker_model",
}
for mtype, field in type_field_map.items():
if getattr(obj, field, False) and obj.model_type == mtype:
LLMModel.objects.filter(**{field: True}).exclude(pk=obj.pk).update(**{field: False})
self.message_user(
request,
f"{obj.name} is now the system-wide {mtype} model.",
messages.SUCCESS,
)
elif getattr(obj, field, False) and obj.model_type != mtype:
setattr(obj, field, False)
super().save_model(request, obj, form, change)
@admin.register(LLMUsage)
class LLMUsageAdmin(admin.ModelAdmin):
list_display = (
"timestamp",
"user",
"model",
"input_tokens",
"output_tokens",
"cached_tokens",
"total_cost",
"session_id",
"purpose",
)
list_filter = ("model", "purpose", "timestamp")
search_fields = ("user__username", "session_id", "model__name")
readonly_fields = (
"user",
"model",
"timestamp",
"input_tokens",
"output_tokens",
"cached_tokens",
"total_cost",
"session_id",
"purpose",
"request_metadata",
)
date_hierarchy = "timestamp"
def has_add_permission(self, request):
return False
def has_change_permission(self, request, obj=None):
return False

View File

View File

@@ -0,0 +1,105 @@
"""
DRF serializers for LLM Manager.
"""
from rest_framework import serializers
from ..models import LLMApi, LLMModel, LLMUsage
class LLMApiSerializer(serializers.ModelSerializer):
model_count = serializers.SerializerMethodField()
class Meta:
model = LLMApi
fields = [
"id",
"name",
"api_type",
"base_url",
"is_active",
"supports_streaming",
"timeout_seconds",
"max_retries",
"last_tested_at",
"last_test_status",
"model_count",
"created_at",
"updated_at",
]
read_only_fields = [
"id",
"last_tested_at",
"last_test_status",
"created_at",
"updated_at",
]
def get_model_count(self, obj):
return obj.models.filter(is_active=True).count()
class LLMModelSerializer(serializers.ModelSerializer):
api_name = serializers.CharField(source="api.name", read_only=True)
class Meta:
model = LLMModel
fields = [
"id",
"api",
"api_name",
"name",
"display_name",
"model_type",
"context_window",
"max_output_tokens",
"vector_dimensions",
"supports_cache",
"supports_vision",
"supports_multimodal",
"supports_function_calling",
"supports_json_mode",
"input_cost_per_1k",
"output_cost_per_1k",
"cached_cost_per_1k",
"is_active",
"is_system_embedding_model",
"is_system_chat_model",
"is_system_reranker_model",
"created_at",
"updated_at",
]
read_only_fields = [
"id",
"is_system_embedding_model",
"is_system_chat_model",
"is_system_reranker_model",
"created_at",
"updated_at",
]
class LLMUsageSerializer(serializers.ModelSerializer):
model_name = serializers.CharField(source="model.name", read_only=True)
api_name = serializers.CharField(source="model.api.name", read_only=True)
username = serializers.CharField(source="user.username", read_only=True)
class Meta:
model = LLMUsage
fields = [
"id",
"user",
"username",
"model",
"model_name",
"api_name",
"timestamp",
"input_tokens",
"output_tokens",
"cached_tokens",
"total_cost",
"session_id",
"purpose",
"request_metadata",
]
read_only_fields = ["id", "timestamp", "total_cost"]

View File

@@ -0,0 +1,18 @@
"""
DRF API URL patterns for LLM Manager.
"""
from django.urls import path
from . import views
app_name = "llm-manager-api"
urlpatterns = [
path("apis/", views.api_list, name="api_list"),
path("apis/<uuid:pk>/", views.api_detail, name="api_detail"),
path("models/", views.model_list, name="model_list"),
path("models/<uuid:pk>/", views.model_detail, name="model_detail"),
path("models/system/", views.system_models, name="system_models"),
path("usage/", views.usage_list, name="usage_list"),
]

View File

@@ -0,0 +1,100 @@
"""
DRF API views for LLM Manager — FBVs per Red Panda Standards.
"""
from rest_framework import status
from rest_framework.decorators import api_view, permission_classes
from rest_framework.permissions import IsAuthenticated
from rest_framework.response import Response
from ..models import LLMApi, LLMModel, LLMUsage
from .serializers import LLMApiSerializer, LLMModelSerializer, LLMUsageSerializer
@api_view(["GET"])
@permission_classes([IsAuthenticated])
def api_list(request):
"""List all LLM APIs."""
apis = LLMApi.objects.all().order_by("name")
serializer = LLMApiSerializer(apis, many=True)
return Response(serializer.data)
@api_view(["GET"])
@permission_classes([IsAuthenticated])
def api_detail(request, pk):
"""Get a specific LLM API."""
try:
api = LLMApi.objects.get(pk=pk)
except LLMApi.DoesNotExist:
return Response({"error": "Not found"}, status=status.HTTP_404_NOT_FOUND)
serializer = LLMApiSerializer(api)
return Response(serializer.data)
@api_view(["GET"])
@permission_classes([IsAuthenticated])
def model_list(request):
"""List all LLM Models, optionally filtered by API or type."""
qs = LLMModel.objects.select_related("api").order_by("api__name", "name")
api_id = request.query_params.get("api")
model_type = request.query_params.get("type")
active_only = request.query_params.get("active", "").lower() in ("1", "true")
if api_id:
qs = qs.filter(api_id=api_id)
if model_type:
qs = qs.filter(model_type=model_type)
if active_only:
qs = qs.filter(is_active=True)
serializer = LLMModelSerializer(qs, many=True)
return Response(serializer.data)
@api_view(["GET"])
@permission_classes([IsAuthenticated])
def model_detail(request, pk):
"""Get a specific LLM Model."""
try:
model = LLMModel.objects.select_related("api").get(pk=pk)
except LLMModel.DoesNotExist:
return Response({"error": "Not found"}, status=status.HTTP_404_NOT_FOUND)
serializer = LLMModelSerializer(model)
return Response(serializer.data)
@api_view(["GET"])
@permission_classes([IsAuthenticated])
def system_models(request):
"""Get the current system default models."""
data = {}
embed = LLMModel.get_system_embedding_model()
chat = LLMModel.get_system_chat_model()
reranker = LLMModel.get_system_reranker_model()
if embed:
data["embedding"] = LLMModelSerializer(embed).data
if chat:
data["chat"] = LLMModelSerializer(chat).data
if reranker:
data["reranker"] = LLMModelSerializer(reranker).data
return Response(data)
@api_view(["GET", "POST"])
@permission_classes([IsAuthenticated])
def usage_list(request):
"""List usage records for current user, or create a new usage record."""
if request.method == "GET":
qs = (
LLMUsage.objects.filter(user=request.user)
.select_related("model", "model__api")
.order_by("-timestamp")[:100]
)
serializer = LLMUsageSerializer(qs, many=True)
return Response(serializer.data)
# POST — create a usage record
serializer = LLMUsageSerializer(data=request.data)
if serializer.is_valid():
serializer.save(user=request.user)
return Response(serializer.data, status=status.HTTP_201_CREATED)
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)

View File

@@ -0,0 +1,7 @@
from django.apps import AppConfig
class LLMManagerConfig(AppConfig):
default_auto_field = "django.db.models.BigAutoField"
name = "llm_manager"
verbose_name = "LLM Manager"

View File

@@ -0,0 +1,65 @@
"""
Fernet encryption field for LLM API keys.
Uses LLM_API_SECRETS_ENCRYPTION_KEY from settings if available,
otherwise derives a key from Django's SECRET_KEY (Themis pattern).
"""
import base64
import hashlib
from cryptography.fernet import Fernet, InvalidToken
from django.conf import settings
from django.db import models
def _get_fernet():
"""
Get a Fernet cipher using the configured encryption key.
Checks for LLM_API_SECRETS_ENCRYPTION_KEY first, then falls
back to deriving a key from SECRET_KEY (Themis pattern).
"""
key = getattr(settings, "LLM_API_SECRETS_ENCRYPTION_KEY", None)
if key:
return Fernet(key.encode() if isinstance(key, str) else key)
# Fallback: derive from SECRET_KEY like Themis
secret = settings.SECRET_KEY.encode("utf-8")
digest = hashlib.sha256(secret).digest()
derived_key = base64.urlsafe_b64encode(digest)
return Fernet(derived_key)
class EncryptedCharField(models.CharField):
"""
CharField that transparently encrypts/decrypts values using Fernet.
Values are encrypted before saving to the database and decrypted
when read. Supports blank/null values gracefully.
"""
description = "Encrypted CharField for storing API secrets"
def get_prep_value(self, value):
"""Encrypt before saving to DB."""
if value is None or value == "":
return value
cipher = _get_fernet()
encrypted = cipher.encrypt(value.encode("utf-8"))
return base64.b64encode(encrypted).decode("utf-8")
def from_db_value(self, value, expression, connection):
"""Decrypt when loading from DB."""
if value is None or value == "":
return value
try:
cipher = _get_fernet()
encrypted = base64.b64decode(value)
return cipher.decrypt(encrypted).decode("utf-8")
except (InvalidToken, Exception):
return value
def to_python(self, value):
if isinstance(value, str) or value is None:
return value
return str(value)

View File

@@ -0,0 +1,82 @@
"""
Forms for LLM Manager — DaisyUI-styled widgets.
"""
from django import forms
from .models import LLMApi, LLMModel
class LLMApiForm(forms.ModelForm):
class Meta:
model = LLMApi
fields = [
"name",
"api_type",
"base_url",
"api_key",
"is_active",
"supports_streaming",
"timeout_seconds",
"max_retries",
]
widgets = {
"name": forms.TextInput(attrs={"class": "input input-bordered w-full"}),
"api_type": forms.Select(attrs={"class": "select select-bordered w-full"}),
"base_url": forms.URLInput(attrs={"class": "input input-bordered w-full"}),
"api_key": forms.PasswordInput(
attrs={"class": "input input-bordered w-full", "autocomplete": "off"},
render_value=True,
),
"is_active": forms.CheckboxInput(attrs={"class": "toggle toggle-primary"}),
"supports_streaming": forms.CheckboxInput(attrs={"class": "toggle toggle-primary"}),
"timeout_seconds": forms.NumberInput(attrs={"class": "input input-bordered w-full"}),
"max_retries": forms.NumberInput(attrs={"class": "input input-bordered w-full"}),
}
class LLMModelForm(forms.ModelForm):
class Meta:
model = LLMModel
fields = [
"api",
"name",
"display_name",
"model_type",
"context_window",
"max_output_tokens",
"vector_dimensions",
"supports_cache",
"supports_vision",
"supports_multimodal",
"supports_function_calling",
"supports_json_mode",
"input_cost_per_1k",
"output_cost_per_1k",
"cached_cost_per_1k",
"is_active",
]
widgets = {
"api": forms.Select(attrs={"class": "select select-bordered w-full"}),
"name": forms.TextInput(attrs={"class": "input input-bordered w-full"}),
"display_name": forms.TextInput(attrs={"class": "input input-bordered w-full"}),
"model_type": forms.Select(attrs={"class": "select select-bordered w-full"}),
"context_window": forms.NumberInput(attrs={"class": "input input-bordered w-full"}),
"max_output_tokens": forms.NumberInput(attrs={"class": "input input-bordered w-full"}),
"vector_dimensions": forms.NumberInput(attrs={"class": "input input-bordered w-full"}),
"supports_cache": forms.CheckboxInput(attrs={"class": "toggle toggle-primary"}),
"supports_vision": forms.CheckboxInput(attrs={"class": "toggle toggle-primary"}),
"supports_multimodal": forms.CheckboxInput(attrs={"class": "toggle toggle-primary"}),
"supports_function_calling": forms.CheckboxInput(attrs={"class": "toggle toggle-primary"}),
"supports_json_mode": forms.CheckboxInput(attrs={"class": "toggle toggle-primary"}),
"input_cost_per_1k": forms.NumberInput(
attrs={"class": "input input-bordered w-full", "step": "0.000001"}
),
"output_cost_per_1k": forms.NumberInput(
attrs={"class": "input input-bordered w-full", "step": "0.000001"}
),
"cached_cost_per_1k": forms.NumberInput(
attrs={"class": "input input-bordered w-full", "step": "0.000001"}
),
"is_active": forms.CheckboxInput(attrs={"class": "toggle toggle-primary"}),
}

View File

@@ -0,0 +1,138 @@
"""
Management command to load default LLM models for common providers.
Usage:
python manage.py load_default_llm_models
python manage.py load_default_llm_models --force # update existing models
"""
from decimal import Decimal
from django.core.management.base import BaseCommand
from llm_manager.models import LLMApi, LLMModel
DEFAULT_APIS = [
{
"name": "OpenAI",
"api_type": "openai",
"base_url": "https://api.openai.com/v1",
},
]
DEFAULT_MODELS = [
# ── Chat models ────────────────────────────────────────────────────
{
"api_name": "OpenAI",
"name": "gpt-4o",
"display_name": "GPT-4o",
"model_type": "chat",
"context_window": 128000,
"max_output_tokens": 16384,
"input_cost_per_1k": Decimal("0.0025"),
"output_cost_per_1k": Decimal("0.01"),
"supports_cache": True,
"cached_cost_per_1k": Decimal("0.00125"),
"supports_vision": True,
"supports_function_calling": True,
"supports_json_mode": True,
},
{
"api_name": "OpenAI",
"name": "gpt-4o-mini",
"display_name": "GPT-4o Mini",
"model_type": "chat",
"context_window": 128000,
"max_output_tokens": 16384,
"input_cost_per_1k": Decimal("0.00015"),
"output_cost_per_1k": Decimal("0.0006"),
"supports_cache": True,
"cached_cost_per_1k": Decimal("0.000075"),
"supports_vision": True,
"supports_function_calling": True,
"supports_json_mode": True,
},
# ── Embedding models ───────────────────────────────────────────────
{
"api_name": "OpenAI",
"name": "text-embedding-3-large",
"display_name": "Text Embedding 3 Large",
"model_type": "embedding",
"context_window": 8191,
"vector_dimensions": 3072,
"input_cost_per_1k": Decimal("0.00013"),
"output_cost_per_1k": Decimal("0"),
},
{
"api_name": "OpenAI",
"name": "text-embedding-3-small",
"display_name": "Text Embedding 3 Small",
"model_type": "embedding",
"context_window": 8191,
"vector_dimensions": 1536,
"input_cost_per_1k": Decimal("0.00002"),
"output_cost_per_1k": Decimal("0"),
},
]
class Command(BaseCommand):
help = "Load default LLM APIs and models."
def add_arguments(self, parser):
parser.add_argument(
"--force",
action="store_true",
help="Update existing model records with defaults.",
)
def handle(self, *args, **options):
force = options["force"]
# Create default APIs
api_map = {}
for api_data in DEFAULT_APIS:
api, created = LLMApi.objects.get_or_create(
name=api_data["name"],
defaults={
"api_type": api_data["api_type"],
"base_url": api_data["base_url"],
"is_active": True,
},
)
api_map[api_data["name"]] = api
if created:
self.stdout.write(self.style.SUCCESS(f"Created API: {api.name}"))
else:
self.stdout.write(self.style.WARNING(f"API already exists: {api.name}"))
# Create default models
for model_data in DEFAULT_MODELS:
api_name = model_data.pop("api_name")
api = api_map.get(api_name)
if not api:
self.stdout.write(self.style.ERROR(f"API '{api_name}' not found, skipping model."))
model_data["api_name"] = api_name # restore
continue
defaults = {k: v for k, v in model_data.items() if k != "name"}
model, created = LLMModel.objects.get_or_create(
api=api,
name=model_data["name"],
defaults=defaults,
)
if created:
self.stdout.write(self.style.SUCCESS(f" Created model: {model.name}"))
elif force:
for key, val in defaults.items():
setattr(model, key, val)
model.save()
self.stdout.write(self.style.SUCCESS(f" Updated model: {model.name}"))
else:
self.stdout.write(self.style.WARNING(f" Model exists: {model.name}"))
model_data["api_name"] = api_name # restore
self.stdout.write(self.style.SUCCESS("Default LLM models loaded."))

View File

@@ -0,0 +1,130 @@
# Generated by Django 5.2.12 on 2026-03-10 16:59
import django.db.models.deletion
import llm_manager.encryption
import uuid
from decimal import Decimal
from django.conf import settings
from django.db import migrations, models
class Migration(migrations.Migration):
initial = True
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
]
operations = [
migrations.CreateModel(
name='LLMApi',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=100, unique=True)),
('api_type', models.CharField(choices=[('openai', 'OpenAI Compatible'), ('azure', 'Azure OpenAI'), ('ollama', 'Ollama'), ('anthropic', 'Anthropic'), ('llama-cpp', 'Llama.cpp'), ('vllm', 'vLLM')], max_length=20)),
('base_url', models.URLField()),
('api_key', llm_manager.encryption.EncryptedCharField(blank=True, default='', max_length=500)),
('is_active', models.BooleanField(default=True)),
('supports_streaming', models.BooleanField(default=True)),
('timeout_seconds', models.PositiveIntegerField(default=60)),
('max_retries', models.PositiveIntegerField(default=3)),
('last_tested_at', models.DateTimeField(blank=True, help_text='Last time this API was tested', null=True)),
('last_test_status', models.CharField(choices=[('success', 'Success'), ('failed', 'Failed'), ('pending', 'Pending')], default='pending', help_text='Result of the last API test', max_length=20)),
('last_test_message', models.TextField(blank=True, help_text='Details from the last test (success message or error)')),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('created_by', models.ForeignKey(blank=True, null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='llm_apis_created', to=settings.AUTH_USER_MODEL)),
],
options={
'verbose_name': 'LLM API',
'verbose_name_plural': 'LLM APIs',
'ordering': ['name'],
},
),
migrations.CreateModel(
name='LLMModel',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('name', models.CharField(max_length=100)),
('display_name', models.CharField(blank=True, max_length=200)),
('model_type', models.CharField(choices=[('chat', 'Chat/Completion'), ('embedding', 'Embedding'), ('vision', 'Vision'), ('audio', 'Audio'), ('reranker', 'Reranker'), ('multimodal_embed', 'Multimodal Embedding')], max_length=20)),
('context_window', models.PositiveIntegerField(help_text='Maximum context in tokens')),
('max_output_tokens', models.PositiveIntegerField(blank=True, null=True)),
('supports_cache', models.BooleanField(default=False)),
('supports_vision', models.BooleanField(default=False)),
('supports_function_calling', models.BooleanField(default=False)),
('supports_json_mode', models.BooleanField(default=False)),
('supports_multimodal', models.BooleanField(default=False, help_text='Flag models that accept image+text input')),
('vector_dimensions', models.PositiveIntegerField(blank=True, help_text='Embedding output dimensions (e.g., 4096)', null=True)),
('input_cost_per_1k', models.DecimalField(decimal_places=6, default=Decimal('0'), help_text='Cost per 1K input tokens in USD', max_digits=10)),
('output_cost_per_1k', models.DecimalField(decimal_places=6, default=Decimal('0'), help_text='Cost per 1K output tokens in USD', max_digits=10)),
('cached_cost_per_1k', models.DecimalField(blank=True, decimal_places=6, help_text='Cost per 1K cached tokens (if supported)', max_digits=10, null=True)),
('is_active', models.BooleanField(default=True)),
('is_system_embedding_model', models.BooleanField(default=False, help_text='Mark this as the system-wide embedding model. Only ONE embedding model should have this set to True.')),
('is_system_chat_model', models.BooleanField(default=False, help_text='Mark this as the system-wide chat model. Only ONE chat model should have this set to True.')),
('is_system_reranker_model', models.BooleanField(default=False, help_text='Mark this as the system-wide reranker model. Only ONE reranker model should have this set to True.')),
('created_at', models.DateTimeField(auto_now_add=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('api', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='models', to='llm_manager.llmapi')),
],
options={
'ordering': ['api', 'name'],
},
),
migrations.CreateModel(
name='LLMUsage',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('timestamp', models.DateTimeField(auto_now_add=True, db_index=True)),
('input_tokens', models.PositiveIntegerField(default=0)),
('output_tokens', models.PositiveIntegerField(default=0)),
('cached_tokens', models.PositiveIntegerField(default=0)),
('total_cost', models.DecimalField(decimal_places=6, default=Decimal('0'), help_text='Total cost in USD', max_digits=12)),
('session_id', models.CharField(blank=True, db_index=True, max_length=100)),
('purpose', models.CharField(choices=[('responder', 'RAG Responder'), ('reviewer', 'RAG Reviewer'), ('embeddings', 'Document Embeddings'), ('search', 'Vector Search'), ('reranking', 'Re-ranking'), ('multimodal_embed', 'Multimodal Embedding'), ('other', 'Other')], db_index=True, default='other', max_length=50)),
('request_metadata', models.JSONField(blank=True, help_text='Additional context (prompt, temperature, etc.)', null=True)),
('model', models.ForeignKey(on_delete=django.db.models.deletion.PROTECT, related_name='usage_records', to='llm_manager.llmmodel')),
('user', models.ForeignKey(null=True, on_delete=django.db.models.deletion.SET_NULL, related_name='llm_usage', to=settings.AUTH_USER_MODEL)),
],
options={
'ordering': ['-timestamp'],
},
),
migrations.AddIndex(
model_name='llmmodel',
index=models.Index(fields=['api', 'model_type', 'is_active'], name='llm_manager_api_id_140af0_idx'),
),
migrations.AddIndex(
model_name='llmmodel',
index=models.Index(fields=['is_system_embedding_model', 'model_type'], name='llm_manager_is_syst_39386f_idx'),
),
migrations.AddIndex(
model_name='llmmodel',
index=models.Index(fields=['is_system_chat_model', 'model_type'], name='llm_manager_is_syst_346eb3_idx'),
),
migrations.AddIndex(
model_name='llmmodel',
index=models.Index(fields=['is_system_reranker_model', 'model_type'], name='llm_manager_is_syst_cc73c6_idx'),
),
migrations.AlterUniqueTogether(
name='llmmodel',
unique_together={('api', 'name')},
),
migrations.AddIndex(
model_name='llmusage',
index=models.Index(fields=['-timestamp', 'user'], name='llm_manager_timesta_aa66fc_idx'),
),
migrations.AddIndex(
model_name='llmusage',
index=models.Index(fields=['-timestamp', 'model'], name='llm_manager_timesta_0b5c38_idx'),
),
migrations.AddIndex(
model_name='llmusage',
index=models.Index(fields=['purpose', '-timestamp'], name='llm_manager_purpose_37c32c_idx'),
),
migrations.AddIndex(
model_name='llmusage',
index=models.Index(fields=['session_id'], name='llm_manager_session_1da37d_idx'),
),
]

View File

@@ -0,0 +1,34 @@
"""
Add 'bedrock' to LLMApi.api_type choices.
Django migrations track field changes including choices — this migration
updates the api_type field to include the new Amazon Bedrock option.
"""
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
("llm_manager", "0001_initial"),
]
operations = [
migrations.AlterField(
model_name="llmapi",
name="api_type",
field=models.CharField(
choices=[
("openai", "OpenAI Compatible"),
("azure", "Azure OpenAI"),
("ollama", "Ollama"),
("anthropic", "Anthropic"),
("llama-cpp", "Llama.cpp"),
("vllm", "vLLM"),
("bedrock", "Amazon Bedrock"),
],
max_length=20,
),
),
]

View File

@@ -0,0 +1,301 @@
"""
LLM Manager models — ported from Spelunker with Mnemosyne adaptations.
Changes from Spelunker:
- api_key uses EncryptedCharField with key derived from SECRET_KEY (Themis-style)
- LLMModel.model_type adds 'reranker' and 'multimodal_embed' choices
- LLMModel adds 'supports_multimodal' and 'vector_dimensions' fields
- LLMUsage.purpose adds Mnemosyne-specific choices
"""
import uuid
from decimal import Decimal
from django.conf import settings
from django.contrib.auth import get_user_model
from django.db import models
from .encryption import EncryptedCharField
User = get_user_model()
class LLMApi(models.Model):
"""
Represents an LLM API provider (OpenAI-compatible, Arke proxy, etc.).
API keys are stored encrypted using Fernet symmetric encryption
derived from Django's SECRET_KEY.
"""
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
name = models.CharField(max_length=100, unique=True)
api_type = models.CharField(
max_length=20,
choices=[
("openai", "OpenAI Compatible"),
("azure", "Azure OpenAI"),
("ollama", "Ollama"),
("anthropic", "Anthropic"),
("llama-cpp", "Llama.cpp"),
("vllm", "vLLM"),
("bedrock", "Amazon Bedrock"),
],
)
base_url = models.URLField()
api_key = EncryptedCharField(max_length=500, blank=True, default="")
is_active = models.BooleanField(default=True)
supports_streaming = models.BooleanField(default=True)
timeout_seconds = models.PositiveIntegerField(default=60)
max_retries = models.PositiveIntegerField(default=3)
# Testing and validation fields
last_tested_at = models.DateTimeField(
null=True,
blank=True,
help_text="Last time this API was tested",
)
last_test_status = models.CharField(
max_length=20,
choices=[
("success", "Success"),
("failed", "Failed"),
("pending", "Pending"),
],
default="pending",
help_text="Result of the last API test",
)
last_test_message = models.TextField(
blank=True,
help_text="Details from the last test (success message or error)",
)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
created_by = models.ForeignKey(
User,
null=True,
blank=True,
on_delete=models.SET_NULL,
related_name="llm_apis_created",
)
class Meta:
ordering = ["name"]
verbose_name = "LLM API"
verbose_name_plural = "LLM APIs"
def __str__(self):
return f"{self.name} ({self.api_type})"
class LLMModel(models.Model):
"""
Represents a specific LLM model provided by an API.
Mnemosyne additions over Spelunker:
- model_type adds 'reranker' and 'multimodal_embed'
- supports_multimodal flag for image+text capable models
- vector_dimensions for embedding output size
"""
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
api = models.ForeignKey(LLMApi, on_delete=models.CASCADE, related_name="models")
name = models.CharField(max_length=100)
display_name = models.CharField(max_length=200, blank=True)
model_type = models.CharField(
max_length=20,
choices=[
("chat", "Chat/Completion"),
("embedding", "Embedding"),
("vision", "Vision"),
("audio", "Audio"),
("reranker", "Reranker"),
("multimodal_embed", "Multimodal Embedding"),
],
)
context_window = models.PositiveIntegerField(
help_text="Maximum context in tokens"
)
max_output_tokens = models.PositiveIntegerField(null=True, blank=True)
supports_cache = models.BooleanField(default=False)
supports_vision = models.BooleanField(default=False)
supports_function_calling = models.BooleanField(default=False)
supports_json_mode = models.BooleanField(default=False)
# Mnemosyne additions
supports_multimodal = models.BooleanField(
default=False,
help_text="Flag models that accept image+text input",
)
vector_dimensions = models.PositiveIntegerField(
null=True,
blank=True,
help_text="Embedding output dimensions (e.g., 4096)",
)
# Pricing
input_cost_per_1k = models.DecimalField(
max_digits=10,
decimal_places=6,
default=Decimal("0"),
help_text="Cost per 1K input tokens in USD",
)
output_cost_per_1k = models.DecimalField(
max_digits=10,
decimal_places=6,
default=Decimal("0"),
help_text="Cost per 1K output tokens in USD",
)
cached_cost_per_1k = models.DecimalField(
max_digits=10,
decimal_places=6,
null=True,
blank=True,
help_text="Cost per 1K cached tokens (if supported)",
)
is_active = models.BooleanField(default=True)
is_system_embedding_model = models.BooleanField(
default=False,
help_text=(
"Mark this as the system-wide embedding model. "
"Only ONE embedding model should have this set to True."
),
)
is_system_chat_model = models.BooleanField(
default=False,
help_text=(
"Mark this as the system-wide chat model. "
"Only ONE chat model should have this set to True."
),
)
is_system_reranker_model = models.BooleanField(
default=False,
help_text=(
"Mark this as the system-wide reranker model. "
"Only ONE reranker model should have this set to True."
),
)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
class Meta:
ordering = ["api", "name"]
unique_together = [("api", "name")]
indexes = [
models.Index(fields=["api", "model_type", "is_active"]),
models.Index(fields=["is_system_embedding_model", "model_type"]),
models.Index(fields=["is_system_chat_model", "model_type"]),
models.Index(fields=["is_system_reranker_model", "model_type"]),
]
def __str__(self):
return f"{self.api.name}: {self.name}"
@classmethod
def get_system_embedding_model(cls):
"""Get the system-wide embedding model."""
return cls.objects.filter(
is_system_embedding_model=True,
is_active=True,
model_type__in=["embedding", "multimodal_embed"],
).first()
@classmethod
def get_system_chat_model(cls):
"""Get the system-wide chat model."""
return cls.objects.filter(
is_system_chat_model=True,
is_active=True,
model_type="chat",
).first()
@classmethod
def get_system_reranker_model(cls):
"""Get the system-wide reranker model."""
return cls.objects.filter(
is_system_reranker_model=True,
is_active=True,
model_type="reranker",
).first()
class LLMUsage(models.Model):
"""
Tracks token usage and cost for all LLM API calls.
"""
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
user = models.ForeignKey(
User, on_delete=models.SET_NULL, null=True, related_name="llm_usage"
)
model = models.ForeignKey(
LLMModel, on_delete=models.PROTECT, related_name="usage_records"
)
timestamp = models.DateTimeField(auto_now_add=True, db_index=True)
input_tokens = models.PositiveIntegerField(default=0)
output_tokens = models.PositiveIntegerField(default=0)
cached_tokens = models.PositiveIntegerField(default=0)
total_cost = models.DecimalField(
max_digits=12,
decimal_places=6,
default=Decimal("0"),
help_text="Total cost in USD",
)
session_id = models.CharField(max_length=100, blank=True, db_index=True)
purpose = models.CharField(
max_length=50,
choices=[
("responder", "RAG Responder"),
("reviewer", "RAG Reviewer"),
("embeddings", "Document Embeddings"),
("search", "Vector Search"),
("reranking", "Re-ranking"),
("multimodal_embed", "Multimodal Embedding"),
("other", "Other"),
],
default="other",
db_index=True,
)
request_metadata = models.JSONField(
null=True,
blank=True,
help_text="Additional context (prompt, temperature, etc.)",
)
class Meta:
ordering = ["-timestamp"]
indexes = [
models.Index(fields=["-timestamp", "user"]),
models.Index(fields=["-timestamp", "model"]),
models.Index(fields=["purpose", "-timestamp"]),
models.Index(fields=["session_id"]),
]
def save(self, *args, **kwargs):
if not self.total_cost or self.total_cost == 0:
self.total_cost = self.calculate_cost()
super().save(*args, **kwargs)
def calculate_cost(self):
"""Calculate cost based on token usage and model pricing."""
input_cost = (self.input_tokens / 1000) * float(self.model.input_cost_per_1k)
output_cost = (self.output_tokens / 1000) * float(
self.model.output_cost_per_1k
)
cached_cost = 0
if self.cached_tokens and self.model.cached_cost_per_1k:
cached_cost = (self.cached_tokens / 1000) * float(
self.model.cached_cost_per_1k
)
return Decimal(str(input_cost + output_cost + cached_cost))
def __str__(self):
return f"{self.model.name} - {self.timestamp} - ${self.total_cost}"

View File

@@ -0,0 +1,275 @@
"""
Services for LLM API testing and model discovery.
Ported from Spelunker with Mnemosyne adaptations.
"""
import logging
from django.db import transaction
from django.utils import timezone
logger = logging.getLogger(__name__)
def test_llm_api(api):
"""
Test an LLM API connection and discover available models.
:param api: LLMApi instance to test.
:returns: dict with success, models_added/updated/deactivated, message/error.
"""
from .models import LLMModel
result = {
"success": False,
"models_added": 0,
"models_updated": 0,
"models_deactivated": 0,
"message": "",
"error": "",
}
logger.info("Testing LLM API: %s (%s) at %s", api.name, api.api_type, api.base_url)
try:
if api.api_type in ("openai", "vllm"):
discovered_models = _discover_openai_models(api)
elif api.api_type == "ollama":
discovered_models = _discover_ollama_models(api)
elif api.api_type == "bedrock":
discovered_models = _discover_bedrock_models(api)
else:
result["error"] = f"API type '{api.api_type}' is not yet supported for auto-discovery"
logger.warning(result["error"])
return result
if not discovered_models:
result["error"] = "No models discovered from API"
logger.warning("No models found for API %s", api.name)
return result
logger.info("Discovered %d models from %s", len(discovered_models), api.name)
discovered_model_names = {m["name"] for m in discovered_models}
with transaction.atomic():
for model_data in discovered_models:
model_name = model_data["name"]
try:
existing = LLMModel.objects.get(api=api, name=model_name)
existing.is_active = True
existing.display_name = model_data.get("display_name", model_name)
existing.model_type = model_data.get("model_type", "chat")
existing.context_window = model_data.get("context_window", 8192)
existing.max_output_tokens = model_data.get("max_output_tokens")
existing.supports_cache = model_data.get("supports_cache", False)
existing.supports_vision = model_data.get("supports_vision", False)
existing.supports_function_calling = model_data.get("supports_function_calling", False)
existing.supports_json_mode = model_data.get("supports_json_mode", False)
existing.save()
result["models_updated"] += 1
except LLMModel.DoesNotExist:
from decimal import Decimal
LLMModel.objects.create(
api=api,
name=model_name,
display_name=model_data.get("display_name", model_name),
model_type=model_data.get("model_type", "chat"),
context_window=model_data.get("context_window", 8192),
max_output_tokens=model_data.get("max_output_tokens"),
supports_cache=model_data.get("supports_cache", False),
supports_vision=model_data.get("supports_vision", False),
supports_function_calling=model_data.get("supports_function_calling", False),
supports_json_mode=model_data.get("supports_json_mode", False),
input_cost_per_1k=Decimal("0"),
output_cost_per_1k=Decimal("0"),
is_active=True,
)
result["models_added"] += 1
logger.info("Added new model: %s::%s", api.name, model_name)
# Deactivate models no longer available
for model in LLMModel.objects.filter(api=api, is_active=True):
if model.name not in discovered_model_names:
model.is_active = False
model.save(update_fields=["is_active"])
result["models_deactivated"] += 1
logger.warning("Deactivated missing model: %s::%s", api.name, model.name)
api.last_tested_at = timezone.now()
api.last_test_status = "success"
api.last_test_message = (
f"Added: {result['models_added']}, "
f"Updated: {result['models_updated']}, "
f"Deactivated: {result['models_deactivated']}"
)
api.save(update_fields=["last_tested_at", "last_test_status", "last_test_message"])
result["success"] = True
result["message"] = api.last_test_message
logger.info("API test successful: %s%s", api.name, result["message"])
except Exception as e:
result["error"] = f"API test failed: {e}"
api.last_tested_at = timezone.now()
api.last_test_status = "failed"
api.last_test_message = result["error"]
api.save(update_fields=["last_tested_at", "last_test_status", "last_test_message"])
logger.error("API test failed for %s: %s", api.name, e, exc_info=True)
return result
def _discover_openai_models(api):
"""Discover models from an OpenAI-compatible API."""
try:
from openai import OpenAI
except ImportError:
raise ImportError("openai package required for model discovery — pip install openai")
client = OpenAI(
api_key=api.api_key or "dummy",
base_url=api.base_url,
timeout=api.timeout_seconds,
max_retries=api.max_retries,
)
discovered = []
models_response = client.models.list()
for model in models_response.data:
model_id = model.id
discovered.append(
{
"name": model_id,
"display_name": model_id,
"model_type": _infer_model_type(model_id),
"context_window": _infer_context_window(model_id),
"max_output_tokens": None,
"supports_cache": False,
"supports_vision": any(
kw in model_id.lower() for kw in ("vision", "gpt-4-turbo", "gpt-4o")
),
"supports_function_calling": any(
kw in model_id.lower() for kw in ("gpt-4", "gpt-3.5-turbo")
),
"supports_json_mode": any(
kw in model_id.lower() for kw in ("gpt-4", "gpt-3.5-turbo")
),
}
)
return discovered
def _discover_ollama_models(api):
"""Discover models from an Ollama API."""
import requests
url = f"{api.base_url.rstrip('/')}/api/tags"
discovered = []
resp = requests.get(url, timeout=10)
resp.raise_for_status()
for model in resp.json().get("models", []):
name = model["name"]
discovered.append(
{
"name": name,
"display_name": name,
"model_type": "chat",
"context_window": 4096,
"max_output_tokens": None,
"supports_cache": False,
"supports_vision": False,
"supports_function_calling": False,
"supports_json_mode": False,
}
)
return discovered
def _discover_bedrock_models(api):
"""
Discover models from Amazon Bedrock via the Mantle OpenAI-compatible endpoint.
For Bedrock APIs, the base_url is the bedrock-runtime endpoint. We derive
the Mantle endpoint from the region to list models.
"""
import requests
# Extract region from base_url (e.g. https://bedrock-runtime.us-east-1.amazonaws.com)
base = api.base_url.rstrip("/")
region = "us-east-1"
if "bedrock-runtime." in base:
# Parse region from URL
parts = base.split("bedrock-runtime.")[1].split(".")
if parts:
region = parts[0]
# Use the Mantle endpoint for model listing (OpenAI-compatible)
mantle_url = f"https://bedrock-mantle.{region}.api.aws/v1/models"
headers = {}
if api.api_key:
headers["Authorization"] = f"Bearer {api.api_key}"
discovered = []
try:
resp = requests.get(mantle_url, headers=headers, timeout=api.timeout_seconds or 30)
resp.raise_for_status()
data = resp.json()
for model in data.get("data", []):
model_id = model.get("id", "")
discovered.append(
{
"name": model_id,
"display_name": model_id,
"model_type": _infer_model_type(model_id),
"context_window": _infer_context_window(model_id),
"max_output_tokens": None,
"supports_cache": False,
"supports_vision": any(
kw in model_id.lower() for kw in ("claude-3", "nova", "vision")
),
"supports_function_calling": False,
"supports_json_mode": False,
}
)
except Exception as exc:
logger.warning("Bedrock Mantle model discovery failed: %s", exc)
# Fallback: return empty list (user can manually add models)
return discovered
def _infer_model_type(model_id):
"""Infer model type from its identifier."""
lower = model_id.lower()
if any(kw in lower for kw in ("embed", "embedding")):
return "embedding"
if "rerank" in lower:
return "reranker"
return "chat"
def _infer_context_window(model_id):
"""Infer context window size from model identifier."""
m = model_id.lower()
if any(kw in m for kw in ("gpt-4-turbo", "gpt-4-1106", "gpt-4-0125", "gpt-4o")):
return 128000
if "gpt-4-32k" in m:
return 32768
if "gpt-4" in m:
return 8192
if "gpt-3.5-turbo-16k" in m:
return 16384
if "gpt-3.5-turbo" in m:
return 4096
if "claude-3" in m:
return 200000
if "claude-2" in m:
return 100000
if "32k" in m:
return 32768
if "16k" in m:
return 16384
return 8192

View File

@@ -0,0 +1,86 @@
"""
Celery tasks for LLM Manager — ported from Spelunker.
"""
import logging
from celery import shared_task
from .models import LLMApi
from .services import test_llm_api
logger = logging.getLogger(__name__)
@shared_task(name="llm_manager.validate_all_llm_apis")
def validate_all_llm_apis():
"""
Periodic task to validate all active LLM APIs and discover models.
Schedule via Celery Beat (e.g. hourly or daily).
"""
logger.info("Starting periodic LLM API validation")
active_apis = LLMApi.objects.filter(is_active=True)
if not active_apis.exists():
logger.info("No active APIs to validate")
return {"status": "completed", "tested": 0, "successful": 0, "failed": 0}
results = {
"status": "completed",
"tested": 0,
"successful": 0,
"failed": 0,
"models_added": 0,
"models_updated": 0,
"models_deactivated": 0,
"details": [],
}
for api in active_apis:
results["tested"] += 1
try:
result = test_llm_api(api)
if result["success"]:
results["successful"] += 1
results["models_added"] += result["models_added"]
results["models_updated"] += result["models_updated"]
results["models_deactivated"] += result["models_deactivated"]
else:
results["failed"] += 1
results["details"].append(
{
"api_name": api.name,
"success": result["success"],
"message": result.get("message") or result.get("error", ""),
}
)
except Exception as e:
results["failed"] += 1
results["details"].append(
{"api_name": api.name, "success": False, "message": str(e)}
)
logger.error("Unexpected error validating %s: %s", api.name, e, exc_info=True)
logger.info(
"Completed LLM API validation: %d/%d successful, %d failed",
results["successful"],
results["tested"],
results["failed"],
)
return results
@shared_task(name="llm_manager.validate_single_api")
def validate_single_api(api_id):
"""Validate a single LLM API by ID."""
try:
api = LLMApi.objects.get(id=api_id)
return test_llm_api(api)
except LLMApi.DoesNotExist:
msg = f"LLM API with id {api_id} not found"
logger.error(msg)
return {"success": False, "error": msg}
except Exception as e:
logger.error("Error validating API %s: %s", api_id, e, exc_info=True)
return {"success": False, "error": str(e)}

View File

@@ -0,0 +1,22 @@
{% extends "themis/base.html" %}
{% block title %}Delete {{ api.name }}{% endblock %}
{% block content %}
<div class="max-w-lg mx-auto mt-8">
<div class="card bg-base-200 shadow">
<div class="card-body">
<h2 class="card-title text-error">Delete LLM API</h2>
<p>Are you sure you want to delete <strong>{{ api.name }}</strong>?</p>
<p class="text-sm text-base-content/70">This will also remove all associated models. Usage records will be preserved.</p>
<div class="card-actions justify-end mt-4">
<a href="{% url 'llm_manager:api_detail' api.pk %}" class="btn btn-ghost">Cancel</a>
<form method="post">
{% csrf_token %}
<button type="submit" class="btn btn-error">Delete</button>
</form>
</div>
</div>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,93 @@
{% extends "themis/base.html" %}
{% block title %}{{ api.name }} — LLM API{% endblock %}
{% block content %}
<div class="mb-6">
<div class="text-sm breadcrumbs">
<ul>
<li><a href="{% url 'llm_manager:dashboard' %}">LLM Manager</a></li>
<li><a href="{% url 'llm_manager:api_list' %}">APIs</a></li>
<li>{{ api.name }}</li>
</ul>
</div>
<div class="flex justify-between items-center mt-2">
<h1 class="text-2xl font-bold">{{ api.name }}</h1>
<div class="flex gap-2">
<form method="post" action="{% url 'llm_manager:api_test' api.pk %}">
{% csrf_token %}
<button type="submit" class="btn btn-sm btn-accent">Test Connection</button>
</form>
<a href="{% url 'llm_manager:api_edit' api.pk %}" class="btn btn-sm btn-primary">Edit</a>
<a href="{% url 'llm_manager:api_delete' api.pk %}" class="btn btn-sm btn-error btn-outline">Delete</a>
</div>
</div>
</div>
<!-- API details -->
<div class="grid grid-cols-1 md:grid-cols-2 gap-6 mb-8">
<div class="card bg-base-200 shadow">
<div class="card-body">
<h2 class="card-title text-lg">Configuration</h2>
<div class="grid grid-cols-2 gap-2 text-sm">
<span class="font-semibold">Type:</span><span class="badge badge-ghost">{{ api.get_api_type_display }}</span>
<span class="font-semibold">Base URL:</span><span class="font-mono text-xs break-all">{{ api.base_url }}</span>
<span class="font-semibold">Active:</span><span>{% if api.is_active %}Yes{% else %}No{% endif %}</span>
<span class="font-semibold">Streaming:</span><span>{% if api.supports_streaming %}Yes{% else %}No{% endif %}</span>
<span class="font-semibold">Timeout:</span><span>{{ api.timeout_seconds }}s</span>
<span class="font-semibold">Max Retries:</span><span>{{ api.max_retries }}</span>
</div>
</div>
</div>
<div class="card bg-base-200 shadow">
<div class="card-body">
<h2 class="card-title text-lg">Test Status</h2>
<div class="grid grid-cols-2 gap-2 text-sm">
<span class="font-semibold">Status:</span>
<span>
{% if api.last_test_status == "success" %}<span class="badge badge-success">Success</span>
{% elif api.last_test_status == "failed" %}<span class="badge badge-error">Failed</span>
{% else %}<span class="badge badge-warning">Pending</span>{% endif %}
</span>
<span class="font-semibold">Last Tested:</span><span>{{ api.last_tested_at|default:"Never" }}</span>
</div>
{% if api.last_test_message %}
<div class="mt-2 text-xs bg-base-300 p-2 rounded">{{ api.last_test_message }}</div>
{% endif %}
</div>
</div>
</div>
<!-- Models -->
<div>
<div class="flex justify-between items-center mb-3">
<h2 class="text-xl font-semibold">Models ({{ models.count }})</h2>
<a href="{% url 'llm_manager:model_create' %}?api={{ api.pk }}" class="btn btn-sm btn-primary">Add Model</a>
</div>
{% if models %}
<div class="overflow-x-auto">
<table class="table table-zebra table-sm w-full">
<thead><tr><th>Name</th><th>Type</th><th>Context</th><th>Dims</th><th>Active</th><th>System</th></tr></thead>
<tbody>
{% for m in models %}
<tr>
<td><a href="{% url 'llm_manager:model_detail' m.pk %}" class="link link-primary">{{ m.name }}</a></td>
<td><span class="badge badge-ghost badge-sm">{{ m.get_model_type_display }}</span></td>
<td>{{ m.context_window|default:"—" }}</td>
<td>{{ m.vector_dimensions|default:"—" }}</td>
<td>{% if m.is_active %}<span class="badge badge-success badge-xs"></span>{% else %}<span class="badge badge-error badge-xs"></span>{% endif %}</td>
<td>
{% if m.is_system_embedding_model %}<span class="badge badge-sm badge-success">Embed</span>{% endif %}
{% if m.is_system_chat_model %}<span class="badge badge-sm badge-info">Chat</span>{% endif %}
{% if m.is_system_reranker_model %}<span class="badge badge-sm badge-warning">Rerank</span>{% endif %}
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% else %}
<div class="alert alert-info">No models for this API yet. Use "Test Connection" to auto-discover or add manually.</div>
{% endif %}
</div>
{% endblock %}

View File

@@ -0,0 +1,39 @@
{% extends "themis/base.html" %}
{% block title %}{% if is_edit %}Edit {{ api.name }}{% else %}Add LLM API{% endif %}{% endblock %}
{% block content %}
<div class="text-sm breadcrumbs mb-4">
<ul>
<li><a href="{% url 'llm_manager:dashboard' %}">LLM Manager</a></li>
<li><a href="{% url 'llm_manager:api_list' %}">APIs</a></li>
<li>{% if is_edit %}Edit {{ api.name }}{% else %}Add API{% endif %}</li>
</ul>
</div>
<div class="max-w-2xl">
<h1 class="text-2xl font-bold mb-4">{% if is_edit %}Edit {{ api.name }}{% else %}Add LLM API{% endif %}</h1>
<form method="post" class="space-y-4">
{% csrf_token %}
{% for field in form %}
<div class="form-control w-full">
<label class="label"><span class="label-text font-semibold">{{ field.label }}</span></label>
{{ field }}
{% if field.errors %}
<label class="label"><span class="label-text-alt text-error">{{ field.errors.0 }}</span></label>
{% endif %}
{% if field.help_text %}
<label class="label"><span class="label-text-alt">{{ field.help_text }}</span></label>
{% endif %}
</div>
{% endfor %}
<div class="flex gap-2 pt-4">
<button type="submit" class="btn btn-primary">{% if is_edit %}Save Changes{% else %}Create API{% endif %}</button>
<a href="{% if is_edit %}{% url 'llm_manager:api_detail' api.pk %}{% else %}{% url 'llm_manager:api_list' %}{% endif %}" class="btn btn-ghost">Cancel</a>
</div>
</form>
</div>
{% endblock %}

View File

@@ -0,0 +1,56 @@
{% extends "themis/base.html" %}
{% block title %}LLM APIs{% endblock %}
{% block content %}
<div class="flex justify-between items-center mb-6">
<h1 class="text-2xl font-bold">LLM APIs</h1>
<a href="{% url 'llm_manager:api_create' %}" class="btn btn-primary btn-sm">Add API</a>
</div>
{% if apis %}
<div class="overflow-x-auto">
<table class="table table-zebra w-full">
<thead>
<tr>
<th>Name</th>
<th>Type</th>
<th>Base URL</th>
<th>Active</th>
<th>Streaming</th>
<th>Status</th>
<th>Models</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
{% for api in apis %}
<tr>
<td><a href="{% url 'llm_manager:api_detail' api.pk %}" class="link link-primary font-semibold">{{ api.name }}</a></td>
<td><span class="badge badge-ghost">{{ api.get_api_type_display }}</span></td>
<td class="font-mono text-xs max-w-xs truncate">{{ api.base_url }}</td>
<td>{% if api.is_active %}<span class="badge badge-success badge-sm">Yes</span>{% else %}<span class="badge badge-error badge-sm">No</span>{% endif %}</td>
<td>{% if api.supports_streaming %}<span class="badge badge-info badge-sm">Yes</span>{% else %}—{% endif %}</td>
<td>
{% if api.last_test_status == "success" %}<span class="badge badge-success badge-sm">OK</span>
{% elif api.last_test_status == "failed" %}<span class="badge badge-error badge-sm">Failed</span>
{% else %}<span class="badge badge-warning badge-sm">Pending</span>{% endif %}
</td>
<td>{{ api.models.count }}</td>
<td>
<a href="{% url 'llm_manager:api_edit' api.pk %}" class="btn btn-xs btn-ghost">Edit</a>
<a href="{% url 'llm_manager:api_delete' api.pk %}" class="btn btn-xs btn-ghost text-error">Delete</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% else %}
<div class="alert alert-info">No APIs configured. <a href="{% url 'llm_manager:api_create' %}" class="link">Add one now</a>.</div>
{% endif %}
<div class="mt-4">
<a href="{% url 'llm_manager:dashboard' %}" class="btn btn-ghost btn-sm">&larr; Dashboard</a>
</div>
{% endblock %}

View File

@@ -0,0 +1,132 @@
{% extends "themis/base.html" %}
{% block title %}LLM Manager — Dashboard{% endblock %}
{% block content %}
<div class="mb-6">
<h1 class="text-3xl font-bold">LLM Manager</h1>
<p class="text-base-content/70 mt-1">Manage LLM APIs, models, and usage tracking.</p>
</div>
<!-- Stats cards -->
<div class="grid grid-cols-1 md:grid-cols-3 gap-4 mb-8">
<div class="stat bg-base-200 rounded-box shadow">
<div class="stat-title">Active APIs</div>
<div class="stat-value text-primary">{{ api_count }}</div>
<div class="stat-actions"><a href="{% url 'llm_manager:api_list' %}" class="btn btn-sm btn-primary">Manage</a></div>
</div>
<div class="stat bg-base-200 rounded-box shadow">
<div class="stat-title">Active Models</div>
<div class="stat-value text-secondary">{{ model_count }}</div>
<div class="stat-actions"><a href="{% url 'llm_manager:model_list' %}" class="btn btn-sm btn-secondary">Browse</a></div>
</div>
<div class="stat bg-base-200 rounded-box shadow">
<div class="stat-title">Your API Calls</div>
<div class="stat-value text-accent">{{ usage_count }}</div>
<div class="stat-desc">
{{ total_input_tokens|default:"0" }} in / {{ total_output_tokens|default:"0" }} out tokens
</div>
<div class="stat-actions"><a href="{% url 'llm_manager:usage_list' %}" class="btn btn-sm btn-accent">History</a></div>
</div>
</div>
<!-- System models -->
<div class="mb-8">
<h2 class="text-xl font-semibold mb-3">System Default Models</h2>
<div class="grid grid-cols-1 md:grid-cols-3 gap-4">
<div class="card bg-base-200 shadow">
<div class="card-body">
<h3 class="card-title text-sm">Embedding</h3>
{% if system_embedding_model %}
<p class="font-mono text-sm">{{ system_embedding_model.name }}</p>
<p class="text-xs text-base-content/60">{{ system_embedding_model.api.name }}{% if system_embedding_model.vector_dimensions %} — {{ system_embedding_model.vector_dimensions }}d{% endif %}</p>
{% else %}
<p class="text-warning text-sm">Not configured</p>
{% endif %}
</div>
</div>
<div class="card bg-base-200 shadow">
<div class="card-body">
<h3 class="card-title text-sm">Chat</h3>
{% if system_chat_model %}
<p class="font-mono text-sm">{{ system_chat_model.name }}</p>
<p class="text-xs text-base-content/60">{{ system_chat_model.api.name }}</p>
{% else %}
<p class="text-warning text-sm">Not configured</p>
{% endif %}
</div>
</div>
<div class="card bg-base-200 shadow">
<div class="card-body">
<h3 class="card-title text-sm">Reranker</h3>
{% if system_reranker_model %}
<p class="font-mono text-sm">{{ system_reranker_model.name }}</p>
<p class="text-xs text-base-content/60">{{ system_reranker_model.api.name }}</p>
{% else %}
<p class="text-warning text-sm">Not configured</p>
{% endif %}
</div>
</div>
</div>
</div>
<!-- Active APIs -->
<div class="mb-8">
<div class="flex justify-between items-center mb-3">
<h2 class="text-xl font-semibold">Active APIs</h2>
<a href="{% url 'llm_manager:api_create' %}" class="btn btn-sm btn-primary">Add API</a>
</div>
{% if active_apis %}
<div class="overflow-x-auto">
<table class="table table-zebra w-full">
<thead><tr><th>Name</th><th>Type</th><th>URL</th><th>Status</th><th>Last Tested</th></tr></thead>
<tbody>
{% for api in active_apis %}
<tr>
<td><a href="{% url 'llm_manager:api_detail' api.pk %}" class="link link-primary">{{ api.name }}</a></td>
<td><span class="badge badge-ghost">{{ api.get_api_type_display }}</span></td>
<td class="font-mono text-xs">{{ api.base_url }}</td>
<td>
{% if api.last_test_status == "success" %}
<span class="badge badge-success">OK</span>
{% elif api.last_test_status == "failed" %}
<span class="badge badge-error">Failed</span>
{% else %}
<span class="badge badge-warning">Pending</span>
{% endif %}
</td>
<td class="text-xs">{{ api.last_tested_at|default:"Never" }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% else %}
<div class="alert alert-info">No active APIs configured yet.</div>
{% endif %}
</div>
<!-- Recent usage -->
{% if recent_usage %}
<div>
<h2 class="text-xl font-semibold mb-3">Recent Usage</h2>
<div class="overflow-x-auto">
<table class="table table-zebra table-sm w-full">
<thead><tr><th>Time</th><th>Model</th><th>In</th><th>Out</th><th>Cost</th><th>Purpose</th></tr></thead>
<tbody>
{% for u in recent_usage %}
<tr>
<td class="text-xs">{{ u.timestamp|date:"M d H:i" }}</td>
<td class="font-mono text-xs">{{ u.model.name }}</td>
<td>{{ u.input_tokens }}</td>
<td>{{ u.output_tokens }}</td>
<td>${{ u.total_cost|floatformat:4 }}</td>
<td><span class="badge badge-ghost badge-sm">{{ u.get_purpose_display }}</span></td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
{% endif %}
{% endblock %}

View File

@@ -0,0 +1,22 @@
{% extends "themis/base.html" %}
{% block title %}Delete {{ model.name }}{% endblock %}
{% block content %}
<div class="max-w-lg mx-auto mt-8">
<div class="card bg-base-200 shadow">
<div class="card-body">
<h2 class="card-title text-error">Delete LLM Model</h2>
<p>Are you sure you want to delete <strong class="font-mono">{{ model.name }}</strong> from <strong>{{ model.api.name }}</strong>?</p>
<p class="text-sm text-base-content/70">Usage records referencing this model will be preserved.</p>
<div class="card-actions justify-end mt-4">
<a href="{% url 'llm_manager:model_detail' model.pk %}" class="btn btn-ghost">Cancel</a>
<form method="post">
{% csrf_token %}
<button type="submit" class="btn btn-error">Delete</button>
</form>
</div>
</div>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,66 @@
{% extends "themis/base.html" %}
{% block title %}{{ model.name }} — LLM Model{% endblock %}
{% block content %}
<div class="text-sm breadcrumbs mb-4">
<ul>
<li><a href="{% url 'llm_manager:dashboard' %}">LLM Manager</a></li>
<li><a href="{% url 'llm_manager:model_list' %}">Models</a></li>
<li>{{ model.name }}</li>
</ul>
</div>
<div class="flex justify-between items-center mb-6">
<h1 class="text-2xl font-bold font-mono">{{ model.name }}</h1>
<div class="flex gap-2">
<a href="{% url 'llm_manager:model_edit' model.pk %}" class="btn btn-sm btn-primary">Edit</a>
<a href="{% url 'llm_manager:model_delete' model.pk %}" class="btn btn-sm btn-error btn-outline">Delete</a>
</div>
</div>
<div class="grid grid-cols-1 md:grid-cols-2 gap-6">
<div class="card bg-base-200 shadow">
<div class="card-body">
<h2 class="card-title text-lg">Details</h2>
<div class="grid grid-cols-2 gap-2 text-sm">
<span class="font-semibold">Display Name:</span><span>{{ model.display_name|default:"—" }}</span>
<span class="font-semibold">API:</span><span><a href="{% url 'llm_manager:api_detail' model.api.pk %}" class="link link-primary">{{ model.api.name }}</a></span>
<span class="font-semibold">Type:</span><span class="badge badge-ghost">{{ model.get_model_type_display }}</span>
<span class="font-semibold">Active:</span><span>{% if model.is_active %}Yes{% else %}No{% endif %}</span>
<span class="font-semibold">Context Window:</span><span>{{ model.context_window|default:"—" }} tokens</span>
<span class="font-semibold">Max Output:</span><span>{{ model.max_output_tokens|default:"—" }} tokens</span>
<span class="font-semibold">Dimensions:</span><span>{{ model.vector_dimensions|default:"—" }}</span>
</div>
</div>
</div>
<div class="card bg-base-200 shadow">
<div class="card-body">
<h2 class="card-title text-lg">Capabilities</h2>
<div class="flex flex-wrap gap-2">
{% if model.supports_cache %}<span class="badge badge-success">Cache</span>{% endif %}
{% if model.supports_vision %}<span class="badge badge-info">Vision</span>{% endif %}
{% if model.supports_multimodal %}<span class="badge badge-accent">Multimodal</span>{% endif %}
{% if model.supports_function_calling %}<span class="badge badge-secondary">Functions</span>{% endif %}
{% if model.supports_json_mode %}<span class="badge badge-warning">JSON Mode</span>{% endif %}
</div>
<h3 class="font-semibold mt-4 text-sm">Pricing (per 1K tokens)</h3>
<div class="grid grid-cols-2 gap-2 text-sm">
<span>Input:</span><span>${{ model.input_cost_per_1k }}</span>
<span>Output:</span><span>${{ model.output_cost_per_1k }}</span>
{% if model.cached_cost_per_1k %}<span>Cached:</span><span>${{ model.cached_cost_per_1k }}</span>{% endif %}
</div>
<h3 class="font-semibold mt-4 text-sm">System Defaults</h3>
<div class="flex flex-wrap gap-2">
{% if model.is_system_embedding_model %}<span class="badge badge-success">System Embedding</span>{% endif %}
{% if model.is_system_chat_model %}<span class="badge badge-info">System Chat</span>{% endif %}
{% if model.is_system_reranker_model %}<span class="badge badge-warning">System Reranker</span>{% endif %}
{% if not model.is_system_embedding_model and not model.is_system_chat_model and not model.is_system_reranker_model %}
<span class="text-base-content/50 text-sm">Not a system default</span>
{% endif %}
</div>
</div>
</div>
</div>
{% endblock %}

View File

@@ -0,0 +1,39 @@
{% extends "themis/base.html" %}
{% block title %}{% if is_edit %}Edit {{ model.name }}{% else %}Add LLM Model{% endif %}{% endblock %}
{% block content %}
<div class="text-sm breadcrumbs mb-4">
<ul>
<li><a href="{% url 'llm_manager:dashboard' %}">LLM Manager</a></li>
<li><a href="{% url 'llm_manager:model_list' %}">Models</a></li>
<li>{% if is_edit %}Edit {{ model.name }}{% else %}Add Model{% endif %}</li>
</ul>
</div>
<div class="max-w-2xl">
<h1 class="text-2xl font-bold mb-4">{% if is_edit %}Edit {{ model.name }}{% else %}Add LLM Model{% endif %}</h1>
<form method="post" class="space-y-4">
{% csrf_token %}
{% for field in form %}
<div class="form-control w-full">
<label class="label"><span class="label-text font-semibold">{{ field.label }}</span></label>
{{ field }}
{% if field.errors %}
<label class="label"><span class="label-text-alt text-error">{{ field.errors.0 }}</span></label>
{% endif %}
{% if field.help_text %}
<label class="label"><span class="label-text-alt">{{ field.help_text }}</span></label>
{% endif %}
</div>
{% endfor %}
<div class="flex gap-2 pt-4">
<button type="submit" class="btn btn-primary">{% if is_edit %}Save Changes{% else %}Create Model{% endif %}</button>
<a href="{% if is_edit %}{% url 'llm_manager:model_detail' model.pk %}{% else %}{% url 'llm_manager:model_list' %}{% endif %}" class="btn btn-ghost">Cancel</a>
</div>
</form>
</div>
{% endblock %}

View File

@@ -0,0 +1,68 @@
{% extends "themis/base.html" %}
{% block title %}LLM Models{% endblock %}
{% block content %}
<div class="flex justify-between items-center mb-6">
<h1 class="text-2xl font-bold">LLM Models</h1>
<a href="{% url 'llm_manager:model_create' %}" class="btn btn-primary btn-sm">Add Model</a>
</div>
<!-- Filter by API -->
<div class="mb-4 flex gap-2 items-center">
<span class="font-semibold text-sm">Filter API:</span>
<a href="{% url 'llm_manager:model_list' %}" class="btn btn-xs {% if not selected_api %}btn-primary{% else %}btn-ghost{% endif %}">All</a>
{% for api in apis %}
<a href="{% url 'llm_manager:model_list' %}?api={{ api.pk }}" class="btn btn-xs {% if selected_api == api.pk|slugify %}btn-primary{% else %}btn-ghost{% endif %}">{{ api.name }}</a>
{% endfor %}
</div>
{% if models %}
<div class="overflow-x-auto">
<table class="table table-zebra w-full">
<thead>
<tr>
<th>Name</th>
<th>API</th>
<th>Type</th>
<th>Context</th>
<th>Dims</th>
<th>$/1K In</th>
<th>$/1K Out</th>
<th>Active</th>
<th>System</th>
<th>Actions</th>
</tr>
</thead>
<tbody>
{% for m in models %}
<tr>
<td><a href="{% url 'llm_manager:model_detail' m.pk %}" class="link link-primary font-mono text-sm">{{ m.name }}</a></td>
<td class="text-sm">{{ m.api.name }}</td>
<td><span class="badge badge-ghost badge-sm">{{ m.get_model_type_display }}</span></td>
<td>{{ m.context_window|default:"—" }}</td>
<td>{{ m.vector_dimensions|default:"—" }}</td>
<td class="text-xs">${{ m.input_cost_per_1k }}</td>
<td class="text-xs">${{ m.output_cost_per_1k }}</td>
<td>{% if m.is_active %}<span class="badge badge-success badge-xs"></span>{% else %}<span class="badge badge-error badge-xs"></span>{% endif %}</td>
<td>
{% if m.is_system_embedding_model %}<span class="badge badge-sm badge-success">Embed</span>{% endif %}
{% if m.is_system_chat_model %}<span class="badge badge-sm badge-info">Chat</span>{% endif %}
{% if m.is_system_reranker_model %}<span class="badge badge-sm badge-warning">Rerank</span>{% endif %}
</td>
<td>
<a href="{% url 'llm_manager:model_edit' m.pk %}" class="btn btn-xs btn-ghost">Edit</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% else %}
<div class="alert alert-info">No models found.</div>
{% endif %}
<div class="mt-4">
<a href="{% url 'llm_manager:dashboard' %}" class="btn btn-ghost btn-sm">&larr; Dashboard</a>
</div>
{% endblock %}

View File

@@ -0,0 +1,72 @@
{% extends "themis/base.html" %}
{% block title %}Usage History{% endblock %}
{% block content %}
<div class="flex justify-between items-center mb-6">
<h1 class="text-2xl font-bold">Usage History</h1>
</div>
<!-- Totals -->
{% if totals %}
<div class="grid grid-cols-2 md:grid-cols-4 gap-4 mb-6">
<div class="stat bg-base-200 rounded-box shadow py-3">
<div class="stat-title text-xs">Input Tokens</div>
<div class="stat-value text-lg">{{ totals.total_input|default:"0" }}</div>
</div>
<div class="stat bg-base-200 rounded-box shadow py-3">
<div class="stat-title text-xs">Output Tokens</div>
<div class="stat-value text-lg">{{ totals.total_output|default:"0" }}</div>
</div>
<div class="stat bg-base-200 rounded-box shadow py-3">
<div class="stat-title text-xs">Cached Tokens</div>
<div class="stat-value text-lg">{{ totals.total_cached|default:"0" }}</div>
</div>
<div class="stat bg-base-200 rounded-box shadow py-3">
<div class="stat-title text-xs">Total Cost</div>
<div class="stat-value text-lg">${{ totals.total_cost|default:"0"|floatformat:4 }}</div>
</div>
</div>
{% endif %}
{% if usage_records %}
<div class="overflow-x-auto">
<table class="table table-zebra table-sm w-full">
<thead>
<tr>
<th>Timestamp</th>
<th>Model</th>
<th>API</th>
<th>Input</th>
<th>Output</th>
<th>Cached</th>
<th>Cost</th>
<th>Purpose</th>
<th>Session</th>
</tr>
</thead>
<tbody>
{% for u in usage_records %}
<tr>
<td class="text-xs">{{ u.timestamp|date:"Y-m-d H:i:s" }}</td>
<td class="font-mono text-xs">{{ u.model.name }}</td>
<td class="text-xs">{{ u.model.api.name }}</td>
<td>{{ u.input_tokens }}</td>
<td>{{ u.output_tokens }}</td>
<td>{{ u.cached_tokens }}</td>
<td class="text-xs">${{ u.total_cost|floatformat:4 }}</td>
<td><span class="badge badge-ghost badge-xs">{{ u.get_purpose_display }}</span></td>
<td class="font-mono text-xs max-w-[8rem] truncate">{{ u.session_id|default:"—" }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% else %}
<div class="alert alert-info">No usage records yet.</div>
{% endif %}
<div class="mt-4">
<a href="{% url 'llm_manager:dashboard' %}" class="btn btn-ghost btn-sm">&larr; Dashboard</a>
</div>
{% endblock %}

View File

View File

@@ -0,0 +1,154 @@
"""
Tests for LLM Manager DRF API endpoints.
"""
from decimal import Decimal
from django.contrib.auth import get_user_model
from django.test import TestCase
from django.urls import reverse
from rest_framework.test import APIClient
from llm_manager.models import LLMApi, LLMModel, LLMUsage
User = get_user_model()
class LLMApiEndpointTest(TestCase):
"""Tests for the LLM API endpoints."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.client = APIClient()
self.client.force_authenticate(user=self.user)
self.api = LLMApi.objects.create(
name="Test API",
api_type="openai",
base_url="https://api.example.com/v1",
)
def test_api_list(self):
resp = self.client.get(reverse("llm-manager-api:api_list"))
self.assertEqual(resp.status_code, 200)
self.assertEqual(len(resp.data), 1)
self.assertEqual(resp.data[0]["name"], "Test API")
def test_api_detail(self):
resp = self.client.get(reverse("llm-manager-api:api_detail", kwargs={"pk": self.api.pk}))
self.assertEqual(resp.status_code, 200)
self.assertEqual(resp.data["name"], "Test API")
def test_api_not_found(self):
import uuid
resp = self.client.get(reverse("llm-manager-api:api_detail", kwargs={"pk": uuid.uuid4()}))
self.assertEqual(resp.status_code, 404)
def test_requires_auth(self):
self.client.force_authenticate(user=None)
resp = self.client.get(reverse("llm-manager-api:api_list"))
self.assertIn(resp.status_code, [401, 403])
class LLMModelEndpointTest(TestCase):
"""Tests for the LLM Model API endpoints."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.client = APIClient()
self.client.force_authenticate(user=self.user)
self.api = LLMApi.objects.create(
name="Test API",
api_type="openai",
base_url="https://api.example.com/v1",
)
self.model = LLMModel.objects.create(
api=self.api,
name="gpt-4o",
model_type="chat",
context_window=128000,
)
def test_model_list(self):
resp = self.client.get(reverse("llm-manager-api:model_list"))
self.assertEqual(resp.status_code, 200)
self.assertEqual(len(resp.data), 1)
def test_model_list_filter_type(self):
LLMModel.objects.create(
api=self.api,
name="embed-model",
model_type="embedding",
context_window=8191,
)
resp = self.client.get(reverse("llm-manager-api:model_list") + "?type=embedding")
self.assertEqual(resp.status_code, 200)
self.assertEqual(len(resp.data), 1)
self.assertEqual(resp.data[0]["model_type"], "embedding")
def test_model_detail(self):
resp = self.client.get(reverse("llm-manager-api:model_detail", kwargs={"pk": self.model.pk}))
self.assertEqual(resp.status_code, 200)
self.assertEqual(resp.data["name"], "gpt-4o")
def test_system_models_empty(self):
resp = self.client.get(reverse("llm-manager-api:system_models"))
self.assertEqual(resp.status_code, 200)
self.assertEqual(resp.data, {})
def test_system_models_configured(self):
self.model.is_system_chat_model = True
self.model.save()
resp = self.client.get(reverse("llm-manager-api:system_models"))
self.assertEqual(resp.status_code, 200)
self.assertIn("chat", resp.data)
self.assertEqual(resp.data["chat"]["name"], "gpt-4o")
class LLMUsageEndpointTest(TestCase):
"""Tests for the LLM Usage API endpoints."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.client = APIClient()
self.client.force_authenticate(user=self.user)
self.api = LLMApi.objects.create(
name="Test API",
api_type="openai",
base_url="https://api.example.com/v1",
)
self.model = LLMModel.objects.create(
api=self.api,
name="gpt-4o",
model_type="chat",
context_window=128000,
input_cost_per_1k=Decimal("0.0025"),
output_cost_per_1k=Decimal("0.01"),
)
def test_usage_list_empty(self):
resp = self.client.get(reverse("llm-manager-api:usage_list"))
self.assertEqual(resp.status_code, 200)
self.assertEqual(resp.data, [])
def test_usage_create(self):
resp = self.client.post(
reverse("llm-manager-api:usage_list"),
{
"model": str(self.model.pk),
"input_tokens": 1000,
"output_tokens": 500,
"purpose": "other",
},
format="json",
)
self.assertEqual(resp.status_code, 201)
self.assertEqual(LLMUsage.objects.count(), 1)
def test_usage_list_returns_own_records(self):
other_user = User.objects.create_user(username="other", password="testpass123")
LLMUsage.objects.create(user=self.user, model=self.model, input_tokens=100, output_tokens=50)
LLMUsage.objects.create(user=other_user, model=self.model, input_tokens=200, output_tokens=100)
resp = self.client.get(reverse("llm-manager-api:usage_list"))
self.assertEqual(resp.status_code, 200)
self.assertEqual(len(resp.data), 1)

View File

@@ -0,0 +1,236 @@
"""
Tests for LLM Manager models: LLMApi, LLMModel, LLMUsage.
"""
from decimal import Decimal
from django.contrib.auth import get_user_model
from django.test import TestCase
from llm_manager.models import LLMApi, LLMModel, LLMUsage
User = get_user_model()
class LLMApiModelTest(TestCase):
"""Tests for the LLMApi model."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.api = LLMApi.objects.create(
name="Test API",
api_type="openai",
base_url="https://api.example.com/v1",
is_active=True,
created_by=self.user,
)
def test_str(self):
self.assertEqual(str(self.api), "Test API (openai)")
def test_default_values(self):
self.assertTrue(self.api.is_active)
self.assertTrue(self.api.supports_streaming)
self.assertEqual(self.api.timeout_seconds, 60)
self.assertEqual(self.api.max_retries, 3)
self.assertEqual(self.api.last_test_status, "pending")
def test_uuid_primary_key(self):
self.assertIsNotNone(self.api.pk)
self.assertEqual(len(str(self.api.pk)), 36) # UUID format
def test_unique_name(self):
with self.assertRaises(Exception):
LLMApi.objects.create(
name="Test API",
api_type="ollama",
base_url="http://localhost:11434",
)
class LLMApiEncryptionTest(TestCase):
"""Tests for API key encryption."""
def test_api_key_encrypted_at_rest(self):
"""API key should be encrypted in the database."""
api = LLMApi.objects.create(
name="Encrypted Test",
api_type="openai",
base_url="https://api.example.com/v1",
api_key="sk-test-secret-key-12345",
)
# Re-fetch from database
api_fresh = LLMApi.objects.get(pk=api.pk)
self.assertEqual(api_fresh.api_key, "sk-test-secret-key-12345")
def test_blank_api_key(self):
api = LLMApi.objects.create(
name="No Key",
api_type="ollama",
base_url="http://localhost:11434",
api_key="",
)
api_fresh = LLMApi.objects.get(pk=api.pk)
self.assertEqual(api_fresh.api_key, "")
class LLMModelModelTest(TestCase):
"""Tests for the LLMModel model."""
def setUp(self):
self.api = LLMApi.objects.create(
name="Test API",
api_type="openai",
base_url="https://api.example.com/v1",
)
self.model = LLMModel.objects.create(
api=self.api,
name="gpt-4o",
display_name="GPT-4o",
model_type="chat",
context_window=128000,
max_output_tokens=16384,
input_cost_per_1k=Decimal("0.0025"),
output_cost_per_1k=Decimal("0.01"),
)
def test_str(self):
self.assertEqual(str(self.model), "Test API: gpt-4o")
def test_unique_together(self):
"""Model name must be unique per API."""
with self.assertRaises(Exception):
LLMModel.objects.create(
api=self.api,
name="gpt-4o",
model_type="chat",
context_window=8192,
)
def test_model_types(self):
"""All model types should be creatable."""
for mtype in ["embedding", "vision", "audio", "reranker", "multimodal_embed"]:
m = LLMModel.objects.create(
api=self.api,
name=f"test-{mtype}",
model_type=mtype,
context_window=8192,
)
self.assertEqual(m.model_type, mtype)
def test_mnemosyne_fields(self):
"""Mnemosyne-specific fields: supports_multimodal, vector_dimensions."""
embed = LLMModel.objects.create(
api=self.api,
name="text-embedding-3-large",
model_type="embedding",
context_window=8191,
vector_dimensions=3072,
supports_multimodal=False,
)
self.assertEqual(embed.vector_dimensions, 3072)
self.assertFalse(embed.supports_multimodal)
def test_get_system_embedding_model(self):
embed = LLMModel.objects.create(
api=self.api,
name="embed-model",
model_type="embedding",
context_window=8191,
is_system_embedding_model=True,
)
result = LLMModel.get_system_embedding_model()
self.assertEqual(result.pk, embed.pk)
def test_get_system_chat_model(self):
self.model.is_system_chat_model = True
self.model.save()
result = LLMModel.get_system_chat_model()
self.assertEqual(result.pk, self.model.pk)
def test_get_system_reranker_model(self):
reranker = LLMModel.objects.create(
api=self.api,
name="reranker-model",
model_type="reranker",
context_window=8192,
is_system_reranker_model=True,
)
result = LLMModel.get_system_reranker_model()
self.assertEqual(result.pk, reranker.pk)
def test_get_system_model_returns_none(self):
"""Returns None when no system model is configured."""
self.assertIsNone(LLMModel.get_system_embedding_model())
self.assertIsNone(LLMModel.get_system_chat_model())
self.assertIsNone(LLMModel.get_system_reranker_model())
class LLMUsageModelTest(TestCase):
"""Tests for the LLMUsage model."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.api = LLMApi.objects.create(
name="Test API",
api_type="openai",
base_url="https://api.example.com/v1",
)
self.model = LLMModel.objects.create(
api=self.api,
name="gpt-4o",
model_type="chat",
context_window=128000,
input_cost_per_1k=Decimal("0.0025"),
output_cost_per_1k=Decimal("0.01"),
)
def test_cost_calculation(self):
"""Total cost is auto-calculated on save."""
usage = LLMUsage.objects.create(
user=self.user,
model=self.model,
input_tokens=1000,
output_tokens=500,
purpose="other",
)
# 1000/1000 * 0.0025 + 500/1000 * 0.01 = 0.0025 + 0.005 = 0.0075
self.assertAlmostEqual(float(usage.total_cost), 0.0075, places=4)
def test_cost_with_cached_tokens(self):
self.model.cached_cost_per_1k = Decimal("0.00125")
self.model.save()
usage = LLMUsage.objects.create(
user=self.user,
model=self.model,
input_tokens=1000,
output_tokens=500,
cached_tokens=2000,
purpose="responder",
)
# 0.0025 + 0.005 + 2000/1000 * 0.00125 = 0.0025 + 0.005 + 0.0025 = 0.01
self.assertAlmostEqual(float(usage.total_cost), 0.01, places=4)
def test_purpose_choices(self):
for purpose in ["responder", "reviewer", "embeddings", "search", "reranking", "multimodal_embed", "other"]:
usage = LLMUsage.objects.create(
user=self.user,
model=self.model,
input_tokens=100,
output_tokens=50,
purpose=purpose,
)
self.assertEqual(usage.purpose, purpose)
def test_protect_model_delete(self):
"""Deleting a model with usage records should raise ProtectedError."""
LLMUsage.objects.create(
user=self.user,
model=self.model,
input_tokens=100,
output_tokens=50,
)
from django.db.models import ProtectedError
with self.assertRaises(ProtectedError):
self.model.delete()

View File

@@ -0,0 +1,162 @@
"""
Tests for LLM Manager views — FBV-based.
"""
from django.contrib.auth import get_user_model
from django.test import TestCase
from django.urls import reverse
from llm_manager.models import LLMApi, LLMModel
User = get_user_model()
class LLMDashboardViewTest(TestCase):
"""Tests for the LLM Manager dashboard."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.client.login(username="testuser", password="testpass123")
def test_dashboard_requires_login(self):
self.client.logout()
resp = self.client.get(reverse("llm_manager:dashboard"))
self.assertEqual(resp.status_code, 302)
def test_dashboard_renders(self):
resp = self.client.get(reverse("llm_manager:dashboard"))
self.assertEqual(resp.status_code, 200)
self.assertContains(resp, "LLM Manager")
class LLMApiViewTest(TestCase):
"""Tests for LLMApi CRUD views."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.client.login(username="testuser", password="testpass123")
self.api = LLMApi.objects.create(
name="Test API",
api_type="openai",
base_url="https://api.example.com/v1",
created_by=self.user,
)
def test_api_list(self):
resp = self.client.get(reverse("llm_manager:api_list"))
self.assertEqual(resp.status_code, 200)
self.assertContains(resp, "Test API")
def test_api_detail(self):
resp = self.client.get(reverse("llm_manager:api_detail", kwargs={"pk": self.api.pk}))
self.assertEqual(resp.status_code, 200)
self.assertContains(resp, "Test API")
def test_api_create_get(self):
resp = self.client.get(reverse("llm_manager:api_create"))
self.assertEqual(resp.status_code, 200)
def test_api_create_post(self):
resp = self.client.post(
reverse("llm_manager:api_create"),
{
"name": "New API",
"api_type": "ollama",
"base_url": "http://localhost:11434",
"is_active": True,
"supports_streaming": True,
"timeout_seconds": 60,
"max_retries": 3,
},
)
self.assertEqual(resp.status_code, 302)
self.assertTrue(LLMApi.objects.filter(name="New API").exists())
def test_api_edit(self):
resp = self.client.post(
reverse("llm_manager:api_edit", kwargs={"pk": self.api.pk}),
{
"name": "Updated API",
"api_type": "openai",
"base_url": "https://api.example.com/v2",
"is_active": True,
"supports_streaming": True,
"timeout_seconds": 30,
"max_retries": 5,
},
)
self.assertEqual(resp.status_code, 302)
self.api.refresh_from_db()
self.assertEqual(self.api.name, "Updated API")
def test_api_delete(self):
resp = self.client.post(reverse("llm_manager:api_delete", kwargs={"pk": self.api.pk}))
self.assertEqual(resp.status_code, 302)
self.assertFalse(LLMApi.objects.filter(pk=self.api.pk).exists())
class LLMModelViewTest(TestCase):
"""Tests for LLMModel CRUD views."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.client.login(username="testuser", password="testpass123")
self.api = LLMApi.objects.create(
name="Test API",
api_type="openai",
base_url="https://api.example.com/v1",
)
self.model = LLMModel.objects.create(
api=self.api,
name="gpt-4o",
model_type="chat",
context_window=128000,
)
def test_model_list(self):
resp = self.client.get(reverse("llm_manager:model_list"))
self.assertEqual(resp.status_code, 200)
self.assertContains(resp, "gpt-4o")
def test_model_list_filter_by_api(self):
resp = self.client.get(reverse("llm_manager:model_list") + f"?api={self.api.pk}")
self.assertEqual(resp.status_code, 200)
self.assertContains(resp, "gpt-4o")
def test_model_detail(self):
resp = self.client.get(reverse("llm_manager:model_detail", kwargs={"pk": self.model.pk}))
self.assertEqual(resp.status_code, 200)
self.assertContains(resp, "gpt-4o")
def test_model_create(self):
resp = self.client.post(
reverse("llm_manager:model_create"),
{
"api": str(self.api.pk),
"name": "gpt-4o-mini",
"model_type": "chat",
"context_window": 128000,
"input_cost_per_1k": "0.000150",
"output_cost_per_1k": "0.000600",
"is_active": True,
},
)
self.assertEqual(resp.status_code, 302)
self.assertTrue(LLMModel.objects.filter(name="gpt-4o-mini").exists())
def test_model_delete(self):
resp = self.client.post(reverse("llm_manager:model_delete", kwargs={"pk": self.model.pk}))
self.assertEqual(resp.status_code, 302)
self.assertFalse(LLMModel.objects.filter(pk=self.model.pk).exists())
class UsageListViewTest(TestCase):
"""Tests for the usage list view."""
def setUp(self):
self.user = User.objects.create_user(username="testuser", password="testpass123")
self.client.login(username="testuser", password="testpass123")
def test_usage_list(self):
resp = self.client.get(reverse("llm_manager:usage_list"))
self.assertEqual(resp.status_code, 200)

View File

@@ -0,0 +1,28 @@
"""
URL patterns for LLM Manager — FBVs per Red Panda Standards.
"""
from django.urls import path
from . import views
app_name = "llm_manager"
urlpatterns = [
path("", views.dashboard, name="dashboard"),
# APIs
path("apis/", views.api_list, name="api_list"),
path("apis/create/", views.api_create, name="api_create"),
path("apis/<uuid:pk>/", views.api_detail, name="api_detail"),
path("apis/<uuid:pk>/edit/", views.api_edit, name="api_edit"),
path("apis/<uuid:pk>/delete/", views.api_delete, name="api_delete"),
path("apis/<uuid:pk>/test/", views.api_test, name="api_test"),
# Models
path("models/", views.model_list, name="model_list"),
path("models/create/", views.model_create, name="model_create"),
path("models/<uuid:pk>/", views.model_detail, name="model_detail"),
path("models/<uuid:pk>/edit/", views.model_edit, name="model_edit"),
path("models/<uuid:pk>/delete/", views.model_delete, name="model_delete"),
# Usage
path("usage/", views.usage_list, name="usage_list"),
]

Some files were not shown because too many files have changed in this diff Show More