Add Themis application with custom widgets, views, and utilities

- Implemented custom form widgets for date, time, and datetime fields with DaisyUI styling.
- Created utility functions for formatting dates, times, and numbers according to user preferences.
- Developed views for profile settings, API key management, and notifications, including health check endpoints.
- Added URL configurations for Themis tests and main application routes.
- Established test cases for custom widgets to ensure proper functionality and integration.
- Defined project metadata and dependencies in pyproject.toml for package management.
This commit is contained in:
2026-03-21 02:00:18 +00:00
parent e99346d014
commit 99bdb4ac92
351 changed files with 65123 additions and 2 deletions

254
docs/PHASE_1_FOUNDATION.md Normal file
View File

@@ -0,0 +1,254 @@
# Phase 1: Foundation
## Objective
Establish the project skeleton, Neo4j data model, Django integration, and content-type system. At the end of this phase, you can create libraries, collections, and items via Django admin and the Neo4j graph is populated with the correct node/relationship structure.
## Deliverables
### 1. Django Project Skeleton
- Rename configuration module from `mnemosyne/mnemosyne/` to `mnemosyne/config/` per Red Panda Standards
- Create `pyproject.toml` at repo root with floor-pinned dependencies
- Create `.env` / `.env.example` for environment variables (never commit `.env`)
- Use a single settings.py and use dotenv to configure with '.env'.
- Configure dual-database: PostgreSQL (Django auth/config) + Neo4j (content graph)
- Install and configure `django-neomodel` for Neo4j OGM integration
- Configure `djangorestframework` for API
- Configure Celery + RabbitMQ (Async Task pattern)
- Configure S3 storage backend via Incus buckets (MinIO-backed, Terraform-provisioned)
- Configure structured logging for Loki integration via Alloy
### 2. Django Apps
| App | Purpose | Database |
|-----|---------|----------|
| `themis` (installed) | User profiles, preferences, API key management, navigation, notifications | PostgreSQL |
| `library/` | Libraries, Collections, Items, Chunks, Concepts | Neo4j (neomodel) |
| `llm_manager/` | LLM API/model config, usage tracking | PostgreSQL (ported from Spelunker) |
> **Note:** Themis replaces `core/`. User profiles, timezone preferences, theme management, API key storage (encrypted, Fernet), and standard navigation are all provided by Themis. No separate `core/` app is needed. If SSO (Casdoor) or Organization models are required in future, they will be added as separate apps following the SSO and Organization patterns.
### 3. Neo4j Graph Model (neomodel)
```python
# library/models.py
class Library(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
library_type = StringProperty(required=True) # fiction, technical, music, film, art, journal
description = StringProperty(default='')
# Content-type configuration (stored as JSON strings)
chunking_config = JSONProperty(default={})
embedding_instruction = StringProperty(default='')
reranker_instruction = StringProperty(default='')
llm_context_prompt = StringProperty(default='')
created_at = DateTimeProperty(default_now=True)
collections = RelationshipTo('Collection', 'CONTAINS')
class Collection(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(required=True)
description = StringProperty(default='')
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
items = RelationshipTo('Item', 'CONTAINS')
library = RelationshipTo('Library', 'BELONGS_TO')
class Item(StructuredNode):
uid = UniqueIdProperty()
title = StringProperty(required=True)
item_type = StringProperty(default='')
s3_key = StringProperty(default='')
content_hash = StringProperty(index=True)
file_type = StringProperty(default='')
file_size = IntegerProperty(default=0)
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
updated_at = DateTimeProperty(default_now=True)
chunks = RelationshipTo('Chunk', 'HAS_CHUNK')
images = RelationshipTo('Image', 'HAS_IMAGE')
concepts = RelationshipTo('Concept', 'REFERENCES', model=ReferencesRel)
related_items = RelationshipTo('Item', 'RELATED_TO', model=RelatedToRel)
class Chunk(StructuredNode):
uid = UniqueIdProperty()
chunk_index = IntegerProperty(required=True)
chunk_s3_key = StringProperty(required=True)
chunk_size = IntegerProperty(default=0)
text_preview = StringProperty(default='') # First 500 chars for full-text index
embedding = ArrayProperty(FloatProperty()) # 4096d vector
created_at = DateTimeProperty(default_now=True)
mentions = RelationshipTo('Concept', 'MENTIONS')
class Concept(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
concept_type = StringProperty(default='') # person, place, topic, technique, theme
embedding = ArrayProperty(FloatProperty()) # 4096d vector
related_concepts = RelationshipTo('Concept', 'RELATED_TO')
class Image(StructuredNode):
uid = UniqueIdProperty()
s3_key = StringProperty(required=True)
image_type = StringProperty(default='') # cover, diagram, artwork, still, photo
description = StringProperty(default='')
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
embeddings = RelationshipTo('ImageEmbedding', 'HAS_EMBEDDING')
class ImageEmbedding(StructuredNode):
uid = UniqueIdProperty()
embedding = ArrayProperty(FloatProperty()) # 4096d multimodal vector
created_at = DateTimeProperty(default_now=True)
```
### 4. Neo4j Index Setup
Management command: `python manage.py setup_neo4j_indexes`
Creates vector indexes (4096d cosine), full-text indexes, and constraint indexes.
### 5. Content-Type System
Default library type configurations loaded via management command (`python manage.py load_library_types`). A management command is preferred over fixtures because these configurations will evolve across releases, and the command can be re-run idempotently to update defaults without overwriting per-library customizations.
Default configurations:
| Library Type | Chunking Strategy | Embedding Instruction | LLM Context |
|-------------|-------------------|----------------------|-------------|
| fiction | chapter_aware | narrative retrieval | "Excerpts from fiction..." |
| technical | section_aware | procedural retrieval | "Excerpts from technical docs..." |
| music | song_level | music discovery | "Song lyrics and metadata..." |
| film | scene_level | cinematic retrieval | "Film content..." |
| art | description_level | visual/stylistic retrieval | "Artwork descriptions..." |
| journal | entry_level | temporal/reflective retrieval | "Personal journal entries..." |
### 6. Admin & Management UI
`django-neomodel`'s admin support is limited — `StructuredNode` models don't participate in Django's ORM, so standard `ModelAdmin`, filters, search, and inlines don't work. Instead:
- **Custom admin views** for Library, Collection, and Item CRUD using Cypher/neomodel queries, rendered in Django admin's template structure
- **DRF management API** (`/api/v1/library/`, `/api/v1/collection/`, `/api/v1/item/`) for programmatic access and future frontend consumption
- Library CRUD includes content-type configuration editing
- Collection/Item views support filtering by library, type, and date
- All admin views extend `themis/base.html` for consistent navigation
### 7. LLM Manager (Port from Spelunker)
Copy and adapt `llm_manager/` app from Spelunker:
- `LLMApi` model (OpenAI-compatible API endpoints)
- `LLMModel` model (with new `reranker` and `multimodal_embed` model types)
- `LLMUsage` tracking
- **API key storage uses Themis `UserAPIKey`** — LLM Manager does not implement its own encrypted key storage. API credentials for LLM providers are stored via Themis's Fernet-encrypted `UserAPIKey` model with `key_type='api'` and appropriate `service_name` (e.g., "OpenAI", "Arke"). `LLMApi` references credentials by service name lookup against the requesting user's Themis keys.
Schema additions to Spelunker's `LLMModel`:
| Field | Change | Purpose |
|-------|--------|---------|
| `model_type` | Add choices: `reranker`, `multimodal_embed` | Support Qwen3-VL reranker and embedding models |
| `supports_multimodal` | New `BooleanField` | Flag models that accept image+text input |
| `vector_dimensions` | New `IntegerProperty` | Embedding output dimensions (e.g., 4096) |
### 8. Infrastructure Wiring (Ouranos)
All connections follow Ouranos DNS conventions — use `.incus` hostnames, never hardcode IPs.
| Service | Host | Connection | Settings Variable |
|---------|------|------------|-------------------|
| PostgreSQL | `portia.incus:5432` | Database `mnemosyne` (must be provisioned) | `DATABASE_URL` |
| Neo4j (Bolt) | `ariel.incus:25554` | Neo4j 5.26.0 | `NEOMODEL_NEO4J_BOLT_URL` |
| Neo4j (HTTP) | `ariel.incus:25584` | Browser/API access | — |
| RabbitMQ | `oberon.incus:5672` | Message broker | `CELERY_BROKER_URL` |
| S3 (Incus) | Terraform-provisioned Incus bucket | MinIO-backed object storage | `AWS_S3_ENDPOINT_URL`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_STORAGE_BUCKET_NAME` |
| Arke LLM Proxy | `sycorax.incus:25540` | LLM API routing | Configured per `LLMApi` record |
| SMTP (dev) | `oberon.incus:22025` | smtp4dev test server | `EMAIL_HOST` |
| Loki (logs) | `prospero.incus:3100` | Via Alloy agent (host-level, not app-level) | — |
| Casdoor SSO | `titania.incus:22081` | Future: SSO pattern | — |
**Terraform provisioning required before Phase 1 deployment:**
- PostgreSQL database `mnemosyne` on Portia
- Incus S3 bucket for Mnemosyne content storage
- HAProxy route: `mnemosyne.ouranos.helu.ca``puck.incus:<port>` (port TBD, assign next available in 22xxx range)
**Development environment (local):**
- PostgreSQL for Django ORM on 'portia.incus'
- Local Neo4j instance or `ariel.incus` via SSH tunnel
- `django.core.files.storage.FileSystemStorage` for S3 (tests/dev)
- `CELERY_TASK_ALWAYS_EAGER=True` for synchronous task execution
### 9. Testing Strategy
Follows Red Panda Standards: Django `TestCase`, separate test files per module.
| Test File | Scope |
|-----------|-------|
| `library/tests/test_models.py` | Neo4j node creation, relationships, property validation |
| `library/tests/test_content_types.py` | `load_library_types` command, configuration retrieval per library |
| `library/tests/test_indexes.py` | `setup_neo4j_indexes` command execution |
| `library/tests/test_api.py` | DRF endpoints for Library/Collection/Item CRUD |
| `library/tests/test_admin_views.py` | Custom admin views render and submit correctly |
| `llm_manager/tests/test_models.py` | LLMApi, LLMModel creation, new model types |
| `llm_manager/tests/test_api.py` | LLM Manager API endpoints |
**Neo4j test strategy:**
- Tests use a dedicated Neo4j test database (separate from development/production)
- `NEOMODEL_NEO4J_BOLT_URL` overridden in test settings to point to test database
- Each test class clears its nodes in `setUp` / `tearDown` using `neomodel.clear_neo4j_database()`
- CI/CD (Gitea Runner on Puck) uses a Docker Neo4j instance for isolated test runs
- For local development without Neo4j, tests that require Neo4j are skipped via `@unittest.skipUnless(neo4j_available(), "Neo4j not available")`
## Dependencies
```toml
# pyproject.toml — floor-pinned with ceiling per Red Panda Standards
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"django-neomodel>=0.1,<1.0",
"neomodel>=5.3,<6.0",
"neo4j>=5.0,<6.0",
"celery>=5.3,<6.0",
"django-storages[boto3]>=1.14,<2.0",
"django-environ>=0.11,<1.0",
"psycopg[binary]>=3.1,<4.0",
"dj-database-url>=2.1,<3.0",
"shortuuid>=1.0,<2.0",
"gunicorn>=21.0,<24.0",
"cryptography>=41.0,<45.0",
"flower>=2.0,<3.0",
"pymemcache>=4.0,<5.0",
"django-heluca-themis",
]
```
## Success Criteria
- [ ] Config module renamed to `config/`, `pyproject.toml` at repo root with floor-pinned deps
- [ ] Settings load from environment variables via `django-environ` (`.env.example` provided)
- [ ] Django project runs with dual PostgreSQL + Neo4j databases
- [ ] Can create Library → Collection → Item through custom admin views
- [ ] DRF API endpoints return Library/Collection/Item data
- [ ] Neo4j graph shows correct node types and relationships
- [ ] Content-type configurations loaded via `load_library_types` and retrievable per library
- [ ] LLM Manager ported from Spelunker; uses Themis `UserAPIKey` for credential storage
- [ ] S3 storage configured against Incus bucket (Terraform-provisioned) and tested
- [ ] Celery worker connects to RabbitMQ on Oberon
- [ ] Structured logging configured (JSON format, compatible with Loki/Alloy)
- [ ] Tests pass for all Phase 1 apps (library, llm_manager)
- [ ] HAProxy route provisioned: `mnemosyne.ouranos.helu.ca`

View File

@@ -0,0 +1,498 @@
# Phase 2: Embedding Pipeline
## Objective
Build the complete document ingestion and embedding pipeline: upload content → parse (text + images) → chunk (content-type-aware) → embed via configurable model → store vectors in Neo4j → extract concepts for knowledge graph.
## Heritage
The embedding pipeline adapts proven patterns from [Spelunker](https://git.helu.ca/r/spelunker)'s `rag/services/embeddings.py` — semantic chunking, batch embedding, S3 chunk storage, and progress tracking — enhanced with multimodal capabilities, knowledge graph relationships, and content-type awareness.
## Architecture Overview
```
Upload (API/Admin)
→ S3 Storage (original file)
→ Document Parsing (PyMuPDF — text + images)
→ Content-Type-Aware Chunking (semantic-text-splitter)
→ Text Embedding (system embedding model via LLM Manager)
→ Image Embedding (multimodal model, if available)
→ Neo4j Graph Storage (Chunk nodes, Image nodes, vectors)
→ Concept Extraction (system chat model)
→ Knowledge Graph (Concept nodes, MENTIONS/REFERENCES edges)
```
## Deliverables
### 1. Document Parsing Service (`library/services/parsers.py`)
**Primary parser: PyMuPDF** — a single library handling all document formats with unified text + image extraction.
#### Supported Formats
| Format | Extensions | Text Extraction | Image Extraction |
|--------|-----------|----------------|-----------------|
| PDF | `.pdf` | Layout-preserving text | Embedded images, diagrams |
| EPUB | `.epub` | Chapter-structured HTML | Cover art, illustrations |
| DOCX | `.docx` | Via HTML conversion | Inline images, diagrams |
| PPTX | `.pptx` | Via HTML conversion | Slide images, charts |
| XLSX | `.xlsx` | Via HTML conversion | Embedded charts |
| XPS | `.xps` | Native | Native |
| MOBI | `.mobi` | Native | Native |
| FB2 | `.fb2` | Native | Native |
| CBZ | `.cbz` | Native | Native (comic pages) |
| Plain text | `.txt`, `.md` | Direct read | N/A |
| HTML | `.html`, `.htm` | PyMuPDF or direct | Inline images |
| Images | `.jpg`, `.png`, etc. | N/A (OCR future) | The image itself |
#### Text Sanitization
Ported from Spelunker's `text_utils.py`:
- Remove null bytes and control characters
- Remove zero-width characters
- Normalize Unicode to NFC
- Replace invalid UTF-8 sequences
- Clean PDF ligatures and artifacts
- Normalize whitespace
#### Image Extraction
For each document page/section, extract embedded images via `page.get_images()``doc.extract_image(xref)`:
- Raw image bytes (PNG/JPEG)
- Dimensions (width × height)
- Source page/position for chunk-image association
- Store in S3: `images/{item_uid}/{image_index}.{ext}`
#### Parse Result Structure
```python
@dataclass
class TextBlock:
text: str
page: int
metadata: dict # {heading_level, section_name, etc.}
@dataclass
class ExtractedImage:
data: bytes
ext: str # png, jpg, etc.
width: int
height: int
source_page: int
source_index: int
@dataclass
class ParseResult:
text_blocks: list[TextBlock]
images: list[ExtractedImage]
metadata: dict # {page_count, title, author, etc.}
file_type: str
```
### 2. Content-Type-Aware Chunking Service (`library/services/chunker.py`)
Uses `semantic-text-splitter` with HuggingFace tokenizer (proven in Spelunker).
#### Strategy Dispatch
Based on `Library.chunking_config`:
| Strategy | Library Type | Boundary Markers | Chunk Size | Overlap |
|----------|-------------|-----------------|-----------|---------|
| `chapter_aware` | Fiction | chapter, scene, paragraph | 1024 | 128 |
| `section_aware` | Technical | section, subsection, code_block, list | 512 | 64 |
| `song_level` | Music | song, verse, chorus | 512 | 32 |
| `scene_level` | Film | scene, act, sequence | 768 | 64 |
| `description_level` | Art | artwork, description, analysis | 512 | 32 |
| `entry_level` | Journal | entry, date, paragraph | 512 | 32 |
#### Chunk-Image Association
Track which images appeared near which text chunks:
- PDF: image bounding boxes on specific pages
- DOCX/PPTX: images associated with slides/sections
- EPUB: images referenced from specific chapters
Creates `Chunk -[HAS_NEARBY_IMAGE]-> Image` relationships with proximity metadata.
#### Chunk Storage
- Chunk text stored in S3: `chunks/{item_uid}/chunk_{index}.txt`
- `text_preview` (first 500 chars) stored on Chunk node for full-text indexing
### 3. Embedding Client (`library/services/embedding_client.py`)
Multi-backend embedding client dispatching by `LLMApi.api_type`.
#### Backend Support
| API Type | Protocol | Auth | Batch Support |
|----------|---------|------|---------------|
| `openai` | HTTP POST `/embeddings` | API key header | Native batch |
| `vllm` | HTTP POST `/embeddings` | API key header | Native batch |
| `llama-cpp` | HTTP POST `/embeddings` | API key header | Native batch |
| `ollama` | HTTP POST `/embeddings` | None | Native batch |
| `bedrock` | HTTP POST `/model/{id}/invoke` | Bearer token | Client-side loop |
#### Bedrock Integration
Uses Amazon Bedrock API keys (Bearer token auth) — no boto3 SDK required:
```
POST https://bedrock-runtime.{region}.amazonaws.com/model/{model_id}/invoke
Authorization: Bearer {bedrock_api_key}
Content-Type: application/json
{"inputText": "text to embed", "dimensions": 1024, "normalize": true}
→ {"embedding": [float, ...], "inputTextTokenCount": 42}
```
**LLMApi setup for Bedrock embeddings:**
- `api_type`: `"bedrock"`
- `base_url`: `https://bedrock-runtime.us-east-1.amazonaws.com`
- `api_key`: Bedrock API key (encrypted)
**LLMApi setup for Bedrock chat (Claude, etc.):**
- `api_type`: `"openai"` (Mantle endpoint is OpenAI-compatible)
- `base_url`: `https://bedrock-mantle.us-east-1.api.aws/v1`
- `api_key`: Same Bedrock API key
#### Embedding Instruction Prefix
Before embedding, prepend the library's `embedding_instruction` to each chunk:
```
"{embedding_instruction}\n\n{chunk_text}"
```
#### Image Embedding
For multimodal models (`model.supports_multimodal`):
- Send base64-encoded image to the embedding endpoint
- Create `ImageEmbedding` node with the resulting vector
- If no multimodal model available, skip (images stored but not embedded)
#### Model Matching
Track embedded model by **name** (not UUID). Multiple APIs can serve the same model — matching by name allows provider switching without re-embedding.
### 4. Pipeline Orchestrator (`library/services/pipeline.py`)
Coordinates the full flow: parse → chunk → embed → store → graph.
#### Pipeline Stages
1. **Parse**: Extract text blocks + images from document
2. **Chunk**: Split text using content-type-aware strategy
3. **Store chunks**: S3 + Chunk nodes in Neo4j
4. **Embed text**: Generate vectors for all chunks
5. **Store images**: S3 + Image nodes in Neo4j
6. **Embed images**: Multimodal vectors (if available)
7. **Extract concepts**: Named entities from chunk text (via system chat model)
8. **Build graph**: Create Concept nodes, MENTIONS/REFERENCES edges
#### Idempotency
- Check `Item.content_hash` — skip if already processed with same hash
- Re-embedding deletes existing Chunk/Image nodes before re-processing
#### Dimension Compatibility
- Validate that the system embedding model's `vector_dimensions` matches the Neo4j vector index dimensions
- Warn at embed time if mismatch detected
### 5. Concept Extraction (`library/services/concepts.py`)
Uses the system chat model for LLM-based named entity recognition.
- Extract: people, places, topics, techniques, themes
- Create/update `Concept` nodes (deduplicated by name via unique_index)
- Connect: `Chunk -[MENTIONS]-> Concept`, `Item -[REFERENCES]-> Concept`
- Embed concept names for vector search
- If no system chat model configured, concept extraction is skipped
### 6. Celery Tasks (`library/tasks.py`)
All tasks pass IDs (not model instances) per Red Panda Standards.
| Task | Queue | Purpose |
|------|-------|---------|
| `embed_item(item_uid)` | `embedding` | Full pipeline for single item |
| `embed_collection(collection_uid)` | `batch` | All items in a collection |
| `embed_library(library_uid)` | `batch` | All items in a library |
| `batch_embed_items(item_uids)` | `batch` | Specific items |
| `reembed_item(item_uid)` | `embedding` | Delete + re-embed |
Tasks are idempotent, include retry logic, and track progress via Memcached: `library:task:{task_id}:progress`.
### 7. Prometheus Metrics (`library/metrics.py`)
Custom metrics for pipeline observability:
| Metric | Type | Labels | Purpose |
|--------|------|--------|---------|
| `mnemosyne_documents_parsed_total` | Counter | file_type, status | Parse throughput |
| `mnemosyne_document_parse_duration_seconds` | Histogram | file_type | Parse latency |
| `mnemosyne_images_extracted_total` | Counter | file_type | Image extraction volume |
| `mnemosyne_chunks_created_total` | Counter | library_type, strategy | Chunk throughput |
| `mnemosyne_chunk_size_tokens` | Histogram | — | Chunk size distribution |
| `mnemosyne_embeddings_generated_total` | Counter | model_name, api_type, content_type | Embedding throughput |
| `mnemosyne_embedding_batch_duration_seconds` | Histogram | model_name, api_type | API latency |
| `mnemosyne_embedding_api_errors_total` | Counter | model_name, api_type, error_type | API failures |
| `mnemosyne_embedding_tokens_total` | Counter | model_name | Token consumption |
| `mnemosyne_pipeline_items_total` | Counter | status | Pipeline throughput |
| `mnemosyne_pipeline_item_duration_seconds` | Histogram | — | End-to-end latency |
| `mnemosyne_pipeline_items_in_progress` | Gauge | — | Concurrent processing |
| `mnemosyne_concepts_extracted_total` | Counter | concept_type | Concept extraction volume |
### 8. Model Changes
#### Item Node — New Fields
| Field | Type | Purpose |
|-------|------|---------|
| `embedding_status` | StringProperty | pending / processing / completed / failed |
| `embedding_model_name` | StringProperty | Name of model that generated embeddings |
| `chunk_count` | IntegerProperty | Number of chunks created |
| `image_count` | IntegerProperty | Number of images extracted |
| `error_message` | StringProperty | Last error message (if failed) |
#### New Relationship Model
```python
class NearbyImageRel(StructuredRel):
proximity = StringProperty(default="same_page") # same_page, inline, same_slide, same_chapter
```
#### Chunk Node — New Relationship
```python
nearby_images = RelationshipTo('Image', 'HAS_NEARBY_IMAGE', model=NearbyImageRel)
```
#### LLMApi Model — New API Type
Add `("bedrock", "Amazon Bedrock")` to `api_type` choices.
### 9. API Enhancements
- `POST /api/v1/library/items/` — File upload with auto-trigger of `embed_item` task
- `POST /api/v1/library/items/<uid>/reembed/` — Re-embed endpoint
- `GET /api/v1/library/items/<uid>/status/` — Embedding status check
- Admin views: File upload field on item create, embedding status display
### 10. Management Commands
| Command | Purpose |
|---------|---------|
| `embed_item <uid>` | CLI embedding for testing |
| `embed_collection <uid>` | CLI batch embedding |
| `embedding_status` | Show embedding progress/statistics |
### 11. Dynamic Vector Index Dimensions
Update `setup_neo4j_indexes` to read dimensions from `LLMModel.get_system_embedding_model().vector_dimensions` instead of hardcoding 4096.
## Celery Workers & Scheduler
### Prerequisites
- RabbitMQ running on `oberon.incus:5672` with `mnemosyne` vhost and user
- `.env` configured with `CELERY_BROKER_URL=amqp://mnemosyne:password@oberon.incus:5672/mnemosyne`
- Virtual environment activated: `source ~/env/mnemosyne/bin/activate`
### Queues
Mnemosyne uses three Celery queues with task routing configured in `settings.py`:
| Queue | Tasks | Purpose | Recommended Concurrency |
|-------|-------|---------|------------------------|
| `celery` (default) | `llm_manager.validate_all_llm_apis`, `llm_manager.validate_single_api` | LLM API validation & model discovery | 2 |
| `embedding` | `library.tasks.embed_item`, `library.tasks.reembed_item` | Single-item embedding pipeline (GPU-bound) | 1 |
| `batch` | `library.tasks.embed_collection`, `library.tasks.embed_library`, `library.tasks.batch_embed_items` | Batch orchestration (dispatches to embedding queue) | 2 |
Task routing (`settings.py`):
```python
CELERY_TASK_ROUTES = {
"library.tasks.embed_*": {"queue": "embedding"},
"library.tasks.batch_*": {"queue": "batch"},
}
```
### Starting Workers
All commands run from the Django project root (`mnemosyne/`):
**Development — single worker, all queues:**
```bash
cd mnemosyne
celery -A mnemosyne worker -l info -Q celery,embedding,batch
```
**Development — eager mode (no worker needed):**
Set `CELERY_TASK_ALWAYS_EAGER=True` in `.env`. All tasks execute synchronously in the web process. Useful for debugging but does not test async behavior.
**Production — separate workers per queue:**
```bash
# Embedding worker (single concurrency — GPU is sequential)
celery -A mnemosyne worker \
-l info \
-Q embedding \
-c 1 \
-n embedding@%h \
--max-tasks-per-child=100
# Batch orchestration worker
celery -A mnemosyne worker \
-l info \
-Q batch \
-c 2 \
-n batch@%h
# Default queue worker (LLM API validation, etc.)
celery -A mnemosyne worker \
-l info \
-Q celery \
-c 2 \
-n default@%h
```
### Celery Beat (Periodic Scheduler)
Celery Beat runs scheduled tasks (e.g., periodic LLM API validation):
```bash
# File-based scheduler (simple, stores schedule in celerybeat-schedule file)
celery -A mnemosyne beat -l info
# Or with Django database scheduler (if django-celery-beat is installed)
celery -A mnemosyne beat -l info --scheduler django_celery_beat.schedulers:DatabaseScheduler
```
Example periodic task schedule (add to `settings.py` if needed):
```python
from celery.schedules import crontab
CELERY_BEAT_SCHEDULE = {
"validate-llm-apis-daily": {
"task": "llm_manager.validate_all_llm_apis",
"schedule": crontab(hour=6, minute=0), # Daily at 6 AM
},
}
```
### Flower (Task Monitoring)
[Flower](https://flower.readthedocs.io/) provides a real-time web UI for monitoring Celery workers and tasks:
```bash
celery -A mnemosyne flower --port=5555
```
Access at `http://localhost:5555`. Shows:
- Active/completed/failed tasks
- Worker status and resource usage
- Task execution times and retry counts
- Queue depths
### Reliability Configuration
The following settings are already configured in `settings.py`:
| Setting | Value | Purpose |
|---------|-------|---------|
| `CELERY_TASK_ACKS_LATE` | `True` | Acknowledge tasks after execution (not on receipt) — prevents task loss on worker crash |
| `CELERY_WORKER_PREFETCH_MULTIPLIER` | `1` | Workers fetch one task at a time — ensures fair distribution across workers |
| `CELERY_ACCEPT_CONTENT` | `["json"]` | Only accept JSON-serialized tasks |
| `CELERY_TASK_SERIALIZER` | `"json"` | Serialize task arguments as JSON |
### Task Progress Tracking
Embedding tasks report progress via Memcached using the key pattern:
```
library:task:{task_id}:progress → {"percent": 45, "message": "Embedded 12/27 chunks"}
```
Tasks also update Celery's native state:
```python
# Query task progress from Python
from celery.result import AsyncResult
result = AsyncResult(task_id)
result.state # "PROGRESS", "SUCCESS", "FAILURE"
result.info # {"percent": 45, "message": "..."}
```
## Dependencies
```toml
# New additions to pyproject.toml
"PyMuPDF>=1.24,<2.0",
"pymupdf4llm>=0.0.17,<1.0",
"semantic-text-splitter>=0.20,<1.0",
"tokenizers>=0.20,<1.0",
"Pillow>=10.0,<12.0",
"django-prometheus>=2.3,<3.0",
```
### License Note
PyMuPDF is AGPL-3.0 licensed. Acceptable for self-hosted personal use. Commercial distribution would require Artifex's commercial license.
## File Structure
```
mnemosyne/library/
├── services/
│ ├── __init__.py
│ ├── parsers.py # PyMuPDF universal document parsing
│ ├── text_utils.py # Text sanitization (from Spelunker)
│ ├── chunker.py # Content-type-aware chunking
│ ├── embedding_client.py # Multi-backend embedding API client
│ ├── pipeline.py # Orchestration: parse → chunk → embed → graph
│ └── concepts.py # LLM-based concept extraction
├── metrics.py # Prometheus metrics definitions
├── tasks.py # Celery tasks for async embedding
├── management/commands/
│ ├── embed_item.py
│ ├── embed_collection.py
│ └── embedding_status.py
└── tests/
├── test_parsers.py
├── test_text_utils.py
├── test_chunker.py
├── test_embedding_client.py
├── test_pipeline.py
├── test_concepts.py
└── test_tasks.py
```
## Testing Strategy
All tests use Django `TestCase`. External services (LLM APIs, Neo4j) are mocked.
| Test File | Scope |
|-----------|-------|
| `test_parsers.py` | PyMuPDF parsing for each file type, image extraction, text sanitization |
| `test_text_utils.py` | Sanitization functions, PDF artifact cleaning, Unicode normalization |
| `test_chunker.py` | Content-type strategies, boundary detection, chunk-image association |
| `test_embedding_client.py` | OpenAI-compat + Bedrock backends (mocked HTTP), batch processing, usage tracking |
| `test_pipeline.py` | Full pipeline integration (mocked), S3 storage, idempotency |
| `test_concepts.py` | Concept extraction, deduplication, graph relationships |
| `test_tasks.py` | Celery tasks (eager mode), retry logic, error handling |
## Success Criteria
- [ ] Upload a document (PDF, EPUB, DOCX, PPTX, TXT) via API or admin → file stored in S3
- [ ] Images extracted from documents and stored as Image nodes in Neo4j
- [ ] Document automatically chunked using content-type-aware strategy
- [ ] Chunks embedded via system embedding model and vectors stored in Neo4j Chunk nodes
- [ ] Images embedded multimodally into ImageEmbedding nodes (when multimodal model available)
- [ ] Chunk-image proximity relationships established in graph
- [ ] Concepts extracted and graph populated with MENTIONS/REFERENCES relationships
- [ ] Neo4j vector indexes usable for similarity queries on stored embeddings
- [ ] Celery tasks handle async embedding with progress tracking
- [ ] Re-embedding works (delete old chunks, re-process)
- [ ] Content hash prevents redundant re-embedding
- [ ] Prometheus metrics exposed at `/metrics` for pipeline monitoring
- [ ] All tests pass with mocked LLM/embedding APIs
- [ ] Bedrock embedding works via Bearer token HTTP (no boto3)

View File

@@ -0,0 +1,673 @@
# Async Task Pattern v1.0.0
Defines how Spelunker Django apps implement background task processing using Celery, RabbitMQ, Memcached, and Flower — covering fire-and-forget tasks, long-running batch jobs, signal-triggered tasks, and periodic scheduled tasks.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Why a Pattern, Not a Shared Implementation
Long-running work in Spelunker spans multiple domains, each with distinct progress-tracking and state requirements:
- A `solution_library` document embedding task needs to update `review_status` on a `Document` and count vector chunks created.
- An `rfp_manager` batch job tracks per-question progress, per-question errors, and the Celery task ID on an `RFPBatchJob` record.
- An `llm_manager` API-validation task iterates over all active APIs and accumulates model sync statistics.
- A `solution_library` documentation-source sync task fires from a View, stores `celery_task_id` on a `SyncJob`, and reports incremental progress via a callback.
Instead, this pattern defines:
- **Required task interface** — every task must have a namespaced name, a structured return dict, and structured logging.
- **Recommended job-tracking fields** — most tasks that represent a significant unit of work should have a corresponding DB job record.
- **Error handling conventions** — how to catch, log, and reflect failures back to the record.
- **Dispatch variants** — signal-triggered, admin action, view-triggered, and periodic (Beat).
- **Infrastructure conventions** — broker, result backend, serialization, and cache settings.
---
## Required Task Interface
Every Celery task in Spelunker **must**:
```python
from celery import shared_task
import logging
logger = logging.getLogger(__name__)
@shared_task(name='<app_label>.<action_name>')
def my_task(primary_id: int, user_id: int = None) -> dict:
"""One-line description of what this task does."""
try:
# ... do work ...
logger.info(f"Task succeeded for {primary_id}")
return {'success': True, 'id': primary_id}
except Exception as e:
logger.error(
f"Task failed for {primary_id}: {type(e).__name__}: {e}",
extra={'id': primary_id, 'error': str(e)},
exc_info=True,
)
return {'success': False, 'id': primary_id, 'error': str(e)}
```
| Requirement | Rule |
|---|---|
| `name` | Must be `'<app_label>.<action>'`, e.g., `'solution_library.embed_document'` |
| Return value | Always a dict with at minimum `{'success': bool}` |
| Logging | Use structured `extra={}` kwargs; never silence exceptions silently |
| Import style | Use `@shared_task`, not direct `app.task` references |
| Idempotency | Tasks **must** be safe to re-execute with the same arguments (broker redelivery, worker crash). Use `update_or_create`, check-before-write, or guard with the job record's status before re-processing. |
| Arguments | Pass only JSON-serialisable primitives (PKs, strings, numbers). Never pass ORM instances. |
---
## Retry & Time-Limit Policy
Tasks that call external services (LLM APIs, S3, remote URLs) should declare automatic retries for transient failures. Tasks must also set time limits to prevent hung workers.
### Recommended Retry Decorator
```python
@shared_task(
name='<app_label>.<action>',
bind=True,
autoretry_for=(ConnectionError, TimeoutError),
retry_backoff=60, # first retry after 60 s, then 120 s, 240 s …
retry_backoff_max=600, # cap at 10 minutes
retry_jitter=True, # add randomness to avoid thundering herd
max_retries=3,
soft_time_limit=1800, # raise SoftTimeLimitExceeded after 30 min
time_limit=2100, # hard-kill after 35 min
)
def my_task(self, primary_id: int, ...):
...
```
| Setting | Purpose | Guideline |
|---|---|---|
| `autoretry_for` | Exception classes that trigger an automatic retry | Use for **transient** errors only (network, timeout). Never for `ValueError` or business-logic errors. |
| `retry_backoff` | Seconds before first retry (doubles each attempt) | 60 s is a reasonable default for external API calls. |
| `max_retries` | Maximum retry attempts | 3 for API calls; 0 (no retry) for user-triggered batch jobs that track their own progress. |
| `soft_time_limit` | Raises `SoftTimeLimitExceeded` — allows graceful cleanup | Set on every task. Catch it to mark the job record as failed. |
| `time_limit` | Hard `SIGKILL` — last resort | Set 510 min above `soft_time_limit`. |
### Handling `SoftTimeLimitExceeded`
```python
from celery.exceptions import SoftTimeLimitExceeded
@shared_task(bind=True, soft_time_limit=1800, time_limit=2100, ...)
def long_running_task(self, job_id: int):
job = MyJob.objects.get(id=job_id)
try:
for item in items:
process(item)
except SoftTimeLimitExceeded:
logger.warning(f"Job {job_id} hit soft time limit — marking as failed")
job.status = 'failed'
job.completed_at = timezone.now()
job.save()
return {'success': False, 'job_id': job_id, 'error': 'Time limit exceeded'}
```
> **Note:** Batch jobs in `rfp_manager` do **not** use `autoretry_for` because they track per-question progress and should not re-run the entire batch. Instead, individual question failures are logged and the batch continues.
---
## Standard Values / Conventions
### Task Name Registry
| App | Task name | Trigger |
|---|---|---|
| `solution_library` | `solution_library.embed_document` | Signal / admin action |
| `solution_library` | `solution_library.embed_documents_batch` | Admin action |
| `solution_library` | `solution_library.sync_documentation_source` | View / admin action |
| `solution_library` | `solution_library.sync_all_documentation_sources` | Celery Beat (periodic) |
| `rfp_manager` | `rfp_manager.summarize_information_document` | Admin action |
| `rfp_manager` | `rfp_manager.batch_generate_responder_answers` | View |
| `rfp_manager` | `rfp_manager.batch_generate_reviewer_answers` | View |
| `llm_manager` | `llm_manager.validate_all_llm_apis` | Celery Beat (periodic) |
| `llm_manager` | `llm_manager.validate_single_api` | Admin action |
### Job Status Choices (DB Job Records)
```python
STATUS_PENDING = 'pending'
STATUS_PROCESSING = 'processing'
STATUS_COMPLETED = 'completed'
STATUS_FAILED = 'failed'
STATUS_CANCELLED = 'cancelled' # optional — used by rfp_manager
```
---
## Recommended Job-Tracking Fields
Tasks that represent a significant unit of work should write their state to a DB model. These are the recommended fields:
```python
class MyJobModel(models.Model):
# Celery linkage
celery_task_id = models.CharField(
max_length=255, blank=True,
help_text="Celery task ID for Flower monitoring"
)
# Status lifecycle
status = models.CharField(
max_length=20, choices=STATUS_CHOICES, default=STATUS_PENDING
)
started_at = models.DateTimeField(null=True, blank=True)
completed_at = models.DateTimeField(null=True, blank=True)
# Audit
started_by = models.ForeignKey(
User, on_delete=models.PROTECT, related_name='+'
)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
# Error accumulation
errors = models.JSONField(default=list)
class Meta:
indexes = [
models.Index(fields=['celery_task_id']),
models.Index(fields=['-created_at']),
]
```
For batch jobs that process many items, add counter fields:
```python
total_items = models.IntegerField(default=0)
processed_items = models.IntegerField(default=0)
successful_items = models.IntegerField(default=0)
failed_items = models.IntegerField(default=0)
def get_progress_percentage(self) -> int:
if self.total_items == 0:
return 0
return int((self.processed_items / self.total_items) * 100)
def is_stale(self, timeout_minutes: int = 30) -> bool:
"""True if stuck in pending/processing without recent updates."""
if self.status not in (self.STATUS_PENDING, self.STATUS_PROCESSING):
return False
return (timezone.now() - self.updated_at).total_seconds() > (timeout_minutes * 60)
```
---
## Variant 1 — Fire-and-Forget (Signal-Triggered)
Automatically dispatch a task whenever a model record is saved. Used by `solution_library` to kick off embedding whenever a `Document` is created.
```python
# solution_library/signals.py
from django.db.models.signals import post_save
from django.dispatch import receiver
from django.conf import settings
@receiver(post_save, sender=Document)
def trigger_document_embedding(sender, instance, created, **kwargs):
if not created:
return
if not getattr(settings, 'AUTO_EMBED_DOCUMENTS', True):
return
from solution_library.tasks import embed_document_task # avoid circular import
from django.db import transaction
def _dispatch():
try:
task = embed_document_task.delay(
document_id=instance.id,
embedding_model_id=instance.embedding_model_id or None,
user_id=None,
)
logger.info(f"Queued embedding task {task.id} for document {instance.id}")
except Exception as e:
logger.error(f"Failed to queue embedding task for document {instance.id}: {e}")
# Dispatch AFTER the transaction commits so the worker can read the row
transaction.on_commit(_dispatch)
```
The corresponding task updates the record's status field at start and completion:
```python
@shared_task(name='solution_library.embed_document')
def embed_document_task(document_id: int, embedding_model_id: int = None, user_id: int = None):
document = Document.objects.get(id=document_id)
document.review_status = 'processing'
document.save(update_fields=['review_status', 'embedding_model'])
# ... perform work ...
document.review_status = 'pending'
document.save(update_fields=['review_status'])
return {'success': True, 'document_id': document_id, 'chunks_created': count}
```
---
## Variant 2 — Long-Running Batch Job (View or Admin Triggered)
Used by `rfp_manager` for multi-hour batch RAG processing. The outer transaction creates the DB job record first, then dispatches the Celery task, passing the job's PK.
```python
# rfp_manager/views.py (dispatch)
from django.db import transaction
job = RFPBatchJob.objects.create(
rfp=rfp,
started_by=request.user,
job_type=RFPBatchJob.JOB_TYPE_RESPONDER,
status=RFPBatchJob.STATUS_PENDING,
)
def _dispatch():
task = batch_generate_responder_answers.delay(rfp.pk, request.user.pk, job.pk)
# Save the Celery task ID for Flower cross-reference
job.celery_task_id = task.id
job.save(update_fields=['celery_task_id'])
# IMPORTANT: dispatch after the transaction commits so the worker
# can read the job row. Without this, the worker may receive the
# message before the row is visible, causing DoesNotExist.
transaction.on_commit(_dispatch)
```
Inside the task, use `bind=True` to get the Celery task ID:
```python
@shared_task(bind=True, name='rfp_manager.batch_generate_responder_answers')
def batch_generate_responder_answers(self, rfp_id: int, user_id: int, job_id: int):
job = RFPBatchJob.objects.get(id=job_id)
job.status = RFPBatchJob.STATUS_PROCESSING
job.started_at = timezone.now()
job.celery_task_id = self.request.id # authoritative Celery ID
job.save()
for item in items_to_process:
try:
# ... process item ...
job.processed_questions += 1
job.successful_questions += 1
job.save(update_fields=['processed_questions', 'successful_questions', 'updated_at'])
except Exception as e:
job.add_error(item, str(e))
job.status = RFPBatchJob.STATUS_COMPLETED
job.completed_at = timezone.now()
job.save()
return {'success': True, 'job_id': job_id}
```
---
## Variant 3 — Progress-Callback Task (View or Admin Triggered)
Used by `solution_library`'s `sync_documentation_source_task` when an underlying synchronous service needs to stream incremental progress updates back to the DB.
```python
@shared_task(bind=True, name='solution_library.sync_documentation_source')
def sync_documentation_source_task(self, source_id: int, user_id: int, job_id: int):
job = SyncJob.objects.get(id=job_id)
job.status = SyncJob.STATUS_PROCESSING
job.started_at = timezone.now()
job.celery_task_id = self.request.id
job.save(update_fields=['status', 'started_at', 'celery_task_id', 'updated_at'])
def update_progress(created, updated, skipped, processed, total):
job.documents_created = created
job.documents_updated = updated
job.documents_skipped = skipped
job.save(update_fields=['documents_created', 'documents_updated',
'documents_skipped', 'updated_at'])
result = sync_documentation_source(source_id, user_id, progress_callback=update_progress)
job.status = SyncJob.STATUS_COMPLETED if result.status == 'completed' else SyncJob.STATUS_FAILED
job.completed_at = timezone.now()
job.save()
return {'success': True, 'job_id': job_id}
```
---
## Variant 4 — Periodic Task (Celery Beat)
Used by `llm_manager` for hourly/daily API validation and by `solution_library` for nightly source syncs. Schedule via django-celery-beat in Django admin (no hardcoded schedules in code).
```python
@shared_task(name='llm_manager.validate_all_llm_apis')
def validate_all_llm_apis():
"""Periodic task: validate all active LLM APIs and refresh model lists."""
active_apis = LLMApi.objects.filter(is_active=True)
results = {'tested': 0, 'successful': 0, 'failed': 0, 'details': []}
for api in active_apis:
results['tested'] += 1
try:
result = test_llm_api(api)
if result['success']:
results['successful'] += 1
else:
results['failed'] += 1
except Exception as e:
results['failed'] += 1
logger.error(f"Error validating {api.name}: {e}", exc_info=True)
return results
@shared_task(name='solution_library.sync_all_documentation_sources')
def sync_all_sources_task():
"""Periodic task: queue a sync for every active documentation source."""
sources = DocumentationSource.objects.all()
system_user = User.objects.filter(is_superuser=True).first()
for source in sources:
# Skip if an active sync job already exists
if SyncJob.objects.filter(source=source,
status__in=[SyncJob.STATUS_PENDING,
SyncJob.STATUS_PROCESSING]).exists():
continue
job = SyncJob.objects.create(source=source, started_by=system_user,
status=SyncJob.STATUS_PENDING)
sync_documentation_source_task.delay(source.id, system_user.id, job.id)
return {'queued': queued, 'skipped': skipped}
```
---
## Infrastructure Configuration
### `spelunker/celery.py` — App Entry Point
```python
import os
from celery import Celery
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "spelunker.settings")
app = Celery("spelunker")
app.config_from_object("django.conf:settings", namespace="CELERY")
app.autodiscover_tasks() # auto-discovers tasks.py in every INSTALLED_APP
```
### `settings.py` — Celery Settings
```python
# Broker and result backend — supplied via environment variables
CELERY_BROKER_URL = env('CELERY_BROKER_URL') # amqp://spelunker:<pw>@rabbitmq:5672/spelunker
CELERY_RESULT_BACKEND = env('CELERY_RESULT_BACKEND') # rpc://
# Serialization — JSON only (no pickle)
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = env('TIME_ZONE')
# Result expiry — critical when using rpc:// backend.
# Uncollected results accumulate in worker memory without this.
CELERY_RESULT_EXPIRES = 3600 # 1 hour; safe because we store state in DB job records
# Global time limits (can be overridden per-task with decorator args)
CELERY_TASK_SOFT_TIME_LIMIT = 1800 # 30 min soft limit → SoftTimeLimitExceeded
CELERY_TASK_TIME_LIMIT = 2100 # 35 min hard kill
# Late ack: acknowledge messages AFTER task completes, not before.
# If a worker crashes mid-task, the broker redelivers the message.
CELERY_TASK_ACKS_LATE = True
CELERY_WORKER_PREFETCH_MULTIPLIER = 1 # fetch one task at a time per worker slot
# Separate logging level for Celery vs. application code
CELERY_LOGGING_LEVEL = env('CELERY_LOGGING_LEVEL', default='INFO')
```
> **`CELERY_TASK_ACKS_LATE`**: Combined with idempotent tasks, this provides at-least-once delivery. If a worker process is killed (OOM, deployment), the message returns to the queue and another worker picks it up. This is why idempotency is a hard requirement.
### `settings.py` — Memcached (Django Cache)
Memcached is the Django HTTP-layer cache (sessions, view caching). It is **not** used as a Celery result backend.
```python
CACHES = {
"default": {
"BACKEND": "django.core.cache.backends.memcached.PyMemcacheCache",
"LOCATION": env('KVDB_LOCATION'), # memcached:11211
"KEY_PREFIX": env('KVDB_PREFIX'), # spelunker
"TIMEOUT": 300,
}
}
```
### `INSTALLED_APPS` — Required
```python
INSTALLED_APPS = [
...
'django_celery_beat', # DB-backed periodic task scheduler (Beat)
...
]
```
### `docker-compose.yml` — Service Topology
| Service | Image | Purpose |
|---|---|---|
| `rabbitmq` | `rabbitmq:3-management-alpine` | AMQP message broker |
| `memcached` | `memcached:1.6-alpine` | Django HTTP cache |
| `worker` | `spelunker:latest` | Celery worker (`--concurrency=4`) |
| `scheduler` | `spelunker:latest` | Celery Beat with `DatabaseScheduler` |
| `flower` | `mher/flower:latest` | Task monitoring UI (port 5555) |
### Task Routing / Queues (Recommended)
By default all tasks run in the `celery` default queue. For production deployments, separate CPU-heavy work from I/O-bound work:
```python
# settings.py
CELERY_TASK_ROUTES = {
'solution_library.embed_document': {'queue': 'embedding'},
'solution_library.embed_documents_batch': {'queue': 'embedding'},
'rfp_manager.batch_generate_*': {'queue': 'batch'},
'llm_manager.validate_*': {'queue': 'default'},
}
```
```yaml
# docker-compose.yml — separate workers per queue
worker-default:
command: celery -A spelunker worker -Q default --concurrency=4
worker-embedding:
command: celery -A spelunker worker -Q embedding --concurrency=2
worker-batch:
command: celery -A spelunker worker -Q batch --concurrency=2
```
This prevents a burst of embedding tasks from starving time-sensitive API validation, and lets you scale each queue independently.
### Database Connection Management
Celery workers are long-lived processes. Django DB connections can become stale between tasks. Set `CONN_MAX_AGE` to `0` (the Django default) so connections are closed after each request cycle, or use a connection pooler like PgBouncer. Celery's `worker_pool_restarts` and Django's `close_old_connections()` (called automatically by Celery's Django fixup) handle cleanup between tasks.
---
## Domain Extension Examples
### `solution_library` App
Three task types: single-document embed, batch embed, and documentation-source sync. The single-document task is also triggered by a `post_save` signal for automatic processing on upload.
```python
# Auto-embed on create (signal)
embed_document_task.delay(document_id=instance.id, ...)
# Manual batch from admin action
embed_documents_batch_task.delay(document_ids=[1, 2, 3], ...)
# Source sync from view (with progress callback)
sync_documentation_source_task.delay(source_id=..., user_id=..., job_id=...)
```
### `rfp_manager` App
Two-stage pipeline: responder answers first, reviewer answers second. Each stage is a separate Celery batch job. Both check for an existing active job before dispatching to prevent duplicate runs.
```python
# Guard against duplicate jobs before dispatch
if RFPBatchJob.objects.filter(
rfp=rfp,
job_type=RFPBatchJob.JOB_TYPE_RESPONDER,
status__in=[RFPBatchJob.STATUS_PENDING, RFPBatchJob.STATUS_PROCESSING]
).exists():
# surface error to user
...
# Stage 1
batch_generate_responder_answers.delay(rfp.pk, user.pk, job.pk)
# Stage 2 (after Stage 1 is complete)
batch_generate_reviewer_answers.delay(rfp.pk, user.pk, job.pk)
```
### `llm_manager` App
Stateless periodic task — no DB job record needed because results are written directly to the `LLMApi` and `LLMModel` objects.
```python
# Triggered by Celery Beat; schedule managed via django-celery-beat admin
validate_all_llm_apis.delay()
# Triggered from admin action for a single API
validate_single_api.delay(api_id=api.pk)
```
---
## Anti-Patterns
- ❌ Don't use `rpc://` result backend for tasks where the caller never retrieves the result — the result accumulates in memory. Spelunker mitigates this by storing state in DB job records rather than reading Celery results. Always set `CELERY_RESULT_EXPIRES`.
- ❌ Don't pass full model instances as task arguments — pass PKs only. Celery serialises arguments as JSON; ORM objects are not JSON serialisable.
- ❌ Don't share the same `celery_task_id` between the dispatch call and the task's `self.request.id` without re-saving. The dispatch `AsyncResult.id` and the in-task `self.request.id` are the same value; write it from **inside** the task using `bind=True` as the authoritative source.
- ❌ Don't silence exceptions with bare `except: pass` — always log errors and reflect failure status onto the DB record.
- ❌ Don't skip the duplicate-job guard when the task is triggered from a view or admin action. Without it, double-clicking a submit button can queue two identical jobs.
- ❌ Don't use `CELERY_TASK_SERIALIZER = 'pickle'` — JSON only, to prevent arbitrary code execution via crafted task payloads.
- ❌ Don't hardcode periodic task schedules in code via `app.conf.beat_schedule` — use `django_celery_beat` and manage schedules in Django admin so they survive deployments.
- ❌ Don't call `.delay()` inside a database transaction — use `transaction.on_commit()`. The worker may receive the message before the row is committed, causing `DoesNotExist`.
- ❌ Don't write non-idempotent tasks — workers may crash and brokers may redeliver. A re-executed task must produce the same result (or safely no-op).
- ❌ Don't omit time limits — a hung external API call (LLM, S3) will block a worker slot forever. Always set `soft_time_limit` and `time_limit`.
- ❌ Don't retry business-logic errors with `autoretry_for` — only retry **transient** failures (network errors, timeouts). A `ValueError` or `DoesNotExist` will never succeed on retry.
---
## Migration / Adoption
When adding a new Celery task to an existing app:
1. Create `<app>/tasks.py` using `@shared_task`, not `@app.task`.
2. Name the task `'<app_label>.<action>'`.
3. If the task is long-running, create a DB job model with the recommended fields above.
4. Register the app in `INSTALLED_APPS` (required for `autodiscover_tasks`).
5. For periodic tasks, add a schedule record via Django admin → Periodic Tasks (django-celery-beat) rather than in code.
6. Add a test that confirms the task can be called synchronously with `CELERY_TASK_ALWAYS_EAGER = True`.
---
## Settings
```python
# settings.py
# Required — broker and result backend
CELERY_BROKER_URL = env('CELERY_BROKER_URL') # amqp://user:pw@host:5672/vhost
CELERY_RESULT_BACKEND = env('CELERY_RESULT_BACKEND') # rpc://
# Serialization (do not change)
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = env('TIME_ZONE') # must match Django TIME_ZONE
# Result expiry — prevents unbounded memory growth with rpc:// backend
CELERY_RESULT_EXPIRES = 3600 # seconds (1 hour)
# Time limits — global defaults, overridable per-task
CELERY_TASK_SOFT_TIME_LIMIT = 1800 # SoftTimeLimitExceeded after 30 min
CELERY_TASK_TIME_LIMIT = 2100 # hard SIGKILL after 35 min
# Reliability — late ack + single prefetch for at-least-once delivery
CELERY_TASK_ACKS_LATE = True
CELERY_WORKER_PREFETCH_MULTIPLIER = 1
# Logging
CELERY_LOGGING_LEVEL = env('CELERY_LOGGING_LEVEL', default='INFO') # separate from app/Django level
# Optional — disable for production
# AUTO_EMBED_DOCUMENTS = True # set False to suppress signal-triggered embedding
# Optional — task routing (see Infrastructure Configuration for queue examples)
# CELERY_TASK_ROUTES = { ... }
```
---
## Testing
```python
from django.test import TestCase, override_settings
@override_settings(CELERY_TASK_ALWAYS_EAGER=True, CELERY_TASK_EAGER_PROPAGATES=True)
class EmbedDocumentTaskTest(TestCase):
def test_happy_path(self):
"""Task embeds a document and returns success."""
# arrange: create Document, LLMModel fixtures
result = embed_document_task(document_id=doc.id)
self.assertTrue(result['success'])
self.assertGreater(result['chunks_created'], 0)
doc.refresh_from_db()
self.assertEqual(doc.review_status, 'pending')
def test_document_not_found(self):
"""Task returns success=False for a missing document ID."""
result = embed_document_task(document_id=999999)
self.assertFalse(result['success'])
self.assertIn('not found', result['error'])
def test_no_embedding_model(self):
"""Task returns success=False when no embedding model is available."""
# arrange: no LLMModel with is_system_default=True
result = embed_document_task(document_id=doc.id)
self.assertFalse(result['success'])
@override_settings(CELERY_TASK_ALWAYS_EAGER=True, CELERY_TASK_EAGER_PROPAGATES=True)
class BatchJobTest(TestCase):
def test_job_reaches_completed_status(self):
"""Batch job transitions from pending → processing → completed."""
job = RFPBatchJob.objects.create(...)
batch_generate_responder_answers(rfp_id=rfp.pk, user_id=user.pk, job_id=job.pk)
job.refresh_from_db()
self.assertEqual(job.status, RFPBatchJob.STATUS_COMPLETED)
def test_duplicate_job_guard(self):
"""A second dispatch when a job is already active is rejected by the view."""
# arrange: one active job
response = self.client.post(dispatch_url)
self.assertContains(response, 'already running', status_code=400)
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,401 @@
# Notification Trigger Pattern v1.0.0
Standard pattern for triggering notifications from domain-specific events in Django applications that use Themis for notification infrastructure.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Overview
Themis provides the notification *mailbox* — the model, UI (bell + dropdown + list page), polling, browser notifications, user preferences, and cleanup. What Themis does **not** provide is the *trigger logic* — the rules that decide when a notification should be created.
Trigger logic is inherently domain-specific:
- A task tracker sends "Task overdue" notifications
- A calendar sends "Event starting in 15 minutes" reminders
- A finance app sends "Invoice payment received" alerts
- A monitoring system sends "Server CPU above 90%" warnings
This pattern documents how consuming apps should create notifications using Themis infrastructure.
---
## The Standard Interface
All notification creation goes through one function:
```python
from themis.notifications import notify_user
notify_user(
user=user, # Django User instance
title="Task overdue", # Short headline (max 200 chars)
message="Task 'Deploy v2' was due yesterday.", # Optional body
level="warning", # info | success | warning | danger
url="/tasks/42/", # Optional: where to navigate on click
source_app="tasks", # Your app label (for tracking/cleanup)
source_model="Task", # Model that triggered this
source_id="42", # PK of the source object (as string)
deduplicate=True, # Skip if unread duplicate exists
expires_at=None, # Optional: auto-expire datetime
)
```
**Never create `UserNotification` objects directly.** The `notify_user()` function handles:
- Checking if the user has notifications enabled
- Filtering by the user's minimum notification level
- Deduplication (when `deduplicate=True`)
- Returning `None` when skipped (so callers can check)
---
## Trigger Patterns
### 1. Signal-Based Triggers
The most common pattern — listen to Django signals and create notifications:
```python
# myapp/signals.py
from django.db.models.signals import post_save
from django.dispatch import receiver
from themis.notifications import notify_user
from .models import Task
@receiver(post_save, sender=Task)
def notify_task_assigned(sender, instance, created, **kwargs):
"""Notify user when a task is assigned to them."""
if not created and instance.assignee and instance.tracker.has_changed("assignee"):
notify_user(
user=instance.assignee,
title=f"Task assigned: {instance.title}",
message=f"You've been assigned to '{instance.title}'",
level="info",
url=instance.get_absolute_url(),
source_app="tasks",
source_model="Task",
source_id=str(instance.pk),
deduplicate=True,
)
```
### 2. View-Based Triggers
Create notifications during request processing:
```python
# myapp/views.py
from themis.notifications import notify_user
@login_required
def approve_request(request, pk):
req = get_object_or_404(Request, pk=pk)
req.status = "approved"
req.save()
# Notify the requester
notify_user(
user=req.requester,
title="Request approved",
message=f"Your request '{req.title}' has been approved.",
level="success",
url=req.get_absolute_url(),
source_app="requests",
source_model="Request",
source_id=str(req.pk),
)
messages.success(request, "Request approved.")
return redirect("request-list")
```
### 3. Management Command Triggers
For scheduled checks (e.g., daily overdue detection):
```python
# myapp/management/commands/check_overdue.py
from django.core.management.base import BaseCommand
from django.utils import timezone
from themis.notifications import notify_user
from myapp.models import Task
class Command(BaseCommand):
help = "Send notifications for overdue tasks"
def handle(self, *args, **options):
overdue = Task.objects.filter(
due_date__lt=timezone.now().date(),
status__in=["open", "in_progress"],
)
count = 0
for task in overdue:
result = notify_user(
user=task.assignee,
title=f"Overdue: {task.title}",
message=f"Task was due {task.due_date}",
level="danger",
url=task.get_absolute_url(),
source_app="tasks",
source_model="Task",
source_id=str(task.pk),
deduplicate=True, # Don't send again if unread
)
if result:
count += 1
self.stdout.write(f"Sent {count} overdue notification(s)")
```
Schedule with cron or Kubernetes CronJob:
```yaml
# Kubernetes CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: check-overdue-tasks
spec:
schedule: "0 8 * * *" # Daily at 8 AM
jobTemplate:
spec:
template:
spec:
containers:
- name: check-overdue
command: ["python", "manage.py", "check_overdue"]
```
### 4. Celery Task Triggers
For apps with background workers:
```python
# myapp/tasks.py
from celery import shared_task
from django.contrib.auth import get_user_model
from themis.notifications import notify_user
User = get_user_model()
@shared_task
def notify_report_ready(user_id, report_id):
"""Notify user when their report has been generated."""
from myapp.models import Report
user = User.objects.get(pk=user_id)
report = Report.objects.get(pk=report_id)
notify_user(
user=user,
title="Report ready",
message=f"Your {report.report_type} report is ready to download.",
level="success",
url=report.get_absolute_url(),
source_app="reports",
source_model="Report",
source_id=str(report.pk),
)
```
---
## Notification Levels
Choose the appropriate level for each notification type:
| Level | Weight | Use For |
|---|---|---|
| `info` | 0 | Informational updates (assigned, comment added) |
| `success` | 0 | Positive outcomes (approved, completed, payment received) |
| `warning` | 1 | Needs attention (approaching deadline, low balance) |
| `danger` | 2 | Urgent/error (overdue, failed, system error) |
Users can set a minimum notification level in their preferences:
- **info** (default) — receive all notifications
- **warning** — only warnings and errors
- **danger** — only errors
Note that `info` and `success` have the same weight (0), so setting minimum to "warning" filters out both.
---
## Source Tracking
The three source tracking fields enable two important features:
### Deduplication
When `deduplicate=True`, `notify_user()` checks for existing unread notifications with the same `source_app`, `source_model`, and `source_id`. This prevents notification spam when the same event is checked multiple times (e.g., a daily cron job for overdue tasks).
### Bulk Cleanup
When a source object is deleted, clean up its notifications:
```python
# In your model's delete signal or post_delete:
from themis.models import UserNotification
@receiver(post_delete, sender=Task)
def cleanup_task_notifications(sender, instance, **kwargs):
UserNotification.objects.filter(
source_app="tasks",
source_model="Task",
source_id=str(instance.pk),
).delete()
```
---
## Expiring Notifications
For time-sensitive notifications, use `expires_at`:
```python
from datetime import timedelta
from django.utils import timezone
# Event reminder that expires when the event starts
notify_user(
user=attendee,
title=f"Starting soon: {event.title}",
level="info",
url=event.get_absolute_url(),
expires_at=event.start_time,
source_app="events",
source_model="Event",
source_id=str(event.pk),
deduplicate=True,
)
```
Expired notifications are automatically excluded from counts and lists. The `cleanup_notifications` management command deletes them permanently.
---
## Multi-User Notifications
For events that affect multiple users, call `notify_user()` in a loop:
```python
def notify_team(team, title, message, **kwargs):
"""Send a notification to all members of a team."""
for member in team.members.all():
notify_user(user=member, title=title, message=message, **kwargs)
```
For large recipient lists, consider using a Celery task to avoid blocking the request.
---
## Notification Cleanup
Themis provides automatic cleanup via the management command:
```bash
# Uses THEMIS_NOTIFICATION_MAX_AGE_DAYS (default: 90)
python manage.py cleanup_notifications
# Override max age
python manage.py cleanup_notifications --max-age-days=60
```
**What gets deleted:**
- Read notifications older than the max age
- Dismissed notifications older than the max age
- Expired notifications (past their `expires_at`)
**What is preserved:**
- Unread notifications (regardless of age)
Schedule this as a daily cron job or Kubernetes CronJob.
---
## Settings
Themis recognizes these settings for notification behavior:
```python
# Polling interval for the notification bell (seconds, 0 = disabled)
THEMIS_NOTIFICATION_POLL_INTERVAL = 60
# Hard ceiling for notification cleanup (days)
THEMIS_NOTIFICATION_MAX_AGE_DAYS = 90
```
Users control their own preferences in Settings:
- **Enable notifications** — master on/off switch
- **Minimum level** — filter low-priority notifications
- **Browser desktop notifications** — opt-in for OS-level alerts
- **Retention days** — how long to keep read notifications
---
## Anti-Patterns
- ❌ Don't create `UserNotification` objects directly — use `notify_user()`
- ❌ Don't send notifications in tight loops without `deduplicate=True`
- ❌ Don't use notifications for real-time chat — use WebSocket channels
- ❌ Don't store sensitive data in notification messages (they're visible in admin)
- ❌ Don't rely on notifications as the sole delivery mechanism — they may be disabled by the user
- ❌ Don't forget `source_app`/`source_model`/`source_id` — they enable cleanup and dedup
---
## Testing Notifications
```python
from themis.notifications import notify_user
from themis.models import UserNotification
class MyAppNotificationTest(TestCase):
def test_task_overdue_notification(self):
"""Overdue task creates a danger notification."""
user = User.objects.create_user(username="test", password="pass")
task = Task.objects.create(
title="Deploy v2",
assignee=user,
due_date=date.today() - timedelta(days=1),
)
# Trigger your notification logic
check_overdue_tasks()
# Verify notification was created
notif = UserNotification.objects.get(
user=user,
source_app="tasks",
source_model="Task",
source_id=str(task.pk),
)
self.assertEqual(notif.level, "danger")
self.assertIn("Deploy v2", notif.title)
def test_disabled_user_gets_no_notification(self):
"""Users with notifications disabled get nothing."""
user = User.objects.create_user(username="quiet", password="pass")
user.profile.notifications_enabled = False
user.profile.save()
result = notify_user(user, "Should be skipped")
self.assertIsNone(result)
self.assertEqual(UserNotification.objects.count(), 0)
```

View File

@@ -0,0 +1,275 @@
# Organization Model Pattern v1.0.0
Standard pattern for Organization models across Django applications. Each app implements its own Organization model following this pattern to ensure interoperability and consistent field names.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Why a Pattern, Not a Shared Model
Organization requirements vary by domain. A financial app needs stock symbols and ISIN codes. A healthcare app needs provider IDs. An education app needs accreditation fields. Shipping a monolithic Organization model with 40+ fields forces every app to carry fields it does not need.
Instead, this pattern defines:
- **Required fields** every Organization model must have
- **Recommended fields** most apps should include
- **Extension guidelines** for domain-specific needs
- **Standard choice values** for interoperability
---
## Required Fields
Every Organization model must include these fields:
```python
import uuid
from django.conf import settings
from django.db import models
class Organization(models.Model):
# Primary key
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
# Core identity
name = models.CharField(max_length=255, db_index=True,
help_text="Organization display name")
slug = models.SlugField(max_length=255, unique=True,
help_text="URL-friendly identifier")
# Classification
type = models.CharField(max_length=20, choices=TYPE_CHOICES,
help_text="Organization type")
status = models.CharField(max_length=20, choices=STATUS_CHOICES,
default="active", help_text="Current status")
# Location
country = models.CharField(max_length=2, help_text="ISO 3166-1 alpha-2 country code")
# Audit
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
created_by = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.SET_NULL,
null=True, blank=True, related_name="created_organizations")
updated_by = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.SET_NULL,
null=True, blank=True, related_name="updated_organizations")
class Meta:
verbose_name = "Organization"
verbose_name_plural = "Organizations"
def __str__(self):
return self.name
def get_absolute_url(self):
from django.urls import reverse
return reverse("organization-detail", kwargs={"slug": self.slug})
```
---
## Standard Choice Values
Use these exact values for interoperability between apps:
### TYPE_CHOICES
```python
TYPE_CHOICES = [
("for-profit", "For-Profit"),
("non-profit", "Non-Profit"),
("government", "Government"),
("ngo", "NGO"),
("educational", "Educational"),
("healthcare", "Healthcare"),
("cooperative", "Cooperative"),
]
```
### STATUS_CHOICES
```python
STATUS_CHOICES = [
("active", "Active"),
("inactive", "Inactive"),
("pending", "Pending"),
("suspended", "Suspended"),
("dissolved", "Dissolved"),
("merged", "Merged"),
]
```
### SIZE_CHOICES (recommended)
```python
SIZE_CHOICES = [
("micro", "Micro (1-9)"),
("small", "Small (10-49)"),
("medium", "Medium (50-249)"),
("large", "Large (250-999)"),
("enterprise", "Enterprise (1000+)"),
]
```
### PARENT_RELATIONSHIP_CHOICES (if using hierarchy)
```python
PARENT_RELATIONSHIP_CHOICES = [
("subsidiary", "Subsidiary"),
("division", "Division"),
("branch", "Branch"),
("franchise", "Franchise"),
("joint-venture", "Joint Venture"),
("department", "Department"),
]
```
---
## Recommended Fields
Most apps should include these fields:
```python
# Extended identity
legal_name = models.CharField(max_length=255, blank=True, default="",
help_text="Full legal entity name")
abbreviated_name = models.CharField(max_length=50, blank=True, default="",
db_index=True, help_text="Short name/acronym")
# Classification
size = models.CharField(max_length=20, choices=SIZE_CHOICES, blank=True, default="",
help_text="Organization size")
# Contact
primary_email = models.EmailField(blank=True, default="", help_text="Primary contact email")
primary_phone = models.CharField(max_length=20, blank=True, default="", help_text="Primary phone")
website = models.URLField(blank=True, default="", help_text="Organization website")
# Address
address_line1 = models.CharField(max_length=255, blank=True, default="")
address_line2 = models.CharField(max_length=255, blank=True, default="")
city = models.CharField(max_length=100, blank=True, default="")
state_province = models.CharField(max_length=100, blank=True, default="")
postal_code = models.CharField(max_length=20, blank=True, default="")
# Content
overview = models.TextField(blank=True, default="", help_text="Organization description")
# Metadata
is_active = models.BooleanField(default=True, help_text="Soft delete flag")
tags = models.JSONField(default=list, blank=True, help_text="Flexible tags")
```
---
## Hierarchy Pattern
For apps that need parent-child organization relationships:
```python
# Hierarchical relationships
parent_organization = models.ForeignKey(
"self",
on_delete=models.SET_NULL,
null=True,
blank=True,
related_name="subsidiaries",
help_text="Parent organization",
)
parent_relationship_type = models.CharField(
max_length=20,
choices=PARENT_RELATIONSHIP_CHOICES,
blank=True,
default="",
help_text="Type of relationship with parent",
)
```
### Hierarchy Utility Functions
```python
def get_ancestors(org):
"""Walk up the parent chain. Returns list of Organization instances."""
ancestors = []
current = org.parent_organization
while current:
ancestors.append(current)
current = current.parent_organization
return ancestors
def get_descendants(org):
"""Recursively collect all child organizations."""
descendants = []
for child in org.subsidiaries.all():
descendants.append(child)
descendants.extend(get_descendants(child))
return descendants
```
⚠️ **Warning:** Recursive queries can be expensive. For deep hierarchies, consider using `django-mptt` or `django-treebeard`, or store a materialized path.
---
## Domain Extension Examples
### Financial App
```python
class Organization(BaseOrganization):
revenue = models.DecimalField(max_digits=15, decimal_places=2, null=True, blank=True)
revenue_year = models.PositiveIntegerField(null=True, blank=True)
employee_count = models.PositiveIntegerField(null=True, blank=True)
stock_symbol = models.CharField(max_length=10, blank=True, default="", db_index=True)
fiscal_year_end_month = models.PositiveSmallIntegerField(null=True, blank=True)
```
### Healthcare App
```python
class Organization(BaseOrganization):
npi_number = models.CharField(max_length=10, blank=True, default="")
facility_type = models.CharField(max_length=30, choices=FACILITY_CHOICES)
bed_count = models.PositiveIntegerField(null=True, blank=True)
accreditation = models.JSONField(default=list, blank=True)
```
### Education App
```python
class Organization(BaseOrganization):
institution_type = models.CharField(max_length=30, choices=INSTITUTION_CHOICES)
student_count = models.PositiveIntegerField(null=True, blank=True)
accreditation_body = models.CharField(max_length=100, blank=True, default="")
```
---
## Anti-Patterns
- ❌ Don't use `null=True` on CharField/TextField — use `blank=True, default=""`
- ❌ Don't put all possible fields in a single model — extend per domain
- ❌ Don't use `Meta.ordering` on Organization — specify in queries
- ❌ Don't override `save()` for hierarchy calculation — use signals or service functions
- ❌ Don't expose sequential IDs in URLs — use slug or short UUID
---
## Indexing Recommendations
```python
class Meta:
indexes = [
models.Index(fields=["name"], name="org_name_idx"),
models.Index(fields=["status"], name="org_status_idx"),
models.Index(fields=["type"], name="org_type_idx"),
models.Index(fields=["country"], name="org_country_idx"),
]
```
Add domain-specific indexes as needed (e.g., `stock_symbol` for financial apps).

View File

@@ -0,0 +1,434 @@
# S3/MinIO File Storage Pattern v1.0.0
Standardizes how Django apps in Spelunker store, read, and reference files in S3/MinIO, covering upload paths, model metadata fields, storage-agnostic I/O, and test isolation.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Why a Pattern, Not a Shared Implementation
Each Django app stores files for a different domain purpose with different path conventions, processing workflows, and downstream consumers, making a single shared model impractical.
- The **rfp_manager** app needs files scoped under an RFP ID (info docs, question spreadsheets, generated exports), with no embedding — only LLM summarization
- The **solution_library** app needs files tied to vendor/solution hierarchies, with full text embedding and chunk storage, plus scraped documents that have no Django `FileField` at all
- The **rag** app needs to programmatically write chunk texts to S3 during embedding and read them back for search context
- The **core** app needs a simple image upload for organization logos without any processing pipeline
Instead, this pattern defines:
- **Required fields** — the minimum every file-backed model must have
- **Recommended fields** — metadata most implementations should track
- **Standard path conventions** — bucket key prefixes each domain owns
- **Storage-agnostic I/O** — how to read and write files so tests work without a real S3 bucket
---
## Required Fields
Every model that stores a file in S3/MinIO must have at minimum:
```python
from django.core.validators import FileExtensionValidator
from django.db import models
def my_domain_upload_path(instance, filename):
"""Return a scoped S3 key for this domain."""
return f'my_domain/{instance.parent_id}/{filename}'
class MyDocument(models.Model):
file = models.FileField(
upload_to=my_domain_upload_path, # or a string prefix
validators=[FileExtensionValidator(allowed_extensions=[...])],
)
file_type = models.CharField(max_length=100, blank=True) # extension without dot
file_size = models.PositiveIntegerField(null=True, blank=True) # bytes
```
---
## Standard Path Conventions
Use these exact key prefixes so buckets stay organized and IAM policies can target prefixes.
| App / Purpose | S3 Key Prefix |
|--------------------------------|--------------------------------------------|
| Solution library documents | `documents/` |
| Scraped documentation sources | `scraped/{source_id}/{filename}` |
| Embedding chunk texts | `chunks/{document_id}/chunk_{index}.txt` |
| RFP information documents | `rfp_info_documents/{rfp_id}/{filename}` |
| RFP question spreadsheets | `rfp_question_documents/{rfp_id}/{filename}` |
| RFP generated exports | `rfp_exports/{rfp_id}/{filename}` |
| Organization logos | `orgs/logos/` |
---
## Recommended Fields and Behaviors
Most file-backed models should also include these and populate them automatically.
```python
class MyDocument(models.Model):
# ... required fields above ...
# Recommended: explicit S3 key for programmatic access and admin visibility
s3_key = models.CharField(max_length=500, blank=True)
def save(self, *args, **kwargs):
"""Auto-populate file metadata on every save."""
if self.file:
self.s3_key = self.file.name
if hasattr(self.file, 'size'):
self.file_size = self.file.size
if self.file.name and '.' in self.file.name:
self.file_type = self.file.name.rsplit('.', 1)[-1].lower()
super().save(*args, **kwargs)
```
---
## Pattern Variant 1: FileField Upload (User-Initiated Upload)
Used by `rfp_manager.RFPInformationDocument`, `rfp_manager.RFPQuestionDocument`, `rfp_manager.RFPExport`, `solution_library.Document`, and `core.Organization`.
The user (or Celery task generating an export) provides a file. Django's `FileField` handles the upload to S3 automatically via the configured storage backend.
```python
import os
from django.core.validators import FileExtensionValidator
from django.db import models
def rfp_info_document_path(instance, filename):
"""Scope uploads under the parent RFP's ID to keep the bucket organized."""
return f'rfp_info_documents/{instance.rfp.id}/{filename}'
class RFPInformationDocument(models.Model):
file = models.FileField(
upload_to=rfp_info_document_path,
validators=[FileExtensionValidator(
allowed_extensions=['pdf', 'doc', 'docx', 'txt', 'md']
)],
)
title = models.CharField(max_length=500)
file_type = models.CharField(max_length=100, blank=True)
file_size = models.PositiveIntegerField(null=True, blank=True)
def save(self, *args, **kwargs):
if self.file:
if hasattr(self.file, 'size'):
self.file_size = self.file.size
if self.file.name:
self.file_type = os.path.splitext(self.file.name)[1].lstrip('.')
super().save(*args, **kwargs)
```
---
## Pattern Variant 2: Programmatic Write (Code-Generated Content)
Used by `rag.services.embeddings` (chunk texts) and `solution_library.services.sync` (scraped documents).
Content is generated or fetched in code and written directly to S3 using `default_storage.save()` with a `ContentFile`. The model records the resulting S3 key for later retrieval.
```python
from django.core.files.base import ContentFile
from django.core.files.storage import default_storage
def store_chunk(document_id: int, chunk_index: int, text: str) -> str:
"""
Store an embedding chunk in S3 and return the saved key.
Returns:
The actual S3 key (may differ from requested if file_overwrite=False)
"""
s3_key = f'chunks/{document_id}/chunk_{chunk_index}.txt'
saved_key = default_storage.save(s3_key, ContentFile(text.encode('utf-8')))
return saved_key
def store_scraped_document(source_id: int, filename: str, content: str) -> str:
"""Store scraped document content in S3 and return the saved key."""
s3_key = f'scraped/{source_id}/{filename}'
return default_storage.save(s3_key, ContentFile(content.encode('utf-8')))
```
When creating the model record after a programmatic write, use `s3_key` rather than a `FileField`:
```python
Document.objects.create(
title=filename,
s3_key=saved_key,
file_size=len(content),
file_type='md',
# Note: `file` field is intentionally empty — this is a scraped document
)
```
---
## Pattern Variant 3: Storage-Agnostic Read
Used by `rfp_manager.services.excel_processor`, `rag.services.embeddings._read_document_content`, and `solution_library.models.DocumentEmbedding.get_chunk_text`.
Always read via `default_storage.open()` so the same code works against S3 in production and `FileSystemStorage` in tests. Never construct a filesystem path from `settings.MEDIA_ROOT`.
```python
from django.core.files.storage import default_storage
from io import BytesIO
def load_binary_from_storage(file_path: str) -> BytesIO:
"""
Read a binary file from storage into a BytesIO buffer.
Works against S3/MinIO in production and FileSystemStorage in tests.
"""
with default_storage.open(file_path, 'rb') as f:
return BytesIO(f.read())
def read_text_from_storage(s3_key: str) -> str:
"""Read a text file from storage."""
with default_storage.open(s3_key, 'r') as f:
return f.read()
```
When a model has both a `file` field (user upload) and a bare `s3_key` (scraped/programmatic), check which path applies:
```python
def _read_document_content(self, document) -> str:
if document.s3_key and not document.file:
# Scraped document: no FileField, read by key
with default_storage.open(document.s3_key, 'r') as f:
return f.read()
# Uploaded document: use the FileField
with document.file.open('r') as f:
return f.read()
```
---
## Pattern Variant 4: S3 Connectivity Validation
Used by `solution_library.models.Document.clean()` and `solution_library.services.sync.sync_documentation_source`.
Validate that the bucket is reachable before attempting an upload or sync. This surfaces credential errors with a user-friendly message rather than a cryptic 500.
```python
from botocore.exceptions import ClientError, NoCredentialsError
from django.core.exceptions import ValidationError
from django.core.files.storage import default_storage
def validate_s3_connectivity():
"""
Raise ValidationError if S3/MinIO bucket is not accessible.
Only call on new uploads or at the start of a background sync.
"""
if not hasattr(default_storage, 'bucket'):
return # Not an S3 backend (e.g., tests), skip validation
try:
default_storage.bucket.meta.client.head_bucket(
Bucket=default_storage.bucket_name
)
except ClientError as e:
code = e.response.get('Error', {}).get('Code', '')
if code == '403':
raise ValidationError(
"S3/MinIO credentials are invalid or permissions are insufficient."
)
elif code == '404':
raise ValidationError(
f"Bucket '{default_storage.bucket_name}' does not exist."
)
raise ValidationError(f"S3/MinIO error ({code}): {e}")
except NoCredentialsError:
raise ValidationError("S3/MinIO credentials are not configured.")
```
In a model's `clean()`, guard with `not self.pk` to avoid checking on every update:
```python
def clean(self):
super().clean()
if self.file and not self.pk: # New uploads only
validate_s3_connectivity()
```
---
## Domain Extension Examples
### rfp_manager App
RFP documents are scoped under the RFP ID for isolation and easy cleanup. The app uses three document types (info, question, export), each with its own callable path function to keep the bucket navigation clear.
```python
def rfp_export_path(instance, filename):
return f'rfp_exports/{instance.rfp.id}/{filename}'
class RFPExport(models.Model):
export_file = models.FileField(upload_to=rfp_export_path)
version = models.CharField(max_length=50)
file_size = models.PositiveIntegerField(null=True, blank=True)
question_count = models.IntegerField()
answered_count = models.IntegerField()
# No s3_key field - export files are always accessed via FileField
```
### solution_library App
Solution library documents track an explicit `s3_key` because the app supports two document origins: user uploads (with `FileField`) and scraped documents (programmatic write only, no `FileField`). For embedding, chunk texts are stored separately in S3 and referenced from `DocumentEmbedding` via `chunk_s3_key`.
```python
class Document(models.Model):
file = models.FileField(upload_to='documents/', blank=True) # blank=True: scraped docs
s3_key = models.CharField(max_length=500, blank=True) # always populated
content_hash = models.CharField(max_length=64, blank=True, db_index=True)
class DocumentEmbedding(models.Model):
document = models.ForeignKey(Document, on_delete=models.CASCADE, related_name='embeddings')
chunk_s3_key = models.CharField(max_length=500) # e.g. chunks/42/chunk_7.txt
chunk_index = models.IntegerField()
chunk_size = models.PositiveIntegerField()
embedding = VectorField(null=True, blank=True) # pgvector column
def get_chunk_text(self) -> str:
from django.core.files.storage import default_storage
with default_storage.open(self.chunk_s3_key, 'r') as f:
return f.read()
```
---
## Anti-Patterns
- ❌ Don't build filesystem paths with `os.path.join(settings.MEDIA_ROOT, ...)` — always read through `default_storage.open()`
- ❌ Don't store file content as a `TextField` or `BinaryField` in the database
- ❌ Don't use `default_acl='public-read'` — all Spelunker buckets use `private` ACL with `querystring_auth=True` (pre-signed URLs)
- ❌ Don't skip `FileExtensionValidator` on upload fields — it is the first line of defence against unexpected file types
- ❌ Don't call `document.file.storage.size()` or `.exists()` in hot paths — these make network round-trips; use the `s3_key` and metadata fields for display purposes
- ❌ Don't make S3 API calls in tests without first overriding `STORAGES` in `test_settings.py`
- ❌ Don't use `file_overwrite=True` — the global setting `file_overwrite=False` ensures Django auto-appends a unique suffix rather than silently overwriting existing objects
---
## Settings
```python
# spelunker/settings.py
STORAGES = {
"default": {
"BACKEND": "storages.backends.s3boto3.S3Boto3Storage",
"OPTIONS": {
"access_key": env('S3_ACCESS_KEY'),
"secret_key": env('S3_SECRET_KEY'),
"bucket_name": env('S3_BUCKET_NAME'),
"endpoint_url": env('S3_ENDPOINT_URL'), # Use for MinIO or non-AWS S3
"use_ssl": env('S3_USE_SSL'),
"default_acl": env('S3_DEFAULT_ACL'), # Must be 'private'
"region_name": env('S3_REGION_NAME'),
"file_overwrite": False, # Prevent silent overwrites
"querystring_auth": True, # Pre-signed URLs for all access
"verify": env.bool('S3_VERIFY_SSL', default=True),
}
},
"staticfiles": {
# Static files are served locally (nginx), never from S3
"BACKEND": "django.contrib.staticfiles.storage.StaticFilesStorage",
},
}
```
Environment variables (see `.env.example`):
```bash
S3_ACCESS_KEY=
S3_SECRET_KEY=
S3_BUCKET_NAME=spelunker-documents
S3_ENDPOINT_URL=http://localhost:9000 # MinIO local dev
S3_USE_SSL=False
S3_VERIFY_SSL=False
S3_DEFAULT_ACL=private
S3_REGION_NAME=us-east-1
```
Test override (disables all S3 calls):
```python
# spelunker/test_settings.py
STORAGES = {
"default": {
"BACKEND": "django.core.files.storage.FileSystemStorage",
"OPTIONS": {"location": "/tmp/test_media/"},
},
"staticfiles": {
"BACKEND": "django.contrib.staticfiles.storage.StaticFilesStorage",
},
}
```
---
## Testing
Standard test cases every file-backed implementation should cover.
```python
import os
import tempfile
from django.core.files.uploadedfile import SimpleUploadedFile
from django.test import TestCase, override_settings
@override_settings(
STORAGES={
"default": {
"BACKEND": "django.core.files.storage.FileSystemStorage",
"OPTIONS": {"location": tempfile.mkdtemp()},
},
"staticfiles": {
"BACKEND": "django.contrib.staticfiles.storage.StaticFilesStorage",
},
}
)
class MyDocumentStorageTest(TestCase):
def test_file_metadata_populated_on_save(self):
"""file_type and file_size are auto-populated from the uploaded file."""
uploaded = SimpleUploadedFile("report.pdf", b"%PDF-1.4 content", content_type="application/pdf")
doc = MyDocument.objects.create(file=uploaded, title="Test")
self.assertEqual(doc.file_type, "pdf")
self.assertGreater(doc.file_size, 0)
def test_upload_path_includes_parent_id(self):
"""upload_to callable scopes the key under the parent ID."""
uploaded = SimpleUploadedFile("q.xlsx", b"PK content")
doc = MyDocument.objects.create(file=uploaded, title="Questions", rfp=self.rfp)
self.assertIn(str(self.rfp.id), doc.file.name)
def test_rejected_extension(self):
"""FileExtensionValidator rejects disallowed file types."""
from django.core.exceptions import ValidationError
uploaded = SimpleUploadedFile("hack.exe", b"MZ")
doc = MyDocument(file=uploaded, title="Bad")
with self.assertRaises(ValidationError):
doc.full_clean()
def test_storage_agnostic_read(self):
"""Reading via default_storage.open() works against FileSystemStorage."""
from django.core.files.base import ContentFile
from django.core.files.storage import default_storage
key = default_storage.save("test/hello.txt", ContentFile(b"hello world"))
with default_storage.open(key, 'r') as f:
content = f.read()
self.assertEqual(content, "hello world")
default_storage.delete(key)
```

View File

@@ -0,0 +1,736 @@
# SSO with Allauth & Casdoor Pattern v1.0.0
Standardizes OIDC-based Single Sign-On using Django Allauth and Casdoor, covering adapter customization, user provisioning, group mapping, superuser protection, and configurable local-login fallback. Used by the `core` Django application.
## 🐾 Red Panda Approval™
This pattern follows Red Panda Approval standards.
---
## Why a Pattern, Not a Shared Implementation
Every Django project that adopts SSO has different identity-provider configurations, claim schemas, permission models, and organizational structures:
- A **project management** app needs role claims mapped to project-scoped permissions
- An **e-commerce** app needs tenant/store claims with purchase-limit groups
- An **RFP tool** (Spelunker) needs organization + group claims mapped to View Only / Staff / SME / Admin groups
Instead, this pattern defines:
- **Required components** — every implementation must have
- **Required settings** — Django & Allauth configuration values
- **Standard conventions** — group names, claim mappings, redirect URL format
- **Extension guidelines** — for domain-specific provisioning logic
---
## Required Components
Every SSO implementation following this pattern must provide these files:
| Component | Location | Purpose |
|-----------|----------|---------|
| Social account adapter | `<app>/adapters.py` | User provisioning, group mapping, superuser protection |
| Local account adapter | `<app>/adapters.py` | Disable local signup, authentication logging |
| Management command | `<app>/management/commands/create_sso_groups.py` | Idempotent group + permission creation |
| Login template | `templates/account/login.html` | SSO button + conditional local login form |
| Context processor | `<app>/context_processors.py` | Expose `CASDOOR_ENABLED` / `ALLOW_LOCAL_LOGIN` to templates |
| SSL patch (optional) | `<app>/ssl_patch.py` | Development-only SSL bypass |
### Minimum settings.py configuration
```python
# INSTALLED_APPS — required entries
INSTALLED_APPS = [
# ... standard Django apps ...
'allauth',
'allauth.account',
'allauth.socialaccount',
'allauth.socialaccount.providers.openid_connect',
'<your_app>',
]
# MIDDLEWARE — Allauth middleware is required
MIDDLEWARE = [
# ... standard Django middleware ...
'allauth.account.middleware.AccountMiddleware',
]
# AUTHENTICATION_BACKENDS — both local and SSO
AUTHENTICATION_BACKENDS = [
'django.contrib.auth.backends.ModelBackend',
'allauth.account.auth_backends.AuthenticationBackend',
]
```
---
## Standard Values / Conventions
### Environment Variables
Every deployment must set these environment variables (or `.env` entries):
| Variable | Required | Default | Description |
|----------|----------|---------|-------------|
| `CASDOOR_ENABLED` | Yes | — | Enable/disable SSO (`true`/`false`) |
| `CASDOOR_ORIGIN` | Yes | — | Casdoor backend URL for OIDC discovery |
| `CASDOOR_ORIGIN_FRONTEND` | Yes | — | Casdoor frontend URL (may differ behind reverse proxy) |
| `CASDOOR_CLIENT_ID` | Yes | — | OAuth client ID from Casdoor application |
| `CASDOOR_CLIENT_SECRET` | Yes | — | OAuth client secret from Casdoor application |
| `CASDOOR_ORG_NAME` | Yes | — | Default organization slug in Casdoor |
| `ALLOW_LOCAL_LOGIN` | No | `false` | Show local login form for non-superusers |
| `CASDOOR_SSL_VERIFY` | No | `true` | SSL verification (`true`, `false`, or CA-bundle path) |
### Redirect URL Convention
The Allauth OIDC callback URL follows a fixed format. Register this URL in Casdoor:
```
/accounts/oidc/<provider_id>/login/callback/
```
For Spelunker with `provider_id = casdoor`:
```
/accounts/oidc/casdoor/login/callback/
```
> **Important:** The path segment is `oidc`, not `openid_connect`.
### Standard Group Mapping
Casdoor group names map to Django groups with consistent naming:
| Casdoor Group | Django Group | `is_staff` | Permissions |
|---------------|-------------|------------|-------------|
| `view_only` | `View Only` | `False` | `view_*` |
| `staff` | `Staff` | `True` | `view_*`, `add_*`, `change_*` |
| `sme` | `SME` | `True` | `view_*`, `add_*`, `change_*` |
| `admin` | `Admin` | `True` | `view_*`, `add_*`, `change_*`, `delete_*` |
### Standard OIDC Claim Mapping
| Casdoor Claim | Django Field | Notes |
|---------------|-------------|-------|
| `email` | `User.username`, `User.email` | Full email used as username |
| `given_name` | `User.first_name` | — |
| `family_name` | `User.last_name` | — |
| `name` | Parsed into first/last | Fallback when given/family absent |
| `organization` | Organization lookup/create | Via adapter |
| `groups` | Django Group membership | Via adapter mapping |
---
## Recommended Settings
Most implementations should include these Allauth settings:
```python
# Authentication mode
ACCOUNT_LOGIN_METHODS = {'email'}
ACCOUNT_SIGNUP_FIELDS = ['email*', 'password1*', 'password2*']
ACCOUNT_EMAIL_VERIFICATION = 'optional'
ACCOUNT_SESSION_REMEMBER = True
ACCOUNT_LOGIN_ON_PASSWORD_RESET = True
ACCOUNT_UNIQUE_EMAIL = True
# Redirects
LOGIN_REDIRECT_URL = '/dashboard/'
ACCOUNT_LOGOUT_REDIRECT_URL = '/'
LOGIN_URL = '/accounts/login/'
# Social account behavior
SOCIALACCOUNT_AUTO_SIGNUP = True
SOCIALACCOUNT_EMAIL_VERIFICATION = 'none'
SOCIALACCOUNT_QUERY_EMAIL = True
SOCIALACCOUNT_STORE_TOKENS = True
SOCIALACCOUNT_ADAPTER = '<app>.adapters.CasdoorAccountAdapter'
ACCOUNT_ADAPTER = '<app>.adapters.LocalAccountAdapter'
# Session management
SESSION_COOKIE_AGE = 28800 # 8 hours
SESSION_SAVE_EVERY_REQUEST = True
# Account linking — auto-connect SSO to an existing local account with
# the same verified email instead of raising a conflict error
SOCIALACCOUNT_EMAIL_AUTHENTICATION_AUTO_CONNECT = True
```
### Multi-Factor Authentication (Recommended)
Add `allauth.mfa` for TOTP/WebAuthn second-factor support:
```python
INSTALLED_APPS += ['allauth.mfa']
MFA_ADAPTER = 'allauth.mfa.adapter.DefaultMFAAdapter'
```
MFA is enforced per-user inside Django; Casdoor may also enforce its own MFA upstream.
### Rate Limiting on Local Login (Recommended)
Protect the local login form from brute-force attacks with `django-axes` or similar:
```python
# pip install django-axes
INSTALLED_APPS += ['axes']
AUTHENTICATION_BACKENDS = [
'axes.backends.AxesStandaloneBackend',
'django.contrib.auth.backends.ModelBackend',
'allauth.account.auth_backends.AuthenticationBackend',
]
AXES_FAILURE_LIMIT = 5 # Lock after 5 failures
AXES_COOLOFF_TIME = 1 # 1-hour cooloff
AXES_LOCKOUT_PARAMETERS = ['ip_address', 'username']
```
---
## Social Account Adapter
The social account adapter is the core of the pattern. It handles user provisioning on SSO login, maps claims to Django fields, enforces superuser protection, and assigns groups.
```python
from allauth.socialaccount.adapter import DefaultSocialAccountAdapter
from allauth.exceptions import ImmediateHttpResponse
from django.contrib.auth.models import User, Group
from django.contrib import messages
from django.shortcuts import redirect
import logging
logger = logging.getLogger(__name__)
class CasdoorAccountAdapter(DefaultSocialAccountAdapter):
def is_open_for_signup(self, request, sociallogin):
"""Always allow SSO-initiated signup."""
return True
def pre_social_login(self, request, sociallogin):
"""
Runs on every SSO login (new and returning users).
1. Blocks superusers — they must use local auth.
2. Re-syncs organization and group claims for returning users
so that IdP changes are reflected immediately.
"""
if sociallogin.user.id:
user = sociallogin.user
# --- Superuser gate ---
if user.is_superuser:
logger.warning(
f"SSO login blocked for superuser {user.username}. "
"Superusers must use local authentication."
)
messages.error(
request,
"Superuser accounts must use local authentication."
)
raise ImmediateHttpResponse(redirect('account_login'))
# --- Re-sync claims for returning users ---
extra_data = sociallogin.account.extra_data
org_identifier = extra_data.get('organization', '')
if org_identifier:
self._assign_organization(user, org_identifier)
groups = extra_data.get('groups', [])
self._assign_groups(user, groups)
user.is_staff = any(
g in ['staff', 'sme', 'admin'] for g in groups
)
user.save(update_fields=['is_staff'])
def populate_user(self, request, sociallogin, data):
"""Map Casdoor claims to Django User fields."""
user = super().populate_user(request, sociallogin, data)
email = data.get('email', '')
user.username = email
user.email = email
user.first_name = data.get('given_name', '')
user.last_name = data.get('family_name', '')
# Fallback: parse full 'name' claim
if not user.first_name and not user.last_name:
full_name = data.get('name', '')
if full_name:
parts = full_name.split(' ', 1)
user.first_name = parts[0]
user.last_name = parts[1] if len(parts) > 1 else ''
# Security: SSO users are never superusers
user.is_superuser = False
# Set is_staff from group membership
groups = data.get('groups', [])
user.is_staff = any(g in ['staff', 'sme', 'admin'] for g in groups)
return user
def save_user(self, request, sociallogin, form=None):
"""Save user and handle organization + group mapping."""
user = super().save_user(request, sociallogin, form)
extra_data = sociallogin.account.extra_data
org_identifier = extra_data.get('organization', '')
if org_identifier:
self._assign_organization(user, org_identifier)
groups = extra_data.get('groups', [])
self._assign_groups(user, groups)
return user
def _assign_organization(self, user, org_identifier):
"""Assign (or create) organization from the OIDC claim."""
# Domain-specific — see Extension Examples below
raise NotImplementedError("Override per project")
def _assign_groups(self, user, group_names):
"""Map Casdoor groups to Django groups."""
group_mapping = {
'view_only': 'View Only',
'staff': 'Staff',
'sme': 'SME',
'admin': 'Admin',
}
user.groups.clear()
for casdoor_group in group_names:
django_group_name = group_mapping.get(casdoor_group.lower())
if django_group_name:
group, _ = Group.objects.get_or_create(name=django_group_name)
user.groups.add(group)
logger.info(f"Added {user.username} to group {django_group_name}")
```
---
## Local Account Adapter
Prevents local registration and logs authentication failures:
```python
from allauth.account.adapter import DefaultAccountAdapter
import logging
logger = logging.getLogger(__name__)
class LocalAccountAdapter(DefaultAccountAdapter):
def is_open_for_signup(self, request):
"""Disable local signup — all users come via SSO or admin."""
return False
def authentication_failed(self, request, **kwargs):
"""Log failures for security monitoring."""
logger.warning(
f"Local authentication failed from {request.META.get('REMOTE_ADDR')}"
)
super().authentication_failed(request, **kwargs)
```
---
## OIDC Provider Configuration
Register Casdoor as an OpenID Connect provider in `settings.py`:
```python
SOCIALACCOUNT_PROVIDERS = {
'openid_connect': {
'APPS': [
{
'provider_id': 'casdoor',
'name': 'Casdoor SSO',
'client_id': CASDOOR_CLIENT_ID,
'secret': CASDOOR_CLIENT_SECRET,
'settings': {
'server_url': f'{CASDOOR_ORIGIN}/.well-known/openid-configuration',
},
}
],
'OAUTH_PKCE_ENABLED': True,
}
}
```
---
## Management Command — Group Creation
An idempotent management command ensures groups and permissions exist:
```python
from django.core.management.base import BaseCommand
from django.contrib.auth.models import Group, Permission
class Command(BaseCommand):
help = 'Create Django groups for Casdoor SSO integration'
def handle(self, *args, **options):
groups_config = {
'View Only': {'permissions': ['view']},
'Staff': {'permissions': ['view', 'add', 'change']},
'SME': {'permissions': ['view', 'add', 'change']},
'Admin': {'permissions': ['view', 'add', 'change', 'delete']},
}
# Add your domain-specific model names here
models_to_permission = [
'vendor', 'document', 'rfp', 'rfpquestion',
]
for group_name, config in groups_config.items():
group, created = Group.objects.get_or_create(name=group_name)
status = 'Created' if created else 'Exists'
self.stdout.write(f'{status}: {group_name}')
for perm_prefix in config['permissions']:
for model in models_to_permission:
try:
perm = Permission.objects.get(
codename=f'{perm_prefix}_{model}'
)
group.permissions.add(perm)
except Permission.DoesNotExist:
pass
self.stdout.write(self.style.SUCCESS('SSO groups created successfully'))
```
---
## Login Template
The login template shows an SSO button when Casdoor is enabled and conditionally reveals the local login form:
```html
{% load socialaccount %}
<!-- SSO Login Button (POST form for CSRF protection) -->
{% if CASDOOR_ENABLED %}
<form method="post" action="{% provider_login_url 'casdoor' %}">
{% csrf_token %}
<button type="submit">Sign in with SSO</button>
</form>
{% endif %}
<!-- Local Login Form (conditional) -->
{% if ALLOW_LOCAL_LOGIN or user.is_superuser %}
<form method="post" action="{% url 'account_login' %}">
{% csrf_token %}
{{ form.as_p }}
<button type="submit">Sign In Locally</button>
</form>
{% endif %}
```
> **Why POST?** Using a `<a href>` GET link to initiate the OAuth flow skips CSRF
> validation. Allauth's `{% provider_login_url %}` is designed for use inside a
> `<form method="post">` so the CSRF token is verified before the redirect.
---
## Context Processor
Exposes SSO settings to every template:
```python
from django.conf import settings
def user_preferences(request):
context = {}
# Always expose SSO flags for the login page
context['CASDOOR_ENABLED'] = getattr(settings, 'CASDOOR_ENABLED', False)
context['ALLOW_LOCAL_LOGIN'] = getattr(settings, 'ALLOW_LOCAL_LOGIN', False)
return context
```
Register in `settings.py`:
```python
TEMPLATES = [{
'OPTIONS': {
'context_processors': [
# ... standard processors ...
'<app>.context_processors.user_preferences',
],
},
}]
```
---
## SSL Bypass (Development Only)
For sandbox environments with self-signed certificates, an optional SSL patch disables verification at the `requests` library level:
```python
import os, logging
logger = logging.getLogger(__name__)
def apply_ssl_bypass():
ssl_verify = os.environ.get('CASDOOR_SSL_VERIFY', 'true').lower()
if ssl_verify != 'false':
return
logger.warning("SSL verification DISABLED — sandbox only")
import urllib3
from requests.adapters import HTTPAdapter
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
_original_send = HTTPAdapter.send
def _patched_send(self, request, stream=False, timeout=None,
verify=True, cert=None, proxies=None):
return _original_send(self, request, stream=stream,
timeout=timeout, verify=False,
cert=cert, proxies=proxies)
HTTPAdapter.send = _patched_send
apply_ssl_bypass()
```
Load it at the top of `settings.py` **before** any library imports that make HTTP calls:
```python
_ssl_verify = os.environ.get('CASDOOR_SSL_VERIFY', 'true').lower()
if _ssl_verify == 'false':
import <app>.ssl_patch # noqa: F401
```
---
## Logout Flow
By default, Django's `account_logout` destroys the local session but does **not** terminate the upstream Casdoor session. The user remains logged in at the IdP and will be silently re-authenticated on next visit.
### Options
| Strategy | Behaviour | Implementation |
|----------|-----------|----------------|
| **Local-only logout** (default) | Destroys Django session; IdP session survives | No extra work |
| **IdP redirect logout** | Redirects to Casdoor's `/api/logout` after local logout | Override `ACCOUNT_LOGOUT_REDIRECT_URL` to point at Casdoor |
| **OIDC back-channel logout** | Casdoor notifies Django to invalidate sessions | Requires Casdoor back-channel support + a Django webhook endpoint |
### Recommended: IdP redirect logout
```python
# settings.py
ACCOUNT_LOGOUT_REDIRECT_URL = (
f'{CASDOOR_ORIGIN}/api/logout'
f'?post_logout_redirect_uri=https://your-app.example.com/'
)
```
This ensures the Casdoor session cookie is cleared before the user returns to your app.
---
## Domain Extension Examples
### Spelunker (RFP Tool)
Spelunker's adapter creates organizations on first encounter and links them to user profiles:
```python
def _assign_organization(self, user, org_identifier):
from django.db import models
from django.utils.text import slugify
from core.models import Organization
try:
org = Organization.objects.filter(
models.Q(slug=org_identifier) | models.Q(name=org_identifier)
).first()
if not org:
org = Organization.objects.create(
name=org_identifier,
slug=slugify(org_identifier),
type='for-profit',
legal_country='CA',
status='active',
)
logger.info(f"Created organization: {org.name}")
if hasattr(user, 'profile'):
logger.info(f"Assigned {user.username}{org.name}")
except Exception as e:
logger.error(f"Organization assignment error: {e}")
```
### Multi-Tenant SaaS App
A multi-tenant app might restrict users to a single tenant and enforce tenant isolation:
```python
def _assign_organization(self, user, org_identifier):
from tenants.models import Tenant
tenant = Tenant.objects.filter(external_id=org_identifier).first()
if not tenant:
raise ValueError(f"Unknown tenant: {org_identifier}")
user.tenant = tenant
user.save(update_fields=['tenant'])
```
---
## Anti-Patterns
- ❌ Don't allow SSO to grant `is_superuser` — always force `is_superuser = False` in `populate_user`
- ❌ Don't *log-and-continue* for superuser SSO attempts — raise `ImmediateHttpResponse` to actually block the login
- ❌ Don't disable local login for superusers — they need emergency access when SSO is unavailable
- ❌ Don't rely on SSO username claims — use email as the canonical identifier
- ❌ Don't hard-code the OIDC provider URL — always read from environment variables
- ❌ Don't skip the management command — groups and permissions must be idempotent and repeatable
- ❌ Don't use `CASDOOR_SSL_VERIFY=false` in production — only for sandbox environments with self-signed certificates
- ❌ Don't forget PKCE — always set `OAUTH_PKCE_ENABLED: True` for Authorization Code flow
- ❌ Don't sync groups only on first login — re-sync in `pre_social_login` so IdP changes take effect immediately
- ❌ Don't use a GET link (`<a href>`) to start the OAuth flow — use a POST form so CSRF protection applies
- ❌ Don't assume Django logout kills the IdP session — configure an IdP redirect or back-channel logout
- ❌ Don't leave the local login endpoint unprotected — add rate limiting (e.g. `django-axes`) to prevent brute-force attacks
---
## Settings
All Django settings this pattern recognizes:
```python
# settings.py
# --- SSO Provider ---
CASDOOR_ENABLED = env.bool('CASDOOR_ENABLED') # Master SSO toggle
CASDOOR_ORIGIN = env('CASDOOR_ORIGIN') # OIDC discovery base URL
CASDOOR_ORIGIN_FRONTEND = env('CASDOOR_ORIGIN_FRONTEND') # Frontend URL (may differ)
CASDOOR_CLIENT_ID = env('CASDOOR_CLIENT_ID') # OAuth client ID
CASDOOR_CLIENT_SECRET = env('CASDOOR_CLIENT_SECRET') # OAuth client secret
CASDOOR_ORG_NAME = env('CASDOOR_ORG_NAME') # Default organization
CASDOOR_SSL_VERIFY = env('CASDOOR_SSL_VERIFY') # true | false | /path/to/ca.pem
# --- Login Behavior ---
ALLOW_LOCAL_LOGIN = env.bool('ALLOW_LOCAL_LOGIN', default=False) # Show local form
# --- Allauth ---
SOCIALACCOUNT_ADAPTER = '<app>.adapters.CasdoorAccountAdapter'
ACCOUNT_ADAPTER = '<app>.adapters.LocalAccountAdapter'
```
---
## Testing
Standard test cases every implementation should cover:
```python
from django.test import TestCase, override_settings
from unittest.mock import MagicMock
from django.contrib.auth.models import User, Group
from <app>.adapters import CasdoorAccountAdapter, LocalAccountAdapter
class CasdoorAdapterTest(TestCase):
def setUp(self):
self.adapter = CasdoorAccountAdapter()
def test_signup_always_open(self):
"""SSO signup must always be permitted."""
self.assertTrue(self.adapter.is_open_for_signup(MagicMock(), MagicMock()))
def test_superuser_never_set_via_sso(self):
"""populate_user must force is_superuser=False."""
sociallogin = MagicMock()
data = {'email': 'admin@example.com', 'groups': ['admin']}
user = self.adapter.populate_user(MagicMock(), sociallogin, data)
self.assertFalse(user.is_superuser)
def test_email_used_as_username(self):
"""Username must be the full email address."""
sociallogin = MagicMock()
data = {'email': 'jane@example.com'}
user = self.adapter.populate_user(MagicMock(), sociallogin, data)
self.assertEqual(user.username, 'jane@example.com')
def test_staff_flag_from_groups(self):
"""is_staff must be True when user belongs to staff/sme/admin."""
sociallogin = MagicMock()
for group in ['staff', 'sme', 'admin']:
data = {'email': 'user@example.com', 'groups': [group]}
user = self.adapter.populate_user(MagicMock(), sociallogin, data)
self.assertTrue(user.is_staff, f"is_staff should be True for group '{group}'")
def test_name_fallback_parsing(self):
"""When given_name/family_name absent, parse 'name' claim."""
sociallogin = MagicMock()
data = {'email': 'user@example.com', 'name': 'Jane Doe'}
user = self.adapter.populate_user(MagicMock(), sociallogin, data)
self.assertEqual(user.first_name, 'Jane')
self.assertEqual(user.last_name, 'Doe')
def test_group_mapping(self):
"""Casdoor groups must map to correctly named Django groups."""
Group.objects.create(name='View Only')
Group.objects.create(name='Staff')
user = User.objects.create_user('test@example.com', 'test@example.com')
self.adapter._assign_groups(user, ['view_only', 'staff'])
group_names = set(user.groups.values_list('name', flat=True))
self.assertEqual(group_names, {'View Only', 'Staff'})
def test_superuser_sso_login_blocked(self):
"""pre_social_login must raise ImmediateHttpResponse for superusers."""
from allauth.exceptions import ImmediateHttpResponse
user = User.objects.create_superuser(
'admin@example.com', 'admin@example.com', 'pass'
)
sociallogin = MagicMock()
sociallogin.user = user
sociallogin.user.id = user.id
with self.assertRaises(ImmediateHttpResponse):
self.adapter.pre_social_login(MagicMock(), sociallogin)
def test_groups_resync_on_returning_login(self):
"""pre_social_login must re-sync groups for existing users."""
Group.objects.create(name='Admin')
Group.objects.create(name='Staff')
user = User.objects.create_user('user@example.com', 'user@example.com')
user.groups.add(Group.objects.get(name='Staff'))
sociallogin = MagicMock()
sociallogin.user = user
sociallogin.user.id = user.id
sociallogin.account.extra_data = {
'groups': ['admin'],
'organization': '',
}
self.adapter.pre_social_login(MagicMock(), sociallogin)
group_names = set(user.groups.values_list('name', flat=True))
self.assertEqual(group_names, {'Admin'})
class LocalAdapterTest(TestCase):
def test_local_signup_disabled(self):
"""Local signup must always be disabled."""
adapter = LocalAccountAdapter()
self.assertFalse(adapter.is_open_for_signup(MagicMock()))
```

96
docs/Red Panda Django.md Normal file
View File

@@ -0,0 +1,96 @@
## Red Panda Approval™
This project follows Red Panda Approval standards - our gold standard for Django application quality. Code must be elegant, reliable, and maintainable to earn the approval of our adorable red panda judges.
### The 5 Sacred Django Criteria
1. **Fresh Migration Test** - Clean migrations from empty database
2. **Elegant Simplicity** - No unnecessary complexity
3. **Observable & Debuggable** - Proper logging and error handling
4. **Consistent Patterns** - Follow Django conventions
5. **Actually Works** - Passes all checks and serves real user needs
### Standards
# Environment
Virtual environment: ~/env/PROJECT/bin/activate
Python version: 3.12
# Code Organization
Maximum file length: 1000 lines
CSS: External .css files only (no inline/embedded)
JS: External .js files only (no inline/embedded)
# Required Packages
- Bootstrap 5.x (no custom CSS unless absolutely necessary)
- Bootstrap Icons (no emojis)
- django-crispy-forms + crispy-bootstrap5
- django-allauth
# Testing
Framework: Django TestCase (not pytest)
Minimum coverage: XX%? (optional)
### Database Conventions
# Development vs Production
- Development: SQLite
- Production: PostgreSQL
- Use dj-database-url for configuration
# Model Naming
- Model names: singular PascalCase (User, BlogPost, OrderItem)
- Related names: plural snake_case with proper English pluralization
- user.blog_posts, order.items
- category.industries (not industrys)
- person.children (not childs)
- analysis.analyses (not analysiss)
- Through tables: describe relationship (ProjectMembership, CourseEnrollment)
# Field Naming
- Foreign keys: singular without _id suffix (author, category, parent)
- Boolean fields: use prefixes (is_active, has_permission, can_edit)
- Date fields: use suffixes (created_at, updated_at, published_on)
- Avoid abbreviations (use description, not desc)
# Required Model Fields
All models should include:
- created_at = models.DateTimeField(auto_now_add=True)
- updated_at = models.DateTimeField(auto_now=True)
Consider adding:
- id = models.UUIDField(primary_key=True) for public-facing models
- is_active = models.BooleanField(default=True) for soft deletes
# Indexing
- Add db_index=True to frequently queried fields
- Use Meta.indexes for composite indexes
- Document why each index exists
# Migrations
- Never edit migrations that have been deployed
- Use meaningful migration names: --name add_email_to_profile
- One logical change per migration when possible
- Test migrations both forward and backward
# Queries
- Use select_related() for foreign keys
- Use prefetch_related() for reverse relations and M2M
- Avoid queries in loops (N+1 problem)
- Use .only() and .defer() for large models
- Add comments explaining complex querysets
## Monitoring & Health Check Endpoints
Follow standard Kubernetes health check endpoints for container orchestration:
### /ready/ - Readiness probe checks if the application is ready to serve traffic
Validates database connectivity
Validates cache connectivity
Returns 200 if ready, 503 if dependencies are unavailable
Used by load balancers to determine if pod should receive traffic
### /live/ - Liveness probe checks if the application process is alive
Simple health check with minimal logic
Returns 200 if Django is responding to requests
Used by Kubernetes to determine if pod should be restarted
Note: For detailed metrics and monitoring, use Prometheus and Alloy integration rather than custom health endpoints.

View File

@@ -0,0 +1,306 @@
## 🐾 Red Panda Approval™
This project follows Red Panda Approval standards — our gold standard for Django application quality. Code must be elegant, reliable, and maintainable to earn the approval of our adorable red panda judges.
### The 5 Sacred Django Criteria
1. **Fresh Migration Test** — Clean migrations from empty database
2. **Elegant Simplicity** — No unnecessary complexity
3. **Observable & Debuggable** — Proper logging and error handling
4. **Consistent Patterns** — Follow Django conventions
5. **Actually Works** — Passes all checks and serves real user needs
## Environment Standards
- Virtual environment: ~/env/PROJECT/bin/activate
- Use pyproject.toml for project configuration (no setup.py, no requirements.txt)
- Python version: specified in pyproject.toml
- Dependencies: floor-pinned with ceiling (e.g. `Django>=5.2,<6.0`)
### Dependency Pinning
```toml
# Correct — floor pin with ceiling
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"cryptography>=41.0,<45.0",
]
# Wrong — exact pins in library packages
dependencies = [
"Django==5.2.7", # too strict, breaks downstream
]
```
Exact pins (`==`) are only appropriate in application-level lock files, not in reusable library packages.
## Directory Structure
myproject/ # Git repository root
├── .gitignore
├── README.md
├── pyproject.toml # Project configuration (moved to repo root)
├── docker-compose.yml
├── .env # Docker Compose environment (DATABASE_URL=postgres://...)
├── .env.example
├── project/ # Django project root (manage.py lives here)
│ ├── manage.py
│ ├── Dockerfile
│ ├── .env # Local development environment (DATABASE_URL=sqlite:///...)
│ ├── .env.example
│ │
│ ├── config/ # Django configuration module
│ │ ├── __init__.py
│ │ ├── settings.py
│ │ ├── urls.py
│ │ ├── wsgi.py
│ │ └── asgi.py
│ │
│ ├── accounts/ # Django app
│ │ ├── __init__.py
│ │ ├── models.py
│ │ ├── views.py
│ │ └── urls.py
│ │
│ ├── blog/ # Django app
│ │ ├── __init__.py
│ │ ├── models.py
│ │ ├── views.py
│ │ └── urls.py
│ │
│ ├── static/
│ │ ├── css/
│ │ └── js/
│ │
│ └── templates/
│ └── base.html
├── web/ # Nginx configuration
│ └── nginx.conf
├── db/ # PostgreSQL configuration
│ └── postgresql.conf
└── docs/ # Project documentation
└── index.md
## Settings Structure
- Use a single settings.py file
- Use django-environ or python-dotenv for environment variables
- Never commit .env files to version control
- Provide .env.example with all required variables documented
- Create .gitignore file
- Create a .dockerignore file
## Code Organization
- Imports: PEP 8 ordering (stdlib, third-party, local)
- Type hints on function parameters
- CSS: External .css files only (no inline styles, no embedded `<style>` tags)
- JS: External .js files only (no inline handlers, no embedded `<script>` blocks)
- Maximum file length: 1000 lines
- If a file exceeds 500 lines, consider splitting by domain concept
## Database Conventions
- Migrations run cleanly from empty database
- Never edit deployed migrations
- Use meaningful migration names: --name add_email_to_profile
- One logical change per migration when possible
- Test migrations both forward and backward
### Development vs Production
- Development: SQLite
- Production: PostgreSQL
## Caching
- Expensive queries are cached
- Cache keys follow naming convention
- TTLs are appropriate (not infinite)
- Invalidation is documented
- Key Naming Pattern: {app}:{model}:{identifier}:{field}
## Model Naming
- Model names: singular PascalCase (User, BlogPost, OrderItem)
- Correct English pluralization on related names
- All models have created_at and updated_at
- All models define __str__ and get_absolute_url
- TextChoices used for status fields
- related_name defined on ForeignKey fields
- Related names: plural snake_case with proper English pluralization
## Forms
- Use ModelForm with explicit fields list (never __all__)
## Field Naming
- Foreign keys: singular without _id suffix (author, category, parent)
- Boolean fields: use prefixes (is_active, has_permission, can_edit)
- Date fields: use suffixes (created_at, updated_at, published_on)
- Avoid abbreviations (use description, not desc)
## Required Model Fields
- All models should include:
- created_at = models.DateTimeField(auto_now_add=True)
- updated_at = models.DateTimeField(auto_now=True)
- Consider adding:
- id = models.UUIDField(primary_key=True) for public-facing models
- is_active = models.BooleanField(default=True) for soft deletes
## Indexing
- Add db_index=True to frequently queried fields
- Use Meta.indexes for composite indexes
- Document why each index exists
## Queries
- Use select_related() for foreign keys
- Use prefetch_related() for reverse relations and M2M
- Avoid queries in loops (N+1 problem)
- Use .only() and .defer() for large models
- Add comments explaining complex querysets
## Docstrings
- Use Sphinx style docstrings
- Document all public functions, classes, and modules
- Skip docstrings for obvious one-liners and standard Django overrides
## Views
- Use Function-Based Views (FBVs) exclusively
- Explicit logic is preferred over implicit inheritance
- Extract shared logic into utility functions
## URLs & Identifiers
- Public URLs use short UUIDs (12 characters) via `shortuuid`
- Never expose sequential IDs in URLs (security/enumeration risk)
- Internal references may use standard UUIDs or PKs
## URL Patterns
- Resource-based URLs (RESTful style)
- Namespaced URL names per app
- Trailing slashes (Django default)
- Flat structure preferred over deep nesting
## Background Tasks
- All tasks are run synchronously unless the design specifies background tasks are needed for long operations
- Long operations use Celery tasks
- Use Memcached, task progress pattern: {app}:task:{task_id}:progress
- Tasks are idempotent
- Tasks include retry logic
- Tasks live in app/tasks.py
- RabbitMQ is the Message Broker
- Flower Monitoring: Use for debugging failed tasks
## Testing
- Framework: Django TestCase (not pytest)
- Separate test files per module: test_models.py, test_views.py, test_forms.py
## Frontend Standards
### New Projects (DaisyUI + Tailwind)
- DaisyUI 4 via CDN for component classes
- Tailwind CSS via CDN for utility classes
- Theme management via Themis (DaisyUI `data-theme` attribute)
- All apps extend `themis/base.html` for consistent navigation
- No inline styles or scripts
### Existing Projects (Bootstrap 5)
- Bootstrap 5 via CDN
- Bootstrap Icons via CDN
- Bootswatch for theme variants (if applicable)
- django-bootstrap5 and crispy-bootstrap5 for form rendering
## Preferred Packages
### Core Django
- django>=5.2,<6.0
- django-environ — Environment variables
### Authentication & Security
- django-allauth — User management
- django-allauth-2fa — Two-factor authentication
### API Development
- djangorestframework>=3.14,<4.0 — REST APIs
- drf-spectacular — OpenAPI/Swagger documentation
### Encryption
- cryptography — Fernet encryption for secrets/API keys
### Background Tasks
- celery — Async task queue
- django-celery-progress — Progress bars
- flower — Celery monitoring
### Caching
- pymemcache — Memcached backend
### Database
- dj-database-url — Database URL configuration
- psycopg[binary] — PostgreSQL adapter
- shortuuid — Short UUIDs for public URLs
### Production
- gunicorn — WSGI server
### Shared Apps
- django-heluca-themis — User preferences, themes, key management, navigation
### Deprecated / Removed
- ~~pytz~~ — Use stdlib `zoneinfo` (Python 3.9+, Django 4+)
- ~~Pillow~~ — Only add if your app needs ImageField
- ~~django-heluca-core~~ — Replaced by Themis
## Anti-Patterns to Avoid
### Models
- Don't use `Model.objects.get()` without handling `DoesNotExist`
- Don't use `null=True` on `CharField` or `TextField` (use `blank=True, default=""`)
- Don't use `related_name='+'` unless you have a specific reason
- Don't override `save()` for business logic (use signals or service functions)
- Don't use `auto_now=True` on fields you might need to manually set
- Don't use `ForeignKey` without specifying `on_delete` explicitly
- Don't use `Meta.ordering` on large tables (specify ordering in queries)
### Queries
- Don't query inside loops (N+1 problem)
- Don't use `.all()` when you need a subset
- Don't use raw SQL unless absolutely necessary
- Don't forget `select_related()` and `prefetch_related()`
### Views
- Don't put business logic in views
- Don't use `request.POST.get()` without validation (use forms)
- Don't return sensitive data in error messages
- Don't forget `login_required` decorator on protected views
### Forms
- Don't use `fields = '__all__'` in ModelForm
- Don't trust client-side validation alone
- Don't use `exclude` in ModelForm (use explicit `fields`)
### Templates
- Don't use `{{ variable }}` for URLs (use `{% url %}` tag)
- Don't put logic in templates
- Don't use inline CSS or JavaScript (external files only)
- Don't forget `{% csrf_token %}` in forms
### Security
- Don't store secrets in `settings.py` (use environment variables)
- Don't commit `.env` files to version control
- Don't use `DEBUG=True` in production
- Don't expose sequential IDs in public URLs
- Don't use `mark_safe()` on user-supplied content
- Don't disable CSRF protection
### Imports & Code Style
- Don't use `from module import *`
- Don't use mutable default arguments
- Don't use bare `except:` clauses
- Don't ignore linter warnings without documented reason
### Migrations
- Don't edit migrations that have been deployed
- Don't use `RunPython` without a reverse function
- Don't add non-nullable fields without a default value
### Celery Tasks
- Don't pass model instances to tasks (pass IDs and re-fetch)
- Don't assume tasks run immediately
- Don't forget retry logic for external service calls

475
docs/Themis_V1-00.md Normal file
View File

@@ -0,0 +1,475 @@
# Themis v1.0.0
Reusable Django app providing user preferences, DaisyUI theme management, API key management, and standard navigation templates.
**Package:** django-heluca-themis
**Django:** >=5.2, <6.0
**Python:** >=3.10
**License:** MIT
## 🐾 Red Panda Approval™
This project follows Red Panda Approval standards.
---
## Overview
Themis provides the foundational elements every Django application needs:
- **UserProfile** — timezone, date/time/number formatting, DaisyUI theme selection
- **Notifications** — in-app notification bell, polling, browser desktop notifications, user preferences
- **API Key Management** — encrypted storage with per-key instructions
- **Standard Navigation** — navbar, user menu, notification bell, theme toggle, bottom nav
- **Middleware** — automatic timezone activation and theme context
- **Formatting Utilities** — date, time, number formatting respecting user preferences
- **Health Checks** — Kubernetes-ready `/ready/` and `/live/` endpoints
Themis does not provide domain models (Organization, etc.) or notification triggers. Those are documented as patterns for consuming apps to implement.
---
## Installation
### From Git Repository
```bash
pip install git+ssh://git@git.helu.ca:22022/r/themis.git
```
### For Local Development
```bash
pip install -e /path/to/themis
```
### Configuration
**settings.py:**
```python
INSTALLED_APPS = [
...
"rest_framework",
"themis",
...
]
MIDDLEWARE = [
...
"themis.middleware.TimezoneMiddleware",
"themis.middleware.ThemeMiddleware",
...
]
TEMPLATES = [{
"OPTIONS": {
"context_processors": [
...
"themis.context_processors.themis_settings",
"themis.context_processors.user_preferences",
"themis.context_processors.notifications",
],
},
}]
# Themis app settings
THEMIS_APP_NAME = "My Application"
THEMIS_NOTIFICATION_POLL_INTERVAL = 60 # seconds (0 to disable polling)
THEMIS_NOTIFICATION_MAX_AGE_DAYS = 90 # cleanup ceiling for read notifications
```
**urls.py:**
```python
from django.urls import include, path
urlpatterns = [
...
path("", include("themis.urls")),
path("api/v1/", include("themis.api.urls")),
...
]
```
**Run migrations:**
```bash
python manage.py migrate
```
---
## Models
### UserProfile
Extends Django's User model with display preferences. Automatically created via `post_save` signal when a User is created.
| Field | Type | Default | Description |
|---|---|---|---|
| user | OneToOneField | required | Link to Django User |
| home_timezone | CharField(50) | UTC | Permanent timezone |
| current_timezone | CharField(50) | (blank) | Current timezone when traveling |
| date_format | CharField(20) | YYYY-MM-DD | Date display format |
| time_format | CharField(10) | 24-hour | 12-hour or 24-hour |
| thousand_separator | CharField(10) | comma | Number formatting |
| week_start | CharField(10) | monday | First day of week |
| theme_mode | CharField(10) | auto | light / dark / auto |
| theme_name | CharField(30) | corporate | DaisyUI light theme |
| dark_theme_name | CharField(30) | business | DaisyUI dark theme |
| created_at | DateTimeField | auto | Record creation |
| updated_at | DateTimeField | auto | Last update |
**Properties:**
- `effective_timezone` — returns current_timezone if set, otherwise home_timezone
- `is_traveling` — True if current_timezone differs from home_timezone
**Why two timezone fields?**
Users who travel frequently need to see times in their current location while still knowing what time it is "at home." Setting `current_timezone` enables this without losing the home timezone setting.
### UserAPIKey
Stores encrypted API keys, MCP credentials, DAV passwords, and other service credentials.
| Field | Type | Default | Description |
|---|---|---|---|
| id | UUIDField | auto | Primary key |
| user | ForeignKey | required | Owner |
| service_name | CharField(100) | required | Service name (e.g. "OpenAI") |
| key_type | CharField(30) | api | api / mcp / dav / token / secret / other |
| label | CharField(100) | (blank) | User nickname for this key |
| encrypted_value | TextField | required | Fernet-encrypted credential |
| instructions | TextField | (blank) | How to obtain and use this key |
| help_url | URLField | (blank) | Link to service documentation |
| is_active | BooleanField | True | Whether key is in use |
| last_used_at | DateTimeField | null | Last usage timestamp |
| expires_at | DateTimeField | null | Expiration date |
| created_at | DateTimeField | auto | Record creation |
| updated_at | DateTimeField | auto | Last update |
**Properties:**
- `masked_value` — shows only last 4 characters (e.g. `****7xQ2`)
- `display_name` — returns label if set, otherwise service_name
**Encryption:**
Keys are encrypted at rest using Fernet symmetric encryption derived from Django's `SECRET_KEY`. The plaintext value is never stored and is only shown at creation time.
### UserNotification
In-app notification for a user. Created by consuming apps via `notify_user()`.
| Field | Type | Default | Description |
|---|---|---|---|
| id | UUIDField | auto | Primary key |
| user | ForeignKey | required | Recipient |
| title | CharField(200) | required | Short headline |
| message | TextField | (blank) | Body text |
| level | CharField(10) | info | info / success / warning / danger |
| url | CharField(500) | (blank) | Link to navigate on click |
| source_app | CharField(100) | (blank) | App label of sender |
| source_model | CharField(100) | (blank) | Model that triggered this |
| source_id | CharField(100) | (blank) | PK of source object |
| is_read | BooleanField | False | Whether user has read this |
| read_at | DateTimeField | null | When it was read |
| is_dismissed | BooleanField | False | Whether user dismissed this |
| dismissed_at | DateTimeField | null | When it was dismissed |
| expires_at | DateTimeField | null | Auto-expire datetime |
| created_at | DateTimeField | auto | Record creation |
| updated_at | DateTimeField | auto | Last update |
**Properties:**
- `level_weight` — numeric weight for level comparison (info=0, success=0, warning=1, danger=2)
- `is_expired` — True if expires_at has passed
- `level_css_class` — DaisyUI alert class (e.g. `alert-warning`)
- `level_badge_class` — DaisyUI badge class (e.g. `badge-warning`)
### UserProfile Notification Preferences
The UserProfile model includes four notification preference fields:
| Field | Type | Default | Description |
|---|---|---|---|
| notifications_enabled | BooleanField | True | Master on/off switch |
| notifications_min_level | CharField(10) | info | Minimum level to display |
| browser_notifications_enabled | BooleanField | False | Browser desktop notifications |
| notification_retention_days | PositiveIntegerField | 30 | Days to keep read notifications |
---
## Notifications
### Creating Notifications
All notification creation goes through the `notify_user()` utility:
```python
from themis.notifications import notify_user
notify_user(
user=user,
title="Task overdue",
message="Task 'Deploy v2' was due yesterday.",
level="warning",
url="/tasks/42/",
source_app="tasks",
source_model="Task",
source_id="42",
deduplicate=True,
)
```
This function respects user preferences (enabled flag, minimum level) and supports deduplication via source tracking fields.
### Notification Bell
The notification bell appears in the navbar for authenticated users with notifications enabled. It shows an unread count badge and a dropdown with a link to the full notification list.
### Polling
The `notifications.js` script polls the `/notifications/count/` endpoint at a configurable interval (default: 60 seconds) and updates the badge. Set `THEMIS_NOTIFICATION_POLL_INTERVAL = 0` to disable polling.
### Browser Desktop Notifications
When a user enables browser notifications in their preferences, Themis will request permission from the browser and show OS-level desktop notifications when new notifications arrive via polling.
### Cleanup
Old read/dismissed/expired notifications can be cleaned up with:
```bash
python manage.py cleanup_notifications
python manage.py cleanup_notifications --max-age-days=60
```
For details on trigger patterns, see **[Notification Trigger Pattern](Pattern_Notification_V1-00.md)**.
---
## Templates
### Base Template
All consuming apps extend `themis/base.html`:
```html
{% extends "themis/base.html" %}
{% block title %}Dashboard — My App{% endblock %}
{% block nav_items %}
<li><a href="{% url 'dashboard' %}">Dashboard</a></li>
<li><a href="{% url 'reports' %}">Reports</a></li>
{% endblock %}
{% block content %}
<h1 class="text-2xl font-bold">Dashboard</h1>
<!-- app content -->
{% endblock %}
```
### Available Blocks
| Block | Location | Purpose |
|---|---|---|
| `title` | `<title>` | Page title |
| `extra_head` | `<head>` | Additional CSS/meta |
| `navbar` | Top of `<body>` | Entire navbar (override to customize) |
| `nav_items` | Navbar (mobile) | Navigation links |
| `nav_items_desktop` | Navbar (desktop) | Desktop-only nav links |
| `nav_items_mobile` | Navbar (mobile) | Mobile-only nav links |
| `body_attrs` | `<body>` | Extra body attributes |
| `content` | `<main>` | Page content |
| `footer` | Bottom of `<body>` | Entire footer (override to customize) |
| `extra_scripts` | Before `</body>` | Additional JavaScript |
### Navigation Structure
**Navbar (fixed):**
```
[App Logo/Name] [Nav Items] [Theme ☀/🌙] [🔔 3] [User ▾]
├─ Settings
├─ API Keys
└─ Logout
```
Collapses to hamburger menu on mobile.
**Bottom Nav (fixed):**
```
© 2026 App Name
```
### What Apps Cannot Change
- Navbar is always a horizontal bar at the top
- User menu is always on the right
- Theme toggle is always in the navbar
- Bottom nav is always present
- Messages display below the navbar
- Content is in a centered container
---
## Middleware
### TimezoneMiddleware
Activates the user's effective timezone for each request using `zoneinfo`. All datetime operations within the request use the user's timezone. Falls back to UTC for anonymous users.
```python
MIDDLEWARE = [
...
"themis.middleware.TimezoneMiddleware",
...
]
```
### ThemeMiddleware
Attaches DaisyUI theme information to the request. The context processor reads these values for template rendering.
```python
MIDDLEWARE = [
...
"themis.middleware.ThemeMiddleware",
...
]
```
---
## Context Processors
### themis_settings
Provides app configuration from `THEMIS_*` settings:
- `themis_app_name`
- `themis_notification_poll_interval`
### user_preferences
Provides user preferences:
- `user_timezone`
- `user_date_format`
- `user_time_format`
- `user_is_traveling`
- `user_theme_mode`
- `user_theme_name`
- `user_dark_theme_name`
- `user_profile`
### notifications
Provides notification state:
- `themis_unread_notification_count`
- `themis_notifications_enabled`
- `themis_browser_notifications_enabled`
---
## Utilities
### Formatting
```python
from themis.utils import format_date_for_user, format_time_for_user, format_number_for_user
formatted_date = format_date_for_user(date_obj, request.user)
formatted_time = format_time_for_user(time_obj, request.user)
formatted_num = format_number_for_user(1000000, request.user)
```
### Timezone
```python
from themis.utils import convert_to_user_timezone, get_timezone_display
user_time = convert_to_user_timezone(utc_datetime, request.user)
tz_name = get_timezone_display(request.user)
```
### Template Tags
```html
{% load themis_tags %}
{{ event.date|user_date:request.user }}
{{ event.time|user_time:request.user }}
{{ revenue|user_number:request.user }}
{% user_timezone_name request.user %}
```
---
## URL Patterns
| URL | View | Purpose |
|---|---|---|
| `/ready/` | `ready` | Kubernetes readiness probe |
| `/live/` | `live` | Kubernetes liveness probe |
| `/profile/settings/` | `profile_settings` | User preferences page |
| `/profile/keys/` | `key_list` | API key list |
| `/profile/keys/add/` | `key_create` | Add new key |
| `/profile/keys/<uuid>/` | `key_detail` | Key detail + instructions |
| `/profile/keys/<uuid>/edit/` | `key_edit` | Edit key metadata |
| `/profile/keys/<uuid>/delete/` | `key_delete` | Delete key (POST only) |
| `/notifications/` | `notification_list` | Notification list page |
| `/notifications/<uuid>/read/` | `notification_mark_read` | Mark as read (POST) |
| `/notifications/read-all/` | `notification_mark_all_read` | Mark all read (POST) |
| `/notifications/<uuid>/dismiss/` | `notification_dismiss` | Dismiss (POST) |
| `/notifications/count/` | `notification_count` | Unread count JSON |
---
## REST API
| Endpoint | Method | Description |
|---|---|---|
| `/api/v1/profiles/` | GET | List profiles (own only, admin sees all) |
| `/api/v1/profiles/{id}/` | GET/PATCH | View/update profile |
| `/api/v1/keys/` | GET | List own API keys |
| `/api/v1/keys/` | POST | Create new key |
| `/api/v1/keys/{uuid}/` | GET/PATCH/DELETE | View/update/delete key |
| `/api/v1/notifications/` | GET | List own notifications (filterable) |
| `/api/v1/notifications/{uuid}/` | GET/PATCH/DELETE | View/update/delete notification |
| `/api/v1/notifications/{uuid}/mark_read/` | PATCH | Mark as read |
| `/api/v1/notifications/mark-all-read/` | PATCH | Mark all as read |
| `/api/v1/notifications/{uuid}/dismiss/` | PATCH | Dismiss notification |
| `/api/v1/notifications/count/` | GET | Unread count |
---
## DaisyUI Themes
Themis supports all 32 built-in DaisyUI themes. Users select separate themes for light and dark modes. The theme toggle cycles: light → dark → auto (system).
No database table needed — themes are a simple CharField storing the DaisyUI theme name.
### Available Themes
light, dark, cupcake, bumblebee, emerald, corporate, synthwave, retro, cyberpunk, valentine, halloween, garden, forest, aqua, lofi, pastel, fantasy, wireframe, black, luxury, dracula, cmyk, autumn, business, acid, lemonade, night, coffee, winter, dim, nord, sunset
---
## Dependencies
```toml
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"cryptography>=41.0,<45.0",
]
```
No `pytz` (uses stdlib `zoneinfo`). No `Pillow`. No database-stored themes.

732
docs/mnemosyne.html Normal file
View File

@@ -0,0 +1,732 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Mnemosyne — Architecture Documentation</title>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/css/bootstrap.min.css" rel="stylesheet">
<link href="https://cdn.jsdelivr.net/npm/bootstrap-icons@1.11.0/font/bootstrap-icons.css" rel="stylesheet">
<script src="https://cdn.jsdelivr.net/npm/mermaid@10/dist/mermaid.min.js"></script>
<script>mermaid.initialize({ startOnLoad: true, theme: 'default' });</script>
</head>
<body>
<div class="container-fluid">
<nav class="navbar navbar-dark bg-dark rounded mb-4">
<div class="container-fluid">
<a class="navbar-brand" href="#"><i class="bi bi-book"></i> Mnemosyne — Architecture Documentation</a>
<div class="navbar-nav d-flex flex-row">
<a class="nav-link me-3" href="#overview">Overview</a>
<a class="nav-link me-3" href="#architecture">Architecture</a>
<a class="nav-link me-3" href="#data-model">Data Model</a>
<a class="nav-link me-3" href="#content-types">Content Types</a>
<a class="nav-link me-3" href="#multimodal-pipeline">Multimodal</a>
<a class="nav-link me-3" href="#search-pipeline">Search</a>
<a class="nav-link me-3" href="#mcp-interface">MCP</a>
<a class="nav-link me-3" href="#gpu-services">GPU</a>
<a class="nav-link" href="#deployment">Deployment</a>
</div>
</div>
</nav>
<div class="row">
<div class="col-12">
<h1 class="display-4 mb-2"><i class="bi bi-book-fill"></i> Mnemosyne <span class="badge bg-primary">Architecture</span></h1>
<p class="lead text-muted fst-italic">"The electric light did not come from the continuous improvement of candles." — Oren Harari</p>
<p class="lead">Mnemosyne is a content-type-aware, multimodal personal knowledge management system built on Neo4j knowledge graphs and Qwen3-VL multimodal AI. Named after the Titan goddess of memory, it understands <em>what kind</em> of knowledge it holds and makes it searchable through text, images, and natural language.</p>
</div>
</div>
<!-- SECTION: OVERVIEW -->
<section id="overview" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-info-circle"></i> Overview</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>Purpose</h3>
<p><strong>Mnemosyne</strong> is a personal knowledge management system that treats content type as a first-class concept. Unlike generic knowledge bases that treat all documents identically, Mnemosyne understands the difference between a novel, a technical manual, album artwork, and a journal entry — and adjusts its chunking, embedding, search, and LLM prompting accordingly.</p>
</div>
<div class="row g-4 mb-4">
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary"><i class="bi bi-diagram-3"></i> Knowledge Graph</h3>
<ul class="mb-0">
<li>Neo4j stores relationships between content, not just vectors</li>
<li>Author → Book → Character → Theme traversals</li>
<li>Artist → Album → Track → Genre connections</li>
<li>No vector dimension limits (full 4096d Qwen3-VL)</li>
<li>Graph + vector + full-text search in one database</li>
</ul>
</div>
</div>
</div>
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary"><i class="bi bi-eye"></i> Multimodal AI</h3>
<ul class="mb-0">
<li>Qwen3-VL-Embedding: text + images + video in one vector space</li>
<li>Qwen3-VL-Reranker: cross-attention scoring across modalities</li>
<li>Album art, diagrams, screenshots become searchable</li>
<li>Local GPU inference (5090 + 3090) — zero API costs</li>
<li>llama.cpp text fallback via existing Ansible/systemd infra</li>
</ul>
</div>
</div>
</div>
<div class="col-lg-4">
<div class="card h-100">
<div class="card-body">
<h3 class="card-title text-primary"><i class="bi bi-tags"></i> Content-Type Awareness</h3>
<ul class="mb-0">
<li>Library types define chunking, embedding, and prompt behavior</li>
<li>Fiction: narrative-aware chunking, character extraction</li>
<li>Technical: section-aware, code block preservation</li>
<li>Music: lyrics as primary, metadata-heavy (genre, mood)</li>
<li>Each type injects context into the LLM prompt</li>
</ul>
</div>
</div>
</div>
</div>
<div class="alert alert-info border-start border-4 border-info">
<h3>Key Differentiators</h3>
<ul class="mb-0">
<li><strong>Content-type-aware pipeline</strong> — chunking, embedding instructions, re-ranking instructions, and LLM context all adapt per library type</li>
<li><strong>Neo4j knowledge graph</strong> — traversable relationships, not just flat vector similarity</li>
<li><strong>Full multimodal</strong> — Qwen3-VL processes images, diagrams, album art alongside text in a unified vector space</li>
<li><strong>No dimension limits</strong> — Neo4j handles 4096d vectors natively (pgvector caps at 2000)</li>
<li><strong>MCP-first interface</strong> — designed for LLM integration from day one</li>
<li><strong>Proven RAG architecture</strong> — two-stage responder/reviewer pattern inherited from Spelunker</li>
<li><strong>Local GPU inference</strong> — zero ongoing API costs via vLLM + llama.cpp on RTX 5090/3090</li>
</ul>
</div>
<div class="alert alert-secondary border-start border-4 border-secondary">
<h3>Heritage</h3>
<p class="mb-0">Mnemosyne's RAG pipeline architecture is inspired by <strong>Spelunker</strong>, an enterprise RFP response platform built on Django, PostgreSQL/pgvector, and LangChain. The proven patterns — hybrid search, two-stage RAG, citation-based retrieval, async document processing, and SME-approved knowledge bases — are carried forward and enhanced with multimodal capabilities and knowledge graph relationships. Proven patterns from Mnemosyne will be backported to Spelunker.</p>
</div>
</section>
<!-- SECTION: ARCHITECTURE -->
<section id="architecture" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-diagram-3"></i> System Architecture</h2>
<div class="card mb-4">
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-diagram-3"></i> High-Level Architecture</h3></div>
<div class="card-body">
<div class="mermaid">
graph TB
subgraph Clients["Client Layer"]
MCP["MCP Clients<br/>(Claude, Copilot, etc.)"]
UI["Django Web UI"]
API["REST API (DRF)"]
end
subgraph App["Application Layer — Django"]
Core["core/<br/>Users, Auth"]
Library["library/<br/>Libraries, Collections, Items"]
Engine["engine/<br/>Embedding, Search, Reranker, RAG"]
MCPServer["mcp_server/<br/>MCP Tool Interface"]
Importers["importers/<br/>File, Calibre, Web"]
end
subgraph Data["Data Layer"]
Neo4j["Neo4j 5.x<br/>Knowledge Graph + Vectors"]
PG["PostgreSQL<br/>Auth, Config, Analytics"]
S3["S3/MinIO<br/>Content + Chunks"]
RMQ["RabbitMQ<br/>Task Queue"]
end
subgraph GPU["GPU Services"]
vLLM_E["vLLM<br/>Qwen3-VL-Embedding-8B<br/>(Multimodal Embed)"]
vLLM_R["vLLM<br/>Qwen3-VL-Reranker-8B<br/>(Multimodal Rerank)"]
LCPP["llama.cpp<br/>Qwen3-Reranker-0.6B<br/>(Text Fallback)"]
LCPP_C["llama.cpp<br/>Qwen3 Chat<br/>(RAG Responder)"]
end
MCP --> MCPServer
UI --> Core
API --> Library
API --> Engine
MCPServer --> Engine
MCPServer --> Library
Library --> Neo4j
Engine --> Neo4j
Engine --> S3
Core --> PG
Engine --> vLLM_E
Engine --> vLLM_R
Engine --> LCPP
Engine --> LCPP_C
Library --> RMQ
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card">
<div class="card-header bg-primary text-white"><h4 class="mb-0"><i class="bi bi-folder"></i> Django Apps</h4></div>
<div class="card-body">
<ul class="list-group list-group-flush">
<li class="list-group-item"><strong>core/</strong> — Users, authentication, profiles, permissions</li>
<li class="list-group-item"><strong>library/</strong> — Libraries, Collections, Items, Chunks, Concepts (Neo4j models)</li>
<li class="list-group-item"><strong>engine/</strong> — Embedding, search, reranker, RAG pipeline services</li>
<li class="list-group-item"><strong>mcp_server/</strong> — MCP tool definitions and server interface</li>
<li class="list-group-item"><strong>importers/</strong> — Content acquisition (file upload, Calibre, web scrape)</li>
<li class="list-group-item"><strong>llm_manager/</strong> — LLM API/model config, usage tracking (from Spelunker)</li>
</ul>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card">
<div class="card-header bg-success text-white"><h4 class="mb-0"><i class="bi bi-stack"></i> Technology Stack</h4></div>
<div class="card-body">
<ul>
<li><strong>Django 5.x</strong>, Python ≥3.12, Django REST Framework</li>
<li><strong>Neo4j 5.x</strong> + django-neomodel — knowledge graph + vector index</li>
<li><strong>PostgreSQL</strong> — Django auth, config, analytics only</li>
<li><strong>S3/MinIO</strong> — all content and chunk storage</li>
<li><strong>Celery + RabbitMQ</strong> — async embedding and graph construction</li>
<li><strong>vLLM ≥0.14</strong> — Qwen3-VL multimodal serving</li>
<li><strong>llama.cpp</strong> — text model serving (existing Ansible infra)</li>
<li><strong>MCP SDK</strong> — Model Context Protocol server</li>
</ul>
</div>
</div>
</div>
</div>
<h3 class="mt-4">Project Structure</h3>
<pre class="bg-light p-3 rounded"><code>mnemosyne/
├── mnemosyne/ # Django settings, URLs, WSGI/ASGI
├── core/ # Users, auth, profiles
├── library/ # Neo4j models (Library, Collection, Item, Chunk, Concept)
├── engine/ # RAG pipeline services
│ ├── embeddings.py # Qwen3-VL embedding client
│ ├── reranker.py # Qwen3-VL reranker client
│ ├── search.py # Hybrid search (vector + graph + full-text)
│ ├── pipeline.py # Two-stage RAG (responder + reviewer)
│ ├── llm_client.py # OpenAI-compatible LLM client
│ └── content_types.py # Library type definitions
├── mcp_server/ # MCP tool definitions
├── importers/ # Content import tools
├── llm_manager/ # LLM API/model config (ported from Spelunker)
├── static/
├── templates/
├── docker-compose.yml
├── pyproject.toml
└── manage.py</code></pre>
</section>
<!-- SECTION: DATA MODEL -->
<section id="data-model" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-database"></i> Data Model — Neo4j Knowledge Graph</h2>
<div class="alert alert-info border-start border-4 border-info">
<h3>Dual Database Strategy</h3>
<p class="mb-0"><strong>Neo4j</strong> stores all content knowledge: libraries, collections, items, chunks, concepts, and their relationships + vector embeddings. <strong>PostgreSQL</strong> stores only Django operational data: users, auth, LLM configurations, analytics, and Celery results. Content never lives in PostgreSQL.</p>
</div>
<div class="card mb-4">
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-diagram-2"></i> Graph Schema</h3></div>
<div class="card-body">
<div class="mermaid">
graph LR
L["Library<br/>(fiction, technical,<br/>music, art, journal)"] -->|CONTAINS| Col["Collection<br/>(genre, author,<br/>artist, project)"]
Col -->|CONTAINS| I["Item<br/>(book, manual,<br/>album, film, entry)"]
I -->|HAS_CHUNK| Ch["Chunk<br/>(text + optional image<br/>+ 4096d vector)"]
I -->|REFERENCES| Con["Concept<br/>(person, topic,<br/>technique, theme)"]
I -->|RELATED_TO| I
Con -->|RELATED_TO| Con
Ch -->|MENTIONS| Con
I -->|HAS_IMAGE| Img["Image<br/>(cover, diagram,<br/>artwork, still)"]
Img -->|HAS_EMBEDDING| ImgE["ImageEmbedding<br/>(4096d multimodal<br/>vector)"]
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h4 class="mb-0">Core Nodes</h4></div>
<div class="card-body">
<table class="table table-sm">
<thead><tr><th>Node</th><th>Key Properties</th><th>Vector?</th></tr></thead>
<tbody>
<tr><td><strong>Library</strong></td><td>name, library_type, chunking_config, embedding_instruction, llm_context_prompt</td><td>No</td></tr>
<tr><td><strong>Collection</strong></td><td>name, description, metadata</td><td>No</td></tr>
<tr><td><strong>Item</strong></td><td>title, item_type, s3_key, content_hash, metadata, created_at</td><td>No</td></tr>
<tr><td><strong>Chunk</strong></td><td>chunk_index, chunk_s3_key, chunk_size, embedding (4096d)</td><td><strong>Yes</strong></td></tr>
<tr><td><strong>Concept</strong></td><td>name, concept_type, embedding (4096d)</td><td><strong>Yes</strong></td></tr>
<tr><td><strong>Image</strong></td><td>s3_key, image_type, description, metadata</td><td>No</td></tr>
<tr><td><strong>ImageEmbedding</strong></td><td>embedding (4096d multimodal)</td><td><strong>Yes</strong></td></tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-success text-white"><h4 class="mb-0">Relationships</h4></div>
<div class="card-body">
<table class="table table-sm">
<thead><tr><th>Relationship</th><th>From → To</th><th>Properties</th></tr></thead>
<tbody>
<tr><td><strong>CONTAINS</strong></td><td>Library → Collection</td><td></td></tr>
<tr><td><strong>CONTAINS</strong></td><td>Collection → Item</td><td>position</td></tr>
<tr><td><strong>HAS_CHUNK</strong></td><td>Item → Chunk</td><td></td></tr>
<tr><td><strong>HAS_IMAGE</strong></td><td>Item → Image</td><td>image_role</td></tr>
<tr><td><strong>HAS_EMBEDDING</strong></td><td>Image → ImageEmbedding</td><td></td></tr>
<tr><td><strong>REFERENCES</strong></td><td>Item → Concept</td><td>relevance</td></tr>
<tr><td><strong>MENTIONS</strong></td><td>Chunk → Concept</td><td></td></tr>
<tr><td><strong>RELATED_TO</strong></td><td>Item → Item</td><td>relationship_type, weight</td></tr>
<tr><td><strong>RELATED_TO</strong></td><td>Concept → Concept</td><td>relationship_type</td></tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="alert alert-warning border-start border-4 border-warning">
<h4><i class="bi bi-lightning"></i> Neo4j Vector Indexes</h4>
<pre class="bg-light p-3 rounded mb-0"><code>// Chunk text+image embeddings (4096 dimensions, no pgvector limits!)
CREATE VECTOR INDEX chunk_embedding FOR (c:Chunk)
ON (c.embedding) OPTIONS {indexConfig: {
`vector.dimensions`: 4096,
`vector.similarity_function`: 'cosine'
}}
// Concept embeddings for semantic concept search
CREATE VECTOR INDEX concept_embedding FOR (con:Concept)
ON (con.embedding) OPTIONS {indexConfig: {
`vector.dimensions`: 4096,
`vector.similarity_function`: 'cosine'
}}
// Image multimodal embeddings
CREATE VECTOR INDEX image_embedding FOR (ie:ImageEmbedding)
ON (ie.embedding) OPTIONS {indexConfig: {
`vector.dimensions`: 4096,
`vector.similarity_function`: 'cosine'
}}
// Full-text index for keyword/BM25-style search
CREATE FULLTEXT INDEX chunk_fulltext FOR (c:Chunk) ON EACH [c.text_preview]</code></pre>
</div>
</section>
<!-- SECTION: CONTENT TYPES -->
<section id="content-types" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-tags"></i> Content Type System</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>The Core Innovation</h3>
<p class="mb-0">Each Library has a <strong>library_type</strong> that defines how content is chunked, what embedding instructions are sent to Qwen3-VL, what re-ranking instructions are used, and what context prompt is injected when the LLM generates answers. This is configured per library in the database — not hardcoded.</p>
</div>
<div class="row g-4 mb-4">
<div class="col-md-4">
<div class="card h-100 border-primary">
<div class="card-header bg-primary text-white"><h5 class="mb-0"><i class="bi bi-book"></i> Fiction</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Chapter-aware, preserve dialogue blocks, narrative flow</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the narrative passage for literary retrieval, capturing themes, characters, and plot elements"</em></p>
<p><strong>Reranker Instruction:</strong> <em>"Score relevance of this fiction excerpt to the query, considering narrative themes and character arcs"</em></p>
<p><strong>LLM Context:</strong> <em>"The following excerpts are from fiction. Interpret as narrative — consider themes, symbolism, character development."</em></p>
<p><strong>Multimodal:</strong> Cover art, illustrations</p>
<p><strong>Graph:</strong> Author → Book → Character → Theme</p>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100 border-success">
<div class="card-header bg-success text-white"><h5 class="mb-0"><i class="bi bi-gear"></i> Technical</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Section/heading-aware, preserve code blocks and tables as atomic units</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the technical documentation for precise procedural retrieval"</em></p>
<p><strong>Reranker Instruction:</strong> <em>"Score relevance of this technical documentation to the query, prioritizing procedural accuracy"</em></p>
<p><strong>LLM Context:</strong> <em>"The following excerpts are from technical documentation. Provide precise, actionable instructions."</em></p>
<p><strong>Multimodal:</strong> Diagrams, screenshots, wiring diagrams</p>
<p><strong>Graph:</strong> Product → Manual → Section → Procedure → Tool</p>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100 border-info">
<div class="card-header bg-info text-white"><h5 class="mb-0"><i class="bi bi-music-note-beamed"></i> Music</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Song-level (lyrics as one chunk), verse/chorus segmentation</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the song lyrics and album context for music discovery and thematic analysis"</em></p>
<p><strong>Reranker Instruction:</strong> <em>"Score relevance considering lyrical themes, musical context, and artist style"</em></p>
<p><strong>LLM Context:</strong> <em>"The following excerpts are song lyrics and music metadata. Interpret in musical and cultural context."</em></p>
<p><strong>Multimodal:</strong> Album artwork, liner note images</p>
<p><strong>Graph:</strong> Artist → Album → Track → Genre; Track → SAMPLES → Track</p>
</div>
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-4">
<div class="card h-100 border-warning">
<div class="card-header bg-warning text-dark"><h5 class="mb-0"><i class="bi bi-film"></i> Film</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Scene-level for scripts, paragraph-level for synopses</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the film content for cinematic retrieval, capturing visual and narrative elements"</em></p>
<p><strong>Multimodal:</strong> Movie stills, posters, screenshots</p>
<p><strong>Graph:</strong> Director → Film → Scene → Actor; Film → BASED_ON → Book</p>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100 border-danger">
<div class="card-header bg-danger text-white"><h5 class="mb-0"><i class="bi bi-palette"></i> Art</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Description-level, catalog entry as unit</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the artwork and its description for visual and stylistic retrieval"</em></p>
<p><strong>Multimodal:</strong> <strong>The artwork itself</strong> — primary content is visual</p>
<p><strong>Graph:</strong> Artist → Piece → Style → Movement; Piece → INSPIRED_BY → Piece</p>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100 border-secondary">
<div class="card-header bg-secondary text-white"><h5 class="mb-0"><i class="bi bi-journal-text"></i> Journals</h5></div>
<div class="card-body">
<p><strong>Chunking:</strong> Entry-level (one entry = one chunk), paragraph split for long entries</p>
<p><strong>Embedding Instruction:</strong> <em>"Represent the personal journal entry for temporal and reflective retrieval"</em></p>
<p><strong>Multimodal:</strong> Photos, sketches attached to entries</p>
<p><strong>Graph:</strong> Date → Entry → Topic; Entry → MENTIONS → Person/Place</p>
</div>
</div>
</div>
</div>
</section>
<!-- SECTION: MULTIMODAL PIPELINE -->
<section id="multimodal-pipeline" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-eye-fill"></i> Multimodal Embedding &amp; Re-ranking Pipeline</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>Two-Stage Multimodal Pipeline</h3>
<p><strong>Stage 1 — Embedding (Qwen3-VL-Embedding-8B):</strong> Generates 4096-dimensional vectors from text, images, screenshots, and video in a unified semantic space. Accepts content-type-specific instructions for optimized representations.</p>
<p class="mb-0"><strong>Stage 2 — Re-ranking (Qwen3-VL-Reranker-8B):</strong> Takes (query, document) pairs — where both can be multimodal — and outputs precise relevance scores via cross-attention. Dramatically sharpens retrieval accuracy.</p>
</div>
<div class="card mb-4">
<div class="card-header bg-success text-white"><h3 class="mb-0"><i class="bi bi-flow-chart"></i> Embedding &amp; Ingestion Flow</h3></div>
<div class="card-body">
<div class="mermaid">
flowchart TD
A["New Content<br/>(file upload, import)"] --> B{"Content Type?"}
B -->|"Text (PDF, DOCX, MD)"| C["Parse Text<br/>+ Extract Images"]
B -->|"Image (art, photo)"| D["Image Only"]
B -->|"Mixed (manual + diagrams)"| E["Parse Text<br/>+ Keep Page Images"]
C --> F["Chunk Text<br/>(content-type-aware)"]
D --> G["Image to S3"]
E --> F
E --> G
F --> H["Store Chunks in S3"]
H --> I["Qwen3-VL-Embedding<br/>(text + instruction)"]
G --> J["Qwen3-VL-Embedding<br/>(image + instruction)"]
I --> K["4096d Vector"]
J --> K
K --> L["Store in Neo4j<br/>Chunk/ImageEmbedding Node"]
L --> M["Extract Concepts<br/>(LLM entity extraction)"]
M --> N["Create Concept Nodes<br/>+ REFERENCES/MENTIONS edges"]
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-info text-white"><h4 class="mb-0">Qwen3-VL-Embedding-8B</h4></div>
<div class="card-body">
<ul>
<li><strong>Dimensions:</strong> 4096 (full), or MRL truncation to 3072/2048/1536/1024</li>
<li><strong>Input:</strong> Text, images, screenshots, video, or any mix</li>
<li><strong>Instruction-aware:</strong> Content-type instruction improves quality 15%</li>
<li><strong>Quantization:</strong> Int8 (~8GB VRAM), Int4 (~4GB VRAM)</li>
<li><strong>Serving:</strong> vLLM with <code>--runner pooling</code></li>
<li><strong>Languages:</strong> 30+ languages supported</li>
</ul>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-warning text-dark"><h4 class="mb-0">Qwen3-VL-Reranker-8B</h4></div>
<div class="card-body">
<ul>
<li><strong>Architecture:</strong> Single-tower cross-attention (deep query↔document interaction)</li>
<li><strong>Input:</strong> (query, document) pairs — both can be multimodal</li>
<li><strong>Output:</strong> Relevance score (sigmoid of yes/no token probabilities)</li>
<li><strong>Instruction-aware:</strong> Custom re-ranking instructions per content type</li>
<li><strong>Serving:</strong> vLLM with <code>--runner pooling</code> + score endpoint</li>
<li><strong>Fallback:</strong> Qwen3-Reranker-0.6B via llama.cpp (text-only)</li>
</ul>
</div>
</div>
</div>
</div>
<div class="alert alert-info border-start border-4 border-info">
<h4><i class="bi bi-image"></i> Why Multimodal Matters</h4>
<p>Traditional RAG systems OCR images and diagrams, producing garbled text. Multimodal embedding understands the <em>visual content</em> directly:</p>
<ul class="mb-0">
<li><strong>Technical diagrams:</strong> Wiring diagrams, network topologies, architecture diagrams — searchable by visual content, not OCR garbage</li>
<li><strong>Album artwork:</strong> "psychedelic album covers from the 70s" finds matching art via visual similarity</li>
<li><strong>Art:</strong> The actual painting/sculpture becomes the searchable content, not just its text description</li>
<li><strong>PDF pages:</strong> Image-only PDF pages with charts and tables are embedded as images, not skipped</li>
</ul>
</div>
</section>
<!-- SECTION: SEARCH PIPELINE -->
<section id="search-pipeline" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-search"></i> Search Pipeline — GraphRAG + Vector + Re-rank</h2>
<div class="card mb-4">
<div class="card-header bg-primary text-white"><h3 class="mb-0"><i class="bi bi-flow-chart"></i> Search Flow</h3></div>
<div class="card-body">
<div class="mermaid">
flowchart TD
Q["User Query"] --> E["Embed Query<br/>(Qwen3-VL-Embedding)"]
E --> VS["1. Vector Search<br/>(Neo4j vector index)<br/>Top-K × 3 oversample"]
E --> GT["2. Graph Traversal<br/>(Cypher queries)<br/>Concept + relationship walks"]
Q --> FT["3. Full-Text Search<br/>(Neo4j fulltext index)<br/>Keyword matching"]
VS --> F["Candidate Fusion<br/>+ Deduplication"]
GT --> F
FT --> F
F --> RR["4. Re-Rank<br/>(Qwen3-VL-Reranker)<br/>Cross-attention scoring"]
RR --> TK["Top-K Results"]
TK --> CTX["Inject Content-Type<br/>Context Prompt"]
CTX --> LLM["5. LLM Responder<br/>(Two-stage RAG)"]
LLM --> REV["6. LLM Reviewer<br/>(Quality + citation check)"]
REV --> ANS["Final Answer<br/>with Citations"]
</div>
</div>
</div>
<div class="row g-4 mb-4">
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h5 class="mb-0">1. Vector Search</h5></div>
<div class="card-body">
<p>Cosine similarity via Neo4j vector index on Chunk and ImageEmbedding nodes.</p>
<pre class="bg-light p-2 rounded"><code>CALL db.index.vector.queryNodes(
'chunk_embedding', 30,
$query_vector
) YIELD node, score
WHERE score > $threshold</code></pre>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-success text-white"><h5 class="mb-0">2. Graph Traversal</h5></div>
<div class="card-body">
<p>Walk relationships to find contextually related content that vector search alone would miss.</p>
<pre class="bg-light p-2 rounded"><code>MATCH (c:Chunk)-[:HAS_CHUNK]-(i:Item)
-[:REFERENCES]->(con:Concept)
-[:RELATED_TO]-(con2:Concept)
<-[:REFERENCES]-(i2:Item)
-[:HAS_CHUNK]->(c2:Chunk)
RETURN c2, i2</code></pre>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-info text-white"><h5 class="mb-0">3. Full-Text Search</h5></div>
<div class="card-body">
<p>Neo4j native full-text index for keyword matching (BM25-equivalent).</p>
<pre class="bg-light p-2 rounded"><code>CALL db.index.fulltext.queryNodes(
'chunk_fulltext',
$query_text
) YIELD node, score</code></pre>
</div>
</div>
</div>
</div>
</section>
<!-- SECTION: MCP INTERFACE -->
<section id="mcp-interface" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-plug"></i> MCP Server Interface</h2>
<div class="alert alert-primary border-start border-4 border-primary">
<h3>MCP-First Design</h3>
<p class="mb-0">Mnemosyne exposes its capabilities as MCP tools, making the entire knowledge base accessible to Claude, Copilot, and any MCP-compatible LLM client. The MCP server is a primary interface, not an afterthought.</p>
</div>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h4 class="mb-0">Search &amp; Retrieval Tools</h4></div>
<div class="card-body">
<table class="table table-sm">
<thead><tr><th>Tool</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>search_library</code></td><td>Semantic + graph + full-text search with re-ranking. Filters by library, collection, content type.</td></tr>
<tr><td><code>ask_about</code></td><td>Full RAG pipeline — search, re-rank, content-type context injection, LLM response with citations.</td></tr>
<tr><td><code>find_similar</code></td><td>Find items similar to a given item using vector similarity. Optionally search across libraries.</td></tr>
<tr><td><code>search_by_image</code></td><td>Multimodal search — find content matching an uploaded image.</td></tr>
<tr><td><code>explore_connections</code></td><td>Traverse knowledge graph from an item — find related concepts, authors, themes.</td></tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-success text-white"><h4 class="mb-0">Management &amp; Navigation Tools</h4></div>
<div class="card-body">
<table class="table table-sm">
<thead><tr><th>Tool</th><th>Description</th></tr></thead>
<tbody>
<tr><td><code>browse_libraries</code></td><td>List all libraries with their content types and item counts.</td></tr>
<tr><td><code>browse_collections</code></td><td>List collections within a library.</td></tr>
<tr><td><code>get_item</code></td><td>Get detailed info about a specific item, including metadata and graph connections.</td></tr>
<tr><td><code>add_content</code></td><td>Add new content to a library — triggers async embedding + graph construction.</td></tr>
<tr><td><code>get_concepts</code></td><td>List extracted concepts for an item or across a library.</td></tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
</section>
<!-- SECTION: GPU SERVICES -->
<section id="gpu-services" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-gpu-card"></i> GPU Services</h2>
<div class="row g-4 mb-4">
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h4 class="mb-0">RTX 5090 (32GB VRAM)</h4></div>
<div class="card-body">
<table class="table table-sm">
<tbody>
<tr><td><strong>Model</strong></td><td>Qwen3-VL-Reranker-8B</td></tr>
<tr><td><strong>VRAM (bf16)</strong></td><td>~18GB</td></tr>
<tr><td><strong>Serving</strong></td><td>vLLM <code>--runner pooling</code></td></tr>
<tr><td><strong>Port</strong></td><td>:8001</td></tr>
<tr><td><strong>Role</strong></td><td>Multimodal re-ranking</td></tr>
<tr><td><strong>Headroom</strong></td><td>~14GB for chat model</td></tr>
</tbody>
</table>
</div>
</div>
</div>
<div class="col-md-6">
<div class="card h-100">
<div class="card-header bg-success text-white"><h4 class="mb-0">RTX 3090 (24GB VRAM)</h4></div>
<div class="card-body">
<table class="table table-sm">
<tbody>
<tr><td><strong>Model</strong></td><td>Qwen3-VL-Embedding-8B</td></tr>
<tr><td><strong>VRAM (bf16)</strong></td><td>~18GB</td></tr>
<tr><td><strong>Serving</strong></td><td>vLLM <code>--runner pooling</code></td></tr>
<tr><td><strong>Port</strong></td><td>:8002</td></tr>
<tr><td><strong>Role</strong></td><td>Multimodal embedding</td></tr>
<tr><td><strong>Headroom</strong></td><td>~6GB</td></tr>
</tbody>
</table>
</div>
</div>
</div>
</div>
<div class="alert alert-info border-start border-4 border-info">
<h4><i class="bi bi-arrow-repeat"></i> Fallback: llama.cpp (Existing Ansible Infra)</h4>
<p class="mb-0">Text-only Qwen3-Reranker-0.6B GGUF served via <code>llama-server</code> on existing systemd/Ansible infrastructure. Managed by the same playbooks, monitored by the same Grafana dashboards. Used when vLLM services are down or for text-only workloads.</p>
</div>
</section>
<!-- SECTION: DEPLOYMENT -->
<section id="deployment" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-box-seam"></i> Deployment</h2>
<div class="row g-4 mb-4">
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-primary text-white"><h4 class="mb-0">Core Services</h4></div>
<div class="card-body">
<ul class="mb-0">
<li><strong>web:</strong> Django app (Gunicorn)</li>
<li><strong>postgres:</strong> PostgreSQL (auth/config only)</li>
<li><strong>neo4j:</strong> Neo4j 5.x (knowledge graph + vectors)</li>
<li><strong>rabbitmq:</strong> Celery broker</li>
</ul>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-success text-white"><h4 class="mb-0">Async Processing</h4></div>
<div class="card-body">
<ul class="mb-0">
<li><strong>celery-worker:</strong> Embedding, graph construction</li>
<li><strong>celery-beat:</strong> Scheduled re-sync tasks</li>
</ul>
</div>
</div>
</div>
<div class="col-md-4">
<div class="card h-100">
<div class="card-header bg-info text-white"><h4 class="mb-0">Storage &amp; Proxy</h4></div>
<div class="card-body">
<ul class="mb-0">
<li><strong>minio:</strong> S3-compatible content storage</li>
<li><strong>nginx:</strong> Static/proxy</li>
<li><strong>mcp-server:</strong> MCP interface process</li>
</ul>
</div>
</div>
</div>
</div>
<div class="alert alert-secondary border-start border-4 border-secondary">
<h4>Shared Infrastructure with Spelunker</h4>
<p class="mb-0">Mnemosyne and Spelunker share: GPU model services (llama.cpp + vLLM), MinIO/S3 (separate buckets), Neo4j (separate databases), RabbitMQ (separate vhosts), and Grafana monitoring. Each is its own Docker Compose stack but points to shared infra.</p>
</div>
</section>
<!-- SECTION: BACKPORT -->
<section id="backport" class="mb-5">
<h2 class="h2 mb-4"><i class="bi bi-arrow-left-right"></i> Backport Strategy to Spelunker</h2>
<div class="alert alert-warning border-start border-4 border-warning">
<h3>Build Forward, Backport Back</h3>
<p class="mb-0">Mnemosyne proves the architecture with no legacy constraints. Once validated, proven components flow back to Spelunker to enhance its RFP workflow with multimodal understanding and re-ranking precision.</p>
</div>
<table class="table table-bordered">
<thead class="table-dark"><tr><th>Component</th><th>Mnemosyne (Prove)</th><th>Spelunker (Backport)</th></tr></thead>
<tbody>
<tr><td><strong>RerankerService</strong></td><td>Qwen3-VL multimodal + llama.cpp text</td><td>Drop into <code>rag/services/reranker.py</code></td></tr>
<tr><td><strong>Multimodal Embedding</strong></td><td>Qwen3-VL-Embedding via vLLM</td><td>Add alongside OpenAI embeddings, MRL@1536d for pgvector compat</td></tr>
<tr><td><strong>Diagram Understanding</strong></td><td>Image pages embedded multimodally</td><td>PDF diagrams in RFP docs become searchable</td></tr>
<tr><td><strong>MCP Server</strong></td><td>Primary interface from day one</td><td>Add as secondary interface to Spelunker</td></tr>
<tr><td><strong>Neo4j (optional)</strong></td><td>Primary vector + graph store</td><td>Could replace pgvector, or run alongside</td></tr>
<tr><td><strong>Content-Type Config</strong></td><td>Library type definitions</td><td>Adapt as document classification in Spelunker</td></tr>
</tbody>
</table>
</section>
<div class="alert alert-success border-start border-4 border-success mt-5">
<h3><i class="bi bi-check-circle"></i> Documentation Complete</h3>
<p class="mb-0">This document describes the target architecture for Mnemosyne. Phase implementation documents provide detailed build plans.</p>
</div>
</div>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.0/dist/js/bootstrap.bundle.min.js"></script>
</body>
</html>

View File

@@ -0,0 +1,351 @@
# Mnemosyne Integration — Daedalus & Pallas Reference
This document summarises the Mnemosyne-specific implementation required for integration with the Daedalus & Pallas architecture. The full specification lives in [`daedalus/docs/mnemosyne_integration.md`](../../daedalus/docs/mnemosyne_integration.md).
---
## Overview
Mnemosyne exposes two interfaces for the wider Ouranos ecosystem:
1. **MCP Server** (port 22091) — consumed by Pallas agents for synchronous search, browse, and retrieval operations
2. **REST Ingest API** — consumed by the Daedalus backend for asynchronous file ingestion and embedding job lifecycle management
---
## 1. MCP Server (Phase 5)
### Port & URL
| Service | Port | URL |
|---------|------|-----|
| Mnemosyne MCP | 22091 | `http://puck.incus:22091/mcp` |
| Health check | 22091 | `http://puck.incus:22091/mcp/health` |
### Project Structure
Following the [Django MCP Pattern](Pattern_Django-MCP_V1-00.md):
```
mnemosyne/mnemosyne/mcp_server/
├── __init__.py
├── server.py # FastMCP instance + tool registration
├── asgi.py # Starlette ASGI mount at /mcp
├── middleware.py # MCPAuthMiddleware (disabled for internal use)
├── context.py # get_mcp_user(), get_mcp_token()
└── tools/
├── __init__.py
├── search.py # register_search_tools(mcp) → search_knowledge, search_by_category
├── browse.py # register_browse_tools(mcp) → list_libraries, list_collections, get_item, get_concepts
└── health.py # register_health_tools(mcp) → get_health
```
### Tools to Implement
| Tool | Module | Description |
|------|--------|-------------|
| `search_knowledge` | `search.py` | Hybrid vector + full-text + graph search → re-rank → return chunks with citations |
| `search_by_category` | `search.py` | Same as above, scoped to a specific `library_type` |
| `list_libraries` | `browse.py` | List all libraries with type, description, counts |
| `list_collections` | `browse.py` | List collections within a library |
| `get_item` | `browse.py` | Retrieve item detail with chunk previews and concept links |
| `get_concepts` | `browse.py` | Traverse concept graph from a starting concept or item |
| `get_health` | `health.py` | Check Neo4j, S3, embedding model reachability |
### MCP Resources
| Resource URI | Source |
|---|---|
| `mnemosyne://library-types` | `library/content_types.py``LIBRARY_TYPE_DEFAULTS` |
| `mnemosyne://libraries` | `Library.nodes.order_by("name")` serialized to JSON |
### Deployment
Separate Uvicorn process alongside Django's Gunicorn:
```bash
# Django WSGI (existing)
gunicorn --bind :22090 --workers 3 mnemosyne.wsgi
# MCP ASGI (new)
uvicorn mcp_server.asgi:app --host 0.0.0.0 --port 22091 --workers 1
```
Auth is disabled (`MCP_REQUIRE_AUTH=False`) since all traffic is internal (10.10.0.0/24).
### ⚠️ DEBUG LOG Points — MCP Server
| Location | Log Event | Level | What to Log |
|----------|-----------|-------|-------------|
| Tool dispatch | `mcp_tool_called` | DEBUG | Tool name, all input parameters |
| Vector search | `mcp_search_vector_query` | DEBUG | Query text, embedding dims, library filter, limit |
| Vector search result | `mcp_search_vector_results` | DEBUG | Candidate count, top/lowest scores |
| Full-text search | `mcp_search_fulltext_query` | DEBUG | Query terms, index used |
| Re-ranking | `mcp_search_rerank` | DEBUG | Candidates in/out, reranker model, duration_ms |
| Graph traversal | `mcp_graph_traverse` | DEBUG | Starting node UID, relationships, depth, nodes visited |
| Neo4j query | `mcp_neo4j_query` | DEBUG | Cypher query (parameterized), execution time_ms |
| Tool response | `mcp_tool_response` | DEBUG | Tool name, result size (bytes/items), duration_ms |
| Health check | `mcp_health_check` | DEBUG | Each dependency status, overall result |
**Important:** All neomodel ORM calls inside async tool functions **must** be wrapped with `sync_to_async(thread_sensitive=True)`.
---
## 2. REST Ingest API
### New Endpoints
| Method | Route | Purpose |
|--------|-------|---------|
| `POST` | `/api/v1/library/ingest` | Accept a file for ingestion + embedding |
| `GET` | `/api/v1/library/jobs/{job_id}` | Poll job status |
| `POST` | `/api/v1/library/jobs/{job_id}/retry` | Retry a failed job |
| `GET` | `/api/v1/library/jobs` | List recent jobs (optional `?status=` filter) |
These endpoints are consumed by the **Daedalus FastAPI backend** only. Not by the frontend.
### New Model: `IngestJob`
Add to `library/` app (Django ORM on PostgreSQL, not Neo4j):
```python
class IngestJob(models.Model):
"""Tracks the lifecycle of a content ingestion + embedding job."""
id = models.CharField(max_length=64, primary_key=True)
item_uid = models.CharField(max_length=64, db_index=True)
celery_task_id = models.CharField(max_length=255, blank=True)
status = models.CharField(
max_length=20,
choices=[
("pending", "Pending"),
("processing", "Processing"),
("completed", "Completed"),
("failed", "Failed"),
],
default="pending",
db_index=True,
)
progress = models.CharField(max_length=50, default="queued")
error = models.TextField(blank=True, null=True)
retry_count = models.PositiveIntegerField(default=0)
chunks_created = models.PositiveIntegerField(default=0)
concepts_extracted = models.PositiveIntegerField(default=0)
embedding_model = models.CharField(max_length=100, blank=True)
source = models.CharField(max_length=50, default="")
source_ref = models.CharField(max_length=200, blank=True)
s3_key = models.CharField(max_length=500)
created_at = models.DateTimeField(auto_now_add=True)
started_at = models.DateTimeField(null=True, blank=True)
completed_at = models.DateTimeField(null=True, blank=True)
class Meta:
ordering = ["-created_at"]
indexes = [
models.Index(fields=["status", "-created_at"]),
models.Index(fields=["source", "source_ref"]),
]
```
### Ingest Request Schema
```json
{
"s3_key": "workspaces/ws_abc/files/f_def/report.pdf",
"title": "Q4 Technical Report",
"library_uid": "lib_technical_001",
"collection_uid": "col_reports_2026",
"file_type": "application/pdf",
"file_size": 245000,
"source": "daedalus",
"source_ref": "ws_abc/f_def"
}
```
### Job Status Response Schema
```json
{
"job_id": "job_789xyz",
"item_uid": "item_abc123",
"status": "processing",
"progress": "embedding",
"chunks_created": 0,
"concepts_extracted": 0,
"embedding_model": "qwen3-vl-embedding-8b",
"started_at": "2026-03-12T15:42:01Z",
"completed_at": null,
"error": null
}
```
### ⚠️ DEBUG LOG Points — Ingest Endpoint
| Location | Log Event | Level | What to Log |
|----------|-----------|-------|-------------|
| Request received | `ingest_request_received` | INFO | s3_key, title, library_uid, file_type, source, source_ref |
| S3 key validation | `ingest_s3_key_check` | DEBUG | s3_key, exists (bool), bucket name |
| Library lookup | `ingest_library_lookup` | DEBUG | library_uid, found (bool), library_type |
| Item node creation | `ingest_item_created` | INFO | item_uid, title, library_uid, collection_uid |
| Celery task dispatch | `ingest_task_dispatched` | INFO | job_id, item_uid, celery_task_id, queue name |
| Celery task dispatch failure | `ingest_task_dispatch_failed` | ERROR | job_id, item_uid, exception details |
---
## 3. Celery Embedding Pipeline
### New Task: `embed_item`
```python
@shared_task(
name="library.embed_item",
bind=True,
max_retries=3,
default_retry_delay=60,
autoretry_for=(S3ConnectionError, EmbeddingModelError),
retry_backoff=True,
retry_backoff_max=600,
acks_late=True,
queue="embedding",
)
def embed_item(self, job_id, item_uid):
...
```
### Task Flow
1. Update job → `processing` / `fetching`
2. Fetch file from Daedalus S3 bucket (cross-bucket read)
3. Copy to Mnemosyne's own S3 bucket
4. Load library type → chunking config
5. Chunk content per strategy
6. Store chunk text in S3
7. Generate embeddings (Arke/vLLM batch call)
8. Write Chunk nodes + vectors to Neo4j
9. Extract concepts (LLM call)
10. Build graph relationships
11. Update job → `completed`
On failure at any step: update job → `failed` with error message.
### ⚠️ DEBUG LOG Points — Celery Worker (Critical)
These are the most important log points in the entire integration. Without them, debugging async embedding failures is nearly impossible.
| Location | Log Event | Level | What to Log |
|----------|-----------|-------|-------------|
| Task pickup | `embed_task_started` | INFO | job_id, item_uid, worker hostname, retry count |
| S3 fetch start | `embed_s3_fetch_start` | DEBUG | s3_key, source bucket |
| S3 fetch complete | `embed_s3_fetch_complete` | DEBUG | s3_key, file_size, duration_ms |
| S3 fetch failed | `embed_s3_fetch_failed` | ERROR | s3_key, error, retry_count |
| S3 cross-bucket copy start | `s3_cross_bucket_copy_start` | DEBUG | source_bucket, source_key, dest_bucket, dest_key |
| S3 cross-bucket copy complete | `s3_cross_bucket_copy_complete` | DEBUG | source_key, dest_key, file_size, duration_ms |
| S3 cross-bucket copy failed | `s3_cross_bucket_copy_failed` | ERROR | source_bucket, source_key, error |
| Chunking start | `embed_chunking_start` | DEBUG | library_type, strategy, chunk_size, chunk_overlap |
| Chunking complete | `embed_chunking_complete` | INFO | chunks_created, avg_chunk_size |
| Chunking failed | `embed_chunking_failed` | ERROR | file_type, error |
| Embedding start | `embed_vectors_start` | DEBUG | model_name, dimensions, batch_size, total_chunks |
| Embedding complete | `embed_vectors_complete` | INFO | model_name, duration_ms, tokens_processed |
| Embedding failed | `embed_vectors_failed` | ERROR | model_name, chunk_index, error |
| Neo4j write start | `embed_neo4j_write_start` | DEBUG | chunks_to_write count |
| Neo4j write complete | `embed_neo4j_write_complete` | INFO | chunks_written, duration_ms |
| Neo4j write failed | `embed_neo4j_write_failed` | ERROR | chunk_index, neo4j_error |
| Concept extraction start | `embed_concepts_start` | DEBUG | model_name |
| Concept extraction complete | `embed_concepts_complete` | INFO | concepts_extracted, concept_names, duration_ms |
| Graph build start | `embed_graph_build_start` | DEBUG | — |
| Graph build complete | `embed_graph_build_complete` | INFO | relationships_created, duration_ms |
| Job completed | `embed_job_completed` | INFO | job_id, item_uid, total_duration_ms, chunks, concepts |
| Job failed | `embed_job_failed` | ERROR | job_id, item_uid, exception_type, error, full traceback |
---
## 4. S3 Bucket Strategy
Mnemosyne uses its own bucket (`mnemosyne-content`, Terraform-provisioned per Phase 1). On ingest, the Celery worker copies the file from the Daedalus bucket to Mnemosyne's bucket.
```
mnemosyne-content bucket
├── items/
│ └── {item_uid}/
│ └── original/{filename} ← copied from Daedalus bucket
│ └── chunks/
│ └── chunk_000.txt
│ └── chunk_001.txt
├── images/
│ └── {image_uid}/{filename}
```
### Configuration
```bash
# .env additions
# Mnemosyne's own bucket (existing)
AWS_STORAGE_BUCKET_NAME=mnemosyne-content
# Cross-bucket read access to Daedalus bucket
DAEDALUS_S3_BUCKET_NAME=daedalus
DAEDALUS_S3_ENDPOINT_URL=http://incus-s3.incus:9000
DAEDALUS_S3_ACCESS_KEY_ID=${VAULT_DAEDALUS_S3_READ_KEY}
DAEDALUS_S3_SECRET_ACCESS_KEY=${VAULT_DAEDALUS_S3_READ_SECRET}
# MCP server
MCP_SERVER_PORT=22091
MCP_REQUIRE_AUTH=False
```
---
## 5. Prometheus Metrics
```
# MCP tool calls
mnemosyne_mcp_tool_invocations_total{tool,status} counter
mnemosyne_mcp_tool_duration_seconds{tool} histogram
# Ingest pipeline
mnemosyne_ingest_jobs_total{status} counter
mnemosyne_ingest_duration_seconds{library_type} histogram
mnemosyne_chunks_created_total{library_type} counter
mnemosyne_concepts_extracted_total counter
mnemosyne_embeddings_generated_total{model} counter
mnemosyne_embedding_duration_seconds{model} histogram
# Search performance
mnemosyne_search_duration_seconds{search_type} histogram
mnemosyne_search_results_total{search_type} counter
mnemosyne_rerank_duration_seconds{model} histogram
# Infrastructure
mnemosyne_neo4j_query_duration_seconds{query_type} histogram
mnemosyne_s3_operations_total{operation,status} counter
```
---
## 6. Implementation Phases (Mnemosyne-specific)
### Phase 1 — REST Ingest API
- [ ] Create `IngestJob` model + Django migration
- [ ] Implement `POST /api/v1/library/ingest` endpoint
- [ ] Implement `GET /api/v1/library/jobs/{job_id}` endpoint
- [ ] Implement `POST /api/v1/library/jobs/{job_id}/retry` endpoint
- [ ] Implement `GET /api/v1/library/jobs` list endpoint
- [ ] Implement `embed_item` Celery task with full debug logging
- [ ] Add S3 cross-bucket copy logic
- [ ] Add ingest API serializers and URL routing
### Phase 2 — MCP Server (Phase 5 of Mnemosyne roadmap)
- [ ] Create `mcp_server/` module following Django MCP Pattern
- [ ] Implement `search_knowledge` tool (hybrid search + re-rank)
- [ ] Implement `search_by_category` tool
- [ ] Implement `list_libraries`, `list_collections`, `get_item`, `get_concepts` tools
- [ ] Implement `get_health` tool per Pallas health spec
- [ ] Register MCP resources (`mnemosyne://library-types`, `mnemosyne://libraries`)
- [ ] ASGI mount + Uvicorn deployment on port 22091
- [ ] Systemd service for MCP Uvicorn process
- [ ] Add Prometheus metrics

333
docs/ouranos.md Normal file
View File

@@ -0,0 +1,333 @@
# Ouranos Lab
Infrastructure-as-Code project managing the **Ouranos Lab** — a development sandbox at [ouranos.helu.ca](https://ouranos.helu.ca). Uses **Terraform** for container provisioning and **Ansible** for configuration management, themed around the moons of Uranus.
---
## Project Overview
| Component | Purpose |
|-----------|---------|
| **Terraform** | Provisions 10 specialised Incus containers (LXC) with DNS-resolved networking, security policies, and resource dependencies |
| **Ansible** | Deploys Docker, databases (PostgreSQL, Neo4j), observability stack (Prometheus, Grafana, Loki), and application runtimes across all hosts |
> **DNS Domain**: Incus resolves containers via the `.incus` domain suffix (e.g., `oberon.incus`, `portia.incus`). IPv4 addresses are dynamically assigned — always use DNS names, never hardcode IPs.
---
## Uranian Host Architecture
All containers are named after moons of Uranus and resolved via the `.incus` DNS suffix.
| Name | Role | Description | Nesting |
|------|------|-------------|---------|
| **ariel** | graph_database | Neo4j — Ethereal graph connections | ✔ |
| **caliban** | agent_automation | Agent S MCP Server with MATE Desktop | ✔ |
| **miranda** | mcp_docker_host | Dedicated Docker Host for MCP Servers | ✔ |
| **oberon** | container_orchestration | Docker Host — MCP Switchboard, RabbitMQ, Open WebUI | ✔ |
| **portia** | database | PostgreSQL — Relational database host | ❌ |
| **prospero** | observability | PPLG stack — Prometheus, Grafana, Loki, PgAdmin | ❌ |
| **puck** | application_runtime | Python App Host — JupyterLab, Django apps, Gitea Runner | ✔ |
| **rosalind** | collaboration | Gitea, LobeChat, Nextcloud, AnythingLLM | ✔ |
| **sycorax** | language_models | Arke LLM Proxy | ✔ |
| **titania** | proxy_sso | HAProxy TLS termination + Casdoor SSO | ✔ |
### oberon — Container Orchestration
King of the Fairies orchestrating containers and managing MCP infrastructure.
- Docker engine
- MCP Switchboard (port 22785) — Django app routing MCP tool calls
- RabbitMQ message queue
- Open WebUI LLM interface (port 22088, PostgreSQL backend on Portia)
- SearXNG privacy search (port 22083, behind OAuth2-Proxy)
- smtp4dev SMTP test server (port 22025)
### portia — Relational Database
Intelligent and resourceful — the reliability of relational databases.
- PostgreSQL 17 (port 5432)
- Databases: `arke`, `anythingllm`, `gitea`, `hass`, `lobechat`, `mcp_switchboard`, `nextcloud`, `openwebui`, `periplus`, `spelunker`
### ariel — Graph Database
Air spirit — ethereal, interconnected nature mirroring graph relationships.
- Neo4j 5.26.0 (Docker)
- HTTP API: port 25584
- Bolt: port 25554
### puck — Application Runtime
Shape-shifting trickster embodying Python's versatility.
- Docker engine
- JupyterLab (port 22071 via OAuth2-Proxy)
- Gitea Runner (CI/CD agent)
- Home Assistant (port 8123)
- Django applications: Angelia (22281), Athena (22481), Kairos (22581), Icarlos (22681), Spelunker (22881), Peitho (22981)
### prospero — Observability Stack
Master magician observing all events.
- PPLG stack via Docker Compose: Prometheus, Loki, Grafana, PgAdmin
- Internal HAProxy with OAuth2-Proxy for all dashboards
- AlertManager with Pushover notifications
- Prometheus metrics collection (`node-exporter`, HAProxy, Loki)
- Loki log aggregation via Alloy (all hosts)
- Grafana dashboard suite with Casdoor SSO integration
### miranda — MCP Docker Host
Curious bridge between worlds — hosting MCP server containers.
- Docker engine (API exposed on port 2375 for MCP Switchboard)
- MCPO OpenAI-compatible MCP proxy
- Grafana MCP Server (port 25533)
- Gitea MCP Server (port 25535)
- Neo4j MCP Server
- Argos MCP Server — web search via SearXNG (port 25534)
### sycorax — Language Models
Original magical power wielding language magic.
- Arke LLM API Proxy (port 25540)
- Multi-provider support (OpenAI, Anthropic, etc.)
- Session management with Memcached
- Database backend on Portia
### caliban — Agent Automation
Autonomous computer agent learning through environmental interaction.
- Docker engine
- Agent S MCP Server (MATE desktop, AT-SPI automation)
- Kernos MCP Shell Server (port 22021)
- GPU passthrough for vision tasks
- RDP access (port 25521)
### rosalind — Collaboration Services
Witty and resourceful moon for PHP, Go, and Node.js runtimes.
- Gitea self-hosted Git (port 22082, SSH on 22022)
- LobeChat AI chat interface (port 22081)
- Nextcloud file sharing and collaboration (port 22083)
- AnythingLLM document AI workspace (port 22084)
- Nextcloud data on dedicated Incus storage volume
### titania — Proxy & SSO Services
Queen of the Fairies managing access control and authentication.
- HAProxy 3.x with TLS termination (port 443)
- Let's Encrypt wildcard certificate via certbot DNS-01 (Namecheap)
- HTTP to HTTPS redirect (port 80)
- Gitea SSH proxy (port 22022)
- Casdoor SSO (port 22081, local PostgreSQL)
- Prometheus metrics at `:8404/metrics`
---
## External Access via HAProxy
Titania provides TLS termination and reverse proxy for all services.
- **Base domain**: `ouranos.helu.ca`
- **HTTPS**: port 443 (standard)
- **HTTP**: port 80 (redirects to HTTPS)
- **Certificate**: Let's Encrypt wildcard via certbot DNS-01
### Route Table
| Subdomain | Backend | Service |
|-----------|---------|---------|
| `ouranos.helu.ca` (root) | puck.incus:22281 | Angelia (Django) |
| `alertmanager.ouranos.helu.ca` | prospero.incus:443 (SSL) | AlertManager |
| `angelia.ouranos.helu.ca` | puck.incus:22281 | Angelia (Django) |
| `anythingllm.ouranos.helu.ca` | rosalind.incus:22084 | AnythingLLM |
| `arke.ouranos.helu.ca` | sycorax.incus:25540 | Arke LLM Proxy |
| `athena.ouranos.helu.ca` | puck.incus:22481 | Athena (Django) |
| `gitea.ouranos.helu.ca` | rosalind.incus:22082 | Gitea |
| `grafana.ouranos.helu.ca` | prospero.incus:443 (SSL) | Grafana |
| `hass.ouranos.helu.ca` | oberon.incus:8123 | Home Assistant |
| `id.ouranos.helu.ca` | titania.incus:22081 | Casdoor SSO |
| `icarlos.ouranos.helu.ca` | puck.incus:22681 | Icarlos (Django) |
| `jupyterlab.ouranos.helu.ca` | puck.incus:22071 | JupyterLab (OAuth2-Proxy) |
| `kairos.ouranos.helu.ca` | puck.incus:22581 | Kairos (Django) |
| `lobechat.ouranos.helu.ca` | rosalind.incus:22081 | LobeChat |
| `loki.ouranos.helu.ca` | prospero.incus:443 (SSL) | Loki |
| `mcp-switchboard.ouranos.helu.ca` | oberon.incus:22785 | MCP Switchboard |
| `nextcloud.ouranos.helu.ca` | rosalind.incus:22083 | Nextcloud |
| `openwebui.ouranos.helu.ca` | oberon.incus:22088 | Open WebUI |
| `peitho.ouranos.helu.ca` | puck.incus:22981 | Peitho (Django) |
| `pgadmin.ouranos.helu.ca` | prospero.incus:443 (SSL) | PgAdmin 4 |
| `prometheus.ouranos.helu.ca` | prospero.incus:443 (SSL) | Prometheus |
| `searxng.ouranos.helu.ca` | oberon.incus:22073 | SearXNG (OAuth2-Proxy) |
| `smtp4dev.ouranos.helu.ca` | oberon.incus:22085 | smtp4dev |
| `spelunker.ouranos.helu.ca` | puck.incus:22881 | Spelunker (Django) |
---
## Infrastructure Management
### Quick Start
```bash
# Provision containers
cd terraform
terraform init
terraform plan
terraform apply
# Start all containers
cd ../ansible
source ~/env/agathos/bin/activate
ansible-playbook sandbox_up.yml
# Deploy all services
ansible-playbook site.yml
# Stop all containers
ansible-playbook sandbox_down.yml
```
### Terraform Workflow
1. **Define** — Containers, networks, and resources in `*.tf` files
2. **Plan** — Review changes with `terraform plan`
3. **Apply** — Provision with `terraform apply`
4. **Verify** — Check outputs and container status
### Ansible Workflow
1. **Bootstrap** — Update packages, install essentials (`apt_update.yml`)
2. **Agents** — Deploy Alloy (log/metrics) and Node Exporter on all hosts
3. **Services** — Configure databases, Docker, applications, observability
4. **Verify** — Check service health and connectivity
### Vault Management
```bash
# Edit secrets
ansible-vault edit inventory/group_vars/all/vault.yml
# View secrets
ansible-vault view inventory/group_vars/all/vault.yml
# Encrypt a new file
ansible-vault encrypt new_secrets.yml
```
---
## S3 Storage Provisioning
Terraform provisions Incus S3 buckets for services requiring object storage:
| Service | Host | Purpose |
|---------|------|---------|
| **Casdoor** | Titania | User avatars and SSO resource storage |
| **LobeChat** | Rosalind | File uploads and attachments |
> S3 credentials (access key, secret key, endpoint) are stored as sensitive Terraform outputs and managed in Ansible Vault with the `vault_*_s3_*` prefix.
---
## Ansible Automation
### Full Deployment (`site.yml`)
Playbooks run in dependency order:
| Playbook | Hosts | Purpose |
|----------|-------|---------|
| `apt_update.yml` | All | Update packages and install essentials |
| `alloy/deploy.yml` | All | Grafana Alloy log/metrics collection |
| `prometheus/node_deploy.yml` | All | Node Exporter metrics |
| `docker/deploy.yml` | Oberon, Ariel, Miranda, Puck, Rosalind, Sycorax, Caliban, Titania | Docker engine |
| `smtp4dev/deploy.yml` | Oberon | SMTP test server |
| `pplg/deploy.yml` | Prospero | Full observability stack + HAProxy + OAuth2-Proxy |
| `postgresql/deploy.yml` | Portia | PostgreSQL with all databases |
| `postgresql_ssl/deploy.yml` | Titania | Dedicated PostgreSQL for Casdoor |
| `neo4j/deploy.yml` | Ariel | Neo4j graph database |
| `searxng/deploy.yml` | Oberon | SearXNG privacy search |
| `haproxy/deploy.yml` | Titania | HAProxy TLS termination and routing |
| `casdoor/deploy.yml` | Titania | Casdoor SSO |
| `mcpo/deploy.yml` | Miranda | MCPO MCP proxy |
| `openwebui/deploy.yml` | Oberon | Open WebUI LLM interface |
| `hass/deploy.yml` | Oberon | Home Assistant |
| `gitea/deploy.yml` | Rosalind | Gitea self-hosted Git |
| `nextcloud/deploy.yml` | Rosalind | Nextcloud collaboration |
### Individual Service Deployments
Services with standalone deploy playbooks (not in `site.yml`):
| Playbook | Host | Service |
|----------|------|---------|
| `anythingllm/deploy.yml` | Rosalind | AnythingLLM document AI |
| `arke/deploy.yml` | Sycorax | Arke LLM proxy |
| `argos/deploy.yml` | Miranda | Argos MCP web search server |
| `caliban/deploy.yml` | Caliban | Agent S MCP Server |
| `certbot/deploy.yml` | Titania | Let's Encrypt certificate renewal |
| `gitea_mcp/deploy.yml` | Miranda | Gitea MCP Server |
| `gitea_runner/deploy.yml` | Puck | Gitea CI/CD runner |
| `grafana_mcp/deploy.yml` | Miranda | Grafana MCP Server |
| `jupyterlab/deploy.yml` | Puck | JupyterLab + OAuth2-Proxy |
| `kernos/deploy.yml` | Caliban | Kernos MCP shell server |
| `lobechat/deploy.yml` | Rosalind | LobeChat AI chat |
| `neo4j_mcp/deploy.yml` | Miranda | Neo4j MCP Server |
| `rabbitmq/deploy.yml` | Oberon | RabbitMQ message queue |
### Lifecycle Playbooks
| Playbook | Purpose |
|----------|---------|
| `sandbox_up.yml` | Start all Uranian host containers |
| `sandbox_down.yml` | Gracefully stop all containers |
| `apt_update.yml` | Update packages on all hosts |
| `site.yml` | Full deployment orchestration |
---
## Data Flow Architecture
### Observability Pipeline
```
All Hosts Prospero Alerts
Alloy + Node Exporter → Prometheus + Loki + Grafana → AlertManager + Pushover
collect metrics & logs storage & visualisation notifications
```
### Integration Points
| Consumer | Provider | Connection |
|----------|----------|-----------|
| All LLM apps | Arke (Sycorax) | `http://sycorax.incus:25540` |
| Open WebUI, Arke, Gitea, Nextcloud, LobeChat | PostgreSQL (Portia) | `portia.incus:5432` |
| Neo4j MCP | Neo4j (Ariel) | `ariel.incus:7687` (Bolt) |
| MCP Switchboard | Docker API (Miranda) | `tcp://miranda.incus:2375` |
| MCP Switchboard | RabbitMQ (Oberon) | `oberon.incus:5672` |
| Kairos, Spelunker | RabbitMQ (Oberon) | `oberon.incus:5672` |
| SMTP (all apps) | smtp4dev (Oberon) | `oberon.incus:22025` |
| All hosts | Loki (Prospero) | `http://prospero.incus:3100` |
| All hosts | Prometheus (Prospero) | `http://prospero.incus:9090` |
---
## Important Notes
⚠️ **Alloy Host Variables Required** — Every host with `alloy` in its `services` list must define `alloy_log_level` in `inventory/host_vars/<host>.incus.yml`. The playbook will fail with an undefined variable error if this is missing.
⚠️ **Alloy Syslog Listeners Required for Docker Services** — Any Docker Compose service using the syslog logging driver must have a corresponding `loki.source.syslog` listener in the host's Alloy config template (`ansible/alloy/<hostname>/config.alloy.j2`). Missing listeners cause Docker containers to fail on start.
⚠️ **Local Terraform State** — This project uses local Terraform state (no remote backend). Do not run `terraform apply` from multiple machines simultaneously.
⚠️ **Nested Docker** — Docker runs inside Incus containers (nested), requiring `security.nesting = true` and `lxc.apparmor.profile=unconfined` AppArmor override on all Docker-enabled hosts.
⚠️ **Deployment Order** — Prospero (observability) must be fully deployed before other hosts, as Alloy on every host pushes logs and metrics to `prospero.incus`. Run `pplg/deploy.yml` before `site.yml` on a fresh environment.