- Implemented custom form widgets for date, time, and datetime fields with DaisyUI styling. - Created utility functions for formatting dates, times, and numbers according to user preferences. - Developed views for profile settings, API key management, and notifications, including health check endpoints. - Added URL configurations for Themis tests and main application routes. - Established test cases for custom widgets to ensure proper functionality and integration. - Defined project metadata and dependencies in pyproject.toml for package management.
12 KiB
Phase 1: Foundation
Objective
Establish the project skeleton, Neo4j data model, Django integration, and content-type system. At the end of this phase, you can create libraries, collections, and items via Django admin and the Neo4j graph is populated with the correct node/relationship structure.
Deliverables
1. Django Project Skeleton
- Rename configuration module from
mnemosyne/mnemosyne/tomnemosyne/config/per Red Panda Standards - Create
pyproject.tomlat repo root with floor-pinned dependencies - Create
.env/.env.examplefor environment variables (never commit.env) - Use a single settings.py and use dotenv to configure with '.env'.
- Configure dual-database: PostgreSQL (Django auth/config) + Neo4j (content graph)
- Install and configure
django-neomodelfor Neo4j OGM integration - Configure
djangorestframeworkfor API - Configure Celery + RabbitMQ (Async Task pattern)
- Configure S3 storage backend via Incus buckets (MinIO-backed, Terraform-provisioned)
- Configure structured logging for Loki integration via Alloy
2. Django Apps
| App | Purpose | Database |
|---|---|---|
themis (installed) |
User profiles, preferences, API key management, navigation, notifications | PostgreSQL |
library/ |
Libraries, Collections, Items, Chunks, Concepts | Neo4j (neomodel) |
llm_manager/ |
LLM API/model config, usage tracking | PostgreSQL (ported from Spelunker) |
Note: Themis replaces
core/. User profiles, timezone preferences, theme management, API key storage (encrypted, Fernet), and standard navigation are all provided by Themis. No separatecore/app is needed. If SSO (Casdoor) or Organization models are required in future, they will be added as separate apps following the SSO and Organization patterns.
3. Neo4j Graph Model (neomodel)
# library/models.py
class Library(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
library_type = StringProperty(required=True) # fiction, technical, music, film, art, journal
description = StringProperty(default='')
# Content-type configuration (stored as JSON strings)
chunking_config = JSONProperty(default={})
embedding_instruction = StringProperty(default='')
reranker_instruction = StringProperty(default='')
llm_context_prompt = StringProperty(default='')
created_at = DateTimeProperty(default_now=True)
collections = RelationshipTo('Collection', 'CONTAINS')
class Collection(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(required=True)
description = StringProperty(default='')
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
items = RelationshipTo('Item', 'CONTAINS')
library = RelationshipTo('Library', 'BELONGS_TO')
class Item(StructuredNode):
uid = UniqueIdProperty()
title = StringProperty(required=True)
item_type = StringProperty(default='')
s3_key = StringProperty(default='')
content_hash = StringProperty(index=True)
file_type = StringProperty(default='')
file_size = IntegerProperty(default=0)
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
updated_at = DateTimeProperty(default_now=True)
chunks = RelationshipTo('Chunk', 'HAS_CHUNK')
images = RelationshipTo('Image', 'HAS_IMAGE')
concepts = RelationshipTo('Concept', 'REFERENCES', model=ReferencesRel)
related_items = RelationshipTo('Item', 'RELATED_TO', model=RelatedToRel)
class Chunk(StructuredNode):
uid = UniqueIdProperty()
chunk_index = IntegerProperty(required=True)
chunk_s3_key = StringProperty(required=True)
chunk_size = IntegerProperty(default=0)
text_preview = StringProperty(default='') # First 500 chars for full-text index
embedding = ArrayProperty(FloatProperty()) # 4096d vector
created_at = DateTimeProperty(default_now=True)
mentions = RelationshipTo('Concept', 'MENTIONS')
class Concept(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
concept_type = StringProperty(default='') # person, place, topic, technique, theme
embedding = ArrayProperty(FloatProperty()) # 4096d vector
related_concepts = RelationshipTo('Concept', 'RELATED_TO')
class Image(StructuredNode):
uid = UniqueIdProperty()
s3_key = StringProperty(required=True)
image_type = StringProperty(default='') # cover, diagram, artwork, still, photo
description = StringProperty(default='')
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
embeddings = RelationshipTo('ImageEmbedding', 'HAS_EMBEDDING')
class ImageEmbedding(StructuredNode):
uid = UniqueIdProperty()
embedding = ArrayProperty(FloatProperty()) # 4096d multimodal vector
created_at = DateTimeProperty(default_now=True)
4. Neo4j Index Setup
Management command: python manage.py setup_neo4j_indexes
Creates vector indexes (4096d cosine), full-text indexes, and constraint indexes.
5. Content-Type System
Default library type configurations loaded via management command (python manage.py load_library_types). A management command is preferred over fixtures because these configurations will evolve across releases, and the command can be re-run idempotently to update defaults without overwriting per-library customizations.
Default configurations:
| Library Type | Chunking Strategy | Embedding Instruction | LLM Context |
|---|---|---|---|
| fiction | chapter_aware | narrative retrieval | "Excerpts from fiction..." |
| technical | section_aware | procedural retrieval | "Excerpts from technical docs..." |
| music | song_level | music discovery | "Song lyrics and metadata..." |
| film | scene_level | cinematic retrieval | "Film content..." |
| art | description_level | visual/stylistic retrieval | "Artwork descriptions..." |
| journal | entry_level | temporal/reflective retrieval | "Personal journal entries..." |
6. Admin & Management UI
django-neomodel's admin support is limited — StructuredNode models don't participate in Django's ORM, so standard ModelAdmin, filters, search, and inlines don't work. Instead:
- Custom admin views for Library, Collection, and Item CRUD using Cypher/neomodel queries, rendered in Django admin's template structure
- DRF management API (
/api/v1/library/,/api/v1/collection/,/api/v1/item/) for programmatic access and future frontend consumption - Library CRUD includes content-type configuration editing
- Collection/Item views support filtering by library, type, and date
- All admin views extend
themis/base.htmlfor consistent navigation
7. LLM Manager (Port from Spelunker)
Copy and adapt llm_manager/ app from Spelunker:
LLMApimodel (OpenAI-compatible API endpoints)LLMModelmodel (with newrerankerandmultimodal_embedmodel types)LLMUsagetracking- API key storage uses Themis
UserAPIKey— LLM Manager does not implement its own encrypted key storage. API credentials for LLM providers are stored via Themis's Fernet-encryptedUserAPIKeymodel withkey_type='api'and appropriateservice_name(e.g., "OpenAI", "Arke").LLMApireferences credentials by service name lookup against the requesting user's Themis keys.
Schema additions to Spelunker's LLMModel:
| Field | Change | Purpose |
|---|---|---|
model_type |
Add choices: reranker, multimodal_embed |
Support Qwen3-VL reranker and embedding models |
supports_multimodal |
New BooleanField |
Flag models that accept image+text input |
vector_dimensions |
New IntegerProperty |
Embedding output dimensions (e.g., 4096) |
8. Infrastructure Wiring (Ouranos)
All connections follow Ouranos DNS conventions — use .incus hostnames, never hardcode IPs.
| Service | Host | Connection | Settings Variable |
|---|---|---|---|
| PostgreSQL | portia.incus:5432 |
Database mnemosyne (must be provisioned) |
DATABASE_URL |
| Neo4j (Bolt) | ariel.incus:25554 |
Neo4j 5.26.0 | NEOMODEL_NEO4J_BOLT_URL |
| Neo4j (HTTP) | ariel.incus:25584 |
Browser/API access | — |
| RabbitMQ | oberon.incus:5672 |
Message broker | CELERY_BROKER_URL |
| S3 (Incus) | Terraform-provisioned Incus bucket | MinIO-backed object storage | AWS_S3_ENDPOINT_URL, AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_STORAGE_BUCKET_NAME |
| Arke LLM Proxy | sycorax.incus:25540 |
LLM API routing | Configured per LLMApi record |
| SMTP (dev) | oberon.incus:22025 |
smtp4dev test server | EMAIL_HOST |
| Loki (logs) | prospero.incus:3100 |
Via Alloy agent (host-level, not app-level) | — |
| Casdoor SSO | titania.incus:22081 |
Future: SSO pattern | — |
Terraform provisioning required before Phase 1 deployment:
- PostgreSQL database
mnemosyneon Portia - Incus S3 bucket for Mnemosyne content storage
- HAProxy route:
mnemosyne.ouranos.helu.ca→puck.incus:<port>(port TBD, assign next available in 22xxx range)
Development environment (local):
- PostgreSQL for Django ORM on 'portia.incus'
- Local Neo4j instance or
ariel.incusvia SSH tunnel django.core.files.storage.FileSystemStoragefor S3 (tests/dev)CELERY_TASK_ALWAYS_EAGER=Truefor synchronous task execution
9. Testing Strategy
Follows Red Panda Standards: Django TestCase, separate test files per module.
| Test File | Scope |
|---|---|
library/tests/test_models.py |
Neo4j node creation, relationships, property validation |
library/tests/test_content_types.py |
load_library_types command, configuration retrieval per library |
library/tests/test_indexes.py |
setup_neo4j_indexes command execution |
library/tests/test_api.py |
DRF endpoints for Library/Collection/Item CRUD |
library/tests/test_admin_views.py |
Custom admin views render and submit correctly |
llm_manager/tests/test_models.py |
LLMApi, LLMModel creation, new model types |
llm_manager/tests/test_api.py |
LLM Manager API endpoints |
Neo4j test strategy:
- Tests use a dedicated Neo4j test database (separate from development/production)
NEOMODEL_NEO4J_BOLT_URLoverridden in test settings to point to test database- Each test class clears its nodes in
setUp/tearDownusingneomodel.clear_neo4j_database() - CI/CD (Gitea Runner on Puck) uses a Docker Neo4j instance for isolated test runs
- For local development without Neo4j, tests that require Neo4j are skipped via
@unittest.skipUnless(neo4j_available(), "Neo4j not available")
Dependencies
# pyproject.toml — floor-pinned with ceiling per Red Panda Standards
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"django-neomodel>=0.1,<1.0",
"neomodel>=5.3,<6.0",
"neo4j>=5.0,<6.0",
"celery>=5.3,<6.0",
"django-storages[boto3]>=1.14,<2.0",
"django-environ>=0.11,<1.0",
"psycopg[binary]>=3.1,<4.0",
"dj-database-url>=2.1,<3.0",
"shortuuid>=1.0,<2.0",
"gunicorn>=21.0,<24.0",
"cryptography>=41.0,<45.0",
"flower>=2.0,<3.0",
"pymemcache>=4.0,<5.0",
"django-heluca-themis",
]
Success Criteria
- Config module renamed to
config/,pyproject.tomlat repo root with floor-pinned deps - Settings load from environment variables via
django-environ(.env.exampleprovided) - Django project runs with dual PostgreSQL + Neo4j databases
- Can create Library → Collection → Item through custom admin views
- DRF API endpoints return Library/Collection/Item data
- Neo4j graph shows correct node types and relationships
- Content-type configurations loaded via
load_library_typesand retrievable per library - LLM Manager ported from Spelunker; uses Themis
UserAPIKeyfor credential storage - S3 storage configured against Incus bucket (Terraform-provisioned) and tested
- Celery worker connects to RabbitMQ on Oberon
- Structured logging configured (JSON format, compatible with Loki/Alloy)
- Tests pass for all Phase 1 apps (library, llm_manager)
- HAProxy route provisioned:
mnemosyne.ouranos.helu.ca