Add Themis application with custom widgets, views, and utilities

- Implemented custom form widgets for date, time, and datetime fields with DaisyUI styling.
- Created utility functions for formatting dates, times, and numbers according to user preferences.
- Developed views for profile settings, API key management, and notifications, including health check endpoints.
- Added URL configurations for Themis tests and main application routes.
- Established test cases for custom widgets to ensure proper functionality and integration.
- Defined project metadata and dependencies in pyproject.toml for package management.
This commit is contained in:
2026-03-21 02:00:18 +00:00
parent e99346d014
commit 99bdb4ac92
351 changed files with 65123 additions and 2 deletions

254
docs/PHASE_1_FOUNDATION.md Normal file
View File

@@ -0,0 +1,254 @@
# Phase 1: Foundation
## Objective
Establish the project skeleton, Neo4j data model, Django integration, and content-type system. At the end of this phase, you can create libraries, collections, and items via Django admin and the Neo4j graph is populated with the correct node/relationship structure.
## Deliverables
### 1. Django Project Skeleton
- Rename configuration module from `mnemosyne/mnemosyne/` to `mnemosyne/config/` per Red Panda Standards
- Create `pyproject.toml` at repo root with floor-pinned dependencies
- Create `.env` / `.env.example` for environment variables (never commit `.env`)
- Use a single settings.py and use dotenv to configure with '.env'.
- Configure dual-database: PostgreSQL (Django auth/config) + Neo4j (content graph)
- Install and configure `django-neomodel` for Neo4j OGM integration
- Configure `djangorestframework` for API
- Configure Celery + RabbitMQ (Async Task pattern)
- Configure S3 storage backend via Incus buckets (MinIO-backed, Terraform-provisioned)
- Configure structured logging for Loki integration via Alloy
### 2. Django Apps
| App | Purpose | Database |
|-----|---------|----------|
| `themis` (installed) | User profiles, preferences, API key management, navigation, notifications | PostgreSQL |
| `library/` | Libraries, Collections, Items, Chunks, Concepts | Neo4j (neomodel) |
| `llm_manager/` | LLM API/model config, usage tracking | PostgreSQL (ported from Spelunker) |
> **Note:** Themis replaces `core/`. User profiles, timezone preferences, theme management, API key storage (encrypted, Fernet), and standard navigation are all provided by Themis. No separate `core/` app is needed. If SSO (Casdoor) or Organization models are required in future, they will be added as separate apps following the SSO and Organization patterns.
### 3. Neo4j Graph Model (neomodel)
```python
# library/models.py
class Library(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
library_type = StringProperty(required=True) # fiction, technical, music, film, art, journal
description = StringProperty(default='')
# Content-type configuration (stored as JSON strings)
chunking_config = JSONProperty(default={})
embedding_instruction = StringProperty(default='')
reranker_instruction = StringProperty(default='')
llm_context_prompt = StringProperty(default='')
created_at = DateTimeProperty(default_now=True)
collections = RelationshipTo('Collection', 'CONTAINS')
class Collection(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(required=True)
description = StringProperty(default='')
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
items = RelationshipTo('Item', 'CONTAINS')
library = RelationshipTo('Library', 'BELONGS_TO')
class Item(StructuredNode):
uid = UniqueIdProperty()
title = StringProperty(required=True)
item_type = StringProperty(default='')
s3_key = StringProperty(default='')
content_hash = StringProperty(index=True)
file_type = StringProperty(default='')
file_size = IntegerProperty(default=0)
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
updated_at = DateTimeProperty(default_now=True)
chunks = RelationshipTo('Chunk', 'HAS_CHUNK')
images = RelationshipTo('Image', 'HAS_IMAGE')
concepts = RelationshipTo('Concept', 'REFERENCES', model=ReferencesRel)
related_items = RelationshipTo('Item', 'RELATED_TO', model=RelatedToRel)
class Chunk(StructuredNode):
uid = UniqueIdProperty()
chunk_index = IntegerProperty(required=True)
chunk_s3_key = StringProperty(required=True)
chunk_size = IntegerProperty(default=0)
text_preview = StringProperty(default='') # First 500 chars for full-text index
embedding = ArrayProperty(FloatProperty()) # 4096d vector
created_at = DateTimeProperty(default_now=True)
mentions = RelationshipTo('Concept', 'MENTIONS')
class Concept(StructuredNode):
uid = UniqueIdProperty()
name = StringProperty(unique_index=True, required=True)
concept_type = StringProperty(default='') # person, place, topic, technique, theme
embedding = ArrayProperty(FloatProperty()) # 4096d vector
related_concepts = RelationshipTo('Concept', 'RELATED_TO')
class Image(StructuredNode):
uid = UniqueIdProperty()
s3_key = StringProperty(required=True)
image_type = StringProperty(default='') # cover, diagram, artwork, still, photo
description = StringProperty(default='')
metadata = JSONProperty(default={})
created_at = DateTimeProperty(default_now=True)
embeddings = RelationshipTo('ImageEmbedding', 'HAS_EMBEDDING')
class ImageEmbedding(StructuredNode):
uid = UniqueIdProperty()
embedding = ArrayProperty(FloatProperty()) # 4096d multimodal vector
created_at = DateTimeProperty(default_now=True)
```
### 4. Neo4j Index Setup
Management command: `python manage.py setup_neo4j_indexes`
Creates vector indexes (4096d cosine), full-text indexes, and constraint indexes.
### 5. Content-Type System
Default library type configurations loaded via management command (`python manage.py load_library_types`). A management command is preferred over fixtures because these configurations will evolve across releases, and the command can be re-run idempotently to update defaults without overwriting per-library customizations.
Default configurations:
| Library Type | Chunking Strategy | Embedding Instruction | LLM Context |
|-------------|-------------------|----------------------|-------------|
| fiction | chapter_aware | narrative retrieval | "Excerpts from fiction..." |
| technical | section_aware | procedural retrieval | "Excerpts from technical docs..." |
| music | song_level | music discovery | "Song lyrics and metadata..." |
| film | scene_level | cinematic retrieval | "Film content..." |
| art | description_level | visual/stylistic retrieval | "Artwork descriptions..." |
| journal | entry_level | temporal/reflective retrieval | "Personal journal entries..." |
### 6. Admin & Management UI
`django-neomodel`'s admin support is limited — `StructuredNode` models don't participate in Django's ORM, so standard `ModelAdmin`, filters, search, and inlines don't work. Instead:
- **Custom admin views** for Library, Collection, and Item CRUD using Cypher/neomodel queries, rendered in Django admin's template structure
- **DRF management API** (`/api/v1/library/`, `/api/v1/collection/`, `/api/v1/item/`) for programmatic access and future frontend consumption
- Library CRUD includes content-type configuration editing
- Collection/Item views support filtering by library, type, and date
- All admin views extend `themis/base.html` for consistent navigation
### 7. LLM Manager (Port from Spelunker)
Copy and adapt `llm_manager/` app from Spelunker:
- `LLMApi` model (OpenAI-compatible API endpoints)
- `LLMModel` model (with new `reranker` and `multimodal_embed` model types)
- `LLMUsage` tracking
- **API key storage uses Themis `UserAPIKey`** — LLM Manager does not implement its own encrypted key storage. API credentials for LLM providers are stored via Themis's Fernet-encrypted `UserAPIKey` model with `key_type='api'` and appropriate `service_name` (e.g., "OpenAI", "Arke"). `LLMApi` references credentials by service name lookup against the requesting user's Themis keys.
Schema additions to Spelunker's `LLMModel`:
| Field | Change | Purpose |
|-------|--------|---------|
| `model_type` | Add choices: `reranker`, `multimodal_embed` | Support Qwen3-VL reranker and embedding models |
| `supports_multimodal` | New `BooleanField` | Flag models that accept image+text input |
| `vector_dimensions` | New `IntegerProperty` | Embedding output dimensions (e.g., 4096) |
### 8. Infrastructure Wiring (Ouranos)
All connections follow Ouranos DNS conventions — use `.incus` hostnames, never hardcode IPs.
| Service | Host | Connection | Settings Variable |
|---------|------|------------|-------------------|
| PostgreSQL | `portia.incus:5432` | Database `mnemosyne` (must be provisioned) | `DATABASE_URL` |
| Neo4j (Bolt) | `ariel.incus:25554` | Neo4j 5.26.0 | `NEOMODEL_NEO4J_BOLT_URL` |
| Neo4j (HTTP) | `ariel.incus:25584` | Browser/API access | — |
| RabbitMQ | `oberon.incus:5672` | Message broker | `CELERY_BROKER_URL` |
| S3 (Incus) | Terraform-provisioned Incus bucket | MinIO-backed object storage | `AWS_S3_ENDPOINT_URL`, `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_STORAGE_BUCKET_NAME` |
| Arke LLM Proxy | `sycorax.incus:25540` | LLM API routing | Configured per `LLMApi` record |
| SMTP (dev) | `oberon.incus:22025` | smtp4dev test server | `EMAIL_HOST` |
| Loki (logs) | `prospero.incus:3100` | Via Alloy agent (host-level, not app-level) | — |
| Casdoor SSO | `titania.incus:22081` | Future: SSO pattern | — |
**Terraform provisioning required before Phase 1 deployment:**
- PostgreSQL database `mnemosyne` on Portia
- Incus S3 bucket for Mnemosyne content storage
- HAProxy route: `mnemosyne.ouranos.helu.ca``puck.incus:<port>` (port TBD, assign next available in 22xxx range)
**Development environment (local):**
- PostgreSQL for Django ORM on 'portia.incus'
- Local Neo4j instance or `ariel.incus` via SSH tunnel
- `django.core.files.storage.FileSystemStorage` for S3 (tests/dev)
- `CELERY_TASK_ALWAYS_EAGER=True` for synchronous task execution
### 9. Testing Strategy
Follows Red Panda Standards: Django `TestCase`, separate test files per module.
| Test File | Scope |
|-----------|-------|
| `library/tests/test_models.py` | Neo4j node creation, relationships, property validation |
| `library/tests/test_content_types.py` | `load_library_types` command, configuration retrieval per library |
| `library/tests/test_indexes.py` | `setup_neo4j_indexes` command execution |
| `library/tests/test_api.py` | DRF endpoints for Library/Collection/Item CRUD |
| `library/tests/test_admin_views.py` | Custom admin views render and submit correctly |
| `llm_manager/tests/test_models.py` | LLMApi, LLMModel creation, new model types |
| `llm_manager/tests/test_api.py` | LLM Manager API endpoints |
**Neo4j test strategy:**
- Tests use a dedicated Neo4j test database (separate from development/production)
- `NEOMODEL_NEO4J_BOLT_URL` overridden in test settings to point to test database
- Each test class clears its nodes in `setUp` / `tearDown` using `neomodel.clear_neo4j_database()`
- CI/CD (Gitea Runner on Puck) uses a Docker Neo4j instance for isolated test runs
- For local development without Neo4j, tests that require Neo4j are skipped via `@unittest.skipUnless(neo4j_available(), "Neo4j not available")`
## Dependencies
```toml
# pyproject.toml — floor-pinned with ceiling per Red Panda Standards
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"django-neomodel>=0.1,<1.0",
"neomodel>=5.3,<6.0",
"neo4j>=5.0,<6.0",
"celery>=5.3,<6.0",
"django-storages[boto3]>=1.14,<2.0",
"django-environ>=0.11,<1.0",
"psycopg[binary]>=3.1,<4.0",
"dj-database-url>=2.1,<3.0",
"shortuuid>=1.0,<2.0",
"gunicorn>=21.0,<24.0",
"cryptography>=41.0,<45.0",
"flower>=2.0,<3.0",
"pymemcache>=4.0,<5.0",
"django-heluca-themis",
]
```
## Success Criteria
- [ ] Config module renamed to `config/`, `pyproject.toml` at repo root with floor-pinned deps
- [ ] Settings load from environment variables via `django-environ` (`.env.example` provided)
- [ ] Django project runs with dual PostgreSQL + Neo4j databases
- [ ] Can create Library → Collection → Item through custom admin views
- [ ] DRF API endpoints return Library/Collection/Item data
- [ ] Neo4j graph shows correct node types and relationships
- [ ] Content-type configurations loaded via `load_library_types` and retrievable per library
- [ ] LLM Manager ported from Spelunker; uses Themis `UserAPIKey` for credential storage
- [ ] S3 storage configured against Incus bucket (Terraform-provisioned) and tested
- [ ] Celery worker connects to RabbitMQ on Oberon
- [ ] Structured logging configured (JSON format, compatible with Loki/Alloy)
- [ ] Tests pass for all Phase 1 apps (library, llm_manager)
- [ ] HAProxy route provisioned: `mnemosyne.ouranos.helu.ca`