Commit Graph

62 Commits

Author SHA1 Message Date
33658fbc8d feat(library): add business + finance types, workspace_id, IngestJob
Adds two new content-type-aware library types — `business` for
proposals/marketing/strategy (used by the work-team agents) and `finance`
for statements/tax/market commentary (used by Garth). Each ships with
chunking config, embedding/reranker instructions, an LLM-context prompt
that forbids fabricating financial figures, and a vision prompt.

Adds a unique-indexed `workspace_id` property to `Library` so a node
can be scoped to a Daedalus workspace. Null means a global library;
non-null means workspace-scoped. Search Cypher (added in a later
commit) enforces the boundary.

Adds an `IngestJob` Django ORM model — separate from neomodel — that
tracks asynchronous ingestion lifecycle (Daedalus → S3 → Celery →
embedding pipeline) with idempotency on (library, source_ref, hash).
Migration 0001_initial creates the table.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-29 06:26:26 -04:00
81426327bf feat(mcp): store MCP tokens as SHA-256 hashes instead of plaintext
Replace plaintext token storage with SHA-256 hashes so leaked database
contents cannot be used to authenticate. Plaintext is generated, shown
once at creation time, and never persisted.

- Add `hash_token()` helper and `MCPTokenManager.create_token()` that
  returns `(instance, plaintext)`.
- Replace `token` field with indexed `token_hash`; look up bearers by
  hashing the incoming value.
- Update dashboard, management command, and admin to surface plaintext
  only at creation. Disable admin "add" since it cannot reveal plaintext.
- Migration drops the old `token` column and adds `token_hash`;
  pre-existing tokens are invalidated and must be reissued.
2026-04-27 09:01:36 -04:00
2df22941d2 feat: replace server-side RAG with MCP retrieval primitives
- Remove Phase 4 RAG pipeline in favor of retrieval-only architecture
- Add FastMCP server exposing search, get_chunk, list_libraries tools
- Mount MCP endpoints (streamable HTTP + SSE) via Starlette in ASGI config
- Update README to clarify Mnemosyne is a retrieval engine, not RAG
- Let calling LLMs drive synthesis and iterative retrieval themselves
2026-04-26 15:34:26 -04:00
388b37e471 fix(search): require library match and preserve raw scores for RRF
Replace OPTIONAL MATCH with MATCH for Library-Collection-Item paths to
ensure results are properly scoped to libraries, and remove per-query
score normalization since RRF fuses results by rank rather than score
magnitude.
2026-04-26 06:35:11 -04:00
4a35aa126f refactor(settings): replace DATABASE_URL with explicit DB env vars
Replace the single `DATABASE_URL` connection string with individual
environment variables (`APP_DB_NAME`, `APP_DB_USER`, `APP_DB_PASSWORD`,
`DB_HOST`, `DB_PORT`) for more granular database configuration control.
2026-04-13 10:23:03 +00:00
634845fee0 feat: add Phase 3 hybrid search with Synesis reranking
Implement hybrid search pipeline combining vector, fulltext, and graph
search across Neo4j, with cross-attention reranking via Synesis
(Qwen3-VL-Reranker-2B) `/v1/rerank` endpoint.

- Add SearchService with vector, fulltext, and graph search strategies
- Add SynesisRerankerClient for multimodal reranking via HTTP API
- Add search API endpoint (POST /search/) with filtering by library,
  collection, and library_type
- Add SearchRequest/Response serializers and image search results
- Add "nonfiction" to library_type choices
- Consolidate reranker stack from two models to single Synesis service
- Handle image analysis_status as "skipped" when analysis is unavailable
- Add comprehensive tests for search pipeline and reranker client
2026-03-29 18:09:50 +00:00
fb38a881d9 Add vision model support to LLM Manager admin and rename index for clarity 2026-03-29 17:03:59 +00:00
90db904959 Add vision analysis capabilities to the embedding pipeline
- Introduced a new vision analysis service to classify, describe, and extract text from images.
- Enhanced the Image model with fields for OCR text, vision model name, and analysis status.
- Added a new "nonfiction" library type with specific chunking and embedding configurations.
- Updated content types to include vision prompts for various library types.
- Integrated vision analysis into the embedding pipeline, allowing for image analysis during document processing.
- Implemented metrics to track vision analysis performance and usage.
- Updated UI components to display vision analysis results and statuses in item details and the embedding dashboard.
- Added migration for new vision model fields and usage tracking.
2026-03-22 15:14:34 +00:00
6585beed20 Add download functionality for items and images with presigned URLs 2026-03-22 12:08:44 +00:00
1379e0d425 Add logging configuration to prevent Celery from overriding Django's logging setup 2026-03-21 13:23:56 +00:00
99bdb4ac92 Add Themis application with custom widgets, views, and utilities
- Implemented custom form widgets for date, time, and datetime fields with DaisyUI styling.
- Created utility functions for formatting dates, times, and numbers according to user preferences.
- Developed views for profile settings, API key management, and notifications, including health check endpoints.
- Added URL configurations for Themis tests and main application routes.
- Established test cases for custom widgets to ensure proper functionality and integration.
- Defined project metadata and dependencies in pyproject.toml for package management.
2026-03-21 02:00:18 +00:00
e99346d014 Initial commit 2026-03-18 23:01:09 +00:00