fix(search): require library match and preserve raw scores for RRF

Replace OPTIONAL MATCH with MATCH for Library-Collection-Item paths to ensure results are properly scoped to libraries, and remove per-query score normalization since RRF fuses results by rank rather than score magnitude.
2026-04-26 06:35:11 -04:00
parent 4a35aa126f
commit 388b37e471
3 changed files with 55 additions and 360 deletions
--- a/Standards_Django_V1-00.md
+++ b/Standards_Django_V1-00.md
@@ -1,306 +0,0 @@
-## 🐾 Red Panda Approval™
-
-This project follows Red Panda Approval standards — our gold standard for Django application quality. Code must be elegant, reliable, and maintainable to earn the approval of our adorable red panda judges.
-
-### The 5 Sacred Django Criteria
-1. **Fresh Migration Test** — Clean migrations from empty database
-2. **Elegant Simplicity** — No unnecessary complexity
-3. **Observable & Debuggable** — Proper logging and error handling
-4. **Consistent Patterns** — Follow Django conventions
-5. **Actually Works** — Passes all checks and serves real user needs
-
-## Environment Standards
- Virtual environment: ~/env/PROJECT/bin/activate
- Use pyproject.toml for project configuration (no setup.py, no requirements.txt)
- Python version: specified in pyproject.toml
- Dependencies: floor-pinned with ceiling (e.g. `Django>=5.2,<6.0`)
-
-### Dependency Pinning
-
-```toml
-# Correct — floor pin with ceiling
-dependencies = [
-    "Django>=5.2,<6.0",
-    "djangorestframework>=3.14,<4.0",
-    "cryptography>=41.0,<45.0",
-]
-
-# Wrong — exact pins in library packages
-dependencies = [
-    "Django==5.2.7",  # too strict, breaks downstream
-]
-```
-
-Exact pins (`==`) are only appropriate in application-level lock files, not in reusable library packages.
-
-## Directory Structure
-myproject/                     # Git repository root
-├── .gitignore
-├── README.md
-├── pyproject.toml             # Project configuration (moved to repo root)
-├── docker-compose.yml
-├── .env                       # Docker Compose environment (DATABASE_URL=postgres://...)
-├── .env.example
-│
-├── project/                   # Django project root (manage.py lives here)
-│   ├── manage.py
-│   ├── Dockerfile
-│   ├── .env                   # Local development environment (DATABASE_URL=sqlite:///...)
-│   ├── .env.example
-│   │
-│   ├── config/                # Django configuration module
-│   │   ├── __init__.py
-│   │   ├── settings.py
-│   │   ├── urls.py
-│   │   ├── wsgi.py
-│   │   └── asgi.py
-│   │
-│   ├── accounts/              # Django app
-│   │   ├── __init__.py
-│   │   ├── models.py
-│   │   ├── views.py
-│   │   └── urls.py
-│   │
-│   ├── blog/                  # Django app
-│   │   ├── __init__.py
-│   │   ├── models.py
-│   │   ├── views.py
-│   │   └── urls.py
-│   │
-│   ├── static/
-│   │   ├── css/
-│   │   └── js/
-│   │
-│   └── templates/
-│       └── base.html
-│
-├── web/                       # Nginx configuration
-│   └── nginx.conf
-│
-├── db/                        # PostgreSQL configuration
-│   └── postgresql.conf
-│
-└── docs/                      # Project documentation
-    └── index.md
-
-## Settings Structure
- Use a single settings.py file
- Use django-environ or python-dotenv for environment variables
- Never commit .env files to version control
- Provide .env.example with all required variables documented
- Create .gitignore file
- Create a .dockerignore file
-
-## Code Organization
- Imports: PEP 8 ordering (stdlib, third-party, local)
- Type hints on function parameters
- CSS: External .css files only (no inline styles, no embedded `<style>` tags)
- JS: External .js files only (no inline handlers, no embedded `<script>` blocks)
- Maximum file length: 1000 lines
- If a file exceeds 500 lines, consider splitting by domain concept
-
-## Database Conventions
- Migrations run cleanly from empty database
- Never edit deployed migrations
- Use meaningful migration names: --name add_email_to_profile
- One logical change per migration when possible
- Test migrations both forward and backward
-
-### Development vs Production
- Development: SQLite
- Production: PostgreSQL
-
-## Caching
- Expensive queries are cached
- Cache keys follow naming convention
- TTLs are appropriate (not infinite)
- Invalidation is documented
- Key Naming Pattern: {app}:{model}:{identifier}:{field}
-
-## Model Naming
- Model names: singular PascalCase (User, BlogPost, OrderItem)
- Correct English pluralization on related names
- All models have created_at and updated_at
- All models define __str__ and get_absolute_url
- TextChoices used for status fields
- related_name defined on ForeignKey fields
- Related names: plural snake_case with proper English pluralization
-
-## Forms
- Use ModelForm with explicit fields list (never __all__)
-
-## Field Naming
- Foreign keys: singular without _id suffix (author, category, parent)
- Boolean fields: use prefixes (is_active, has_permission, can_edit)
- Date fields: use suffixes (created_at, updated_at, published_on)
- Avoid abbreviations (use description, not desc)
-
-## Required Model Fields
- All models should include:
-  - created_at = models.DateTimeField(auto_now_add=True)
-  - updated_at = models.DateTimeField(auto_now=True)
- Consider adding:
-  - id = models.UUIDField(primary_key=True) for public-facing models
-  - is_active = models.BooleanField(default=True) for soft deletes
-
-## Indexing
- Add db_index=True to frequently queried fields
- Use Meta.indexes for composite indexes
- Document why each index exists
-
-## Queries
- Use select_related() for foreign keys
- Use prefetch_related() for reverse relations and M2M
- Avoid queries in loops (N+1 problem)
- Use .only() and .defer() for large models
- Add comments explaining complex querysets
-
-## Docstrings
- Use Sphinx style docstrings
- Document all public functions, classes, and modules
- Skip docstrings for obvious one-liners and standard Django overrides
-
-## Views
- Use Function-Based Views (FBVs) exclusively
- Explicit logic is preferred over implicit inheritance
- Extract shared logic into utility functions
-
-## URLs & Identifiers
-
- Public URLs use short UUIDs (12 characters) via `shortuuid`
- Never expose sequential IDs in URLs (security/enumeration risk)
- Internal references may use standard UUIDs or PKs
-
-## URL Patterns
- Resource-based URLs (RESTful style)
- Namespaced URL names per app
- Trailing slashes (Django default)
- Flat structure preferred over deep nesting
-
-## Background Tasks
- All tasks are run synchronously unless the design specifies background tasks are needed for long operations
- Long operations use Celery tasks
- Use Memcached, task progress pattern: {app}:task:{task_id}:progress
- Tasks are idempotent
- Tasks include retry logic
- Tasks live in app/tasks.py
- RabbitMQ is the Message Broker
- Flower Monitoring: Use for debugging failed tasks
-
-## Testing
- Framework: Django TestCase (not pytest)
- Separate test files per module: test_models.py, test_views.py, test_forms.py
-
-## Frontend Standards
-
-### New Projects (DaisyUI + Tailwind)
- DaisyUI 4 via CDN for component classes
- Tailwind CSS via CDN for utility classes
- Theme management via Themis (DaisyUI `data-theme` attribute)
- All apps extend `themis/base.html` for consistent navigation
- No inline styles or scripts
-
-### Existing Projects (Bootstrap 5)
- Bootstrap 5 via CDN
- Bootstrap Icons via CDN
- Bootswatch for theme variants (if applicable)
- django-bootstrap5 and crispy-bootstrap5 for form rendering
-
-## Preferred Packages
-
-### Core Django
- django>=5.2,<6.0
- django-environ — Environment variables
-
-### Authentication & Security
- django-allauth — User management
- django-allauth-2fa — Two-factor authentication
-
-### API Development
- djangorestframework>=3.14,<4.0 — REST APIs
- drf-spectacular — OpenAPI/Swagger documentation
-
-### Encryption
- cryptography — Fernet encryption for secrets/API keys
-
-### Background Tasks
- celery — Async task queue
- django-celery-progress — Progress bars
- flower — Celery monitoring
-
-### Caching
- pymemcache — Memcached backend
-
-### Database
- dj-database-url — Database URL configuration
- psycopg[binary] — PostgreSQL adapter
- shortuuid — Short UUIDs for public URLs
-
-### Production
- gunicorn — WSGI server
-
-### Shared Apps
- django-heluca-themis — User preferences, themes, key management, navigation
-
-### Deprecated / Removed
- ~~pytz~~ — Use stdlib `zoneinfo` (Python 3.9+, Django 4+)
- ~~Pillow~~ — Only add if your app needs ImageField
- ~~django-heluca-core~~ — Replaced by Themis
-
-## Anti-Patterns to Avoid
-
-### Models
- Don't use `Model.objects.get()` without handling `DoesNotExist`
- Don't use `null=True` on `CharField` or `TextField` (use `blank=True, default=""`)
- Don't use `related_name='+'` unless you have a specific reason
- Don't override `save()` for business logic (use signals or service functions)
- Don't use `auto_now=True` on fields you might need to manually set
- Don't use `ForeignKey` without specifying `on_delete` explicitly
- Don't use `Meta.ordering` on large tables (specify ordering in queries)
-
-### Queries
- Don't query inside loops (N+1 problem)
- Don't use `.all()` when you need a subset
- Don't use raw SQL unless absolutely necessary
- Don't forget `select_related()` and `prefetch_related()`
-
-### Views
- Don't put business logic in views
- Don't use `request.POST.get()` without validation (use forms)
- Don't return sensitive data in error messages
- Don't forget `login_required` decorator on protected views
-
-### Forms
- Don't use `fields = '__all__'` in ModelForm
- Don't trust client-side validation alone
- Don't use `exclude` in ModelForm (use explicit `fields`)
-
-### Templates
- Don't use `{{ variable }}` for URLs (use `{% url %}` tag)
- Don't put logic in templates
- Don't use inline CSS or JavaScript (external files only)
- Don't forget `{% csrf_token %}` in forms
-
-### Security
- Don't store secrets in `settings.py` (use environment variables)
- Don't commit `.env` files to version control
- Don't use `DEBUG=True` in production
- Don't expose sequential IDs in public URLs
- Don't use `mark_safe()` on user-supplied content
- Don't disable CSRF protection
-
-### Imports & Code Style
- Don't use `from module import *`
- Don't use mutable default arguments
- Don't use bare `except:` clauses
- Don't ignore linter warnings without documented reason
-
-### Migrations
- Don't edit migrations that have been deployed
- Don't use `RunPython` without a reverse function
- Don't add non-nullable fields without a default value
-
-### Celery Tasks
- Don't pass model instances to tasks (pass IDs and re-fetch)
- Don't assume tasks run immediately
- Don't forget retry logic for external service calls
--- a/mnemosyne/library/services/search.py
+++ b/mnemosyne/library/services/search.py
@@ -247,7 +247,7 @@ class SearchService:
            CALL db.index.vector.queryNodes('chunk_embedding_index', $top_k, $query_vector)
            YIELD node AS chunk, score
            MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
-            OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
+            MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
            WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
              AND ($library_type IS NULL OR lib.library_type = $library_type)
              AND ($collection_uid IS NULL OR col.uid = $collection_uid)
@@ -352,7 +352,7 @@ class SearchService:
            CALL db.index.fulltext.queryNodes('chunk_text_fulltext', $query)
            YIELD node AS chunk, score
            MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
-            OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
+            MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
            WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
              AND ($library_type IS NULL OR lib.library_type = $library_type)
              AND ($collection_uid IS NULL OR col.uid = $collection_uid)
@@ -374,15 +374,13 @@ class SearchService:

        try:
            results, _ = db.cypher_query(cypher, params)
-            # Normalize BM25 scores to 0-1 range
-            max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
+            # Keep raw BM25 scores — RRF fuses by rank, not by score magnitude.
            for row in results:
                uid = row[0]
                if not uid:
                    continue
                raw_score = float(row[7]) if row[7] else 0.0
-                normalized = raw_score / max_score if max_score > 0 else 0.0
-                if uid not in candidates or normalized > candidates[uid].score:
+                if uid not in candidates or raw_score > candidates[uid].score:
                    candidates[uid] = SearchCandidate(
                        chunk_uid=uid,
                        text_preview=row[1] or "",
@@ -391,7 +389,7 @@ class SearchService:
                        item_uid=row[4] or "",
                        item_title=row[5] or "",
                        library_type=row[6] or "",
-                        score=normalized,
+                        score=raw_score,
                        source="fulltext",
                    )
        except Exception as exc:
@@ -409,7 +407,7 @@ class SearchService:
            YIELD node AS concept, score AS concept_score
            MATCH (chunk:Chunk)-[:MENTIONS]->(concept)
            MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
-            OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
+            MATCH (lib:Library)-[:CONTAINS]->(:Collection)-[:CONTAINS]->(item)
            WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
              AND ($library_type IS NULL OR lib.library_type = $library_type)
            RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
@@ -430,14 +428,13 @@ class SearchService:

        try:
            results, _ = db.cypher_query(cypher, params)
-            max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
+            # Raw scores already include the 0.8 concept downweight from Cypher.
            for row in results:
                uid = row[0]
                if not uid:
                    continue
                raw_score = float(row[7]) if row[7] else 0.0
-                normalized = raw_score / max_score if max_score > 0 else 0.0
-                if uid not in candidates or normalized > candidates[uid].score:
+                if uid not in candidates or raw_score > candidates[uid].score:
                    candidates[uid] = SearchCandidate(
                        chunk_uid=uid,
                        text_preview=row[1] or "",
@@ -446,7 +443,7 @@ class SearchService:
                        item_uid=row[4] or "",
                        item_title=row[5] or "",
                        library_type=row[6] or "",
-                        score=normalized,
+                        score=raw_score,
                        source="fulltext",
                    )
        except Exception as exc:
@@ -476,17 +473,17 @@ class SearchService:
            LIMIT 10
            MATCH (chunk:Chunk)-[:MENTIONS]->(concept)
            MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
-            OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
+            MATCH (lib:Library)-[:CONTAINS]->(:Collection)-[:CONTAINS]->(item)
            WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
              AND ($library_type IS NULL OR lib.library_type = $library_type)
-            WITH chunk, item, lib, concept, concept_score,
-                 count(DISTINCT concept) AS concept_count
-            RETURN DISTINCT chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
+            WITH chunk, item, lib,
+                 max(concept_score) AS score,
+                 collect(DISTINCT concept.name)[..5] AS concept_names
+            RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
                   chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index,
                   item.uid AS item_uid, item.title AS item_title,
                   lib.library_type AS library_type,
-                   concept_score AS score,
-                   collect(concept.name)[..5] AS concept_names
+                   score, concept_names
            ORDER BY score DESC
            LIMIT $limit
        """
@@ -504,16 +501,12 @@ class SearchService:
            logger.error("Graph search failed: %s", exc)
            return []

-        # Normalize scores
-        max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
-
        candidates = []
        for row in results:
            uid = row[0]
            if not uid:
                continue
            raw_score = float(row[7]) if row[7] else 0.0
-            normalized = raw_score / max_score if max_score > 0 else 0.0
            concept_names = row[8] if len(row) > 8 else []

            candidates.append(
@@ -525,7 +518,7 @@ class SearchService:
                    item_uid=row[4] or "",
                    item_title=row[5] or "",
                    library_type=row[6] or "",
-                    score=normalized,
+                    score=raw_score,
                    source="graph",
                    metadata={"concepts": concept_names},
                )
@@ -562,7 +555,7 @@ class SearchService:
            YIELD node AS emb_node, score
            MATCH (img:Image)-[:HAS_EMBEDDING]->(emb_node)
            MATCH (item:Item)-[:HAS_IMAGE]->(img)
-            OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
+            MATCH (lib:Library)-[:CONTAINS]->(:Collection)-[:CONTAINS]->(item)
            WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
              AND ($library_type IS NULL OR lib.library_type = $library_type)
            RETURN img.uid AS image_uid, img.image_type AS image_type,
@@ -642,11 +635,13 @@ class SearchService:

        try:
            client = RerankerClient(reranker_model, user=self.user)
+            # Don't pass top_n — let the reranker score every candidate so
+            # cross-attention can promote items the RRF stage ranked low.
+            # Final trimming to request.limit happens in search().
            reranked = client.rerank(
                query=request.query,
                candidates=candidates_to_rerank,
                instruction=instruction,
-                top_n=request.limit,
                query_image=request.query_image,
            )
            return reranked, reranker_model.name
@@ -660,22 +655,27 @@ class SearchService:
    # Helpers
    # ------------------------------------------------------------------

+    GENERIC_RERANKER_INSTRUCTION = (
+        "Re-rank these passages by relevance to the query."
+    )
+
    def _get_reranker_instruction(
        self, request: SearchRequest, candidates: list[SearchCandidate]
    ) -> str:
        """
        Get the content-type-aware reranker instruction.

-        If scoped to a library or library type, use that type's instruction.
-        If mixed types, use a generic instruction.
+        Scoped queries (by library or library type) use that type's
+        instruction. Unscoped queries — even when results happen to
+        come mostly from one type — use a generic instruction so the
+        reranker is not biased toward the majority type.

        :param request: SearchRequest.
-        :param candidates: Candidates (used to detect dominant library type).
+        :param candidates: Candidates (unused; kept for API stability).
        :returns: Reranker instruction string.
        """
        from library.content_types import get_library_type_config

-        # Use explicit library type from request
        if request.library_type:
            try:
                config = get_library_type_config(request.library_type)
@@ -683,25 +683,12 @@ class SearchService:
            except ValueError:
                pass

-        # Use library UID to look up type
        if request.library_uid:
-            return self._get_library_reranker_instruction(request.library_uid)
+            instruction = self._get_library_reranker_instruction(request.library_uid)
+            if instruction:
+                return instruction

-        # Detect dominant type from candidates
-        type_counts: dict[str, int] = {}
-        for c in candidates:
-            if c.library_type:
-                type_counts[c.library_type] = type_counts.get(c.library_type, 0) + 1
-
-        if type_counts:
-            dominant_type = max(type_counts, key=type_counts.get)
-            try:
-                config = get_library_type_config(dominant_type)
-                return config.get("reranker_instruction", "")
-            except ValueError:
-                pass
-
-        return ""
+        return self.GENERIC_RERANKER_INSTRUCTION

    def _get_library_reranker_instruction(self, library_uid: str) -> str:
        """Get reranker_instruction from a Library node."""
@@ -710,7 +697,12 @@ class SearchService:

            lib = Library.nodes.get(uid=library_uid)
            return lib.reranker_instruction or ""
-        except Exception:
+        except Exception as exc:
+            logger.warning(
+                "Failed to load reranker_instruction for library_uid=%s: %s",
+                library_uid,
+                exc,
+            )
            return ""

    def _get_embedding_instruction(self, library_uid: str) -> str:
@@ -720,7 +712,12 @@ class SearchService:

            lib = Library.nodes.get(uid=library_uid)
            return lib.embedding_instruction or ""
-        except Exception:
+        except Exception as exc:
+            logger.warning(
+                "Failed to load embedding_instruction for library_uid=%s: %s",
+                library_uid,
+                exc,
+            )
            return ""

    def _get_type_embedding_instruction(self, library_type: str) -> str:
--- a/mnemosyne/library/tests/test_search.py
+++ b/mnemosyne/library/tests/test_search.py
@@ -225,8 +225,12 @@ class SearchServiceHelperTest(TestCase):
        instruction = service._get_reranker_instruction(request, [])
        self.assertIn("fiction", instruction.lower())

-    def test_get_reranker_instruction_from_candidates(self):
-        """Detects dominant library type from candidate list."""
+    def test_get_reranker_instruction_generic_for_unscoped(self):
+        """
+        Unscoped queries get the generic instruction even when candidates
+        all share a library_type — type-specific instructions could bias
+        the reranker against minority-type results.
+        """
        service = SearchService()
        request = SearchRequest(query="test")
        candidates = [
@@ -240,10 +244,10 @@ class SearchServiceHelperTest(TestCase):
        ]

        instruction = service._get_reranker_instruction(request, candidates)
-        self.assertIn("technical", instruction.lower())
+        self.assertEqual(instruction, SearchService.GENERIC_RERANKER_INSTRUCTION)

-    def test_get_reranker_instruction_empty_when_no_context(self):
-        """Returns empty when no library type context available."""
+    def test_get_reranker_instruction_generic_when_no_context(self):
+        """Returns the generic instruction when no library scope is set."""
        service = SearchService()
        request = SearchRequest(query="test")
        candidates = [
@@ -256,4 +260,4 @@ class SearchServiceHelperTest(TestCase):
        ]

        instruction = service._get_reranker_instruction(request, candidates)
-        self.assertEqual(instruction, "")
+        self.assertEqual(instruction, SearchService.GENERIC_RERANKER_INSTRUCTION)