fix(search): require library match and preserve raw scores for RRF

Replace OPTIONAL MATCH with MATCH for Library-Collection-Item paths to
ensure results are properly scoped to libraries, and remove per-query
score normalization since RRF fuses results by rank rather than score
magnitude.
This commit is contained in:
2026-04-26 06:35:11 -04:00
parent 4a35aa126f
commit 388b37e471
3 changed files with 55 additions and 360 deletions

View File

@@ -1,306 +0,0 @@
## 🐾 Red Panda Approval™
This project follows Red Panda Approval standards — our gold standard for Django application quality. Code must be elegant, reliable, and maintainable to earn the approval of our adorable red panda judges.
### The 5 Sacred Django Criteria
1. **Fresh Migration Test** — Clean migrations from empty database
2. **Elegant Simplicity** — No unnecessary complexity
3. **Observable & Debuggable** — Proper logging and error handling
4. **Consistent Patterns** — Follow Django conventions
5. **Actually Works** — Passes all checks and serves real user needs
## Environment Standards
- Virtual environment: ~/env/PROJECT/bin/activate
- Use pyproject.toml for project configuration (no setup.py, no requirements.txt)
- Python version: specified in pyproject.toml
- Dependencies: floor-pinned with ceiling (e.g. `Django>=5.2,<6.0`)
### Dependency Pinning
```toml
# Correct — floor pin with ceiling
dependencies = [
"Django>=5.2,<6.0",
"djangorestframework>=3.14,<4.0",
"cryptography>=41.0,<45.0",
]
# Wrong — exact pins in library packages
dependencies = [
"Django==5.2.7", # too strict, breaks downstream
]
```
Exact pins (`==`) are only appropriate in application-level lock files, not in reusable library packages.
## Directory Structure
myproject/ # Git repository root
├── .gitignore
├── README.md
├── pyproject.toml # Project configuration (moved to repo root)
├── docker-compose.yml
├── .env # Docker Compose environment (DATABASE_URL=postgres://...)
├── .env.example
├── project/ # Django project root (manage.py lives here)
│ ├── manage.py
│ ├── Dockerfile
│ ├── .env # Local development environment (DATABASE_URL=sqlite:///...)
│ ├── .env.example
│ │
│ ├── config/ # Django configuration module
│ │ ├── __init__.py
│ │ ├── settings.py
│ │ ├── urls.py
│ │ ├── wsgi.py
│ │ └── asgi.py
│ │
│ ├── accounts/ # Django app
│ │ ├── __init__.py
│ │ ├── models.py
│ │ ├── views.py
│ │ └── urls.py
│ │
│ ├── blog/ # Django app
│ │ ├── __init__.py
│ │ ├── models.py
│ │ ├── views.py
│ │ └── urls.py
│ │
│ ├── static/
│ │ ├── css/
│ │ └── js/
│ │
│ └── templates/
│ └── base.html
├── web/ # Nginx configuration
│ └── nginx.conf
├── db/ # PostgreSQL configuration
│ └── postgresql.conf
└── docs/ # Project documentation
└── index.md
## Settings Structure
- Use a single settings.py file
- Use django-environ or python-dotenv for environment variables
- Never commit .env files to version control
- Provide .env.example with all required variables documented
- Create .gitignore file
- Create a .dockerignore file
## Code Organization
- Imports: PEP 8 ordering (stdlib, third-party, local)
- Type hints on function parameters
- CSS: External .css files only (no inline styles, no embedded `<style>` tags)
- JS: External .js files only (no inline handlers, no embedded `<script>` blocks)
- Maximum file length: 1000 lines
- If a file exceeds 500 lines, consider splitting by domain concept
## Database Conventions
- Migrations run cleanly from empty database
- Never edit deployed migrations
- Use meaningful migration names: --name add_email_to_profile
- One logical change per migration when possible
- Test migrations both forward and backward
### Development vs Production
- Development: SQLite
- Production: PostgreSQL
## Caching
- Expensive queries are cached
- Cache keys follow naming convention
- TTLs are appropriate (not infinite)
- Invalidation is documented
- Key Naming Pattern: {app}:{model}:{identifier}:{field}
## Model Naming
- Model names: singular PascalCase (User, BlogPost, OrderItem)
- Correct English pluralization on related names
- All models have created_at and updated_at
- All models define __str__ and get_absolute_url
- TextChoices used for status fields
- related_name defined on ForeignKey fields
- Related names: plural snake_case with proper English pluralization
## Forms
- Use ModelForm with explicit fields list (never __all__)
## Field Naming
- Foreign keys: singular without _id suffix (author, category, parent)
- Boolean fields: use prefixes (is_active, has_permission, can_edit)
- Date fields: use suffixes (created_at, updated_at, published_on)
- Avoid abbreviations (use description, not desc)
## Required Model Fields
- All models should include:
- created_at = models.DateTimeField(auto_now_add=True)
- updated_at = models.DateTimeField(auto_now=True)
- Consider adding:
- id = models.UUIDField(primary_key=True) for public-facing models
- is_active = models.BooleanField(default=True) for soft deletes
## Indexing
- Add db_index=True to frequently queried fields
- Use Meta.indexes for composite indexes
- Document why each index exists
## Queries
- Use select_related() for foreign keys
- Use prefetch_related() for reverse relations and M2M
- Avoid queries in loops (N+1 problem)
- Use .only() and .defer() for large models
- Add comments explaining complex querysets
## Docstrings
- Use Sphinx style docstrings
- Document all public functions, classes, and modules
- Skip docstrings for obvious one-liners and standard Django overrides
## Views
- Use Function-Based Views (FBVs) exclusively
- Explicit logic is preferred over implicit inheritance
- Extract shared logic into utility functions
## URLs & Identifiers
- Public URLs use short UUIDs (12 characters) via `shortuuid`
- Never expose sequential IDs in URLs (security/enumeration risk)
- Internal references may use standard UUIDs or PKs
## URL Patterns
- Resource-based URLs (RESTful style)
- Namespaced URL names per app
- Trailing slashes (Django default)
- Flat structure preferred over deep nesting
## Background Tasks
- All tasks are run synchronously unless the design specifies background tasks are needed for long operations
- Long operations use Celery tasks
- Use Memcached, task progress pattern: {app}:task:{task_id}:progress
- Tasks are idempotent
- Tasks include retry logic
- Tasks live in app/tasks.py
- RabbitMQ is the Message Broker
- Flower Monitoring: Use for debugging failed tasks
## Testing
- Framework: Django TestCase (not pytest)
- Separate test files per module: test_models.py, test_views.py, test_forms.py
## Frontend Standards
### New Projects (DaisyUI + Tailwind)
- DaisyUI 4 via CDN for component classes
- Tailwind CSS via CDN for utility classes
- Theme management via Themis (DaisyUI `data-theme` attribute)
- All apps extend `themis/base.html` for consistent navigation
- No inline styles or scripts
### Existing Projects (Bootstrap 5)
- Bootstrap 5 via CDN
- Bootstrap Icons via CDN
- Bootswatch for theme variants (if applicable)
- django-bootstrap5 and crispy-bootstrap5 for form rendering
## Preferred Packages
### Core Django
- django>=5.2,<6.0
- django-environ — Environment variables
### Authentication & Security
- django-allauth — User management
- django-allauth-2fa — Two-factor authentication
### API Development
- djangorestframework>=3.14,<4.0 — REST APIs
- drf-spectacular — OpenAPI/Swagger documentation
### Encryption
- cryptography — Fernet encryption for secrets/API keys
### Background Tasks
- celery — Async task queue
- django-celery-progress — Progress bars
- flower — Celery monitoring
### Caching
- pymemcache — Memcached backend
### Database
- dj-database-url — Database URL configuration
- psycopg[binary] — PostgreSQL adapter
- shortuuid — Short UUIDs for public URLs
### Production
- gunicorn — WSGI server
### Shared Apps
- django-heluca-themis — User preferences, themes, key management, navigation
### Deprecated / Removed
- ~~pytz~~ — Use stdlib `zoneinfo` (Python 3.9+, Django 4+)
- ~~Pillow~~ — Only add if your app needs ImageField
- ~~django-heluca-core~~ — Replaced by Themis
## Anti-Patterns to Avoid
### Models
- Don't use `Model.objects.get()` without handling `DoesNotExist`
- Don't use `null=True` on `CharField` or `TextField` (use `blank=True, default=""`)
- Don't use `related_name='+'` unless you have a specific reason
- Don't override `save()` for business logic (use signals or service functions)
- Don't use `auto_now=True` on fields you might need to manually set
- Don't use `ForeignKey` without specifying `on_delete` explicitly
- Don't use `Meta.ordering` on large tables (specify ordering in queries)
### Queries
- Don't query inside loops (N+1 problem)
- Don't use `.all()` when you need a subset
- Don't use raw SQL unless absolutely necessary
- Don't forget `select_related()` and `prefetch_related()`
### Views
- Don't put business logic in views
- Don't use `request.POST.get()` without validation (use forms)
- Don't return sensitive data in error messages
- Don't forget `login_required` decorator on protected views
### Forms
- Don't use `fields = '__all__'` in ModelForm
- Don't trust client-side validation alone
- Don't use `exclude` in ModelForm (use explicit `fields`)
### Templates
- Don't use `{{ variable }}` for URLs (use `{% url %}` tag)
- Don't put logic in templates
- Don't use inline CSS or JavaScript (external files only)
- Don't forget `{% csrf_token %}` in forms
### Security
- Don't store secrets in `settings.py` (use environment variables)
- Don't commit `.env` files to version control
- Don't use `DEBUG=True` in production
- Don't expose sequential IDs in public URLs
- Don't use `mark_safe()` on user-supplied content
- Don't disable CSRF protection
### Imports & Code Style
- Don't use `from module import *`
- Don't use mutable default arguments
- Don't use bare `except:` clauses
- Don't ignore linter warnings without documented reason
### Migrations
- Don't edit migrations that have been deployed
- Don't use `RunPython` without a reverse function
- Don't add non-nullable fields without a default value
### Celery Tasks
- Don't pass model instances to tasks (pass IDs and re-fetch)
- Don't assume tasks run immediately
- Don't forget retry logic for external service calls

View File

@@ -247,7 +247,7 @@ class SearchService:
CALL db.index.vector.queryNodes('chunk_embedding_index', $top_k, $query_vector)
YIELD node AS chunk, score
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
AND ($collection_uid IS NULL OR col.uid = $collection_uid)
@@ -352,7 +352,7 @@ class SearchService:
CALL db.index.fulltext.queryNodes('chunk_text_fulltext', $query)
YIELD node AS chunk, score
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
AND ($collection_uid IS NULL OR col.uid = $collection_uid)
@@ -374,15 +374,13 @@ class SearchService:
try:
results, _ = db.cypher_query(cypher, params)
# Normalize BM25 scores to 0-1 range
max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
# Keep raw BM25 scores — RRF fuses by rank, not by score magnitude.
for row in results:
uid = row[0]
if not uid:
continue
raw_score = float(row[7]) if row[7] else 0.0
normalized = raw_score / max_score if max_score > 0 else 0.0
if uid not in candidates or normalized > candidates[uid].score:
if uid not in candidates or raw_score > candidates[uid].score:
candidates[uid] = SearchCandidate(
chunk_uid=uid,
text_preview=row[1] or "",
@@ -391,7 +389,7 @@ class SearchService:
item_uid=row[4] or "",
item_title=row[5] or "",
library_type=row[6] or "",
score=normalized,
score=raw_score,
source="fulltext",
)
except Exception as exc:
@@ -409,7 +407,7 @@ class SearchService:
YIELD node AS concept, score AS concept_score
MATCH (chunk:Chunk)-[:MENTIONS]->(concept)
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
MATCH (lib:Library)-[:CONTAINS]->(:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
@@ -430,14 +428,13 @@ class SearchService:
try:
results, _ = db.cypher_query(cypher, params)
max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
# Raw scores already include the 0.8 concept downweight from Cypher.
for row in results:
uid = row[0]
if not uid:
continue
raw_score = float(row[7]) if row[7] else 0.0
normalized = raw_score / max_score if max_score > 0 else 0.0
if uid not in candidates or normalized > candidates[uid].score:
if uid not in candidates or raw_score > candidates[uid].score:
candidates[uid] = SearchCandidate(
chunk_uid=uid,
text_preview=row[1] or "",
@@ -446,7 +443,7 @@ class SearchService:
item_uid=row[4] or "",
item_title=row[5] or "",
library_type=row[6] or "",
score=normalized,
score=raw_score,
source="fulltext",
)
except Exception as exc:
@@ -476,17 +473,17 @@ class SearchService:
LIMIT 10
MATCH (chunk:Chunk)-[:MENTIONS]->(concept)
MATCH (item:Item)-[:HAS_CHUNK]->(chunk)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
MATCH (lib:Library)-[:CONTAINS]->(:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
WITH chunk, item, lib, concept, concept_score,
count(DISTINCT concept) AS concept_count
RETURN DISTINCT chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
WITH chunk, item, lib,
max(concept_score) AS score,
collect(DISTINCT concept.name)[..5] AS concept_names
RETURN chunk.uid AS chunk_uid, chunk.text_preview AS text_preview,
chunk.chunk_s3_key AS chunk_s3_key, chunk.chunk_index AS chunk_index,
item.uid AS item_uid, item.title AS item_title,
lib.library_type AS library_type,
concept_score AS score,
collect(concept.name)[..5] AS concept_names
score, concept_names
ORDER BY score DESC
LIMIT $limit
"""
@@ -504,16 +501,12 @@ class SearchService:
logger.error("Graph search failed: %s", exc)
return []
# Normalize scores
max_score = max((float(r[7]) for r in results if r[7]), default=1.0)
candidates = []
for row in results:
uid = row[0]
if not uid:
continue
raw_score = float(row[7]) if row[7] else 0.0
normalized = raw_score / max_score if max_score > 0 else 0.0
concept_names = row[8] if len(row) > 8 else []
candidates.append(
@@ -525,7 +518,7 @@ class SearchService:
item_uid=row[4] or "",
item_title=row[5] or "",
library_type=row[6] or "",
score=normalized,
score=raw_score,
source="graph",
metadata={"concepts": concept_names},
)
@@ -562,7 +555,7 @@ class SearchService:
YIELD node AS emb_node, score
MATCH (img:Image)-[:HAS_EMBEDDING]->(emb_node)
MATCH (item:Item)-[:HAS_IMAGE]->(img)
OPTIONAL MATCH (lib:Library)-[:CONTAINS]->(col:Collection)-[:CONTAINS]->(item)
MATCH (lib:Library)-[:CONTAINS]->(:Collection)-[:CONTAINS]->(item)
WHERE ($library_uid IS NULL OR lib.uid = $library_uid)
AND ($library_type IS NULL OR lib.library_type = $library_type)
RETURN img.uid AS image_uid, img.image_type AS image_type,
@@ -642,11 +635,13 @@ class SearchService:
try:
client = RerankerClient(reranker_model, user=self.user)
# Don't pass top_n — let the reranker score every candidate so
# cross-attention can promote items the RRF stage ranked low.
# Final trimming to request.limit happens in search().
reranked = client.rerank(
query=request.query,
candidates=candidates_to_rerank,
instruction=instruction,
top_n=request.limit,
query_image=request.query_image,
)
return reranked, reranker_model.name
@@ -660,22 +655,27 @@ class SearchService:
# Helpers
# ------------------------------------------------------------------
GENERIC_RERANKER_INSTRUCTION = (
"Re-rank these passages by relevance to the query."
)
def _get_reranker_instruction(
self, request: SearchRequest, candidates: list[SearchCandidate]
) -> str:
"""
Get the content-type-aware reranker instruction.
If scoped to a library or library type, use that type's instruction.
If mixed types, use a generic instruction.
Scoped queries (by library or library type) use that type's
instruction. Unscoped queries — even when results happen to
come mostly from one type — use a generic instruction so the
reranker is not biased toward the majority type.
:param request: SearchRequest.
:param candidates: Candidates (used to detect dominant library type).
:param candidates: Candidates (unused; kept for API stability).
:returns: Reranker instruction string.
"""
from library.content_types import get_library_type_config
# Use explicit library type from request
if request.library_type:
try:
config = get_library_type_config(request.library_type)
@@ -683,25 +683,12 @@ class SearchService:
except ValueError:
pass
# Use library UID to look up type
if request.library_uid:
return self._get_library_reranker_instruction(request.library_uid)
instruction = self._get_library_reranker_instruction(request.library_uid)
if instruction:
return instruction
# Detect dominant type from candidates
type_counts: dict[str, int] = {}
for c in candidates:
if c.library_type:
type_counts[c.library_type] = type_counts.get(c.library_type, 0) + 1
if type_counts:
dominant_type = max(type_counts, key=type_counts.get)
try:
config = get_library_type_config(dominant_type)
return config.get("reranker_instruction", "")
except ValueError:
pass
return ""
return self.GENERIC_RERANKER_INSTRUCTION
def _get_library_reranker_instruction(self, library_uid: str) -> str:
"""Get reranker_instruction from a Library node."""
@@ -710,7 +697,12 @@ class SearchService:
lib = Library.nodes.get(uid=library_uid)
return lib.reranker_instruction or ""
except Exception:
except Exception as exc:
logger.warning(
"Failed to load reranker_instruction for library_uid=%s: %s",
library_uid,
exc,
)
return ""
def _get_embedding_instruction(self, library_uid: str) -> str:
@@ -720,7 +712,12 @@ class SearchService:
lib = Library.nodes.get(uid=library_uid)
return lib.embedding_instruction or ""
except Exception:
except Exception as exc:
logger.warning(
"Failed to load embedding_instruction for library_uid=%s: %s",
library_uid,
exc,
)
return ""
def _get_type_embedding_instruction(self, library_type: str) -> str:

View File

@@ -225,8 +225,12 @@ class SearchServiceHelperTest(TestCase):
instruction = service._get_reranker_instruction(request, [])
self.assertIn("fiction", instruction.lower())
def test_get_reranker_instruction_from_candidates(self):
"""Detects dominant library type from candidate list."""
def test_get_reranker_instruction_generic_for_unscoped(self):
"""
Unscoped queries get the generic instruction even when candidates
all share a library_type — type-specific instructions could bias
the reranker against minority-type results.
"""
service = SearchService()
request = SearchRequest(query="test")
candidates = [
@@ -240,10 +244,10 @@ class SearchServiceHelperTest(TestCase):
]
instruction = service._get_reranker_instruction(request, candidates)
self.assertIn("technical", instruction.lower())
self.assertEqual(instruction, SearchService.GENERIC_RERANKER_INSTRUCTION)
def test_get_reranker_instruction_empty_when_no_context(self):
"""Returns empty when no library type context available."""
def test_get_reranker_instruction_generic_when_no_context(self):
"""Returns the generic instruction when no library scope is set."""
service = SearchService()
request = SearchRequest(query="test")
candidates = [
@@ -256,4 +260,4 @@ class SearchServiceHelperTest(TestCase):
]
instruction = service._get_reranker_instruction(request, candidates)
self.assertEqual(instruction, "")
self.assertEqual(instruction, SearchService.GENERIC_RERANKER_INSTRUCTION)