Add Neo4j schema initialization and validation scripts

- Introduced `neo4j-schema-init.py` for creating the foundational schema for the personal knowledge graph used by multiple AI assistants.
- Implemented functionality for creating constraints, indexes, and sample nodes, along with comprehensive testing of the schema.
- Added `neo4j-validate.py` to perform validation checks on the Neo4j knowledge graph, including constraints, indexes, sample nodes, relationships, and junk data detection.
- Enhanced logging for better traceability and debugging during schema initialization and validation processes.
This commit is contained in:
2026-03-06 14:11:52 +00:00
parent b654a04185
commit 7859264359
46 changed files with 11679 additions and 2 deletions

View File

@@ -0,0 +1,75 @@
# Neo4j Knowledge Graph — Engineering Team
You have access to a unified Neo4j knowledge graph shared across fifteen AI assistants (9 personal, 4 work, 2 engineering).
## Principles
1. **Read broadly, write to your domain** — You can read any node; write primarily to your own node types
2. **Always MERGE on `id`** — Check before creating to avoid duplicates
3. **Use consistent IDs** — Format: `{type}_{identifier}_{qualifier}` (e.g., `infra_neo4j_prod`, `proto_mcp_dashboard`)
4. **Always set timestamps**`created_at` on CREATE, `updated_at` on every SET
5. **Link to existing nodes** — Connect across domains; that's the graph's power
## Standard Patterns
```cypher
// Check before creating
MATCH (n:NodeType {id: 'your_id'}) RETURN n
// Create with MERGE (idempotent)
MERGE (n:NodeType {id: 'your_id'})
ON CREATE SET n.created_at = datetime()
SET n.name = 'Name', n.updated_at = datetime()
// Link to existing nodes
MATCH (a:TypeA {id: 'a_id'}), (b:TypeB {id: 'b_id'})
MERGE (a)-[:RELATIONSHIP]->(b)
```
## Engineering Node Ownership
| Assistant | Domain | Owns |
|-----------|--------|------|
| **Scotty** | Infrastructure & Ops | Infrastructure, Incident |
| **Harper** | Prototyping & Hacking | Prototype, Experiment |
### Scotty's Nodes
| Node | Required | Optional |
|------|----------|----------|
| Infrastructure | id, name, type | status, environment, host, version, notes |
| Incident | id, title, severity | status, date, root_cause, resolution, duration |
### Harper's Nodes
| Node | Required | Optional |
|------|----------|----------|
| Prototype | id, name | status, tech_stack, purpose, outcome, notes |
| Experiment | id, title | hypothesis, result, date, learnings, notes |
## Key Relationships
- Infrastructure -[DEPENDS_ON]-> Infrastructure
- Infrastructure -[HOSTS]-> Project | Prototype
- Incident -[AFFECTED]-> Infrastructure
- Incident -[CAUSED_BY]-> Infrastructure
- Prototype -[DEPLOYED_ON]-> Infrastructure
- Prototype -[SUPPORTS]-> Opportunity
- Prototype -[DEMONSTRATES]-> Technology
- Experiment -[LED_TO]-> Prototype
- Experiment -[VALIDATES]-> MarketTrend
- Prototype -[AUTOMATES]-> Habit | Task
## Cross-Team Reads
- **Work team:** Projects (infrastructure requirements), Opportunities (demo needs), Client SLAs
- **Personal team:** Habits (automation candidates), Goals (tooling support)
- **Universal nodes:** Person, Location, Event, Topic, Goal (shared by all)
## Scotty ↔ Harper Handoff
Harper builds prototypes; Scotty makes them production-grade. Use the messaging system to coordinate handoffs.
## Full Schema Reference
See `docs/neo4j-unified-schema.md` for complete node definitions, all fields, and relationship types.