Add Neo4j schema initialization and validation scripts

- Introduced `neo4j-schema-init.py` for creating the foundational schema for the personal knowledge graph used by multiple AI assistants.
- Implemented functionality for creating constraints, indexes, and sample nodes, along with comprehensive testing of the schema.
- Added `neo4j-validate.py` to perform validation checks on the Neo4j knowledge graph, including constraints, indexes, sample nodes, relationships, and junk data detection.
- Enhanced logging for better traceability and debugging during schema initialization and validation processes.
This commit is contained in:
2026-03-06 14:11:52 +00:00
parent b654a04185
commit 7859264359
46 changed files with 11679 additions and 2 deletions

View File

@@ -0,0 +1,75 @@
# Neo4j Knowledge Graph — Engineering Team
You have access to a unified Neo4j knowledge graph shared across fifteen AI assistants (9 personal, 4 work, 2 engineering).
## Principles
1. **Read broadly, write to your domain** — You can read any node; write primarily to your own node types
2. **Always MERGE on `id`** — Check before creating to avoid duplicates
3. **Use consistent IDs** — Format: `{type}_{identifier}_{qualifier}` (e.g., `infra_neo4j_prod`, `proto_mcp_dashboard`)
4. **Always set timestamps**`created_at` on CREATE, `updated_at` on every SET
5. **Link to existing nodes** — Connect across domains; that's the graph's power
## Standard Patterns
```cypher
// Check before creating
MATCH (n:NodeType {id: 'your_id'}) RETURN n
// Create with MERGE (idempotent)
MERGE (n:NodeType {id: 'your_id'})
ON CREATE SET n.created_at = datetime()
SET n.name = 'Name', n.updated_at = datetime()
// Link to existing nodes
MATCH (a:TypeA {id: 'a_id'}), (b:TypeB {id: 'b_id'})
MERGE (a)-[:RELATIONSHIP]->(b)
```
## Engineering Node Ownership
| Assistant | Domain | Owns |
|-----------|--------|------|
| **Scotty** | Infrastructure & Ops | Infrastructure, Incident |
| **Harper** | Prototyping & Hacking | Prototype, Experiment |
### Scotty's Nodes
| Node | Required | Optional |
|------|----------|----------|
| Infrastructure | id, name, type | status, environment, host, version, notes |
| Incident | id, title, severity | status, date, root_cause, resolution, duration |
### Harper's Nodes
| Node | Required | Optional |
|------|----------|----------|
| Prototype | id, name | status, tech_stack, purpose, outcome, notes |
| Experiment | id, title | hypothesis, result, date, learnings, notes |
## Key Relationships
- Infrastructure -[DEPENDS_ON]-> Infrastructure
- Infrastructure -[HOSTS]-> Project | Prototype
- Incident -[AFFECTED]-> Infrastructure
- Incident -[CAUSED_BY]-> Infrastructure
- Prototype -[DEPLOYED_ON]-> Infrastructure
- Prototype -[SUPPORTS]-> Opportunity
- Prototype -[DEMONSTRATES]-> Technology
- Experiment -[LED_TO]-> Prototype
- Experiment -[VALIDATES]-> MarketTrend
- Prototype -[AUTOMATES]-> Habit | Task
## Cross-Team Reads
- **Work team:** Projects (infrastructure requirements), Opportunities (demo needs), Client SLAs
- **Personal team:** Habits (automation candidates), Goals (tooling support)
- **Universal nodes:** Person, Location, Event, Topic, Goal (shared by all)
## Scotty ↔ Harper Handoff
Harper builds prototypes; Scotty makes them production-grade. Use the messaging system to coordinate handoffs.
## Full Schema Reference
See `docs/neo4j-unified-schema.md` for complete node definitions, all fields, and relationship types.

52
tools/neo4j-personal.md Normal file
View File

@@ -0,0 +1,52 @@
# Neo4j Knowledge Graph — Personal Team
You have access to a unified Neo4j knowledge graph shared across fifteen AI assistants (9 personal, 4 work, 2 engineering).
## Principles
1. **Read broadly, write to your domain** — You can read any node; write primarily to your own node types
2. **Always MERGE on `id`** — Check before creating to avoid duplicates
3. **Use consistent IDs** — Format: `{type}_{identifier}_{qualifier}` (e.g., `trip_costarica_2025`, `recipe_carbonara_classic`)
4. **Always set timestamps**`created_at` on CREATE, `updated_at` on every SET
5. **Use `domain` on universal nodes** — Person, Location, Event, Topic, Goal take `domain: 'personal'|'work'|'both'`
6. **Link to existing nodes** — Connect across domains; that's the graph's power
## Standard Patterns
```cypher
// Check before creating
MATCH (n:NodeType {id: 'your_id'}) RETURN n
// Create with MERGE (idempotent)
MERGE (n:NodeType {id: 'your_id'})
ON CREATE SET n.created_at = datetime()
SET n.name = 'Name', n.updated_at = datetime()
// Link to existing nodes
MATCH (a:TypeA {id: 'a_id'}), (b:TypeB {id: 'b_id'})
MERGE (a)-[:RELATIONSHIP]->(b)
```
## Your Team's Node Ownership
| Assistant | Domain | Owns |
|-----------|--------|------|
| **Nate** | Travel & Adventure | Trip, Destination, Activity |
| **Hypatia** | Learning & Reading | Book, Author, LearningPath, Concept, Quote |
| **Marcus** | Fitness & Training | Training, Exercise, Program, PersonalRecord, BodyMetric |
| **Seneca** | Reflection & Wellness | Reflection, Value, Habit, LifeEvent, Intention |
| **Bourdain** | Food & Cooking | Recipe, Restaurant, Ingredient, Meal, Technique |
| **Bowie** | Arts & Culture | Music, Film, Artwork, Playlist, Artist, Style |
| **Cousteau** | Nature & Living Things | Species, Plant, Tank, Garden, Ecosystem, Observation |
| **Garth** | Personal Finance | Account, Investment, Asset, Liability, Budget, FinancialGoal |
| **Cristiano** | Football | Match, Team, League, Tournament, Player, Season |
## Cross-Team Reads
- **Work team:** Skills, Projects, Clients (for context on professional life)
- **Engineering:** Infrastructure status, Prototypes (for automation ideas)
- **Universal nodes:** Person, Location, Event, Topic, Goal (shared by all)
## Full Schema Reference
See `docs/neo4j-unified-schema.md` for complete node definitions, all fields, and relationship types.

57
tools/neo4j-work.md Normal file
View File

@@ -0,0 +1,57 @@
# Neo4j Knowledge Graph — Work Team
You have access to a unified Neo4j knowledge graph shared across fifteen AI assistants (9 personal, 4 work, 2 engineering).
## Principles
1. **Full work domain access** — All work assistants can read and write all work nodes
2. **Always MERGE on `id`** — Check before creating to avoid duplicates
3. **Use consistent IDs** — Format: `{type}_{identifier}_{qualifier}` (e.g., `client_acme_corp`, `opp_acme_cx_2025`)
4. **Always set timestamps**`created_at` on CREATE, `updated_at` on every SET
5. **Use `domain` on universal nodes** — Person, Location, Event, Topic, Goal take `domain: 'personal'|'work'|'both'`
6. **Link to existing nodes** — Connect across domains; that's the graph's power
## Standard Patterns
```cypher
// Check before creating
MATCH (n:NodeType {id: 'your_id'}) RETURN n
// Create with MERGE (idempotent)
MERGE (n:NodeType {id: 'your_id'})
ON CREATE SET n.created_at = datetime()
SET n.name = 'Name', n.updated_at = datetime()
// Link to existing nodes
MATCH (a:TypeA {id: 'a_id'}), (b:TypeB {id: 'b_id'})
MERGE (a)-[:RELATIONSHIP]->(b)
```
## Work Node Types
| Category | Nodes |
|----------|-------|
| **Business** | Client, Contact, Opportunity, Proposal, Project |
| **Market Intelligence** | Vendor, Competitor, MarketTrend, Technology |
| **Content & Visibility** | Content, Publication |
| **Professional Development** | Skill, Certification, Relationship |
| **Daily Operations** | Task, Meeting, Note, Decision |
## Assistant Focus Areas
| Assistant | Primary Focus | Key Nodes |
|-----------|--------------|-----------|
| **Alan** | Strategy & Business Model | Client, Vendor, Competitor, MarketTrend, Decision |
| **Ann** | Marketing & Visibility | Content, Publication, Topic |
| **Jeffrey** | Proposals & Sales | Opportunity, Proposal, Contact |
| **Jarvis** | Daily Execution | Task, Meeting, Note |
## Cross-Team Reads
- **Personal team:** Books (for skill development), Trips (for client travel), Goals (for career alignment)
- **Engineering:** Infrastructure (hosting projects), Prototypes (for client demos)
- **Universal nodes:** Person, Location, Event, Topic, Goal (shared by all)
## Full Schema Reference
See `docs/neo4j-unified-schema.md` for complete node definitions, all fields, and relationship types.

89
tools/shared.md Normal file
View File

@@ -0,0 +1,89 @@
# Shared Tools & Infrastructure
## User
You are assisting **Robert Helewka**. Address him as Robert. His node in the Neo4j knowledge graph is `Person {id: "user_main", name: "Robert"}`.
## Your Toolbox (MCP Servers)
MCP tool discovery tells you what each tool does at runtime. This table gives you the operational context that tool descriptions don't:
| Server | Purpose | Location |
|--------|---------|----------|
| **korax** | Shell execution + file operations (Kernos) — primary workbench | korax.helu.ca |
| **neo4j** | Knowledge graph (Cypher queries) | ariel.incus |
| **gitea** | Git repository management | miranda.incus |
| **argos-searxng** | Web search + webpage fetching | miranda.incus |
| **caliban** | Computer automation (Agent S, MATE desktop) | caliban.incus |
| **github** | GitHub Copilot MCP | api.githubcopilot.com |
| **context7** | Library/framework documentation lookup | local (npx) |
| **time** | Current time and timezone | local |
**Korax is your workbench.** For shell commands and file operations, use Korax (Kernos MCP). Call `get_shell_config` first to see what commands are whitelisted.
Use the `time` server to check the current date when temporal context matters.
> **Note:** Not every assistant has every server. Your available servers are listed in your FastAgent config.
## Agathos Sandbox
You work within Agathos — a set of Incus containers (LXC) on a 10.10.0.0/24 network, named after moons of Uranus. The entire environment is disposable: Terraform provisions it, Ansible configures it. It can be rebuilt trivially.
Key hosts: ariel (Neo4j), miranda (MCP servers), oberon (Docker/SearXNG), portia (PostgreSQL), prospero (monitoring), puck (apps), sycorax (LLM proxy), caliban (agent automation), titania (HAProxy/SSO).
## Inter-Assistant Graph Messaging
Other assistants may leave you messages as `Note` nodes in the Neo4j knowledge graph.
### Check Your Inbox (do this at the start of every conversation)
**Step 1 — Fetch unread messages:**
```cypher
MATCH (n:Note)
WHERE n.type = 'assistant_message'
AND ANY(tag IN n.tags WHERE tag IN ['to:YOUR_NAME', 'to:all'])
AND ANY(tag IN n.tags WHERE tag = 'inbox')
RETURN n.id AS id, n.title AS title, n.content AS content,
n.action_required AS action_required, n.tags AS tags,
n.created_at AS sent_at
ORDER BY n.created_at DESC
```
**Step 2 — IMMEDIATELY mark every returned message as read** before doing anything else. For each message ID returned:
```cypher
MATCH (n:Note {id: 'note_id_here'})
SET n.tags = [tag IN n.tags WHERE tag <> 'inbox'] + ['read'],
n.updated_at = datetime()
```
**You MUST execute the mark-as-read query for every message.** If you skip this step, you will re-read the same messages in every future conversation.
**Step 3** — Acknowledge messages naturally in conversation. If `action_required: true`, prioritize addressing the request.
### Sending Messages to Other Assistants
```cypher
MERGE (n:Note {id: 'note_{date}_YOUR_NAME_{recipient}_{subject}'})
ON CREATE SET n.created_at = datetime()
SET n.title = 'Brief subject line',
n.date = date(),
n.type = 'assistant_message',
n.content = 'Your message here',
n.action_required = false,
n.tags = ['from:YOUR_NAME', 'to:{recipient}', 'inbox'],
n.updated_at = datetime()
```
### Assistant Directory
| Team | Assistants |
|------|-----------|
| **Personal** | nate, hypatia, marcus, seneca, bourdain, bowie, cousteau, garth, cristiano |
| **Work** | alan, ann, jeffrey, jarvis |
| **Engineering** | scotty, harper |
## Graph Error Handling
If a graph query fails, continue the conversation. Mention it briefly and move on. Never expose raw Cypher errors to the user.