# Neo4j - Graph Database Platform ## Overview Neo4j is a high-performance graph database providing native graph storage and processing. It enables efficient traversal of complex relationships and is used for knowledge graphs, recommendation engines, and connected data analysis. Deployed with the **APOC plugin** enabled for extended stored procedures and functions. Two dedicated Neo4j instances run in the Ouranos lab, one per tenant, because Neo4j Community Edition is single-database and tenants cannot safely share label space, vector indexes, or schema migrations: | Host | Tenant | HTTP Browser | Bolt | |------|--------|--------------|------| | `ariel.incus` | Shared / general graph work (Neo4j MCP, exploration) | port 25554 | port 7687 | | `umbriel.incus` | Mnemosyne (dedicated — `Library`/`Collection`/`Item`/`Chunk`/`Concept`) | port 25555 | port 7687 | Both hosts run the same Ansible playbook (`neo4j/deploy.yml`) from the same `docker-compose.yml.j2` template, differing only by port and vault password. They run independent Docker Compose stacks with their own named volumes (`neo4j_data`, `neo4j_logs`, `neo4j_plugins`) — no shared state. ## Architecture ``` ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ Client │─────▶│ Neo4j │◀─────│ Neo4j MCP │ │ (Browser) │ │ (Ariel) │ │ (Miranda) │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ ▼ │ ┌──────────────┐ └────────────▶│ Neo4j Browser│ │ HTTP :25554 │ └──────────────┘ ┌──────────────┐ ┌──────────────┐ │ Mnemosyne │─────▶│ Neo4j │ │ (puck) │ Bolt │ (Umbriel) │ └──────────────┘ └──────────────┘ │ ▼ ┌──────────────┐ │ Neo4j Browser│ │ HTTP :25555 │ └──────────────┘ ``` - **Neo4j Browser (Ariel)**: Web-based query interface on port 25554 - **Neo4j Browser (Umbriel)**: Web-based query interface on port 25555 - **Bolt Protocol**: Binary protocol on port 7687 for high-performance connections (same port on both hosts — each container has its own network namespace) - **APOC Plugin**: Extended procedures for import/export, graph algorithms, and utilities - **Neo4j MCP Servers**: Connect via Bolt from Miranda for AI agent access (Ariel only) - **Mnemosyne**: Connects via Bolt to Umbriel; does not touch Ariel ## Terraform Resources ### Host Definitions Both hosts are defined in `terraform/containers.tf`: | Attribute | ariel | umbriel | |-----------|-------|---------| | Image | noble | noble | | Role | graph_database | graph_database | | Security Nesting | true | true | | AppArmor | unconfined | unconfined | | Description | Neo4j Host - Ethereal graph connections | Neo4j Host (Mnemosyne) - Dusky sprite keeping the memory graph | ### Proxy Devices | Host | Device Name | Listen | Connect | |------|-------------|--------|---------| | ariel | neo4j_ports | tcp:0.0.0.0:25554 | tcp:127.0.0.1:25554 | | umbriel | neo4j_ports | tcp:0.0.0.0:25555 | tcp:127.0.0.1:25555 | > Bolt (7687) is not in the Incus proxy device list for either host — it is > reached directly over the internal `10.10.0.0/24` network by DNS name > (`ariel.incus:7687`, `umbriel.incus:7687`). ### Dependencies | Resource | Relationship | |----------|--------------| | Prospero | Monitoring stack must exist for Alloy log shipping | | Miranda | Neo4j MCP servers connect to Neo4j via Bolt | ## Ansible Deployment ### Playbook ```bash cd ansible ansible-playbook neo4j/deploy.yml ``` ### Files | File | Purpose | |------|---------| | `neo4j/deploy.yml` | Main deployment playbook (runs on both hosts via service detection) | | `neo4j/docker-compose.yml.j2` | Docker Compose template | | `alloy/ariel/config.alloy.j2` | Alloy log collection config — Ariel | | `alloy/umbriel/config.alloy.j2` | Alloy log collection config — Umbriel | ### Deployment Steps 1. **Create System User**: `neo4j:neo4j` system group and user 2. **Configure ponos Access**: Add ponos user to neo4j group 3. **Create Directory**: `/srv/neo4j` with proper ownership 4. **Template Compose File**: Apply `docker-compose.yml.j2` 5. **Start Service**: Launch via `docker_compose_v2` module ## Configuration ### Host Variables Both hosts define the same variable set, differing only in port, syslog port, and vault reference. `host_vars/ariel.incus.yml`: | Variable | Value | |----------|-------| | `neo4j_auth_password` | `{{ vault_neo4j_auth_password }}` | | `neo4j_http_port` | `25554` | | `neo4j_syslog_port` | `22011` | `host_vars/umbriel.incus.yml`: | Variable | Value | |----------|-------| | `neo4j_auth_password` | `{{ vault_mnemosyne_neo4j_auth_password }}` | | `neo4j_http_port` | `25555` | | `neo4j_syslog_port` | `22012` | Shared variables on both hosts: | Variable | Description | Default | |----------|-------------|---------| | `neo4j_version` | Neo4j Docker image version | `5.26.0` | | `neo4j_user` | System user | `neo4j` | | `neo4j_group` | System group | `neo4j` | | `neo4j_directory` | Installation directory | `/srv/neo4j` | | `neo4j_auth_user` | Database admin username | `neo4j` | | `neo4j_bolt_port` | Bolt protocol port | `7687` | | `neo4j_apoc_unrestricted` | APOC procedures allowed | `apoc.*` | ### Vault Variables (`group_vars/all/vault.yml`) | Variable | Description | |----------|-------------| | `vault_neo4j_auth_password` | Neo4j admin password (Ariel) | | `vault_mnemosyne_neo4j_auth_password` | Neo4j admin password (Umbriel — dedicated Mnemosyne instance) | ### APOC Plugin Configuration The APOC (Awesome Procedures on Cypher) plugin is enabled with the following settings: | Environment Variable | Value | Purpose | |---------------------|-------|---------| | `NEO4J_PLUGINS` | `["apoc"]` | Install APOC plugin | | `NEO4J_apoc_export_file_enabled` | `true` | Allow file exports | | `NEO4J_apoc_import_file_enabled` | `true` | Allow file imports | | `NEO4J_apoc_import_file_use__neo4j__config` | `true` | Use Neo4j config for imports | | `NEO4J_dbms_security_procedures_unrestricted` | `apoc.*` | Allow all APOC procedures | ### Docker Volumes | Volume | Mount Point | Purpose | |--------|-------------|---------| | `neo4j_data` | `/data` | Database files | | `neo4j_logs` | `/logs` | Application logs | | `neo4j_plugins` | `/plugins` | APOC and other plugins | ## Monitoring ### Alloy Configuration **Files:** `ansible/alloy/ariel/config.alloy.j2`, `ansible/alloy/umbriel/config.alloy.j2` Alloy on each host collects: - System logs (`/var/log/syslog`, `/var/log/auth.log`) - Systemd journal - Neo4j Docker container logs via syslog (Ariel: tcp:127.0.0.1:22011; Umbriel: tcp:127.0.0.1:22012) ### Loki Logs | Log Source | Labels | |------------|--------| | Neo4j container (Ariel) | `{job="neo4j", hostname="ariel.incus"}` | | Neo4j container (Umbriel) | `{job="neo4j", hostname="umbriel.incus"}` | | System logs | `{job="syslog", hostname="ariel.incus"}` / `{job="syslog", hostname="umbriel.incus"}` | ### Prometheus Metrics Host-level metrics collected via Alloy's Unix exporter: | Metric | Description | |--------|-------------| | `node_*` | Standard node exporter metrics | ### Log Collection Flow ``` Neo4j Container (Ariel) → Syslog (tcp:127.0.0.1:22011) → Alloy → Loki (Prospero) Neo4j Container (Umbriel) → Syslog (tcp:127.0.0.1:22012) → Alloy → Loki (Prospero) ``` ## Operations ### Start/Stop ```bash # Via Docker Compose cd /srv/neo4j docker compose up -d docker compose down # Via Ansible ansible-playbook neo4j/deploy.yml ``` ### Health Check ```bash # HTTP Browser curl http://ariel.incus:25554 # Bolt connection test cypher-shell -a bolt://ariel.incus:7687 -u neo4j -p "RETURN 1" ``` ### Logs ```bash # Docker container logs docker logs -f neo4j # Via Loki (Grafana Explore) {job="neo4j", hostname="ariel.incus"} ``` ### Cypher Shell Access ```bash # SSH to Ariel and exec into container ssh ariel.incus docker exec -it neo4j cypher-shell -u neo4j -p ``` ### Backup Neo4j data persists in Docker volumes. Backup procedures: ```bash # Stop container for consistent backup docker compose -f /srv/neo4j/docker-compose.yml stop # Backup volumes docker run --rm -v neo4j_data:/data -v /backup:/backup alpine \ tar czf /backup/neo4j_data_$(date +%Y%m%d).tar.gz -C /data . # Start container docker compose -f /srv/neo4j/docker-compose.yml up -d ``` ### Restore ```bash # Stop container docker compose -f /srv/neo4j/docker-compose.yml down # Remove existing volume docker volume rm neo4j_data # Create new volume and restore docker volume create neo4j_data docker run --rm -v neo4j_data:/data -v /backup:/backup alpine \ tar xzf /backup/neo4j_data_YYYYMMDD.tar.gz -C /data # Start container docker compose -f /srv/neo4j/docker-compose.yml up -d ``` ## Troubleshooting ### Common Issues | Symptom | Cause | Resolution | |---------|-------|------------| | Container won't start | Auth format issue | Check `NEO4J_AUTH` format is `user/password` | | APOC procedures fail | Security restrictions | Verify `neo4j_apoc_unrestricted` includes procedure | | Connection refused | Port not exposed | Check Incus proxy device configuration | | Bolt connection fails | Wrong port | Use port 7687, not 25554 | ### Debug Mode ```bash # View container startup logs docker logs neo4j # Check Neo4j internal logs docker exec neo4j cat /logs/debug.log ``` ### Verify APOC Installation ```cypher CALL apoc.help("apoc") YIELD name, text RETURN name, text LIMIT 10; ``` ## Related Services ### Neo4j MCP Servers (Miranda) Two MCP servers run on Miranda to provide AI agent access to Neo4j: | Server | Port | Purpose | |--------|------|---------| | neo4j-cypher | 25531 | Direct Cypher query execution | | neo4j-memory | 25532 | Knowledge graph memory operations | See [Neo4j MCP documentation](#neo4j-mcp-servers) for deployment details. ## References - [Neo4j Documentation](https://neo4j.com/docs/) - [APOC Library Documentation](https://neo4j.com/labs/apoc/) - [Terraform Practices](../terraform.md) - [Ansible Practices](../ansible.md) - [Sandbox Overview](../ouranos.html)