Replaces the minimal project description with a comprehensive README including a component overview table, quick start instructions, common Ansible operations, and links to detailed documentation. Aligns with Red Panda Approval™ standards.
13 KiB
Ouranos Lab
Infrastructure-as-Code project managing the Ouranos Lab — a development sandbox at ouranos.helu.ca. Uses Terraform for container provisioning and Ansible for configuration management, themed around the moons of Uranus.
Project Overview
| Component | Purpose |
|---|---|
| Terraform | Provisions 10 specialised Incus containers (LXC) with DNS-resolved networking, security policies, and resource dependencies |
| Ansible | Deploys Docker, databases (PostgreSQL, Neo4j), observability stack (Prometheus, Grafana, Loki), and application runtimes across all hosts |
DNS Domain: Incus resolves containers via the
.incusdomain suffix (e.g.,oberon.incus,portia.incus). IPv4 addresses are dynamically assigned — always use DNS names, never hardcode IPs.
Uranian Host Architecture
All containers are named after moons of Uranus and resolved via the .incus DNS suffix.
| Name | Role | Description | Nesting |
|---|---|---|---|
| ariel | graph_database | Neo4j — Ethereal graph connections | ✔ |
| caliban | agent_automation | Agent S MCP Server with MATE Desktop | ✔ |
| miranda | mcp_docker_host | Dedicated Docker Host for MCP Servers | ✔ |
| oberon | container_orchestration | Docker Host — MCP Switchboard, RabbitMQ, Open WebUI | ✔ |
| portia | database | PostgreSQL — Relational database host | ❌ |
| prospero | observability | PPLG stack — Prometheus, Grafana, Loki, PgAdmin | ❌ |
| puck | application_runtime | Python App Host — JupyterLab, Django apps, Gitea Runner | ✔ |
| rosalind | collaboration | Gitea, LobeChat, Nextcloud, AnythingLLM | ✔ |
| sycorax | language_models | Arke LLM Proxy | ✔ |
| titania | proxy_sso | HAProxy TLS termination + Casdoor SSO | ✔ |
oberon — Container Orchestration
King of the Fairies orchestrating containers and managing MCP infrastructure.
- Docker engine
- MCP Switchboard (port 22785) — Django app routing MCP tool calls
- RabbitMQ message queue
- Open WebUI LLM interface (port 22088, PostgreSQL backend on Portia)
- SearXNG privacy search (port 22083, behind OAuth2-Proxy)
- smtp4dev SMTP test server (port 22025)
portia — Relational Database
Intelligent and resourceful — the reliability of relational databases.
- PostgreSQL 17 (port 5432)
- Databases:
arke,anythingllm,gitea,hass,lobechat,mcp_switchboard,nextcloud,openwebui,spelunker
ariel — Graph Database
Air spirit — ethereal, interconnected nature mirroring graph relationships.
- Neo4j 5.26.0 (Docker)
- HTTP API: port 25584
- Bolt: port 25554
puck — Application Runtime
Shape-shifting trickster embodying Python's versatility.
- Docker engine
- JupyterLab (port 22071 via OAuth2-Proxy)
- Gitea Runner (CI/CD agent)
- Home Assistant (port 8123)
- Django applications: Angelia (22281), Athena (22481), Kairos (22581), Icarlos (22681), Spelunker (22881), Peitho (22981)
prospero — Observability Stack
Master magician observing all events.
- PPLG stack via Docker Compose: Prometheus, Loki, Grafana, PgAdmin
- Internal HAProxy with OAuth2-Proxy for all dashboards
- AlertManager with Pushover notifications
- Prometheus metrics collection (
node-exporter, HAProxy, Loki) - Loki log aggregation via Alloy (all hosts)
- Grafana dashboard suite with Casdoor SSO integration
miranda — MCP Docker Host
Curious bridge between worlds — hosting MCP server containers.
- Docker engine (API exposed on port 2375 for MCP Switchboard)
- MCPO OpenAI-compatible MCP proxy
- Grafana MCP Server (port 25533)
- Gitea MCP Server (port 25535)
- Neo4j MCP Server
- Argos MCP Server — web search via SearXNG (port 25534)
sycorax — Language Models
Original magical power wielding language magic.
- Arke LLM API Proxy (port 25540)
- Multi-provider support (OpenAI, Anthropic, etc.)
- Session management with Memcached
- Database backend on Portia
caliban — Agent Automation
Autonomous computer agent learning through environmental interaction.
- Docker engine
- Agent S MCP Server (MATE desktop, AT-SPI automation)
- Kernos MCP Shell Server (port 22021)
- GPU passthrough for vision tasks
- RDP access (port 25521)
rosalind — Collaboration Services
Witty and resourceful moon for PHP, Go, and Node.js runtimes.
- Gitea self-hosted Git (port 22082, SSH on 22022)
- LobeChat AI chat interface (port 22081)
- Nextcloud file sharing and collaboration (port 22083)
- AnythingLLM document AI workspace (port 22084)
- Nextcloud data on dedicated Incus storage volume
titania — Proxy & SSO Services
Queen of the Fairies managing access control and authentication.
- HAProxy 3.x with TLS termination (port 443)
- Let's Encrypt wildcard certificate via certbot DNS-01 (Namecheap)
- HTTP to HTTPS redirect (port 80)
- Gitea SSH proxy (port 22022)
- Casdoor SSO (port 22081, local PostgreSQL)
- Prometheus metrics at
:8404/metrics
External Access via HAProxy
Titania provides TLS termination and reverse proxy for all services.
- Base domain:
ouranos.helu.ca - HTTPS: port 443 (standard)
- HTTP: port 80 (redirects to HTTPS)
- Certificate: Let's Encrypt wildcard via certbot DNS-01
Route Table
| Subdomain | Backend | Service |
|---|---|---|
ouranos.helu.ca (root) |
puck.incus:22281 | Angelia (Django) |
alertmanager.ouranos.helu.ca |
prospero.incus:443 (SSL) | AlertManager |
angelia.ouranos.helu.ca |
puck.incus:22281 | Angelia (Django) |
anythingllm.ouranos.helu.ca |
rosalind.incus:22084 | AnythingLLM |
arke.ouranos.helu.ca |
sycorax.incus:25540 | Arke LLM Proxy |
athena.ouranos.helu.ca |
puck.incus:22481 | Athena (Django) |
gitea.ouranos.helu.ca |
rosalind.incus:22082 | Gitea |
grafana.ouranos.helu.ca |
prospero.incus:443 (SSL) | Grafana |
hass.ouranos.helu.ca |
oberon.incus:8123 | Home Assistant |
id.ouranos.helu.ca |
titania.incus:22081 | Casdoor SSO |
icarlos.ouranos.helu.ca |
puck.incus:22681 | Icarlos (Django) |
jupyterlab.ouranos.helu.ca |
puck.incus:22071 | JupyterLab (OAuth2-Proxy) |
kairos.ouranos.helu.ca |
puck.incus:22581 | Kairos (Django) |
lobechat.ouranos.helu.ca |
rosalind.incus:22081 | LobeChat |
loki.ouranos.helu.ca |
prospero.incus:443 (SSL) | Loki |
mcp-switchboard.ouranos.helu.ca |
oberon.incus:22785 | MCP Switchboard |
nextcloud.ouranos.helu.ca |
rosalind.incus:22083 | Nextcloud |
openwebui.ouranos.helu.ca |
oberon.incus:22088 | Open WebUI |
peitho.ouranos.helu.ca |
puck.incus:22981 | Peitho (Django) |
pgadmin.ouranos.helu.ca |
prospero.incus:443 (SSL) | PgAdmin 4 |
prometheus.ouranos.helu.ca |
prospero.incus:443 (SSL) | Prometheus |
searxng.ouranos.helu.ca |
oberon.incus:22073 | SearXNG (OAuth2-Proxy) |
smtp4dev.ouranos.helu.ca |
oberon.incus:22085 | smtp4dev |
spelunker.ouranos.helu.ca |
puck.incus:22881 | Spelunker (Django) |
Infrastructure Management
Quick Start
# Provision containers
cd terraform
terraform init
terraform plan
terraform apply
# Start all containers
cd ../ansible
source ~/env/agathos/bin/activate
ansible-playbook sandbox_up.yml
# Deploy all services
ansible-playbook site.yml
# Stop all containers
ansible-playbook sandbox_down.yml
Terraform Workflow
- Define — Containers, networks, and resources in
*.tffiles - Plan — Review changes with
terraform plan - Apply — Provision with
terraform apply - Verify — Check outputs and container status
Ansible Workflow
- Bootstrap — Update packages, install essentials (
apt_update.yml) - Agents — Deploy Alloy (log/metrics) and Node Exporter on all hosts
- Services — Configure databases, Docker, applications, observability
- Verify — Check service health and connectivity
Vault Management
# Edit secrets
ansible-vault edit inventory/group_vars/all/vault.yml
# View secrets
ansible-vault view inventory/group_vars/all/vault.yml
# Encrypt a new file
ansible-vault encrypt new_secrets.yml
S3 Storage Provisioning
Terraform provisions Incus S3 buckets for services requiring object storage:
| Service | Host | Purpose |
|---|---|---|
| Casdoor | Titania | User avatars and SSO resource storage |
| LobeChat | Rosalind | File uploads and attachments |
S3 credentials (access key, secret key, endpoint) are stored as sensitive Terraform outputs and managed in Ansible Vault with the
vault_*_s3_*prefix.
Ansible Automation
Full Deployment (site.yml)
Playbooks run in dependency order:
| Playbook | Hosts | Purpose |
|---|---|---|
apt_update.yml |
All | Update packages and install essentials |
alloy/deploy.yml |
All | Grafana Alloy log/metrics collection |
prometheus/node_deploy.yml |
All | Node Exporter metrics |
docker/deploy.yml |
Oberon, Ariel, Miranda, Puck, Rosalind, Sycorax, Caliban, Titania | Docker engine |
smtp4dev/deploy.yml |
Oberon | SMTP test server |
pplg/deploy.yml |
Prospero | Full observability stack + HAProxy + OAuth2-Proxy |
postgresql/deploy.yml |
Portia | PostgreSQL with all databases |
postgresql_ssl/deploy.yml |
Titania | Dedicated PostgreSQL for Casdoor |
neo4j/deploy.yml |
Ariel | Neo4j graph database |
searxng/deploy.yml |
Oberon | SearXNG privacy search |
haproxy/deploy.yml |
Titania | HAProxy TLS termination and routing |
casdoor/deploy.yml |
Titania | Casdoor SSO |
mcpo/deploy.yml |
Miranda | MCPO MCP proxy |
openwebui/deploy.yml |
Oberon | Open WebUI LLM interface |
hass/deploy.yml |
Oberon | Home Assistant |
gitea/deploy.yml |
Rosalind | Gitea self-hosted Git |
nextcloud/deploy.yml |
Rosalind | Nextcloud collaboration |
Individual Service Deployments
Services with standalone deploy playbooks (not in site.yml):
| Playbook | Host | Service |
|---|---|---|
anythingllm/deploy.yml |
Rosalind | AnythingLLM document AI |
arke/deploy.yml |
Sycorax | Arke LLM proxy |
argos/deploy.yml |
Miranda | Argos MCP web search server |
caliban/deploy.yml |
Caliban | Agent S MCP Server |
certbot/deploy.yml |
Titania | Let's Encrypt certificate renewal |
gitea_mcp/deploy.yml |
Miranda | Gitea MCP Server |
gitea_runner/deploy.yml |
Puck | Gitea CI/CD runner |
grafana_mcp/deploy.yml |
Miranda | Grafana MCP Server |
jupyterlab/deploy.yml |
Puck | JupyterLab + OAuth2-Proxy |
kernos/deploy.yml |
Caliban | Kernos MCP shell server |
lobechat/deploy.yml |
Rosalind | LobeChat AI chat |
neo4j_mcp/deploy.yml |
Miranda | Neo4j MCP Server |
rabbitmq/deploy.yml |
Oberon | RabbitMQ message queue |
Lifecycle Playbooks
| Playbook | Purpose |
|---|---|
sandbox_up.yml |
Start all Uranian host containers |
sandbox_down.yml |
Gracefully stop all containers |
apt_update.yml |
Update packages on all hosts |
site.yml |
Full deployment orchestration |
Data Flow Architecture
Observability Pipeline
All Hosts Prospero Alerts
Alloy + Node Exporter → Prometheus + Loki + Grafana → AlertManager + Pushover
collect metrics & logs storage & visualisation notifications
Integration Points
| Consumer | Provider | Connection |
|---|---|---|
| All LLM apps | Arke (Sycorax) | http://sycorax.incus:25540 |
| Open WebUI, Arke, Gitea, Nextcloud, LobeChat | PostgreSQL (Portia) | portia.incus:5432 |
| Neo4j MCP | Neo4j (Ariel) | ariel.incus:7687 (Bolt) |
| MCP Switchboard | Docker API (Miranda) | tcp://miranda.incus:2375 |
| MCP Switchboard | RabbitMQ (Oberon) | oberon.incus:5672 |
| Kairos, Spelunker | RabbitMQ (Oberon) | oberon.incus:5672 |
| SMTP (all apps) | smtp4dev (Oberon) | oberon.incus:22025 |
| All hosts | Loki (Prospero) | http://prospero.incus:3100 |
| All hosts | Prometheus (Prospero) | http://prospero.incus:9090 |
Important Notes
⚠️ Alloy Host Variables Required — Every host with alloy in its services list must define alloy_log_level in inventory/host_vars/<host>.incus.yml. The playbook will fail with an undefined variable error if this is missing.
⚠️ Alloy Syslog Listeners Required for Docker Services — Any Docker Compose service using the syslog logging driver must have a corresponding loki.source.syslog listener in the host's Alloy config template (ansible/alloy/<hostname>/config.alloy.j2). Missing listeners cause Docker containers to fail on start.
⚠️ Local Terraform State — This project uses local Terraform state (no remote backend). Do not run terraform apply from multiple machines simultaneously.
⚠️ Nested Docker — Docker runs inside Incus containers (nested), requiring security.nesting = true and lxc.apparmor.profile=unconfined AppArmor override on all Docker-enabled hosts.
⚠️ Deployment Order — Prospero (observability) must be fully deployed before other hosts, as Alloy on every host pushes logs and metrics to prospero.incus. Run pplg/deploy.yml before site.yml on a fresh environment.