docs: rewrite README with structured overview and quick start guide

Replaces the minimal project description with a comprehensive README
including a component overview table, quick start instructions, common
Ansible operations, and links to detailed documentation. Aligns with
Red Panda Approval™ standards.
This commit is contained in:
2026-03-03 12:49:06 +00:00
parent c7be03a743
commit b4d60f2f38
219 changed files with 34586 additions and 2 deletions

546
docs/rabbitmq.md Normal file
View File

@@ -0,0 +1,546 @@
# RabbitMQ - Message Broker Infrastructure
## Overview
RabbitMQ 3 (management-alpine) serves as the central message broker for the Agathos sandbox, providing AMQP-compliant message queuing for asynchronous communication between services. The deployment includes the management web interface for monitoring and administration.
**Host:** Oberon (container_orchestration)
**Role:** Message broker for event-driven architectures
**AMQP Port:** 5672
**Management Port:** 25582
**Syslog Port:** 51402 (Alloy)
## Architecture
```
┌─────────────────────────────────────────────────────────┐
│ Oberon Host │
│ │
│ ┌──────────────────────────────────────────────────┐ │
│ │ RabbitMQ Container (Docker) │ │
│ │ │ │
│ │ ┌──────────────┬──────────────┐ │ │
│ │ │ VHost │ VHost │ │ │
│ │ │ "kairos" │ "spelunker" │ │ │
│ │ │ │ │ │ │
│ │ │ User: │ User: │ │ │
│ │ │ kairos │ spelunker │ │ │
│ │ │ (full perm) │ (full perm) │ │ │
│ │ └──────────────┴──────────────┘ │ │
│ │ │ │
│ │ Default Admin: rabbitmq │ │
│ │ (all vhosts, admin privileges) │ │
│ │ │ │
│ └──────────────────────────────────────────────────┘ │
│ │
│ Ports: 5672 (AMQP), 25582 (Management) │
│ Logs: syslog → Alloy:51402 → Loki │
└─────────────────────────────────────────────────────────┘
┌──────────────┐ ┌──────────────┐
│ Kairos │───AMQP────▶│ kairos/ │
│ (future) │ │ (vhost) │
└──────────────┘ └──────────────┘
┌──────────────┐ ┌──────────────┐
│ Spelunker │───AMQP────▶│ spelunker/ │
│ (future) │ │ (vhost) │
└──────────────┘ └──────────────┘
```
**Note**: Kairos and Spelunker are future services. The RabbitMQ infrastructure is pre-provisioned with dedicated virtual hosts and users ready for when these services are deployed.
## Terraform Resources
### Oberon Host Definition
RabbitMQ runs on Oberon, defined in `terraform/containers.tf`:
| Attribute | Value |
|-----------|-------|
| Description | Docker Host + MCP Switchboard - King of Fairies orchestrating containers |
| Image | noble |
| Role | container_orchestration |
| Security Nesting | `true` (required for Docker) |
| AppArmor Profile | unconfined |
| Proxy Devices | `25580-25599 → 25580-25599` (application port range) |
### Container Dependencies
| Resource | Relationship |
|----------|--------------|
| Docker | RabbitMQ runs as a Docker container on Oberon |
| Alloy | Collects syslog logs from RabbitMQ on port 51402 |
| Prospero | Receives logs via Loki for observability |
## Ansible Deployment
### Playbook
```bash
cd ansible
ansible-playbook rabbitmq/deploy.yml
```
### Files
| File | Purpose |
|------|---------|
| `rabbitmq/deploy.yml` | Main deployment playbook |
| `rabbitmq/docker-compose.yml.j2` | Docker Compose template |
### Deployment Steps
The playbook performs the following operations:
1. **User and Group Management**
- Creates `rabbitmq` system user and group
- Adds `ponos` user to `rabbitmq` group for operational access
2. **Directory Setup**
- Creates service directory at `/srv/rabbitmq`
- Sets ownership to `rabbitmq:rabbitmq`
- Configures permissions (mode 750)
3. **Docker Compose Deployment**
- Templates `docker-compose.yml` from Jinja2 template
- Deploys RabbitMQ container with `docker compose up`
4. **rabbitmqadmin CLI Setup**
- Extracts `rabbitmqadmin` from container to `/usr/local/bin/`
- Makes it executable for host-level management
5. **Automatic Provisioning** (idempotent)
- Creates virtual hosts: `kairos`, `spelunker`
- Creates users with passwords from vault
- Sets user tags (currently none, expandable for admin/monitoring roles)
- Configures full permissions for each user on their respective vhost
### Variables
#### Host Variables (`host_vars/oberon.incus.yml`)
| Variable | Description | Default |
|----------|-------------|---------|
| `rabbitmq_user` | Service user | `rabbitmq` |
| `rabbitmq_group` | Service group | `rabbitmq` |
| `rabbitmq_directory` | Installation directory | `/srv/rabbitmq` |
| `rabbitmq_amqp_port` | AMQP protocol port | `5672` |
| `rabbitmq_management_port` | Management web interface | `25582` |
| `rabbitmq_password` | Default admin password | `{{ vault_rabbitmq_password }}` |
#### Group Variables (`group_vars/all/vars.yml`)
Defines the provisioning configuration for vhosts, users, and permissions:
```yaml
rabbitmq_vhosts:
- name: kairos
- name: spelunker
rabbitmq_users:
- name: kairos
password: "{{ kairos_rabbitmq_password }}"
tags: []
- name: spelunker
password: "{{ spelunker_rabbitmq_password }}"
tags: []
rabbitmq_permissions:
- vhost: kairos
user: kairos
configure_priv: .*
read_priv: .*
write_priv: .*
- vhost: spelunker
user: spelunker
configure_priv: .*
read_priv: .*
write_priv: .*
```
**Vault Variable Mappings**:
```yaml
kairos_rabbitmq_password: "{{ vault_kairos_rabbitmq_password }}"
spelunker_rabbitmq_password: "{{ vault_spelunker_rabbitmq_password }}"
```
#### Vault Variables (`group_vars/all/vault.yml`)
All sensitive credentials are encrypted in the vault:
| Variable | Description |
|----------|-------------|
| `vault_rabbitmq_password` | Default admin account password |
| `vault_kairos_rabbitmq_password` | Kairos service user password |
| `vault_spelunker_rabbitmq_password` | Spelunker service user password |
## Configuration
### Docker Compose Template
The deployment uses a minimal Docker Compose configuration:
```yaml
services:
rabbitmq:
image: rabbitmq:3-management-alpine
container_name: rabbitmq
restart: unless-stopped
ports:
- "{{rabbitmq_amqp_port}}:5672" # AMQP protocol
- "{{rabbitmq_management_port}}:15672" # Management UI
volumes:
- rabbitmq_data:/var/lib/rabbitmq # Persistent data
environment:
RABBITMQ_DEFAULT_USER: "{{rabbitmq_user}}"
RABBITMQ_DEFAULT_PASS: "{{rabbitmq_password}}"
logging:
driver: syslog
options:
syslog-address: "tcp://127.0.0.1:{{rabbitmq_syslog_port}}"
syslog-format: "{{syslog_format}}"
tag: "rabbitmq"
```
### Data Persistence
- **Volume**: `rabbitmq_data` (Docker-managed volume)
- **Location**: `/var/lib/rabbitmq` inside container
- **Contents**:
- Message queues and persistent messages
- Virtual host metadata
- User credentials and permissions
- Configuration overrides
## Virtual Hosts and Users
### Default Admin Account
**Username**: `rabbitmq`
**Password**: `{{ vault_rabbitmq_password }}` (from vault)
**Privileges**: Full administrative access to all virtual hosts
The default admin account is created automatically when the container starts and can access:
- All virtual hosts (including `/`, `kairos`, `spelunker`)
- Management web interface
- All RabbitMQ management commands
### Kairos Virtual Host
**VHost**: `kairos`
**User**: `kairos`
**Password**: `{{ vault_kairos_rabbitmq_password }}`
**Permissions**: Full (configure, read, write) on all resources matching `.*`
Intended for the **Kairos** service (event-driven time-series processing system, planned future deployment).
### Spelunker Virtual Host
**VHost**: `spelunker`
**User**: `spelunker`
**Password**: `{{ vault_spelunker_rabbitmq_password }}`
**Permissions**: Full (configure, read, write) on all resources matching `.*`
Intended for the **Spelunker** service (log exploration and analytics platform, planned future deployment).
### Permission Model
Both service users have full access within their respective virtual hosts:
| Permission | Pattern | Description |
|------------|---------|-------------|
| Configure | `.*` | Create/delete queues, exchanges, bindings |
| Write | `.*` | Publish messages to exchanges |
| Read | `.*` | Consume messages from queues |
This isolation ensures:
- ✔ Each service operates in its own namespace
- ✔ Messages cannot cross between services
- ✔ Resource limits can be applied per-vhost
- ✔ Service credentials can be rotated independently
## Access and Administration
### Management Web Interface
**URL**: `http://oberon.incus:25582`
**External**: `http://{oberon-ip}:25582`
**Login**: `rabbitmq` / `{{ vault_rabbitmq_password }}`
Features:
- Queue inspection and message browsing
- Exchange and binding management
- Connection and channel monitoring
- User and permission administration
- Virtual host management
- Performance metrics and charts
### CLI Administration
#### On Host Machine (using rabbitmqadmin)
```bash
# List vhosts
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD list vhosts
# List queues in a vhost
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD -V kairos list queues
# Publish a test message
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD -V kairos publish \
exchange=amq.default routing_key=test payload="test message"
```
#### Inside Container
```bash
# Enter the container
docker exec -it rabbitmq /bin/sh
# List vhosts
rabbitmqctl list_vhosts
# List users
rabbitmqctl list_users
# List permissions for a user
rabbitmqctl list_user_permissions kairos
# List queues in a vhost
rabbitmqctl list_queues -p kairos
# Check node status
rabbitmqctl status
```
### Connection Strings
#### AMQP Connection (from other containers on Oberon)
```
amqp://kairos:PASSWORD@localhost:5672/kairos
amqp://spelunker:PASSWORD@localhost:5672/spelunker
```
#### AMQP Connection (from other hosts)
```
amqp://kairos:PASSWORD@oberon.incus:5672/kairos
amqp://spelunker:PASSWORD@oberon.incus:5672/spelunker
```
#### Management API
```
http://rabbitmq:PASSWORD@oberon.incus:25582/api/
```
## Monitoring and Observability
### Logging
- **Driver**: syslog (Docker logging driver)
- **Destination**: `tcp://127.0.0.1:51402` (Alloy on Oberon)
- **Tag**: `rabbitmq`
- **Format**: `{{ syslog_format }}` (from Alloy configuration)
Logs are collected by Alloy and forwarded to Loki on Prospero for centralized log aggregation.
### Key Metrics (via Management UI)
| Metric | Description |
|--------|-------------|
| Connections | Active AMQP client connections |
| Channels | Active channels within connections |
| Queues | Total queues across all vhosts |
| Messages | Ready, unacknowledged, and total message counts |
| Message Rate | Publish/deliver rates (msg/s) |
| Memory Usage | Container memory consumption |
| Disk Usage | Persistent storage utilization |
### Health Check
```bash
# Check if RabbitMQ is running
docker ps | grep rabbitmq
# Check container logs
docker logs rabbitmq
# Check RabbitMQ node status
docker exec rabbitmq rabbitmqctl status
# Check cluster health (single-node, should show 1 node)
docker exec rabbitmq rabbitmqctl cluster_status
```
## Operational Tasks
### Restart RabbitMQ
```bash
# Via Docker Compose
cd /srv/rabbitmq
sudo -u rabbitmq docker compose restart
# Via Docker directly
docker restart rabbitmq
```
### Recreate Container (preserves data)
```bash
cd /srv/rabbitmq
sudo -u rabbitmq docker compose down
sudo -u rabbitmq docker compose up -d
```
### Add New Virtual Host and User
1. Update `group_vars/all/vars.yml`:
```yaml
rabbitmq_vhosts:
- name: newservice
rabbitmq_users:
- name: newservice
password: "{{ newservice_rabbitmq_password }}"
tags: []
rabbitmq_permissions:
- vhost: newservice
user: newservice
configure_priv: .*
read_priv: .*
write_priv: .*
# Add mapping
newservice_rabbitmq_password: "{{ vault_newservice_rabbitmq_password }}"
```
2. Add password to `group_vars/all/vault.yml`:
```bash
ansible-vault edit inventory/group_vars/all/vault.yml
# Add: vault_newservice_rabbitmq_password: "secure_password"
```
3. Run the playbook:
```bash
ansible-playbook rabbitmq/deploy.yml
```
The provisioning tasks are idempotent—existing vhosts and users are skipped, only new ones are created.
### Rotate User Password
```bash
# Inside container
docker exec rabbitmq rabbitmqctl change_password kairos "new_password"
# Update vault
ansible-vault edit inventory/group_vars/all/vault.yml
# Update vault_kairos_rabbitmq_password
```
### Clear All Messages in a Queue
```bash
docker exec rabbitmq rabbitmqctl purge_queue queue_name -p kairos
```
## Troubleshooting
### Container Won't Start
Check Docker logs for errors:
```bash
docker logs rabbitmq
```
Common issues:
- Port conflict on 5672 or 25582
- Permission issues on `/srv/rabbitmq` directory
- Corrupted data volume
### Cannot Connect to Management UI
1. Verify port mapping: `docker port rabbitmq`
2. Check firewall rules on Oberon
3. Verify container is running: `docker ps | grep rabbitmq`
4. Check if management plugin is enabled (should be in `-management-alpine` image)
### User Authentication Failing
```bash
# List users and verify they exist
docker exec rabbitmq rabbitmqctl list_users
# Check user permissions
docker exec rabbitmq rabbitmqctl list_user_permissions kairos
# Verify vhost exists
docker exec rabbitmq rabbitmqctl list_vhosts
```
### High Memory Usage
RabbitMQ may consume significant memory with many messages. Check:
```bash
# Memory usage
docker exec rabbitmq rabbitmqctl status | grep memory
# Queue depths
docker exec rabbitmq rabbitmqctl list_queues -p kairos messages
# Consider setting memory limits in docker-compose.yml
```
## Security Considerations
### Network Isolation
- RabbitMQ AMQP port (5672) is **only** exposed on the Incus network (`10.10.0.0/16`)
- Management UI (25582) is exposed externally for administration
- For production: Place HAProxy in front of management UI with authentication
- Consider enabling SSL/TLS for AMQP connections in production
### Credential Management
- ✔ All passwords stored in Ansible Vault
- ✔ Service accounts have isolated virtual hosts
- ✔ Default admin account uses strong password from vault
- ⚠️ Credentials passed as environment variables (visible in `docker inspect`)
- Consider using Docker secrets or Vault integration for enhanced security
### Virtual Host Isolation
Each service operates in its own virtual host:
- Messages cannot cross between vhosts
- Resource quotas can be applied per-vhost
- Credentials can be rotated without affecting other services
## Future Enhancements
- [ ] **SSL/TLS Support**: Enable encrypted AMQP connections
- [ ] **Cluster Mode**: Add additional RabbitMQ nodes for high availability
- [ ] **Federation**: Connect to external RabbitMQ clusters
- [ ] **Prometheus Exporter**: Add metrics export for Grafana monitoring
- [ ] **Shovel Plugin**: Configure message forwarding between brokers
- [ ] **HAProxy Integration**: Reverse proxy for management UI with authentication
- [ ] **Docker Secrets**: Replace environment variables with Docker secrets
## References
- [RabbitMQ Official Documentation](https://www.rabbitmq.com/documentation.html)
- [RabbitMQ Management Plugin](https://www.rabbitmq.com/management.html)
- [AMQP 0-9-1 Protocol Reference](https://www.rabbitmq.com/amqp-0-9-1-reference.html)
- [Virtual Hosts](https://www.rabbitmq.com/vhosts.html)
- [Access Control (Authentication, Authorisation)](https://www.rabbitmq.com/access-control.html)
- [Monitoring RabbitMQ](https://www.rabbitmq.com/monitoring.html)
---
**Last Updated**: February 12, 2026
**Project**: Agathos Infrastructure
**Approval**: Red Panda Approved™