docs: rewrite README with structured overview and quick start guide
Replaces the minimal project description with a comprehensive README including a component overview table, quick start instructions, common Ansible operations, and links to detailed documentation. Aligns with Red Panda Approval™ standards.
This commit is contained in:
546
docs/rabbitmq.md
Normal file
546
docs/rabbitmq.md
Normal file
@@ -0,0 +1,546 @@
|
||||
# RabbitMQ - Message Broker Infrastructure
|
||||
|
||||
## Overview
|
||||
|
||||
RabbitMQ 3 (management-alpine) serves as the central message broker for the Agathos sandbox, providing AMQP-compliant message queuing for asynchronous communication between services. The deployment includes the management web interface for monitoring and administration.
|
||||
|
||||
**Host:** Oberon (container_orchestration)
|
||||
**Role:** Message broker for event-driven architectures
|
||||
**AMQP Port:** 5672
|
||||
**Management Port:** 25582
|
||||
**Syslog Port:** 51402 (Alloy)
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Oberon Host │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────────┐ │
|
||||
│ │ RabbitMQ Container (Docker) │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──────────────┬──────────────┐ │ │
|
||||
│ │ │ VHost │ VHost │ │ │
|
||||
│ │ │ "kairos" │ "spelunker" │ │ │
|
||||
│ │ │ │ │ │ │
|
||||
│ │ │ User: │ User: │ │ │
|
||||
│ │ │ kairos │ spelunker │ │ │
|
||||
│ │ │ (full perm) │ (full perm) │ │ │
|
||||
│ │ └──────────────┴──────────────┘ │ │
|
||||
│ │ │ │
|
||||
│ │ Default Admin: rabbitmq │ │
|
||||
│ │ (all vhosts, admin privileges) │ │
|
||||
│ │ │ │
|
||||
│ └──────────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ Ports: 5672 (AMQP), 25582 (Management) │
|
||||
│ Logs: syslog → Alloy:51402 → Loki │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
|
||||
┌──────────────┐ ┌──────────────┐
|
||||
│ Kairos │───AMQP────▶│ kairos/ │
|
||||
│ (future) │ │ (vhost) │
|
||||
└──────────────┘ └──────────────┘
|
||||
|
||||
┌──────────────┐ ┌──────────────┐
|
||||
│ Spelunker │───AMQP────▶│ spelunker/ │
|
||||
│ (future) │ │ (vhost) │
|
||||
└──────────────┘ └──────────────┘
|
||||
```
|
||||
|
||||
**Note**: Kairos and Spelunker are future services. The RabbitMQ infrastructure is pre-provisioned with dedicated virtual hosts and users ready for when these services are deployed.
|
||||
|
||||
## Terraform Resources
|
||||
|
||||
### Oberon Host Definition
|
||||
|
||||
RabbitMQ runs on Oberon, defined in `terraform/containers.tf`:
|
||||
|
||||
| Attribute | Value |
|
||||
|-----------|-------|
|
||||
| Description | Docker Host + MCP Switchboard - King of Fairies orchestrating containers |
|
||||
| Image | noble |
|
||||
| Role | container_orchestration |
|
||||
| Security Nesting | `true` (required for Docker) |
|
||||
| AppArmor Profile | unconfined |
|
||||
| Proxy Devices | `25580-25599 → 25580-25599` (application port range) |
|
||||
|
||||
### Container Dependencies
|
||||
|
||||
| Resource | Relationship |
|
||||
|----------|--------------|
|
||||
| Docker | RabbitMQ runs as a Docker container on Oberon |
|
||||
| Alloy | Collects syslog logs from RabbitMQ on port 51402 |
|
||||
| Prospero | Receives logs via Loki for observability |
|
||||
|
||||
## Ansible Deployment
|
||||
|
||||
### Playbook
|
||||
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook rabbitmq/deploy.yml
|
||||
```
|
||||
|
||||
### Files
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `rabbitmq/deploy.yml` | Main deployment playbook |
|
||||
| `rabbitmq/docker-compose.yml.j2` | Docker Compose template |
|
||||
|
||||
### Deployment Steps
|
||||
|
||||
The playbook performs the following operations:
|
||||
|
||||
1. **User and Group Management**
|
||||
- Creates `rabbitmq` system user and group
|
||||
- Adds `ponos` user to `rabbitmq` group for operational access
|
||||
|
||||
2. **Directory Setup**
|
||||
- Creates service directory at `/srv/rabbitmq`
|
||||
- Sets ownership to `rabbitmq:rabbitmq`
|
||||
- Configures permissions (mode 750)
|
||||
|
||||
3. **Docker Compose Deployment**
|
||||
- Templates `docker-compose.yml` from Jinja2 template
|
||||
- Deploys RabbitMQ container with `docker compose up`
|
||||
|
||||
4. **rabbitmqadmin CLI Setup**
|
||||
- Extracts `rabbitmqadmin` from container to `/usr/local/bin/`
|
||||
- Makes it executable for host-level management
|
||||
|
||||
5. **Automatic Provisioning** (idempotent)
|
||||
- Creates virtual hosts: `kairos`, `spelunker`
|
||||
- Creates users with passwords from vault
|
||||
- Sets user tags (currently none, expandable for admin/monitoring roles)
|
||||
- Configures full permissions for each user on their respective vhost
|
||||
|
||||
### Variables
|
||||
|
||||
#### Host Variables (`host_vars/oberon.incus.yml`)
|
||||
|
||||
| Variable | Description | Default |
|
||||
|----------|-------------|---------|
|
||||
| `rabbitmq_user` | Service user | `rabbitmq` |
|
||||
| `rabbitmq_group` | Service group | `rabbitmq` |
|
||||
| `rabbitmq_directory` | Installation directory | `/srv/rabbitmq` |
|
||||
| `rabbitmq_amqp_port` | AMQP protocol port | `5672` |
|
||||
| `rabbitmq_management_port` | Management web interface | `25582` |
|
||||
| `rabbitmq_password` | Default admin password | `{{ vault_rabbitmq_password }}` |
|
||||
|
||||
#### Group Variables (`group_vars/all/vars.yml`)
|
||||
|
||||
Defines the provisioning configuration for vhosts, users, and permissions:
|
||||
|
||||
```yaml
|
||||
rabbitmq_vhosts:
|
||||
- name: kairos
|
||||
- name: spelunker
|
||||
|
||||
rabbitmq_users:
|
||||
- name: kairos
|
||||
password: "{{ kairos_rabbitmq_password }}"
|
||||
tags: []
|
||||
- name: spelunker
|
||||
password: "{{ spelunker_rabbitmq_password }}"
|
||||
tags: []
|
||||
|
||||
rabbitmq_permissions:
|
||||
- vhost: kairos
|
||||
user: kairos
|
||||
configure_priv: .*
|
||||
read_priv: .*
|
||||
write_priv: .*
|
||||
- vhost: spelunker
|
||||
user: spelunker
|
||||
configure_priv: .*
|
||||
read_priv: .*
|
||||
write_priv: .*
|
||||
```
|
||||
|
||||
**Vault Variable Mappings**:
|
||||
```yaml
|
||||
kairos_rabbitmq_password: "{{ vault_kairos_rabbitmq_password }}"
|
||||
spelunker_rabbitmq_password: "{{ vault_spelunker_rabbitmq_password }}"
|
||||
```
|
||||
|
||||
#### Vault Variables (`group_vars/all/vault.yml`)
|
||||
|
||||
All sensitive credentials are encrypted in the vault:
|
||||
|
||||
| Variable | Description |
|
||||
|----------|-------------|
|
||||
| `vault_rabbitmq_password` | Default admin account password |
|
||||
| `vault_kairos_rabbitmq_password` | Kairos service user password |
|
||||
| `vault_spelunker_rabbitmq_password` | Spelunker service user password |
|
||||
|
||||
## Configuration
|
||||
|
||||
### Docker Compose Template
|
||||
|
||||
The deployment uses a minimal Docker Compose configuration:
|
||||
|
||||
```yaml
|
||||
services:
|
||||
rabbitmq:
|
||||
image: rabbitmq:3-management-alpine
|
||||
container_name: rabbitmq
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "{{rabbitmq_amqp_port}}:5672" # AMQP protocol
|
||||
- "{{rabbitmq_management_port}}:15672" # Management UI
|
||||
volumes:
|
||||
- rabbitmq_data:/var/lib/rabbitmq # Persistent data
|
||||
environment:
|
||||
RABBITMQ_DEFAULT_USER: "{{rabbitmq_user}}"
|
||||
RABBITMQ_DEFAULT_PASS: "{{rabbitmq_password}}"
|
||||
logging:
|
||||
driver: syslog
|
||||
options:
|
||||
syslog-address: "tcp://127.0.0.1:{{rabbitmq_syslog_port}}"
|
||||
syslog-format: "{{syslog_format}}"
|
||||
tag: "rabbitmq"
|
||||
```
|
||||
|
||||
### Data Persistence
|
||||
|
||||
- **Volume**: `rabbitmq_data` (Docker-managed volume)
|
||||
- **Location**: `/var/lib/rabbitmq` inside container
|
||||
- **Contents**:
|
||||
- Message queues and persistent messages
|
||||
- Virtual host metadata
|
||||
- User credentials and permissions
|
||||
- Configuration overrides
|
||||
|
||||
## Virtual Hosts and Users
|
||||
|
||||
### Default Admin Account
|
||||
|
||||
**Username**: `rabbitmq`
|
||||
**Password**: `{{ vault_rabbitmq_password }}` (from vault)
|
||||
**Privileges**: Full administrative access to all virtual hosts
|
||||
|
||||
The default admin account is created automatically when the container starts and can access:
|
||||
- All virtual hosts (including `/`, `kairos`, `spelunker`)
|
||||
- Management web interface
|
||||
- All RabbitMQ management commands
|
||||
|
||||
### Kairos Virtual Host
|
||||
|
||||
**VHost**: `kairos`
|
||||
**User**: `kairos`
|
||||
**Password**: `{{ vault_kairos_rabbitmq_password }}`
|
||||
**Permissions**: Full (configure, read, write) on all resources matching `.*`
|
||||
|
||||
Intended for the **Kairos** service (event-driven time-series processing system, planned future deployment).
|
||||
|
||||
### Spelunker Virtual Host
|
||||
|
||||
**VHost**: `spelunker`
|
||||
**User**: `spelunker`
|
||||
**Password**: `{{ vault_spelunker_rabbitmq_password }}`
|
||||
**Permissions**: Full (configure, read, write) on all resources matching `.*`
|
||||
|
||||
Intended for the **Spelunker** service (log exploration and analytics platform, planned future deployment).
|
||||
|
||||
### Permission Model
|
||||
|
||||
Both service users have full access within their respective virtual hosts:
|
||||
|
||||
| Permission | Pattern | Description |
|
||||
|------------|---------|-------------|
|
||||
| Configure | `.*` | Create/delete queues, exchanges, bindings |
|
||||
| Write | `.*` | Publish messages to exchanges |
|
||||
| Read | `.*` | Consume messages from queues |
|
||||
|
||||
This isolation ensures:
|
||||
- ✔ Each service operates in its own namespace
|
||||
- ✔ Messages cannot cross between services
|
||||
- ✔ Resource limits can be applied per-vhost
|
||||
- ✔ Service credentials can be rotated independently
|
||||
|
||||
## Access and Administration
|
||||
|
||||
### Management Web Interface
|
||||
|
||||
**URL**: `http://oberon.incus:25582`
|
||||
**External**: `http://{oberon-ip}:25582`
|
||||
**Login**: `rabbitmq` / `{{ vault_rabbitmq_password }}`
|
||||
|
||||
Features:
|
||||
- Queue inspection and message browsing
|
||||
- Exchange and binding management
|
||||
- Connection and channel monitoring
|
||||
- User and permission administration
|
||||
- Virtual host management
|
||||
- Performance metrics and charts
|
||||
|
||||
### CLI Administration
|
||||
|
||||
#### On Host Machine (using rabbitmqadmin)
|
||||
|
||||
```bash
|
||||
# List vhosts
|
||||
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD list vhosts
|
||||
|
||||
# List queues in a vhost
|
||||
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD -V kairos list queues
|
||||
|
||||
# Publish a test message
|
||||
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD -V kairos publish \
|
||||
exchange=amq.default routing_key=test payload="test message"
|
||||
```
|
||||
|
||||
#### Inside Container
|
||||
|
||||
```bash
|
||||
# Enter the container
|
||||
docker exec -it rabbitmq /bin/sh
|
||||
|
||||
# List vhosts
|
||||
rabbitmqctl list_vhosts
|
||||
|
||||
# List users
|
||||
rabbitmqctl list_users
|
||||
|
||||
# List permissions for a user
|
||||
rabbitmqctl list_user_permissions kairos
|
||||
|
||||
# List queues in a vhost
|
||||
rabbitmqctl list_queues -p kairos
|
||||
|
||||
# Check node status
|
||||
rabbitmqctl status
|
||||
```
|
||||
|
||||
### Connection Strings
|
||||
|
||||
#### AMQP Connection (from other containers on Oberon)
|
||||
|
||||
```
|
||||
amqp://kairos:PASSWORD@localhost:5672/kairos
|
||||
amqp://spelunker:PASSWORD@localhost:5672/spelunker
|
||||
```
|
||||
|
||||
#### AMQP Connection (from other hosts)
|
||||
|
||||
```
|
||||
amqp://kairos:PASSWORD@oberon.incus:5672/kairos
|
||||
amqp://spelunker:PASSWORD@oberon.incus:5672/spelunker
|
||||
```
|
||||
|
||||
#### Management API
|
||||
|
||||
```
|
||||
http://rabbitmq:PASSWORD@oberon.incus:25582/api/
|
||||
```
|
||||
|
||||
## Monitoring and Observability
|
||||
|
||||
### Logging
|
||||
|
||||
- **Driver**: syslog (Docker logging driver)
|
||||
- **Destination**: `tcp://127.0.0.1:51402` (Alloy on Oberon)
|
||||
- **Tag**: `rabbitmq`
|
||||
- **Format**: `{{ syslog_format }}` (from Alloy configuration)
|
||||
|
||||
Logs are collected by Alloy and forwarded to Loki on Prospero for centralized log aggregation.
|
||||
|
||||
### Key Metrics (via Management UI)
|
||||
|
||||
| Metric | Description |
|
||||
|--------|-------------|
|
||||
| Connections | Active AMQP client connections |
|
||||
| Channels | Active channels within connections |
|
||||
| Queues | Total queues across all vhosts |
|
||||
| Messages | Ready, unacknowledged, and total message counts |
|
||||
| Message Rate | Publish/deliver rates (msg/s) |
|
||||
| Memory Usage | Container memory consumption |
|
||||
| Disk Usage | Persistent storage utilization |
|
||||
|
||||
### Health Check
|
||||
|
||||
```bash
|
||||
# Check if RabbitMQ is running
|
||||
docker ps | grep rabbitmq
|
||||
|
||||
# Check container logs
|
||||
docker logs rabbitmq
|
||||
|
||||
# Check RabbitMQ node status
|
||||
docker exec rabbitmq rabbitmqctl status
|
||||
|
||||
# Check cluster health (single-node, should show 1 node)
|
||||
docker exec rabbitmq rabbitmqctl cluster_status
|
||||
```
|
||||
|
||||
## Operational Tasks
|
||||
|
||||
### Restart RabbitMQ
|
||||
|
||||
```bash
|
||||
# Via Docker Compose
|
||||
cd /srv/rabbitmq
|
||||
sudo -u rabbitmq docker compose restart
|
||||
|
||||
# Via Docker directly
|
||||
docker restart rabbitmq
|
||||
```
|
||||
|
||||
### Recreate Container (preserves data)
|
||||
|
||||
```bash
|
||||
cd /srv/rabbitmq
|
||||
sudo -u rabbitmq docker compose down
|
||||
sudo -u rabbitmq docker compose up -d
|
||||
```
|
||||
|
||||
### Add New Virtual Host and User
|
||||
|
||||
1. Update `group_vars/all/vars.yml`:
|
||||
```yaml
|
||||
rabbitmq_vhosts:
|
||||
- name: newservice
|
||||
|
||||
rabbitmq_users:
|
||||
- name: newservice
|
||||
password: "{{ newservice_rabbitmq_password }}"
|
||||
tags: []
|
||||
|
||||
rabbitmq_permissions:
|
||||
- vhost: newservice
|
||||
user: newservice
|
||||
configure_priv: .*
|
||||
read_priv: .*
|
||||
write_priv: .*
|
||||
|
||||
# Add mapping
|
||||
newservice_rabbitmq_password: "{{ vault_newservice_rabbitmq_password }}"
|
||||
```
|
||||
|
||||
2. Add password to `group_vars/all/vault.yml`:
|
||||
```bash
|
||||
ansible-vault edit inventory/group_vars/all/vault.yml
|
||||
# Add: vault_newservice_rabbitmq_password: "secure_password"
|
||||
```
|
||||
|
||||
3. Run the playbook:
|
||||
```bash
|
||||
ansible-playbook rabbitmq/deploy.yml
|
||||
```
|
||||
|
||||
The provisioning tasks are idempotent—existing vhosts and users are skipped, only new ones are created.
|
||||
|
||||
### Rotate User Password
|
||||
|
||||
```bash
|
||||
# Inside container
|
||||
docker exec rabbitmq rabbitmqctl change_password kairos "new_password"
|
||||
|
||||
# Update vault
|
||||
ansible-vault edit inventory/group_vars/all/vault.yml
|
||||
# Update vault_kairos_rabbitmq_password
|
||||
```
|
||||
|
||||
### Clear All Messages in a Queue
|
||||
|
||||
```bash
|
||||
docker exec rabbitmq rabbitmqctl purge_queue queue_name -p kairos
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Container Won't Start
|
||||
|
||||
Check Docker logs for errors:
|
||||
```bash
|
||||
docker logs rabbitmq
|
||||
```
|
||||
|
||||
Common issues:
|
||||
- Port conflict on 5672 or 25582
|
||||
- Permission issues on `/srv/rabbitmq` directory
|
||||
- Corrupted data volume
|
||||
|
||||
### Cannot Connect to Management UI
|
||||
|
||||
1. Verify port mapping: `docker port rabbitmq`
|
||||
2. Check firewall rules on Oberon
|
||||
3. Verify container is running: `docker ps | grep rabbitmq`
|
||||
4. Check if management plugin is enabled (should be in `-management-alpine` image)
|
||||
|
||||
### User Authentication Failing
|
||||
|
||||
```bash
|
||||
# List users and verify they exist
|
||||
docker exec rabbitmq rabbitmqctl list_users
|
||||
|
||||
# Check user permissions
|
||||
docker exec rabbitmq rabbitmqctl list_user_permissions kairos
|
||||
|
||||
# Verify vhost exists
|
||||
docker exec rabbitmq rabbitmqctl list_vhosts
|
||||
```
|
||||
|
||||
### High Memory Usage
|
||||
|
||||
RabbitMQ may consume significant memory with many messages. Check:
|
||||
```bash
|
||||
# Memory usage
|
||||
docker exec rabbitmq rabbitmqctl status | grep memory
|
||||
|
||||
# Queue depths
|
||||
docker exec rabbitmq rabbitmqctl list_queues -p kairos messages
|
||||
|
||||
# Consider setting memory limits in docker-compose.yml
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Network Isolation
|
||||
|
||||
- RabbitMQ AMQP port (5672) is **only** exposed on the Incus network (`10.10.0.0/16`)
|
||||
- Management UI (25582) is exposed externally for administration
|
||||
- For production: Place HAProxy in front of management UI with authentication
|
||||
- Consider enabling SSL/TLS for AMQP connections in production
|
||||
|
||||
### Credential Management
|
||||
|
||||
- ✔ All passwords stored in Ansible Vault
|
||||
- ✔ Service accounts have isolated virtual hosts
|
||||
- ✔ Default admin account uses strong password from vault
|
||||
- ⚠️ Credentials passed as environment variables (visible in `docker inspect`)
|
||||
- Consider using Docker secrets or Vault integration for enhanced security
|
||||
|
||||
### Virtual Host Isolation
|
||||
|
||||
Each service operates in its own virtual host:
|
||||
- Messages cannot cross between vhosts
|
||||
- Resource quotas can be applied per-vhost
|
||||
- Credentials can be rotated without affecting other services
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
- [ ] **SSL/TLS Support**: Enable encrypted AMQP connections
|
||||
- [ ] **Cluster Mode**: Add additional RabbitMQ nodes for high availability
|
||||
- [ ] **Federation**: Connect to external RabbitMQ clusters
|
||||
- [ ] **Prometheus Exporter**: Add metrics export for Grafana monitoring
|
||||
- [ ] **Shovel Plugin**: Configure message forwarding between brokers
|
||||
- [ ] **HAProxy Integration**: Reverse proxy for management UI with authentication
|
||||
- [ ] **Docker Secrets**: Replace environment variables with Docker secrets
|
||||
|
||||
## References
|
||||
|
||||
- [RabbitMQ Official Documentation](https://www.rabbitmq.com/documentation.html)
|
||||
- [RabbitMQ Management Plugin](https://www.rabbitmq.com/management.html)
|
||||
- [AMQP 0-9-1 Protocol Reference](https://www.rabbitmq.com/amqp-0-9-1-reference.html)
|
||||
- [Virtual Hosts](https://www.rabbitmq.com/vhosts.html)
|
||||
- [Access Control (Authentication, Authorisation)](https://www.rabbitmq.com/access-control.html)
|
||||
- [Monitoring RabbitMQ](https://www.rabbitmq.com/monitoring.html)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated**: February 12, 2026
|
||||
**Project**: Agathos Infrastructure
|
||||
**Approval**: Red Panda Approved™
|
||||
Reference in New Issue
Block a user