547 lines
17 KiB
Markdown
547 lines
17 KiB
Markdown
# RabbitMQ - Message Broker Infrastructure
|
|
|
|
## Overview
|
|
|
|
RabbitMQ 3 (management-alpine) serves as the central message broker for the Ouranos sandbox, providing AMQP-compliant message queuing for asynchronous communication between services. The deployment includes the management web interface for monitoring and administration.
|
|
|
|
**Host:** Oberon (container_orchestration)
|
|
**Role:** Message broker for event-driven architectures
|
|
**AMQP Port:** 5672
|
|
**Management Port:** 25582
|
|
**Syslog Port:** 51402 (Alloy)
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────┐
|
|
│ Oberon Host │
|
|
│ │
|
|
│ ┌──────────────────────────────────────────────────┐ │
|
|
│ │ RabbitMQ Container (Docker) │ │
|
|
│ │ │ │
|
|
│ │ ┌──────────────┬──────────────┐ │ │
|
|
│ │ │ VHost │ VHost │ │ │
|
|
│ │ │ "kairos" │ "spelunker" │ │ │
|
|
│ │ │ │ │ │ │
|
|
│ │ │ User: │ User: │ │ │
|
|
│ │ │ kairos │ spelunker │ │ │
|
|
│ │ │ (full perm) │ (full perm) │ │ │
|
|
│ │ └──────────────┴──────────────┘ │ │
|
|
│ │ │ │
|
|
│ │ Default Admin: rabbitmq │ │
|
|
│ │ (all vhosts, admin privileges) │ │
|
|
│ │ │ │
|
|
│ └──────────────────────────────────────────────────┘ │
|
|
│ │
|
|
│ Ports: 5672 (AMQP), 25582 (Management) │
|
|
│ Logs: syslog → Alloy:51402 → Loki │
|
|
└─────────────────────────────────────────────────────────┘
|
|
|
|
┌──────────────┐ ┌──────────────┐
|
|
│ Kairos │───AMQP────▶│ kairos/ │
|
|
│ (future) │ │ (vhost) │
|
|
└──────────────┘ └──────────────┘
|
|
|
|
┌──────────────┐ ┌──────────────┐
|
|
│ Spelunker │───AMQP────▶│ spelunker/ │
|
|
│ (future) │ │ (vhost) │
|
|
└──────────────┘ └──────────────┘
|
|
```
|
|
|
|
**Note**: Kairos and Spelunker are future services. The RabbitMQ infrastructure is pre-provisioned with dedicated virtual hosts and users ready for when these services are deployed.
|
|
|
|
## Terraform Resources
|
|
|
|
### Oberon Host Definition
|
|
|
|
RabbitMQ runs on Oberon, defined in `terraform/containers.tf`:
|
|
|
|
| Attribute | Value |
|
|
|-----------|-------|
|
|
| Description | Docker Host + MCP Switchboard - King of Fairies orchestrating containers |
|
|
| Image | noble |
|
|
| Role | container_orchestration |
|
|
| Security Nesting | `true` (required for Docker) |
|
|
| AppArmor Profile | unconfined |
|
|
| Proxy Devices | `25580-25599 → 25580-25599` (application port range) |
|
|
|
|
### Container Dependencies
|
|
|
|
| Resource | Relationship |
|
|
|----------|--------------|
|
|
| Docker | RabbitMQ runs as a Docker container on Oberon |
|
|
| Alloy | Collects syslog logs from RabbitMQ on port 51402 |
|
|
| Prospero | Receives logs via Loki for observability |
|
|
|
|
## Ansible Deployment
|
|
|
|
### Playbook
|
|
|
|
```bash
|
|
cd ansible
|
|
ansible-playbook rabbitmq/deploy.yml
|
|
```
|
|
|
|
### Files
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `rabbitmq/deploy.yml` | Main deployment playbook |
|
|
| `rabbitmq/docker-compose.yml.j2` | Docker Compose template |
|
|
|
|
### Deployment Steps
|
|
|
|
The playbook performs the following operations:
|
|
|
|
1. **User and Group Management**
|
|
- Creates `rabbitmq` system user and group
|
|
- Adds `ponos` user to `rabbitmq` group for operational access
|
|
|
|
2. **Directory Setup**
|
|
- Creates service directory at `/srv/rabbitmq`
|
|
- Sets ownership to `rabbitmq:rabbitmq`
|
|
- Configures permissions (mode 750)
|
|
|
|
3. **Docker Compose Deployment**
|
|
- Templates `docker-compose.yml` from Jinja2 template
|
|
- Deploys RabbitMQ container with `docker compose up`
|
|
|
|
4. **rabbitmqadmin CLI Setup**
|
|
- Extracts `rabbitmqadmin` from container to `/usr/local/bin/`
|
|
- Makes it executable for host-level management
|
|
|
|
5. **Automatic Provisioning** (idempotent)
|
|
- Creates virtual hosts: `kairos`, `spelunker`
|
|
- Creates users with passwords from vault
|
|
- Sets user tags (currently none, expandable for admin/monitoring roles)
|
|
- Configures full permissions for each user on their respective vhost
|
|
|
|
### Variables
|
|
|
|
#### Host Variables (`host_vars/oberon.incus.yml`)
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `rabbitmq_user` | Service user | `rabbitmq` |
|
|
| `rabbitmq_group` | Service group | `rabbitmq` |
|
|
| `rabbitmq_directory` | Installation directory | `/srv/rabbitmq` |
|
|
| `rabbitmq_amqp_port` | AMQP protocol port | `5672` |
|
|
| `rabbitmq_management_port` | Management web interface | `25582` |
|
|
| `rabbitmq_password` | Default admin password | `{{ vault_rabbitmq_password }}` |
|
|
|
|
#### Group Variables (`group_vars/all/vars.yml`)
|
|
|
|
Defines the provisioning configuration for vhosts, users, and permissions:
|
|
|
|
```yaml
|
|
rabbitmq_vhosts:
|
|
- name: kairos
|
|
- name: spelunker
|
|
|
|
rabbitmq_users:
|
|
- name: kairos
|
|
password: "{{ kairos_rabbitmq_password }}"
|
|
tags: []
|
|
- name: spelunker
|
|
password: "{{ spelunker_rabbitmq_password }}"
|
|
tags: []
|
|
|
|
rabbitmq_permissions:
|
|
- vhost: kairos
|
|
user: kairos
|
|
configure_priv: .*
|
|
read_priv: .*
|
|
write_priv: .*
|
|
- vhost: spelunker
|
|
user: spelunker
|
|
configure_priv: .*
|
|
read_priv: .*
|
|
write_priv: .*
|
|
```
|
|
|
|
**Vault Variable Mappings**:
|
|
```yaml
|
|
kairos_rabbitmq_password: "{{ vault_kairos_rabbitmq_password }}"
|
|
spelunker_rabbitmq_password: "{{ vault_spelunker_rabbitmq_password }}"
|
|
```
|
|
|
|
#### Vault Variables (`group_vars/all/vault.yml`)
|
|
|
|
All sensitive credentials are encrypted in the vault:
|
|
|
|
| Variable | Description |
|
|
|----------|-------------|
|
|
| `vault_rabbitmq_password` | Default admin account password |
|
|
| `vault_kairos_rabbitmq_password` | Kairos service user password |
|
|
| `vault_spelunker_rabbitmq_password` | Spelunker service user password |
|
|
|
|
## Configuration
|
|
|
|
### Docker Compose Template
|
|
|
|
The deployment uses a minimal Docker Compose configuration:
|
|
|
|
```yaml
|
|
services:
|
|
rabbitmq:
|
|
image: rabbitmq:3-management-alpine
|
|
container_name: rabbitmq
|
|
restart: unless-stopped
|
|
ports:
|
|
- "{{rabbitmq_amqp_port}}:5672" # AMQP protocol
|
|
- "{{rabbitmq_management_port}}:15672" # Management UI
|
|
volumes:
|
|
- rabbitmq_data:/var/lib/rabbitmq # Persistent data
|
|
environment:
|
|
RABBITMQ_DEFAULT_USER: "{{rabbitmq_user}}"
|
|
RABBITMQ_DEFAULT_PASS: "{{rabbitmq_password}}"
|
|
logging:
|
|
driver: syslog
|
|
options:
|
|
syslog-address: "tcp://127.0.0.1:{{rabbitmq_syslog_port}}"
|
|
syslog-format: "{{syslog_format}}"
|
|
tag: "rabbitmq"
|
|
```
|
|
|
|
### Data Persistence
|
|
|
|
- **Volume**: `rabbitmq_data` (Docker-managed volume)
|
|
- **Location**: `/var/lib/rabbitmq` inside container
|
|
- **Contents**:
|
|
- Message queues and persistent messages
|
|
- Virtual host metadata
|
|
- User credentials and permissions
|
|
- Configuration overrides
|
|
|
|
## Virtual Hosts and Users
|
|
|
|
### Default Admin Account
|
|
|
|
**Username**: `rabbitmq`
|
|
**Password**: `{{ vault_rabbitmq_password }}` (from vault)
|
|
**Privileges**: Full administrative access to all virtual hosts
|
|
|
|
The default admin account is created automatically when the container starts and can access:
|
|
- All virtual hosts (including `/`, `kairos`, `spelunker`)
|
|
- Management web interface
|
|
- All RabbitMQ management commands
|
|
|
|
### Kairos Virtual Host
|
|
|
|
**VHost**: `kairos`
|
|
**User**: `kairos`
|
|
**Password**: `{{ vault_kairos_rabbitmq_password }}`
|
|
**Permissions**: Full (configure, read, write) on all resources matching `.*`
|
|
|
|
Intended for the **Kairos** service (event-driven time-series processing system, planned future deployment).
|
|
|
|
### Spelunker Virtual Host
|
|
|
|
**VHost**: `spelunker`
|
|
**User**: `spelunker`
|
|
**Password**: `{{ vault_spelunker_rabbitmq_password }}`
|
|
**Permissions**: Full (configure, read, write) on all resources matching `.*`
|
|
|
|
Intended for the **Spelunker** service (log exploration and analytics platform, planned future deployment).
|
|
|
|
### Permission Model
|
|
|
|
Both service users have full access within their respective virtual hosts:
|
|
|
|
| Permission | Pattern | Description |
|
|
|------------|---------|-------------|
|
|
| Configure | `.*` | Create/delete queues, exchanges, bindings |
|
|
| Write | `.*` | Publish messages to exchanges |
|
|
| Read | `.*` | Consume messages from queues |
|
|
|
|
This isolation ensures:
|
|
- ✔ Each service operates in its own namespace
|
|
- ✔ Messages cannot cross between services
|
|
- ✔ Resource limits can be applied per-vhost
|
|
- ✔ Service credentials can be rotated independently
|
|
|
|
## Access and Administration
|
|
|
|
### Management Web Interface
|
|
|
|
**URL**: `http://oberon.incus:25582`
|
|
**External**: `http://{oberon-ip}:25582`
|
|
**Login**: `rabbitmq` / `{{ vault_rabbitmq_password }}`
|
|
|
|
Features:
|
|
- Queue inspection and message browsing
|
|
- Exchange and binding management
|
|
- Connection and channel monitoring
|
|
- User and permission administration
|
|
- Virtual host management
|
|
- Performance metrics and charts
|
|
|
|
### CLI Administration
|
|
|
|
#### On Host Machine (using rabbitmqadmin)
|
|
|
|
```bash
|
|
# List vhosts
|
|
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD list vhosts
|
|
|
|
# List queues in a vhost
|
|
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD -V kairos list queues
|
|
|
|
# Publish a test message
|
|
rabbitmqadmin -H oberon.incus -P 25582 -u rabbitmq -p PASSWORD -V kairos publish \
|
|
exchange=amq.default routing_key=test payload="test message"
|
|
```
|
|
|
|
#### Inside Container
|
|
|
|
```bash
|
|
# Enter the container
|
|
docker exec -it rabbitmq /bin/sh
|
|
|
|
# List vhosts
|
|
rabbitmqctl list_vhosts
|
|
|
|
# List users
|
|
rabbitmqctl list_users
|
|
|
|
# List permissions for a user
|
|
rabbitmqctl list_user_permissions kairos
|
|
|
|
# List queues in a vhost
|
|
rabbitmqctl list_queues -p kairos
|
|
|
|
# Check node status
|
|
rabbitmqctl status
|
|
```
|
|
|
|
### Connection Strings
|
|
|
|
#### AMQP Connection (from other containers on Oberon)
|
|
|
|
```
|
|
amqp://kairos:PASSWORD@localhost:5672/kairos
|
|
amqp://spelunker:PASSWORD@localhost:5672/spelunker
|
|
```
|
|
|
|
#### AMQP Connection (from other hosts)
|
|
|
|
```
|
|
amqp://kairos:PASSWORD@oberon.incus:5672/kairos
|
|
amqp://spelunker:PASSWORD@oberon.incus:5672/spelunker
|
|
```
|
|
|
|
#### Management API
|
|
|
|
```
|
|
http://rabbitmq:PASSWORD@oberon.incus:25582/api/
|
|
```
|
|
|
|
## Monitoring and Observability
|
|
|
|
### Logging
|
|
|
|
- **Driver**: syslog (Docker logging driver)
|
|
- **Destination**: `tcp://127.0.0.1:51402` (Alloy on Oberon)
|
|
- **Tag**: `rabbitmq`
|
|
- **Format**: `{{ syslog_format }}` (from Alloy configuration)
|
|
|
|
Logs are collected by Alloy and forwarded to Loki on Prospero for centralized log aggregation.
|
|
|
|
### Key Metrics (via Management UI)
|
|
|
|
| Metric | Description |
|
|
|--------|-------------|
|
|
| Connections | Active AMQP client connections |
|
|
| Channels | Active channels within connections |
|
|
| Queues | Total queues across all vhosts |
|
|
| Messages | Ready, unacknowledged, and total message counts |
|
|
| Message Rate | Publish/deliver rates (msg/s) |
|
|
| Memory Usage | Container memory consumption |
|
|
| Disk Usage | Persistent storage utilization |
|
|
|
|
### Health Check
|
|
|
|
```bash
|
|
# Check if RabbitMQ is running
|
|
docker ps | grep rabbitmq
|
|
|
|
# Check container logs
|
|
docker logs rabbitmq
|
|
|
|
# Check RabbitMQ node status
|
|
docker exec rabbitmq rabbitmqctl status
|
|
|
|
# Check cluster health (single-node, should show 1 node)
|
|
docker exec rabbitmq rabbitmqctl cluster_status
|
|
```
|
|
|
|
## Operational Tasks
|
|
|
|
### Restart RabbitMQ
|
|
|
|
```bash
|
|
# Via Docker Compose
|
|
cd /srv/rabbitmq
|
|
sudo -u rabbitmq docker compose restart
|
|
|
|
# Via Docker directly
|
|
docker restart rabbitmq
|
|
```
|
|
|
|
### Recreate Container (preserves data)
|
|
|
|
```bash
|
|
cd /srv/rabbitmq
|
|
sudo -u rabbitmq docker compose down
|
|
sudo -u rabbitmq docker compose up -d
|
|
```
|
|
|
|
### Add New Virtual Host and User
|
|
|
|
1. Update `group_vars/all/vars.yml`:
|
|
```yaml
|
|
rabbitmq_vhosts:
|
|
- name: newservice
|
|
|
|
rabbitmq_users:
|
|
- name: newservice
|
|
password: "{{ newservice_rabbitmq_password }}"
|
|
tags: []
|
|
|
|
rabbitmq_permissions:
|
|
- vhost: newservice
|
|
user: newservice
|
|
configure_priv: .*
|
|
read_priv: .*
|
|
write_priv: .*
|
|
|
|
# Add mapping
|
|
newservice_rabbitmq_password: "{{ vault_newservice_rabbitmq_password }}"
|
|
```
|
|
|
|
2. Add password to `group_vars/all/vault.yml`:
|
|
```bash
|
|
ansible-vault edit inventory/group_vars/all/vault.yml
|
|
# Add: vault_newservice_rabbitmq_password: "secure_password"
|
|
```
|
|
|
|
3. Run the playbook:
|
|
```bash
|
|
ansible-playbook rabbitmq/deploy.yml
|
|
```
|
|
|
|
The provisioning tasks are idempotent—existing vhosts and users are skipped, only new ones are created.
|
|
|
|
### Rotate User Password
|
|
|
|
```bash
|
|
# Inside container
|
|
docker exec rabbitmq rabbitmqctl change_password kairos "new_password"
|
|
|
|
# Update vault
|
|
ansible-vault edit inventory/group_vars/all/vault.yml
|
|
# Update vault_kairos_rabbitmq_password
|
|
```
|
|
|
|
### Clear All Messages in a Queue
|
|
|
|
```bash
|
|
docker exec rabbitmq rabbitmqctl purge_queue queue_name -p kairos
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Container Won't Start
|
|
|
|
Check Docker logs for errors:
|
|
```bash
|
|
docker logs rabbitmq
|
|
```
|
|
|
|
Common issues:
|
|
- Port conflict on 5672 or 25582
|
|
- Permission issues on `/srv/rabbitmq` directory
|
|
- Corrupted data volume
|
|
|
|
### Cannot Connect to Management UI
|
|
|
|
1. Verify port mapping: `docker port rabbitmq`
|
|
2. Check firewall rules on Oberon
|
|
3. Verify container is running: `docker ps | grep rabbitmq`
|
|
4. Check if management plugin is enabled (should be in `-management-alpine` image)
|
|
|
|
### User Authentication Failing
|
|
|
|
```bash
|
|
# List users and verify they exist
|
|
docker exec rabbitmq rabbitmqctl list_users
|
|
|
|
# Check user permissions
|
|
docker exec rabbitmq rabbitmqctl list_user_permissions kairos
|
|
|
|
# Verify vhost exists
|
|
docker exec rabbitmq rabbitmqctl list_vhosts
|
|
```
|
|
|
|
### High Memory Usage
|
|
|
|
RabbitMQ may consume significant memory with many messages. Check:
|
|
```bash
|
|
# Memory usage
|
|
docker exec rabbitmq rabbitmqctl status | grep memory
|
|
|
|
# Queue depths
|
|
docker exec rabbitmq rabbitmqctl list_queues -p kairos messages
|
|
|
|
# Consider setting memory limits in docker-compose.yml
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
### Network Isolation
|
|
|
|
- RabbitMQ AMQP port (5672) is **only** exposed on the Incus network (`10.10.0.0/16`)
|
|
- Management UI (25582) is exposed externally for administration
|
|
- For production: Place HAProxy in front of management UI with authentication
|
|
- Consider enabling SSL/TLS for AMQP connections in production
|
|
|
|
### Credential Management
|
|
|
|
- ✔ All passwords stored in Ansible Vault
|
|
- ✔ Service accounts have isolated virtual hosts
|
|
- ✔ Default admin account uses strong password from vault
|
|
- ⚠️ Credentials passed as environment variables (visible in `docker inspect`)
|
|
- Consider using Docker secrets or Vault integration for enhanced security
|
|
|
|
### Virtual Host Isolation
|
|
|
|
Each service operates in its own virtual host:
|
|
- Messages cannot cross between vhosts
|
|
- Resource quotas can be applied per-vhost
|
|
- Credentials can be rotated without affecting other services
|
|
|
|
## Future Enhancements
|
|
|
|
- [ ] **SSL/TLS Support**: Enable encrypted AMQP connections
|
|
- [ ] **Cluster Mode**: Add additional RabbitMQ nodes for high availability
|
|
- [ ] **Federation**: Connect to external RabbitMQ clusters
|
|
- [ ] **Prometheus Exporter**: Add metrics export for Grafana monitoring
|
|
- [ ] **Shovel Plugin**: Configure message forwarding between brokers
|
|
- [ ] **HAProxy Integration**: Reverse proxy for management UI with authentication
|
|
- [ ] **Docker Secrets**: Replace environment variables with Docker secrets
|
|
|
|
## References
|
|
|
|
- [RabbitMQ Official Documentation](https://www.rabbitmq.com/documentation.html)
|
|
- [RabbitMQ Management Plugin](https://www.rabbitmq.com/management.html)
|
|
- [AMQP 0-9-1 Protocol Reference](https://www.rabbitmq.com/amqp-0-9-1-reference.html)
|
|
- [Virtual Hosts](https://www.rabbitmq.com/vhosts.html)
|
|
- [Access Control (Authentication, Authorisation)](https://www.rabbitmq.com/access-control.html)
|
|
- [Monitoring RabbitMQ](https://www.rabbitmq.com/monitoring.html)
|
|
|
|
---
|
|
|
|
**Last Updated**: February 12, 2026
|
|
**Project**: Ouranos Infrastructure
|
|
**Approval**: Red Panda Approved™
|