docs: rewrite README with structured overview and quick start guide

Replaces the minimal project description with a comprehensive README
including a component overview table, quick start instructions, common
Ansible operations, and links to detailed documentation. Aligns with
Red Panda Approval™ standards.
This commit is contained in:
2026-03-03 12:49:06 +00:00
parent c7be03a743
commit b4d60f2f38
219 changed files with 34586 additions and 2 deletions

191
docs/cerbot.md Normal file
View File

@@ -0,0 +1,191 @@
# Certbot DNS-01 with Namecheap
This playbook deploys certbot with the Namecheap DNS plugin for DNS-01 validation, enabling wildcard SSL certificates.
## Overview
| Component | Value |
|-----------|-------|
| Installation | Python virtualenv in `/srv/certbot/.venv` |
| DNS Plugin | `certbot-dns-namecheap` |
| Validation | DNS-01 (supports wildcards) |
| Renewal | Systemd timer (twice daily) |
| Certificate Output | `/etc/haproxy/certs/{domain}.pem` |
| Metrics | Prometheus textfile collector |
## Deployments
### Titania (ouranos.helu.ca)
Production deployment providing Let's Encrypt certificates for the Agathos sandbox HAProxy reverse proxy.
| Setting | Value |
|---------|-------|
| **Host** | titania.incus |
| **Domain** | ouranos.helu.ca |
| **Wildcard** | *.ouranos.helu.ca |
| **Email** | webmaster@helu.ca |
| **HAProxy** | Port 443 (HTTPS), Port 80 (HTTP redirect) |
| **Renewal** | Twice daily, automatic HAProxy reload |
### Other Deployments
The playbook can be deployed to any host with HAProxy. See the example configuration for hippocamp.helu.ca (d.helu.ca domain) below.
## Prerequisites
1. **Namecheap API Access** enabled on your account
2. **Namecheap API key** generated
3. **IP whitelisted** in Namecheap API settings
4. **Ansible Vault** configured with Namecheap credentials
## Setup
### 1. Add Secrets to Ansible Vault
Add Namecheap credentials to `ansible/inventory/group_vars/all/vault.yml`:
```bash
ansible-vault edit inventory/group_vars/all/vault.yml
```
Add the following variables:
```yaml
vault_namecheap_username: "your_namecheap_username"
vault_namecheap_api_key: "your_namecheap_api_key"
```
Map these in `inventory/group_vars/all/vars.yml`:
```yaml
namecheap_username: "{{ vault_namecheap_username }}"
namecheap_api_key: "{{ vault_namecheap_api_key }}"
```
### 2. Configure Host Variables
For Titania, the configuration is in `inventory/host_vars/titania.incus.yml`:
```yaml
services:
- certbot
- haproxy
# ...
certbot_email: webmaster@helu.ca
certbot_cert_name: ouranos.helu.ca
certbot_domains:
- "*.ouranos.helu.ca"
- "ouranos.helu.ca"
```
### 3. Deploy
```bash
cd ansible
ansible-playbook certbot/deploy.yml --limit titania.incus
```
## Files Created
| Path | Purpose |
|------|---------|
| `/srv/certbot/.venv/` | Python virtualenv with certbot |
| `/srv/certbot/config/` | Certbot configuration and certificates |
| `/srv/certbot/credentials/namecheap.ini` | Namecheap API credentials (600 perms) |
| `/srv/certbot/hooks/renewal-hook.sh` | Post-renewal script |
| `/srv/certbot/hooks/cert-metrics.sh` | Prometheus metrics script |
| `/etc/haproxy/certs/ouranos.helu.ca.pem` | Combined cert for HAProxy (Titania) |
| `/etc/systemd/system/certbot-renew.service` | Renewal service unit |
| `/etc/systemd/system/certbot-renew.timer` | Twice-daily renewal timer |
| `/etc/systemd/system/certbot-renew.timer` | Twice-daily renewal timer |
## Renewal Process
1. Systemd timer triggers at 00:00 and 12:00 (with random delay up to 1 hour)
2. Certbot checks if certificate needs renewal (within 30 days of expiry)
3. If renewal needed:
- Creates DNS TXT record via Namecheap API
- Waits 120 seconds for propagation
- Validates and downloads new certificate
- Runs `renewal-hook.sh`
4. Renewal hook:
- Combines fullchain + privkey into HAProxy format
- Reloads HAProxy via `docker compose kill -s HUP haproxy`
- Updates Prometheus metrics
## Prometheus Metrics
Metrics written to `/var/lib/prometheus/node-exporter/ssl_cert.prom`:
| Metric | Description |
|--------|-------------|
| `ssl_certificate_expiry_timestamp` | Unix timestamp when cert expires |
| `ssl_certificate_expiry_seconds` | Seconds until cert expires |
| `ssl_certificate_valid` | 1 if valid, 0 if expired/missing |
Example alert rule:
```yaml
- alert: SSLCertificateExpiringSoon
expr: ssl_certificate_expiry_seconds < 604800 # 7 days
for: 1h
labels:
severity: warning
annotations:
summary: "SSL certificate expiring soon"
description: "Certificate for {{ $labels.domain }} expires in {{ $value | humanizeDuration }}"
```
## Troubleshooting
### View Certificate Status
```bash
# Check certificate expiry (Titania example)
openssl x509 -enddate -noout -in /etc/haproxy/certs/ouranos.helu.ca.pem
# Check certbot certificates
sudo -u certbot /srv/certbot/.venv/bin/certbot certificates \
--config-dir /srv/certbot/config
```
### Manual Renewal Test
```bash
# Dry run renewal
sudo -u certbot /srv/certbot/.venv/bin/certbot renew \
--config-dir /srv/certbot/config \
--work-dir /srv/certbot/work \
--logs-dir /srv/certbot/logs \
--dry-run
# Force renewal (if needed)
sudo -u certbot /srv/certbot/.venv/bin/certbot renew \
--config-dir /srv/certbot/config \
--work-dir /srv/certbot/work \
--logs-dir /srv/certbot/logs \
--force-renewal
```
### Check Systemd Timer
```bash
# Timer status
systemctl status certbot-renew.timer
# Last run
journalctl -u certbot-renew.service --since "1 day ago"
# List timers
systemctl list-timers certbot-renew.timer
```
### DNS Propagation Issues
If certificate requests fail due to DNS propagation:
1. Check Namecheap API is accessible
2. Verify IP is whitelisted
3. Increase propagation wait time (default 120s)
4. Check certbot logs: `/srv/certbot/logs/letsencrypt.log`
## Related Playbooks
- `haproxy/deploy.yml` - Depends on certificate from certbot
- `prometheus/node_deploy.yml` - Deploys node_exporter for metrics collection