# Certbot DNS-01 with Namecheap This playbook deploys certbot with the Namecheap DNS plugin for DNS-01 validation, enabling wildcard SSL certificates. ## Overview | Component | Value | |-----------|-------| | Installation | Python virtualenv in `/srv/certbot/.venv` | | DNS Plugin | `certbot-dns-namecheap` | | Validation | DNS-01 (supports wildcards) | | Renewal | Systemd timer (twice daily), runs as the `certbot` user | | Certificate Output | Combined PEM at `haproxy_cert_path` (Titania: `/etc/haproxy/certs/ouranos.pem`) | | HAProxy Reload | `systemctl reload haproxy` (native systemd, not Docker) | | Metrics | Prometheus textfile collector | ## Deployments ### Titania (ouranos.helu.ca) Production deployment providing Let's Encrypt certificates for the Ouranos sandbox HAProxy reverse proxy. | Setting | Value | |---------|-------| | **Host** | titania.incus | | **Domain** | ouranos.helu.ca | | **Wildcard** | *.ouranos.helu.ca | | **Email** | webmaster@helu.ca | | **HAProxy** | Port 443 (HTTPS), Port 80 (HTTP redirect) | | **Renewal** | Twice daily, automatic HAProxy reload | ### Other Deployments The playbook can be deployed to any host with HAProxy. See the example configuration for hippocamp.helu.ca (d.helu.ca domain) below. ## Prerequisites 1. **Namecheap API Access** enabled on your account 2. **Namecheap API key** generated 3. **IP whitelisted** in Namecheap API settings 4. **Ansible Vault** configured with Namecheap credentials ## Setup ### 1. Add Secrets to Ansible Vault Add Namecheap credentials to `ansible/inventory/group_vars/all/vault.yml`: ```bash ansible-vault edit inventory/group_vars/all/vault.yml ``` Add the following variables: ```yaml vault_namecheap_username: "your_namecheap_username" vault_namecheap_api_key: "your_namecheap_api_key" ``` Map these in `inventory/group_vars/all/vars.yml`: ```yaml namecheap_username: "{{ vault_namecheap_username }}" namecheap_api_key: "{{ vault_namecheap_api_key }}" ``` ### 2. Configure Host Variables For Titania, the configuration is in `inventory/host_vars/titania.incus.yml`: ```yaml services: - certbot - haproxy # ... certbot_email: webmaster@helu.ca certbot_certificates: - cert_name: wildcard.ouranos.helu.ca domains: ["*.ouranos.helu.ca", "ouranos.helu.ca"] # Where the renewal hook writes the combined fullchain+privkey PEM for HAProxy haproxy_cert_path: /etc/haproxy/certs/ouranos.pem ``` > The certbot lineage name is **`wildcard.ouranos.helu.ca`**, so the certbot > config lives under `/srv/certbot/config/live/wildcard.ouranos.helu.ca/`. The > combined PEM that HAProxy actually serves is a separate file at > `haproxy_cert_path` (`ouranos.pem`) written by the renewal hook — do not > confuse the two. > > The playbook also supports the single-cert form (`certbot_cert_name` + > `certbot_domains`) for hosts with one certificate. ### 3. Deploy ```bash cd ansible ansible-playbook certbot/deploy.yml --limit titania.incus ``` ## Files Created | Path | Purpose | |------|---------| | `/srv/certbot/.venv/` | Python virtualenv with certbot | | `/srv/certbot/config/` | Certbot configuration and certificates | | `/srv/certbot/credentials/namecheap.ini` | Namecheap API credentials (600 perms) | | `/srv/certbot/hooks/renewal-hook.sh` | Post-renewal script | | `/srv/certbot/hooks/cert-metrics.sh` | Prometheus metrics script | | `/etc/haproxy/certs/ouranos.pem` | Combined cert for HAProxy (Titania), written by the renewal hook | | `/etc/sudoers.d/certbot-haproxy-reload` | Scoped sudo rule letting certbot run `systemctl reload haproxy` | | `/etc/systemd/system/certbot-renew.service` | Renewal service unit (runs as the `certbot` user) | | `/etc/systemd/system/certbot-renew.timer` | Twice-daily renewal timer | ## Renewal Process 1. Systemd timer triggers at 00:00 and 12:00 (with random delay up to 1 hour) 2. Certbot checks if certificate needs renewal (within 30 days of expiry) 3. If renewal needed: - Creates DNS TXT record via Namecheap API - Waits 120 seconds for propagation - Validates and downloads new certificate - Runs `renewal-hook.sh` 4. Renewal hook (`renewal-hook.sh`, run via certbot's `--deploy-hook`): - Combines fullchain + privkey into the HAProxy PEM at `haproxy_cert_path` - Reloads native HAProxy via `sudo -n systemctl reload haproxy` - Always refreshes Prometheus metrics (even on failure — see below) > **HAProxy on Titania runs natively under systemd, not in Docker.** The hook > reloads it with `systemctl reload haproxy`. (Only Casdoor runs in Docker on > Titania.) ### Permission model (why renewals can silently fail) The renewal timer runs the hook as the unprivileged **`certbot`** user, so three permissions must line up or the renewed cert never reaches HAProxy: | Resource | Required state | Provided by | |----------|----------------|-------------| | `/etc/haproxy/certs` | `0770`, group `haproxy`; `certbot` is a member of `haproxy` | `haproxy/deploy.yml` (mode) + `certbot/deploy.yml` (group membership) | | `systemctl reload haproxy` | allowed for `certbot` via sudo | `/etc/sudoers.d/certbot-haproxy-reload` | | Prometheus textfile dir | group-writable by `certbot` | `certbot/deploy.yml` | If any of these is wrong, the hook fails. **Certbot treats a deploy-hook failure as a non-fatal WARNING and still reports "renewals succeeded"** — so a broken hook will let the live cert renew while HAProxy keeps serving the *old* file until it expires. To make this visible, the hook now: - checks each step and exits non-zero with an explicit `serving a STALE certificate` error (surfaced in the certbot/journal output), and - refreshes the Prometheus cert metrics on *every* exit, so the `SSLCertificateExpiringSoon` / `SSLCertificateExpired` alerts keep reflecting reality even when installation fails. ## Prometheus Metrics Metrics written to `/var/lib/prometheus/node-exporter/ssl_cert.prom`: | Metric | Description | |--------|-------------| | `ssl_certificate_expiry_timestamp` | Unix timestamp when cert expires | | `ssl_certificate_expiry_seconds` | Seconds until cert expires | | `ssl_certificate_valid` | 1 if valid, 0 if expired/missing | Example alert rule: ```yaml - alert: SSLCertificateExpiringSoon expr: ssl_certificate_expiry_seconds < 604800 # 7 days for: 1h labels: severity: warning annotations: summary: "SSL certificate expiring soon" description: "Certificate for {{ $labels.domain }} expires in {{ $value | humanizeDuration }}" ``` ## Troubleshooting ### View Certificate Status ```bash # Check expiry of the cert HAProxy actually serves (Titania) sudo openssl x509 -enddate -noout -in /etc/haproxy/certs/ouranos.pem # Confirm HAProxy is serving it on the wire echo | openssl s_client -connect titania.incus:8443 \ -servername grafana.ouranos.helu.ca 2>/dev/null \ | openssl x509 -noout -enddate -issuer # Check the underlying certbot lineage (may be newer than the served file # if the deploy hook failed to install it) sudo openssl x509 -enddate -noout \ -in /srv/certbot/config/live/wildcard.ouranos.helu.ca/fullchain.pem # Check certbot certificates sudo -u certbot /srv/certbot/.venv/bin/certbot certificates \ --config-dir /srv/certbot/config ``` > If the served file is older than the certbot lineage, the deploy hook is > failing to install renewals. Check the hook output: > `sudo grep -i hook /srv/certbot/logs/letsencrypt.log*` — look for > `Permission denied`, `reload failed`, or `serving a STALE certificate`. ### Manual Renewal Test ```bash # Dry run renewal sudo -u certbot /srv/certbot/.venv/bin/certbot renew \ --config-dir /srv/certbot/config \ --work-dir /srv/certbot/work \ --logs-dir /srv/certbot/logs \ --dry-run # Force renewal (if needed) sudo -u certbot /srv/certbot/.venv/bin/certbot renew \ --config-dir /srv/certbot/config \ --work-dir /srv/certbot/work \ --logs-dir /srv/certbot/logs \ --force-renewal ``` ### Check Systemd Timer ```bash # Timer status systemctl status certbot-renew.timer # Last run journalctl -u certbot-renew.service --since "1 day ago" # List timers systemctl list-timers certbot-renew.timer ``` ### DNS Propagation Issues If certificate requests fail due to DNS propagation: 1. Check Namecheap API is accessible 2. Verify IP is whitelisted 3. Increase propagation wait time (default 120s) 4. Check certbot logs: `/srv/certbot/logs/letsencrypt.log` ## Related Playbooks - `haproxy/deploy.yml` - Depends on certificate from certbot - `prometheus/node_deploy.yml` - Deploys node_exporter for metrics collection