fix(certbot): harden renewal hook and fix permission errors

The renewal deploy-hook ran as the certbot user but lacked permissions to
write the combined PEM to /etc/haproxy/certs and to reload HAProxy,
causing silent failures that left a stale certificate in production until
expiry.

- Add certbot user to the haproxy group so it can write the combined PEM
- Grant certbot NOPASSWD sudo for `systemctl reload haproxy` only
- Make the Prometheus textfile directory group-owned by certbot (0775)
  so cert-metrics.sh can atomically update ssl_cert.prom
- Refactor renewal-hook.sh to always refresh cert metrics on exit via a
  trap, ensuring expiry alerts fire when the hook itself is broken
- Replace `set -e` with explicit error handling and structured logging
This commit is contained in:
2026-06-17 09:58:46 -04:00
parent 2f5a15eef5
commit 343b0e13d6
10 changed files with 665 additions and 46 deletions

View File

@@ -29,7 +29,11 @@ ROMMIE_GROUNDING_HEIGHT={{ rommie_grounding_height | default(1024) }}
# ============================================================================
ROMMIE_HOST={{ rommie_host | default('0.0.0.0') }}
ROMMIE_PORT={{ rommie_port }}
ROMMIE_ALLOWED_HOSTS={{ rommie_allowed_hosts }}
# Idle MCP sessions are reaped after this many seconds (<=0 disables).
# Prevents unbounded StreamableHTTP transport accumulation from clients
# that drop their connection without sending an explicit DELETE.
ROMMIE_SESSION_IDLE_TIMEOUT={{ rommie_session_idle_timeout | default(1800) }}
# ============================================================================
# get_screenshot (parent-agent) output