docs: rewrite README with structured overview and quick start guide

Replaces the minimal project description with a comprehensive README
including a component overview table, quick start instructions, common
Ansible operations, and links to detailed documentation. Aligns with
Red Panda Approval™ standards.
This commit is contained in:
2026-03-03 12:49:06 +00:00
parent c7be03a743
commit b4d60f2f38
219 changed files with 34586 additions and 2 deletions

View File

@@ -0,0 +1,127 @@
Docker Compose doesn't pull newer images for existing tags
-----------------------------------------------------------
# Issue
Running `docker compose up` on a service tagged `:latest` does not check the registry for a newer image. The container keeps running the old image even though a newer one has been pushed upstream.
## Symptoms
- `docker compose up` starts the container immediately using the locally cached image
- `docker compose pull` or `docker pull <image>:latest` successfully downloads a newer image
- After pulling manually, `docker compose up` recreates the container with the new image
- The `community.docker.docker_compose_v2` Ansible module with `state: present` behaves identically — no pull check
# Explanation
Docker's default behaviour is: **if an image with the requested tag exists locally, use it without checking the registry.** The `:latest` tag is not special — it's just a regular mutable tag. Docker does not treat it as "always fetch the newest." It is simply the default tag applied when no tag is specified.
When you run `docker compose up`:
1. Docker checks if `image:latest` exists in the local image store
2. If yes → use it, no registry check
3. If no → pull from registry
This means a stale `:latest` can sit on your host indefinitely while the upstream registry has a completely different image behind the same tag. The only way Docker knows to pull is if:
- The image doesn't exist locally at all
- You explicitly tell it to pull
The same applies to the Ansible `community.docker.docker_compose_v2` module — `state: present` maps to `docker compose up` behaviour, so no pull check occurs unless you tell it to.
# Solution
Two complementary fixes ensure images are always checked against the registry.
## 1. Docker Compose — `pull_policy: always`
Add `pull_policy: always` to the service definition in `docker-compose.yml`:
```yaml
services:
my-service:
image: registry.example.com/my-image:latest
pull_policy: always # Check registry on every `up`
container_name: my-service
...
```
With this set, `docker compose up` will always contact the registry and compare the local image digest with the remote one. If they match, no download occurs — it's a lightweight check. If they differ, the new image layers are pulled.
Valid values for `pull_policy`:
| Value | Behaviour |
|-------|-----------|
| `always` | Always check the registry before starting |
| `missing` | Only pull if the image doesn't exist locally (default) |
| `never` | Never pull, fail if image doesn't exist locally |
| `build` | Always build the image (for services with `build:`) |
## 2. Ansible — `pull: always` on `docker_compose_v2`
Add `pull: always` to the `community.docker.docker_compose_v2` task:
```yaml
- name: Start service
community.docker.docker_compose_v2:
project_src: "{{ service_directory }}"
state: present
pull: always # Check registry during deploy
```
Valid values for `pull`:
| Value | Behaviour |
|-------|-----------|
| `always` | Always pull before starting (like `docker compose pull && up`) |
| `missing` | Only pull if image doesn't exist locally |
| `never` | Never pull |
| `policy` | Defer to `pull_policy` defined in the compose file |
## Why use both?
- **`pull_policy` in compose file** — Protects against manual `docker compose up` on the host
- **`pull: always` in Ansible** — Ensures automated deployments always get the freshest image
They are independent mechanisms. The Ansible `pull` parameter runs a pull step before compose up, regardless of what the compose file says. Belt and suspenders.
# Agathos Fix
Applied to `ansible/gitea_mcp/` as the first instance. The same pattern should be applied to any service using mutable tags (`:latest`, `:stable`, etc.).
**docker-compose.yml.j2:**
```yaml
services:
gitea-mcp:
image: docker.gitea.com/gitea-mcp-server:latest
pull_policy: always
...
```
**deploy.yml:**
```yaml
- name: Start Gitea MCP service
community.docker.docker_compose_v2:
project_src: "{{ gitea_mcp_directory }}"
state: present
pull: always
```
# When you DON'T need this
- **Pinned image tags** (e.g., `postgres:16.2`, `grafana/grafana:11.1.0`) — The tag is immutable, so there's nothing newer to pull. Using `pull: always` here just adds a redundant registry check on every deploy.
- **Locally built images** — If the image is built by `docker compose build`, use `pull_policy: build` instead.
- **Air-gapped / offline hosts** — `pull: always` will fail if the registry is unreachable. Use `missing` or `never`.
# Verification
```bash
# Check what image a running container is using
docker inspect --format='{{.Image}}' gitea-mcp
# Compare local digest with remote
docker images --digests docker.gitea.com/gitea-mcp-server
# Force pull and check if image ID changes
docker compose pull
docker compose up -d
```

View File

@@ -0,0 +1,134 @@
Docker won't start inside Incus container
------------------------------------------
# Issue
Running Docker inside Incus has worked for years, but a recent Ubuntu package update caused it to fail.
## Symptoms
Docker containers won't start with the following error:
```
docker compose up
Attaching to neo4j
Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: open sysctl net.ipv4.ip_unprivileged_port_start file: reopen fd 8: permission denied
```
The issue is AppArmor on Incus containers. The host has AppArmor, and Incus applies an AppArmor profile to containers with `security.nesting=true` that blocks Docker from writing to `/proc/sys/net/ipv4/ip_unprivileged_port_start`.
# Solution (Automated)
The fix requires **both** host-side and container-side changes. These are now automated in our infrastructure:
## 1. Terraform - Host-side fix
In `terraform/containers.tf`, all containers with `security.nesting=true` now include:
```terraform
config = {
"security.nesting" = true
"raw.lxc" = "lxc.apparmor.profile=unconfined"
}
```
This tells Incus not to load any AppArmor profile for the container.
## 2. Ansible - Container-side fix
In `ansible/docker/deploy.yml`, Docker deployment now creates a systemd override:
```yaml
- name: Create AppArmor workaround for Incus nested Docker
ansible.builtin.copy:
content: |
[Service]
Environment=container="setmeandforgetme"
dest: /etc/systemd/system/docker.service.d/apparmor-workaround.conf
```
This tells Docker to skip loading its own AppArmor profile.
# Manual Workaround
If you need to fix this manually (e.g., before running Terraform/Ansible):
## Step 1: Force unconfined mode from the Incus host
```bash
# On the HOST (pan.helu.ca), not in the container
incus config set <container-name> raw.lxc "lxc.apparmor.profile=unconfined" --project agathos
incus restart <container-name> --project agathos
```
## Step 2: Disable AppArmor for Docker inside the container
```bash
# Inside the container
sudo mkdir -p /etc/systemd/system/docker.service.d
sudo tee /etc/systemd/system/docker.service.d/apparmor-workaround.conf <<EOF
[Service]
Environment=container="setmeandforgetme"
EOF
sudo systemctl daemon-reload
sudo systemctl restart docker
```
Reference: [ktz.blog](https://blog.ktz.me/proxmox-9-broke-my-docker-containers/)
# Verification
Tested on Miranda (2025-12-28):
```bash
# Before fix - fails with permission denied
$ ssh miranda.incus "docker run hello-world"
docker: Error response from daemon: failed to create task for container: ... permission denied
# After applying both fixes
$ ssh miranda.incus "docker run hello-world"
Hello from Docker!
# Port binding also works
$ ssh miranda.incus "docker run -d -p 8080:80 nginx"
# Container starts successfully
```
# Security Considerations
Setting `lxc.apparmor.profile=unconfined` only disables the AppArmor profile that Incus applies **to** the container. The host's AppArmor daemon continues running and protecting the host itself.
Security layers with this fix:
- Host AppArmor ✅ (still active)
- Incus container isolation ✅ (namespaces, cgroups)
- Container AppArmor ❌ (disabled with unconfined)
- Docker container isolation ✅ (namespaces, cgroups)
For sandbox/dev environments, this tradeoff is acceptable since:
- The Incus container is already isolated from the host
- We're not running untrusted workloads
- Production uses VMs + Docker without Incus nesting
# Explanation
What happened is that a recent update on the host (probably the incus and/or apparmor packages that landed in Ubuntu 24.04) started feeding the container a new AppArmor profile that contains this rule (or one very much like it):
```
deny @{PROC}/sys/net/ipv4/ip_unprivileged_port_start rw,
```
That rule is not present in the profile that ships with plain Docker, but it is present in the profile that Incus now attaches to every container that has `security.nesting=true` (the flag you need to run Docker inside Incus).
Because the rule is a `deny`, it overrides any later `allow`, so Docker's own profile (which allows the write) is ignored and the kernel returns `permission denied` the first time Docker/runc tries to write the value that tells the kernel which ports an unprivileged user may bind to.
So the container itself starts fine, but as soon as Docker tries to start any of its own containers, the AppArmor policy that Incus attached to the nested container blocks the write and the whole Docker container creation aborts.
The two workarounds remove the enforcing profile:
1. **`raw.lxc = lxc.apparmor.profile=unconfined`** — Tells Incus "don't load any AppArmor profile for this container at all", so the offending rule is never applied.
2. **`Environment=container="setmeandforgetme"`** — Is the magic string Docker's systemd unit looks for. When it sees that variable it skips loading the Docker-default AppArmor profile. The value literally does not matter; the variable only has to exist.
Either way you end up with no AppArmor policy on the nested Docker container, so the write to `ip_unprivileged_port_start` succeeds and your containers start again.
**In short:** Recent Incus added a deny rule that clashes with Docker's need to tweak that sysctl; disabling the profile (host-side or container-side) is the quickest fix until the profiles are updated to allow the operation.
Because the rule is a deny, it overrides any later allow, so Dockers own profile (which allows the write) is ignored and the kernel returns: