# Ansible Deployment for Rommie Rommie is an MCP server that wraps [Agent S](https://github.com/simular-ai/Agent-S), enabling agent-to-agent collaboration for GUI automation. It exposes three MCP tools — `execute_gui_task`, `get_screenshot`, and `get_agent_status` — over Streamable HTTP, allowing remote AI agents to delegate GUI tasks to the MATE desktop running on `caliban.incus`. Named after the Andromeda Ascendant's AI avatar. ## Host | Host | Group | Type | |------|-------|------| | `caliban.incus` | `rommie` | Incus container | ## Prerequisites ### Control node - Staged release tarball in `~/rel/` (produced by `agent_s/stage.yml`): - `~/rel/rommie_.tar` ### Target host - Agent S fully deployed (`agent_s/deploy.yml`) — Rommie's `deploy.yml` imports it as a dependency - MATE desktop and XRDP running (Agent S deployment provides this) - Python 3.13 (Ubuntu 25.04) - X11 display available at the configured `DISPLAY` value > **Note**: `gui-agents` 0.3.x declares `Requires-Python <=3.12` in its PyPI metadata despite working on Python 3.13. The deploy playbook pre-installs it with `--ignore-requires-python` before installing Rommie. ## Staging Rommie is staged from a local git checkout using `agent_s/stage.yml` (which creates the rommie tarball as part of the Agent S staging run). The release branch is controlled by `rommie_rel` in `group_vars/all/vars.yml` (default: `main`). ## Deployment ```bash ansible-playbook ansible/rommie/deploy.yml ``` The playbook imports `agent_s/deploy.yml` first to ensure the MATE desktop and Agent S dependencies are in place, then: 1. Creates `~/rommie/` and extracts the staged tarball 2. Creates a Python venv at `~/env/rommie` with `--system-site-packages` 3. Pre-installs `gui-agents>=0.3.1` with `--ignore-requires-python` 4. Installs Rommie into the venv in editable mode (`pip install -e`) 5. Deploys `~/rommie/.env` from the template 6. Deploys and enables the `rommie.service` systemd unit 7. Health-checks `http://localhost:/mcp` (retries 5×, 3 s apart) ## MCP Tools | Tool | Concurrency | Description | |------|-------------|-------------| | `execute_gui_task` | Serialized (one at a time) | Execute a GUI automation task via Agent S | | `get_screenshot` | Always available | Capture the current screen state | | `get_agent_status` | Always available | Query task progress and agent state | Read-only tools (`get_screenshot`, `get_agent_status`) remain available while a GUI task is running. A second `execute_gui_task` call while one is in-flight returns a "busy" error. ## Architecture ``` External Agent (e.g., Claude / MCP Switchboard) │ MCP Protocol (Streamable HTTP) │ http://caliban.incus:22031/mcp ▼ Rommie MCP Server (serialized task execution, multi-client reads) │ ▼ Agent S (gui-agents package) │ ▼ MATE Desktop ← X11 display :10 ← XRDP session ``` ## Variables | Variable | Default | Description | |----------|---------|-------------| | `rommie_port` | `22031` | HTTP listen port | | `rommie_host` | `0.0.0.0` | Bind address | | `rommie_display` | `:10` | X11 display for Agent S (XRDP assigns `:10` by default) | | `rommie_allowed_hosts` | `caliban.incus` | Allowed Host header values | | `rommie_model` | `Qwen3-VL-30B-A3B-Instruct-UD-Q5_K_XL.gguf` | Primary vision-language model | | `rommie_model_url` | `http://nyx.helu.ca:22078` | Inference endpoint for the primary model | | `rommie_provider` | `openai` | API provider for the primary model | | `rommie_ground_provider` | `huggingface` | API provider for the grounding model | | `rommie_ground_url` | `http://pan.helu.ca:22078` | Inference endpoint for the grounding model | | `rommie_ground_model` | `UI-TARS-7B-DPO-Q6_K_L.gguf` | Grounding model (UI element localisation) | | `rommie_grounding_width` | `1024` | Screenshot width passed to the grounding model | | `rommie_grounding_height` | `1024` | Screenshot height passed to the grounding model | | `rommie_rel` | `main` | Git branch/tag to stage from `~/git/rommie` | All host-specific variables are set in `ansible/inventory/host_vars/caliban.incus.yml`. The `rommie_rel` default is in `ansible/inventory/group_vars/all/vars.yml`. ## Integration The MCP URL for Rommie is registered in `group_vars/all/vars.yml`: ```yaml rommie_mcp_url: http://caliban.incus:22031/mcp ``` Consumers (e.g., MCP Switchboard, Open WebUI) reference `{{ rommie_mcp_url }}`. ## Service Management ```bash # Check status systemctl status rommie # Restart systemctl restart rommie # View logs journalctl -u rommie -f ``` The unit runs as `principal_user` (`robert`) and loads environment from `~/rommie/.env`. It restarts automatically on failure with a 10 s back-off. ## Troubleshooting ### `gui-agents` version conflict `gui-agents` 0.3.x requires Python <=3.12 in its PyPI metadata but works on 3.13. The deploy playbook installs it with `--ignore-requires-python`. If the install step fails with a version conflict, confirm the pre-install task ran and check the venv Python version: ```bash /home/robert/env/rommie/bin/python --version /home/robert/env/rommie/bin/pip show gui-agents ``` ### Health check fails The playbook probes `http://localhost:22031/mcp` after starting the service. If it times out: 1. Check the service started: `systemctl status rommie` 2. Confirm the `DISPLAY` variable resolves — XRDP must have created the `:10` display before Rommie starts 3. Check logs: `journalctl -u rommie --since "5 min ago"` ### No X display Rommie inherits `DISPLAY` from `.env`. If Agent S cannot connect to the display: ```bash # Verify XRDP created the display ls /tmp/.X11-unix/ ``` An active RDP session must exist or XRDP's `Xorg` daemon must be running for display `:10` to be present.