5.7 KiB
Ansible Deployment for Rommie
Rommie is an MCP server that wraps Agent S, enabling agent-to-agent collaboration for GUI automation. It exposes three MCP tools — execute_gui_task, get_screenshot, and get_agent_status — over Streamable HTTP, allowing remote AI agents to delegate GUI tasks to the MATE desktop running on caliban.incus.
Named after the Andromeda Ascendant's AI avatar.
Host
| Host | Group | Type |
|---|---|---|
caliban.incus |
rommie |
Incus container |
Prerequisites
Control node
- Staged release tarball in
~/rel/(produced byagent_s/stage.yml):~/rel/rommie_<rommie_rel>.tar
Target host
- Agent S fully deployed (
agent_s/deploy.yml) — Rommie'sdeploy.ymlimports it as a dependency - MATE desktop and XRDP running (Agent S deployment provides this)
- Python 3.13 (Ubuntu 25.04)
- X11 display available at the configured
DISPLAYvalue
Note
:
gui-agents0.3.x declaresRequires-Python <=3.12in its PyPI metadata despite working on Python 3.13. The deploy playbook pre-installs it with--ignore-requires-pythonbefore installing Rommie.
Staging
Rommie is staged from a local git checkout using agent_s/stage.yml (which creates the rommie tarball as part of the Agent S staging run). The release branch is controlled by rommie_rel in group_vars/all/vars.yml (default: main).
Deployment
ansible-playbook ansible/rommie/deploy.yml
The playbook imports agent_s/deploy.yml first to ensure the MATE desktop and Agent S dependencies are in place, then:
- Creates
~/rommie/and extracts the staged tarball - Creates a Python venv at
~/env/rommiewith--system-site-packages - Pre-installs
gui-agents>=0.3.1with--ignore-requires-python - Installs Rommie into the venv in editable mode (
pip install -e) - Deploys
~/rommie/.envfrom the template - Deploys and enables the
rommie.servicesystemd unit - Health-checks
http://localhost:<rommie_port>/mcp(retries 5×, 3 s apart)
MCP Tools
| Tool | Concurrency | Description |
|---|---|---|
execute_gui_task |
Serialized (one at a time) | Execute a GUI automation task via Agent S |
get_screenshot |
Always available | Capture the current screen state |
get_agent_status |
Always available | Query task progress and agent state |
Read-only tools (get_screenshot, get_agent_status) remain available while a GUI task is running. A second execute_gui_task call while one is in-flight returns a "busy" error.
Architecture
External Agent (e.g., Claude / MCP Switchboard)
│ MCP Protocol (Streamable HTTP)
│ http://caliban.incus:22031/mcp
▼
Rommie MCP Server
(serialized task execution, multi-client reads)
│
▼
Agent S (gui-agents package)
│
▼
MATE Desktop ← X11 display :10 ← XRDP session
Variables
| Variable | Default | Description |
|---|---|---|
rommie_port |
22031 |
HTTP listen port |
rommie_host |
0.0.0.0 |
Bind address |
rommie_display |
:10 |
X11 display for Agent S (XRDP assigns :10 by default) |
rommie_allowed_hosts |
caliban.incus |
Allowed Host header values |
rommie_model |
Qwen3-VL-30B-A3B-Instruct-UD-Q5_K_XL.gguf |
Primary vision-language model |
rommie_model_url |
http://nyx.helu.ca:22078 |
Inference endpoint for the primary model |
rommie_provider |
openai |
API provider for the primary model |
rommie_ground_provider |
huggingface |
API provider for the grounding model |
rommie_ground_url |
http://pan.helu.ca:22078 |
Inference endpoint for the grounding model |
rommie_ground_model |
UI-TARS-7B-DPO-Q6_K_L.gguf |
Grounding model (UI element localisation) |
rommie_grounding_width |
1024 |
Screenshot width passed to the grounding model |
rommie_grounding_height |
1024 |
Screenshot height passed to the grounding model |
rommie_rel |
main |
Git branch/tag to stage from ~/git/rommie |
All host-specific variables are set in ansible/inventory/host_vars/caliban.incus.yml. The rommie_rel default is in ansible/inventory/group_vars/all/vars.yml.
Integration
The MCP URL for Rommie is registered in group_vars/all/vars.yml:
rommie_mcp_url: http://caliban.incus:22031/mcp
Consumers (e.g., MCP Switchboard, Open WebUI) reference {{ rommie_mcp_url }}.
Service Management
# Check status
systemctl status rommie
# Restart
systemctl restart rommie
# View logs
journalctl -u rommie -f
The unit runs as principal_user (robert) and loads environment from ~/rommie/.env. It restarts automatically on failure with a 10 s back-off.
Troubleshooting
gui-agents version conflict
gui-agents 0.3.x requires Python <=3.12 in its PyPI metadata but works on 3.13. The deploy playbook installs it with --ignore-requires-python. If the install step fails with a version conflict, confirm the pre-install task ran and check the venv Python version:
/home/robert/env/rommie/bin/python --version
/home/robert/env/rommie/bin/pip show gui-agents
Health check fails
The playbook probes http://localhost:22031/mcp after starting the service. If it times out:
- Check the service started:
systemctl status rommie - Confirm the
DISPLAYvariable resolves — XRDP must have created the:10display before Rommie starts - Check logs:
journalctl -u rommie --since "5 min ago"
No X display
Rommie inherits DISPLAY from .env. If Agent S cannot connect to the display:
# Verify XRDP created the display
ls /tmp/.X11-unix/
An active RDP session must exist or XRDP's Xorg daemon must be running for display :10 to be present.