Files
ouranos/docs/rommie.md

6.0 KiB
Raw Permalink Blame History

Ansible Deployment for Rommie

Rommie is an MCP server that wraps Agent S, enabling agent-to-agent collaboration for GUI automation. It exposes three MCP tools — execute_gui_task, get_screenshot, and get_agent_status — over Streamable HTTP, allowing remote AI agents to delegate GUI tasks to the MATE desktop running on caliban.incus.

Named after the Andromeda Ascendant's AI avatar.

Host

Host Group Type
caliban.incus rommie Incus container

Prerequisites

Control node

  • Staged release tarball in ~/rel/ (produced by agent_s/stage.yml):
    • ~/rel/rommie_<rommie_rel>.tar

Target host

  • Agent S fully deployed (agent_s/deploy.yml) — Rommie's deploy.yml imports it as a dependency
  • MATE desktop and XRDP running (Agent S deployment provides this)
  • Python 3.13 (Ubuntu 25.04)
  • X11 display available at the configured DISPLAY value

Note

: gui-agents 0.3.x declares Requires-Python <=3.12 in its PyPI metadata despite working on Python 3.13. The deploy playbook pre-installs it with --ignore-requires-python before installing Rommie.

Staging

Rommie is staged from a local git checkout using agent_s/stage.yml (which creates the rommie tarball as part of the Agent S staging run). The release branch is controlled by rommie_rel in group_vars/all/vars.yml (default: main).

Deployment

ansible-playbook ansible/rommie/deploy.yml

The playbook imports agent_s/deploy.yml first to ensure the MATE desktop and Agent S dependencies are in place, then:

  1. Creates ~/rommie/ and extracts the staged tarball
  2. Creates a Python venv at ~/env/rommie with --system-site-packages
  3. Pre-installs gui-agents>=0.3.1 with --ignore-requires-python
  4. Installs Rommie into the venv in editable mode (pip install -e)
  5. Deploys ~/rommie/.env from the template
  6. Deploys and enables the rommie.service systemd unit
  7. Health-checks http://localhost:<rommie_port>/mcp (retries 5×, 3 s apart)

MCP Tools

Tool Concurrency Description
execute_gui_task Serialized (one at a time) Execute a GUI automation task via Agent S
get_screenshot Always available Capture the current screen state
get_agent_status Always available Query task progress and agent state

Read-only tools (get_screenshot, get_agent_status) remain available while a GUI task is running. A second execute_gui_task call while one is in-flight returns a "busy" error.

Architecture

External Agent (e.g., Claude Desktop / MCP Switchboard)
        │  MCP Protocol (Streamable HTTP, TLS)
        │  https://rommie.ouranos.helu.ca/mcp
        ▼
  Titania HAProxy (TLS termination, wildcard cert)
        │  http://caliban.incus:22031/mcp
        ▼
  Rommie MCP Server
  (serialized task execution, multi-client reads)
        │
        ▼
  Agent S (gui-agents package)
        │
        ▼
  MATE Desktop  ←  X11 display :10  ←  XRDP session

Variables

Variable Default Description
rommie_port 22031 HTTP listen port
rommie_host 0.0.0.0 Bind address
rommie_display :10 X11 display for Agent S (XRDP assigns :10 by default)
rommie_allowed_hosts caliban.incus Allowed Host header values
rommie_model Qwen3-VL-30B-A3B-Instruct-UD-Q5_K_XL.gguf Primary vision-language model
rommie_model_url http://nyx.helu.ca:22078 Inference endpoint for the primary model
rommie_provider openai API provider for the primary model
rommie_ground_provider huggingface API provider for the grounding model
rommie_ground_url http://pan.helu.ca:22078 Inference endpoint for the grounding model
rommie_ground_model UI-TARS-7B-DPO-Q6_K_L.gguf Grounding model (UI element localisation)
rommie_grounding_width 1024 Screenshot width passed to the grounding model
rommie_grounding_height 1024 Screenshot height passed to the grounding model
rommie_rel main Git branch/tag to stage from ~/git/rommie

All host-specific variables are set in ansible/inventory/host_vars/caliban.incus.yml. The rommie_rel default is in ansible/inventory/group_vars/all/vars.yml.

Integration

The MCP URL for Rommie is registered in group_vars/all/vars.yml:

rommie_mcp_url: https://rommie.ouranos.helu.ca/mcp

Consumers (e.g., MCP Switchboard, Open WebUI, Claude Desktop) reference {{ rommie_mcp_url }}.

The route is served via Titania's HAProxy using the existing *.ouranos.helu.ca Let's Encrypt wildcard certificate. No additional certificate provisioning is required.

Service Management

# Check status
systemctl status rommie

# Restart
systemctl restart rommie

# View logs
journalctl -u rommie -f

The unit runs as principal_user (robert) and loads environment from ~/rommie/.env. It restarts automatically on failure with a 10 s back-off.

Troubleshooting

gui-agents version conflict

gui-agents 0.3.x requires Python <=3.12 in its PyPI metadata but works on 3.13. The deploy playbook installs it with --ignore-requires-python. If the install step fails with a version conflict, confirm the pre-install task ran and check the venv Python version:

/home/robert/env/rommie/bin/python --version
/home/robert/env/rommie/bin/pip show gui-agents

Health check fails

The playbook probes http://localhost:22031/mcp after starting the service. If it times out:

  1. Check the service started: systemctl status rommie
  2. Confirm the DISPLAY variable resolves — XRDP must have created the :10 display before Rommie starts
  3. Check logs: journalctl -u rommie --since "5 min ago"

No X display

Rommie inherits DISPLAY from .env. If Agent S cannot connect to the display:

# Verify XRDP created the display
ls /tmp/.X11-unix/

An active RDP session must exist or XRDP's Xorg daemon must be running for display :10 to be present.