150 lines
5.7 KiB
Markdown
150 lines
5.7 KiB
Markdown
# Ansible Deployment for Rommie
|
||
|
||
Rommie is an MCP server that wraps [Agent S](https://github.com/simular-ai/Agent-S), enabling agent-to-agent collaboration for GUI automation. It exposes three MCP tools — `execute_gui_task`, `get_screenshot`, and `get_agent_status` — over Streamable HTTP, allowing remote AI agents to delegate GUI tasks to the MATE desktop running on `caliban.incus`.
|
||
|
||
Named after the Andromeda Ascendant's AI avatar.
|
||
|
||
## Host
|
||
|
||
| Host | Group | Type |
|
||
|------|-------|------|
|
||
| `caliban.incus` | `rommie` | Incus container |
|
||
|
||
## Prerequisites
|
||
|
||
### Control node
|
||
|
||
- Staged release tarball in `~/rel/` (produced by `agent_s/stage.yml`):
|
||
- `~/rel/rommie_<rommie_rel>.tar`
|
||
|
||
### Target host
|
||
|
||
- Agent S fully deployed (`agent_s/deploy.yml`) — Rommie's `deploy.yml` imports it as a dependency
|
||
- MATE desktop and XRDP running (Agent S deployment provides this)
|
||
- Python 3.13 (Ubuntu 25.04)
|
||
- X11 display available at the configured `DISPLAY` value
|
||
|
||
> **Note**: `gui-agents` 0.3.x declares `Requires-Python <=3.12` in its PyPI metadata despite working on Python 3.13. The deploy playbook pre-installs it with `--ignore-requires-python` before installing Rommie.
|
||
|
||
## Staging
|
||
|
||
Rommie is staged from a local git checkout using `agent_s/stage.yml` (which creates the rommie tarball as part of the Agent S staging run). The release branch is controlled by `rommie_rel` in `group_vars/all/vars.yml` (default: `main`).
|
||
|
||
## Deployment
|
||
|
||
```bash
|
||
ansible-playbook ansible/rommie/deploy.yml
|
||
```
|
||
|
||
The playbook imports `agent_s/deploy.yml` first to ensure the MATE desktop and Agent S dependencies are in place, then:
|
||
|
||
1. Creates `~/rommie/` and extracts the staged tarball
|
||
2. Creates a Python venv at `~/env/rommie` with `--system-site-packages`
|
||
3. Pre-installs `gui-agents>=0.3.1` with `--ignore-requires-python`
|
||
4. Installs Rommie into the venv in editable mode (`pip install -e`)
|
||
5. Deploys `~/rommie/.env` from the template
|
||
6. Deploys and enables the `rommie.service` systemd unit
|
||
7. Health-checks `http://localhost:<rommie_port>/mcp` (retries 5×, 3 s apart)
|
||
|
||
## MCP Tools
|
||
|
||
| Tool | Concurrency | Description |
|
||
|------|-------------|-------------|
|
||
| `execute_gui_task` | Serialized (one at a time) | Execute a GUI automation task via Agent S |
|
||
| `get_screenshot` | Always available | Capture the current screen state |
|
||
| `get_agent_status` | Always available | Query task progress and agent state |
|
||
|
||
Read-only tools (`get_screenshot`, `get_agent_status`) remain available while a GUI task is running. A second `execute_gui_task` call while one is in-flight returns a "busy" error.
|
||
|
||
## Architecture
|
||
|
||
```
|
||
External Agent (e.g., Claude / MCP Switchboard)
|
||
│ MCP Protocol (Streamable HTTP)
|
||
│ http://caliban.incus:22031/mcp
|
||
▼
|
||
Rommie MCP Server
|
||
(serialized task execution, multi-client reads)
|
||
│
|
||
▼
|
||
Agent S (gui-agents package)
|
||
│
|
||
▼
|
||
MATE Desktop ← X11 display :10 ← XRDP session
|
||
```
|
||
|
||
## Variables
|
||
|
||
| Variable | Default | Description |
|
||
|----------|---------|-------------|
|
||
| `rommie_port` | `22031` | HTTP listen port |
|
||
| `rommie_host` | `0.0.0.0` | Bind address |
|
||
| `rommie_display` | `:10` | X11 display for Agent S (XRDP assigns `:10` by default) |
|
||
| `rommie_allowed_hosts` | `caliban.incus` | Allowed Host header values |
|
||
| `rommie_model` | `Qwen3-VL-30B-A3B-Instruct-UD-Q5_K_XL.gguf` | Primary vision-language model |
|
||
| `rommie_model_url` | `http://nyx.helu.ca:22078` | Inference endpoint for the primary model |
|
||
| `rommie_provider` | `openai` | API provider for the primary model |
|
||
| `rommie_ground_provider` | `huggingface` | API provider for the grounding model |
|
||
| `rommie_ground_url` | `http://pan.helu.ca:22078` | Inference endpoint for the grounding model |
|
||
| `rommie_ground_model` | `UI-TARS-7B-DPO-Q6_K_L.gguf` | Grounding model (UI element localisation) |
|
||
| `rommie_grounding_width` | `1024` | Screenshot width passed to the grounding model |
|
||
| `rommie_grounding_height` | `1024` | Screenshot height passed to the grounding model |
|
||
| `rommie_rel` | `main` | Git branch/tag to stage from `~/git/rommie` |
|
||
|
||
All host-specific variables are set in `ansible/inventory/host_vars/caliban.incus.yml`. The `rommie_rel` default is in `ansible/inventory/group_vars/all/vars.yml`.
|
||
|
||
## Integration
|
||
|
||
The MCP URL for Rommie is registered in `group_vars/all/vars.yml`:
|
||
|
||
```yaml
|
||
rommie_mcp_url: http://caliban.incus:22031/mcp
|
||
```
|
||
|
||
Consumers (e.g., MCP Switchboard, Open WebUI) reference `{{ rommie_mcp_url }}`.
|
||
|
||
## Service Management
|
||
|
||
```bash
|
||
# Check status
|
||
systemctl status rommie
|
||
|
||
# Restart
|
||
systemctl restart rommie
|
||
|
||
# View logs
|
||
journalctl -u rommie -f
|
||
```
|
||
|
||
The unit runs as `principal_user` (`robert`) and loads environment from `~/rommie/.env`. It restarts automatically on failure with a 10 s back-off.
|
||
|
||
## Troubleshooting
|
||
|
||
### `gui-agents` version conflict
|
||
|
||
`gui-agents` 0.3.x requires Python <=3.12 in its PyPI metadata but works on 3.13. The deploy playbook installs it with `--ignore-requires-python`. If the install step fails with a version conflict, confirm the pre-install task ran and check the venv Python version:
|
||
|
||
```bash
|
||
/home/robert/env/rommie/bin/python --version
|
||
/home/robert/env/rommie/bin/pip show gui-agents
|
||
```
|
||
|
||
### Health check fails
|
||
|
||
The playbook probes `http://localhost:22031/mcp` after starting the service. If it times out:
|
||
|
||
1. Check the service started: `systemctl status rommie`
|
||
2. Confirm the `DISPLAY` variable resolves — XRDP must have created the `:10` display before Rommie starts
|
||
3. Check logs: `journalctl -u rommie --since "5 min ago"`
|
||
|
||
### No X display
|
||
|
||
Rommie inherits `DISPLAY` from `.env`. If Agent S cannot connect to the display:
|
||
|
||
```bash
|
||
# Verify XRDP created the display
|
||
ls /tmp/.X11-unix/
|
||
```
|
||
|
||
An active RDP session must exist or XRDP's `Xorg` daemon must be running for display `:10` to be present.
|