feat: add RabbitMQ vhost and user configuration for mnemosyne
This commit is contained in:
149
docs/rommie.md
Normal file
149
docs/rommie.md
Normal file
@@ -0,0 +1,149 @@
|
||||
# Ansible Deployment for Rommie
|
||||
|
||||
Rommie is an MCP server that wraps [Agent S](https://github.com/simular-ai/Agent-S), enabling agent-to-agent collaboration for GUI automation. It exposes three MCP tools — `execute_gui_task`, `get_screenshot`, and `get_agent_status` — over Streamable HTTP, allowing remote AI agents to delegate GUI tasks to the MATE desktop running on `caliban.incus`.
|
||||
|
||||
Named after the Andromeda Ascendant's AI avatar.
|
||||
|
||||
## Host
|
||||
|
||||
| Host | Group | Type |
|
||||
|------|-------|------|
|
||||
| `caliban.incus` | `rommie` | Incus container |
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Control node
|
||||
|
||||
- Staged release tarball in `~/rel/` (produced by `agent_s/stage.yml`):
|
||||
- `~/rel/rommie_<rommie_rel>.tar`
|
||||
|
||||
### Target host
|
||||
|
||||
- Agent S fully deployed (`agent_s/deploy.yml`) — Rommie's `deploy.yml` imports it as a dependency
|
||||
- MATE desktop and XRDP running (Agent S deployment provides this)
|
||||
- Python 3.13 (Ubuntu 25.04)
|
||||
- X11 display available at the configured `DISPLAY` value
|
||||
|
||||
> **Note**: `gui-agents` 0.3.x declares `Requires-Python <=3.12` in its PyPI metadata despite working on Python 3.13. The deploy playbook pre-installs it with `--ignore-requires-python` before installing Rommie.
|
||||
|
||||
## Staging
|
||||
|
||||
Rommie is staged from a local git checkout using `agent_s/stage.yml` (which creates the rommie tarball as part of the Agent S staging run). The release branch is controlled by `rommie_rel` in `group_vars/all/vars.yml` (default: `main`).
|
||||
|
||||
## Deployment
|
||||
|
||||
```bash
|
||||
ansible-playbook ansible/rommie/deploy.yml
|
||||
```
|
||||
|
||||
The playbook imports `agent_s/deploy.yml` first to ensure the MATE desktop and Agent S dependencies are in place, then:
|
||||
|
||||
1. Creates `~/rommie/` and extracts the staged tarball
|
||||
2. Creates a Python venv at `~/env/rommie` with `--system-site-packages`
|
||||
3. Pre-installs `gui-agents>=0.3.1` with `--ignore-requires-python`
|
||||
4. Installs Rommie into the venv in editable mode (`pip install -e`)
|
||||
5. Deploys `~/rommie/.env` from the template
|
||||
6. Deploys and enables the `rommie.service` systemd unit
|
||||
7. Health-checks `http://localhost:<rommie_port>/mcp` (retries 5×, 3 s apart)
|
||||
|
||||
## MCP Tools
|
||||
|
||||
| Tool | Concurrency | Description |
|
||||
|------|-------------|-------------|
|
||||
| `execute_gui_task` | Serialized (one at a time) | Execute a GUI automation task via Agent S |
|
||||
| `get_screenshot` | Always available | Capture the current screen state |
|
||||
| `get_agent_status` | Always available | Query task progress and agent state |
|
||||
|
||||
Read-only tools (`get_screenshot`, `get_agent_status`) remain available while a GUI task is running. A second `execute_gui_task` call while one is in-flight returns a "busy" error.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
External Agent (e.g., Claude / MCP Switchboard)
|
||||
│ MCP Protocol (Streamable HTTP)
|
||||
│ http://caliban.incus:22031/mcp
|
||||
▼
|
||||
Rommie MCP Server
|
||||
(serialized task execution, multi-client reads)
|
||||
│
|
||||
▼
|
||||
Agent S (gui-agents package)
|
||||
│
|
||||
▼
|
||||
MATE Desktop ← X11 display :10 ← XRDP session
|
||||
```
|
||||
|
||||
## Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `rommie_port` | `22031` | HTTP listen port |
|
||||
| `rommie_host` | `0.0.0.0` | Bind address |
|
||||
| `rommie_display` | `:10` | X11 display for Agent S (XRDP assigns `:10` by default) |
|
||||
| `rommie_allowed_hosts` | `caliban.incus` | Allowed Host header values |
|
||||
| `rommie_model` | `Qwen3-VL-30B-A3B-Instruct-UD-Q5_K_XL.gguf` | Primary vision-language model |
|
||||
| `rommie_model_url` | `http://nyx.helu.ca:22078` | Inference endpoint for the primary model |
|
||||
| `rommie_provider` | `openai` | API provider for the primary model |
|
||||
| `rommie_ground_provider` | `huggingface` | API provider for the grounding model |
|
||||
| `rommie_ground_url` | `http://pan.helu.ca:22078` | Inference endpoint for the grounding model |
|
||||
| `rommie_ground_model` | `UI-TARS-7B-DPO-Q6_K_L.gguf` | Grounding model (UI element localisation) |
|
||||
| `rommie_grounding_width` | `1024` | Screenshot width passed to the grounding model |
|
||||
| `rommie_grounding_height` | `1024` | Screenshot height passed to the grounding model |
|
||||
| `rommie_rel` | `main` | Git branch/tag to stage from `~/git/rommie` |
|
||||
|
||||
All host-specific variables are set in `ansible/inventory/host_vars/caliban.incus.yml`. The `rommie_rel` default is in `ansible/inventory/group_vars/all/vars.yml`.
|
||||
|
||||
## Integration
|
||||
|
||||
The MCP URL for Rommie is registered in `group_vars/all/vars.yml`:
|
||||
|
||||
```yaml
|
||||
rommie_mcp_url: http://caliban.incus:22031/mcp
|
||||
```
|
||||
|
||||
Consumers (e.g., MCP Switchboard, Open WebUI) reference `{{ rommie_mcp_url }}`.
|
||||
|
||||
## Service Management
|
||||
|
||||
```bash
|
||||
# Check status
|
||||
systemctl status rommie
|
||||
|
||||
# Restart
|
||||
systemctl restart rommie
|
||||
|
||||
# View logs
|
||||
journalctl -u rommie -f
|
||||
```
|
||||
|
||||
The unit runs as `principal_user` (`robert`) and loads environment from `~/rommie/.env`. It restarts automatically on failure with a 10 s back-off.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### `gui-agents` version conflict
|
||||
|
||||
`gui-agents` 0.3.x requires Python <=3.12 in its PyPI metadata but works on 3.13. The deploy playbook installs it with `--ignore-requires-python`. If the install step fails with a version conflict, confirm the pre-install task ran and check the venv Python version:
|
||||
|
||||
```bash
|
||||
/home/robert/env/rommie/bin/python --version
|
||||
/home/robert/env/rommie/bin/pip show gui-agents
|
||||
```
|
||||
|
||||
### Health check fails
|
||||
|
||||
The playbook probes `http://localhost:22031/mcp` after starting the service. If it times out:
|
||||
|
||||
1. Check the service started: `systemctl status rommie`
|
||||
2. Confirm the `DISPLAY` variable resolves — XRDP must have created the `:10` display before Rommie starts
|
||||
3. Check logs: `journalctl -u rommie --since "5 min ago"`
|
||||
|
||||
### No X display
|
||||
|
||||
Rommie inherits `DISPLAY` from `.env`. If Agent S cannot connect to the display:
|
||||
|
||||
```bash
|
||||
# Verify XRDP created the display
|
||||
ls /tmp/.X11-unix/
|
||||
```
|
||||
|
||||
An active RDP session must exist or XRDP's `Xorg` daemon must be running for display `:10` to be present.
|
||||
Reference in New Issue
Block a user