chore(ansible): update model endpoints and enable Rommie deployment
- Bump Qwen model from 3.5 to 3.6 and update inference endpoints (nyx:22079→22072, pan:22078→22076) for caliban and puck hosts - Add Rommie MCP server deployment to site.yml - Update Rommie docs to reflect new port (20361), model versions, and health check accepting 200/406 status codes
This commit is contained in:
@@ -24,11 +24,11 @@ rommie_port: 20361
|
||||
rommie_host: "0.0.0.0"
|
||||
rommie_display: ":10"
|
||||
rommie_allowed_hosts: "caliban.incus,rommie.ouranos.helu.ca"
|
||||
rommie_model: Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf
|
||||
rommie_model_url: "http://nyx.helu.ca:22079"
|
||||
rommie_model: Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf
|
||||
rommie_model_url: "http://nyx.helu.ca:22072"
|
||||
rommie_provider: "openai"
|
||||
rommie_ground_provider: "huggingface"
|
||||
rommie_ground_url: "http://pan.helu.ca:22078"
|
||||
rommie_ground_url: "http://pan.helu.ca:22076"
|
||||
rommie_ground_model: "UI-TARS-7B-DPO-Q6_K_L.gguf"
|
||||
rommie_grounding_width: 1024
|
||||
rommie_grounding_height: 1024
|
||||
|
||||
@@ -79,8 +79,8 @@ pallas_log_level: INFO
|
||||
kottos_fastagent_log_level: info
|
||||
|
||||
# LLM provider — the same OpenAI-compatible Qwen endpoint Kottos uses today.
|
||||
kottos_default_model: "openai.Qwen3.5-35B-A3B-UD-Q4_K_XL.gguf"
|
||||
kottos_openai_base_url: "http://nyx.helu.ca:22079/v1"
|
||||
kottos_default_model: "openai.Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf"
|
||||
kottos_openai_base_url: "http://nyx.helu.ca:22072/v1"
|
||||
kottos_model_vision: true
|
||||
kottos_model_context_window: 192000
|
||||
kottos_model_max_output_tokens: 16384
|
||||
|
||||
@@ -48,6 +48,9 @@
|
||||
- name: Deploy Agent S
|
||||
import_playbook: agent_s/deploy.yml
|
||||
|
||||
- name: Deploy Rommie MCP Server
|
||||
import_playbook: rommie/deploy.yml
|
||||
|
||||
- name: Stage Kottos (Pallas FastAgent runtime)
|
||||
import_playbook: kottos/stage.yml
|
||||
|
||||
|
||||
@@ -44,7 +44,7 @@ The playbook imports `agent_s/deploy.yml` first to ensure the MATE desktop and A
|
||||
4. Installs Rommie into the venv in editable mode (`pip install -e`)
|
||||
5. Deploys `~/rommie/.env` from the template
|
||||
6. Deploys and enables the `rommie.service` systemd unit
|
||||
7. Health-checks `http://localhost:<rommie_port>/mcp` (retries 5×, 3 s apart)
|
||||
7. Health-checks `http://localhost:<rommie_port>/mcp` (retries 5×, 3 s apart, accepts 200/406)
|
||||
|
||||
## MCP Tools
|
||||
|
||||
@@ -64,7 +64,7 @@ External Agent (e.g., Claude Desktop / MCP Switchboard)
|
||||
│ https://rommie.ouranos.helu.ca/mcp
|
||||
▼
|
||||
Titania HAProxy (TLS termination, wildcard cert)
|
||||
│ http://caliban.incus:22031/mcp
|
||||
│ http://caliban.incus:20361/mcp
|
||||
▼
|
||||
Rommie MCP Server
|
||||
(serialized task execution, multi-client reads)
|
||||
@@ -80,15 +80,15 @@ External Agent (e.g., Claude Desktop / MCP Switchboard)
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `rommie_port` | `22031` | HTTP listen port |
|
||||
| `rommie_port` | `20361` | HTTP listen port |
|
||||
| `rommie_host` | `0.0.0.0` | Bind address |
|
||||
| `rommie_display` | `:10` | X11 display for Agent S (XRDP assigns `:10` by default) |
|
||||
| `rommie_allowed_hosts` | `caliban.incus` | Allowed Host header values |
|
||||
| `rommie_model` | `Qwen3-VL-30B-A3B-Instruct-UD-Q5_K_XL.gguf` | Primary vision-language model |
|
||||
| `rommie_model_url` | `http://nyx.helu.ca:22078` | Inference endpoint for the primary model |
|
||||
| `rommie_model` | `Qwen3.6-35B-A3B-UD-Q4_K_XL.gguf` | Primary vision-language model |
|
||||
| `rommie_model_url` | `http://nyx.helu.ca:22072` | Inference endpoint for the primary model |
|
||||
| `rommie_provider` | `openai` | API provider for the primary model |
|
||||
| `rommie_ground_provider` | `huggingface` | API provider for the grounding model |
|
||||
| `rommie_ground_url` | `http://pan.helu.ca:22078` | Inference endpoint for the grounding model |
|
||||
| `rommie_ground_url` | `http://pan.helu.ca:22076` | Inference endpoint for the grounding model |
|
||||
| `rommie_ground_model` | `UI-TARS-7B-DPO-Q6_K_L.gguf` | Grounding model (UI element localisation) |
|
||||
| `rommie_grounding_width` | `1024` | Screenshot width passed to the grounding model |
|
||||
| `rommie_grounding_height` | `1024` | Screenshot height passed to the grounding model |
|
||||
@@ -136,7 +136,7 @@ The unit runs as `principal_user` (`robert`) and loads environment from `~/rommi
|
||||
|
||||
### Health check fails
|
||||
|
||||
The playbook probes `http://localhost:22031/mcp` after starting the service. If it times out:
|
||||
The playbook probes `http://localhost:20361/mcp` after starting the service. If it times out:
|
||||
|
||||
1. Check the service started: `systemctl status rommie`
|
||||
2. Confirm the `DISPLAY` variable resolves — XRDP must have created the `:10` display before Rommie starts
|
||||
|
||||
Reference in New Issue
Block a user