# Ansible Deployment for Agent S Agent S is a computer-use automation agent that controls a desktop environment via RDP. The deployment configures a full graphical stack on `larissa.helu.ca`: MATE desktop, XRDP, audio redirection via PulseAudio, and an AT-SPI accessibility bridge so agents can introspect the UI tree. ## Host | Host | Group | Type | |------|-------|------| | `larissa.helu.ca` | `agent_s` | Incus container | ## Overview The deployment installs and configures: - **Principal user** (`robert`, UID 1000) — the human account the agent operates on behalf of - **MATE desktop** — chosen for strong AT-SPI accessibility support - **Firefox** from the Mozilla APT repo (bypasses the snap dependency introduced by Ubuntu) - **Google Chrome** for browser automation - **XRDP** with a custom Xorg config pinned to 1024×768 for UI-TARS / Agent-S model compatibility - **PulseAudio + pulseaudio-module-xrdp** — audio redirection over RDP - **AT-SPI accessibility stack** — allows agents to query the widget tree - **Python virtual environment** with the Agent-S package and dependencies - **Agent-S repository** extracted from a staged release tarball - **Environment activation script** at `~/.agent_s_env` ## Prerequisites ### Control node - Ansible 2.12+ - SSH access to `larissa.helu.ca` - Staged release tarballs in `~/rel/` (produced by `stage.yml`): - `~/rel/agent_s_.tar` - `~/rel/pulseaudio_module_xrdp_.tar` ### Target host - Ubuntu 25.04 - Network access to Mozilla APT, Google Chrome DL, and the Ubuntu package mirrors - deb-src repositories available (the playbook enables them if absent) ## Staging Before deploying, stage release tarballs from local git checkouts: ```bash ansible-playbook ansible/agent_s/stage.yml ``` `stage.yml` runs on localhost and creates two tarballs from the configured git branches: | Tarball | Source repo | Branch variable | |---------|-------------|-----------------| | `agent_s_.tar` | `~/gh/Agent-S` | `agent_s_rel` | | `pulseaudio_module_xrdp_.tar` | `~/gh/pulseaudio-module-xrdp` | `pulseaudio_module_xrdp_rel` | Both variables default to the `all` group vars (`agent_s_rel: main`, `pulseaudio_module_xrdp_rel: devel`). ## Deployment ```bash ansible-playbook ansible/agent_s/deploy.yml ``` ### Phase 1 — Principal user and snap suppression Creates the `robert` account (UID 1000) and pins `snapd` at priority -10 via `/etc/apt/preferences.d/nosnap.pref`. This must run before the desktop install so `ubuntu-mate-desktop` cannot pull in snap-packaged Firefox. ### Phase 2 — Firefox Adds the Mozilla APT repo and pins all `packages.mozilla.org` packages at priority 1000, ensuring the system installs the .deb Firefox rather than the snap. ### Phase 3 — MATE desktop Installs `ubuntu-mate-desktop`. MATE is preferred over GNOME because its AT-SPI bridge is more reliable in a headless XRDP session. ### Phase 4 — XRDP Installs `xrdp` and `xorgxrdp`, adds the `xrdp` user to `ssl-cert`, and enables the service. The Xorg configuration is deployed separately (Phase 8). ### Phase 5 — AT-SPI accessibility Installs the AT-SPI core libraries and adds `/etc/profile.d/atspi.sh` to set: ```bash GTK_MODULES=gail:atk-bridge NO_AT_BRIDGE=0 ACCESSIBILITY_ENABLED=1 ``` These environment variables are picked up by MATE applications when launched from `.xsession`, making the widget tree available to AT-SPI consumers such as `pyatspi`. ### Phase 6 — PulseAudio and RDP audio See [Sound device configuration](#sound-device-configuration) below. ### Phase 7 — Packages, Chrome, Python environment, Agent-S - Installs OCR support (`tesseract-ocr`), Python 3, and assistive tech libraries - Downloads and installs Google Chrome from the official .deb - Creates a Python venv at `~/env/agents` with `--system-site-packages` (so `pyatspi` and `python3-gi`, installed system-wide, are available) - Extracts Agent-S into `~/gh/Agent-S` ### Phase 8 — XRDP Xorg configuration and session Deploys `xorg.conf.j2` to `/etc/X11/xrdp/xorg.conf` and writes `.xsession` to start MATE: ``` exec mate-session ``` Any change to the Xorg config triggers the `restart xrdp` handler. --- ## X Server / RDP configuration The Xorg config is templated from `ansible/agent_s/xorg.conf.j2` and deployed to `/etc/X11/xrdp/xorg.conf`. ### Resolution The display is pinned to **1024×768**. UI-TARS and similar vision–language models are trained on screenshots at this resolution; higher resolutions degrade accuracy. Fallback modelines for 800×600 and 640×480 satisfy the `xrdpdev` driver's internal requirements but are never selected in normal operation. ### Module loading order ``` Load "glamoregl" ← must precede xorgxrdp Load "xorgxrdp" ``` `xorgxrdp 0.10.2` (shipped in Ubuntu 25.04) depends on the symbol `glamor_xv_init`, which lives in `libglamoregl.so`. If `glamoregl` is not loaded first, `libxorgxrdp.so` fails with: ``` undefined symbol: glamor_xv_init ``` This cascades — `xrdpdev`, `xrdpmouse`, and `xrdpkeyb` all fail because they depend on symbols exported by `libxorgxrdp.so`. ### Device section The Device section uses only the `xrdpdev` virtual framebuffer driver with no DRM/GPU options: ``` Section "Device" Identifier "Video Card (xrdpdev)" Driver "xrdpdev" EndSection ``` DRM options (`DRMDevice`, `DRI3`, `DRMAllowList`) are not applicable to the virtual framebuffer and were the source of a previous misconfiguration on `larissa`. ### Display variable The agent environment sets `DISPLAY=:10.0` (via `~/.agent_s_env`). XRDP assigns display numbers starting at `:10` by default. --- ## Sound device configuration RDP audio redirection requires two components: 1. **PulseAudio** — the userspace sound server 2. **pulseaudio-module-xrdp** — a PulseAudio sink/source module that forwards audio to the RDP client ### Build process `pulseaudio-module-xrdp` must be compiled against the PulseAudio headers matching the exact version running on the target host. The playbook automates this: 1. Enables `deb-src` entries in `/etc/apt/sources.list.d/ubuntu.sources` 2. Runs `apt-get source pulseaudio` into `/usr/local/src/` 3. Generates `config.h` by running `meson setup build` in the PulseAudio source tree 4. Extracts `pulseaudio-module-xrdp` from the staged tarball into `/usr/local/src/pulseaudio-module-xrdp/` 5. Runs `./bootstrap && ./configure PULSE_DIR= && make && make install` Steps 4–5 are skipped if `module-xrdp-sink.so` is already present under `/usr/lib/pulse-*/modules/`. Re-running the playbook after a PulseAudio upgrade will trigger a rebuild because the old `.so` won't be found at the new versioned path. ### How audio flows ``` MATE application → PulseAudio (userspace daemon, per-user session) → module-xrdp-sink (installed by pulseaudio-module-xrdp) → XRDP audio channel → RDP client ``` PulseAudio is started automatically as part of the MATE session (`mate-session` launches `pulseaudio --start`). No additional service unit is required. --- ## Variables | Variable | Default | Description | |----------|---------|-------------| | `principal_user` | `robert` | Username for the human account the agent runs as | | `agent_s_rel` | `main` | Git branch/tag to stage from `~/gh/Agent-S` | | `pulseaudio_module_xrdp_rel` | `devel` | Git branch/tag to stage from `~/gh/pulseaudio-module-xrdp` | | `agent_s_venv` | `/home/{{ principal_user }}/env/agents` | Path to the Python virtual environment | | `agent_s_repo` | `/home/{{ principal_user }}/gh/Agent-S` | Extraction path for the Agent-S source | All variables are set in `ansible/inventory/host_vars/larissa.helu.ca.yml` (`principal_user`) and `ansible/inventory/group_vars/all/vars.yml` (release branches). --- ## Environment activation After deployment, activate the Agent-S environment on the target host: ```bash source ~/.agent_s_env ``` This activates the virtual environment, sets `AGENT_S_HOME`, `DISPLAY`, and exports `HF_TOKEN` and `OPENAI_API_KEY` placeholder values that must be filled in for model inference. --- ## Troubleshooting ### X server fails to start — `undefined symbol: glamor_xv_init` `glamoregl` is missing or ordered after `xorgxrdp` in the Module section. Check `/etc/X11/xrdp/xorg.conf` and ensure `Load "glamoregl"` appears before `Load "xorgxrdp"`. ### Black screen on RDP connect Confirm `.xsession` contains `exec mate-session` and is executable (`chmod 0755`). Check `/var/log/xrdp-sesman.log` for session startup errors. ### No audio in RDP session Verify `module-xrdp-sink.so` is installed: ```bash find /usr/lib/pulse-*/modules/ -name 'module-xrdp-sink.so' ``` If absent, re-run the playbook. If PulseAudio was upgraded, the old `.so` path will not exist and the build steps will execute automatically. ### Accessibility / AT-SPI not working Confirm the profile script is loaded in the session: ```bash echo $GTK_MODULES # should include atk-bridge ``` If empty, verify `/etc/profile.d/atspi.sh` exists and the session was started via `.xsession` (not a display manager session that bypasses `/etc/profile.d/`).