Files
ouranos/docs/agent_s.md
Robert Helewka 042df52bca Refactor user management in Ansible playbooks to standardize on keeper_user
- Updated user addition tasks across multiple playbooks (mcp_switchboard, mcpo, neo4j, neo4j_mcp, openwebui, postgresql, rabbitmq, searxng, smtp4dev) to replace references to ansible_user and remote_user with keeper_user.
- Modified PostgreSQL deployment to create directories and manage files under keeper_user's home.
- Enhanced documentation to clarify account taxonomy and usage of keeper_user in playbooks.
- Introduced new deployment for Agent S, including environment setup, desktop environment installation, XRDP configuration, and accessibility support.
- Added staging playbook for preparing release tarballs from local repositories.
- Created templates for XRDP configuration and environment activation scripts.
- Removed obsolete sunwait documentation.
2026-03-05 10:37:41 +00:00

9.0 KiB
Raw Blame History

Ansible Deployment for Agent S

Agent S is a computer-use automation agent that controls a desktop environment via RDP. The deployment configures a full graphical stack on larissa.helu.ca: MATE desktop, XRDP, audio redirection via PulseAudio, and an AT-SPI accessibility bridge so agents can introspect the UI tree.

Host

Host Group Type
larissa.helu.ca agent_s Incus container

Overview

The deployment installs and configures:

  • Principal user (robert, UID 1000) — the human account the agent operates on behalf of
  • MATE desktop — chosen for strong AT-SPI accessibility support
  • Firefox from the Mozilla APT repo (bypasses the snap dependency introduced by Ubuntu)
  • Google Chrome for browser automation
  • XRDP with a custom Xorg config pinned to 1024×768 for UI-TARS / Agent-S model compatibility
  • PulseAudio + pulseaudio-module-xrdp — audio redirection over RDP
  • AT-SPI accessibility stack — allows agents to query the widget tree
  • Python virtual environment with the Agent-S package and dependencies
  • Agent-S repository extracted from a staged release tarball
  • Environment activation script at ~/.agent_s_env

Prerequisites

Control node

  • Ansible 2.12+
  • SSH access to larissa.helu.ca
  • Staged release tarballs in ~/rel/ (produced by stage.yml):
    • ~/rel/agent_s_<agent_s_rel>.tar
    • ~/rel/pulseaudio_module_xrdp_<pulseaudio_module_xrdp_rel>.tar

Target host

  • Ubuntu 25.04
  • Network access to Mozilla APT, Google Chrome DL, and the Ubuntu package mirrors
  • deb-src repositories available (the playbook enables them if absent)

Staging

Before deploying, stage release tarballs from local git checkouts:

ansible-playbook ansible/agent_s/stage.yml

stage.yml runs on localhost and creates two tarballs from the configured git branches:

Tarball Source repo Branch variable
agent_s_<rel>.tar ~/gh/Agent-S agent_s_rel
pulseaudio_module_xrdp_<rel>.tar ~/gh/pulseaudio-module-xrdp pulseaudio_module_xrdp_rel

Both variables default to the all group vars (agent_s_rel: main, pulseaudio_module_xrdp_rel: devel).

Deployment

ansible-playbook ansible/agent_s/deploy.yml

Phase 1 — Principal user and snap suppression

Creates the robert account (UID 1000) and pins snapd at priority -10 via /etc/apt/preferences.d/nosnap.pref. This must run before the desktop install so ubuntu-mate-desktop cannot pull in snap-packaged Firefox.

Phase 2 — Firefox

Adds the Mozilla APT repo and pins all packages.mozilla.org packages at priority 1000, ensuring the system installs the .deb Firefox rather than the snap.

Phase 3 — MATE desktop

Installs ubuntu-mate-desktop. MATE is preferred over GNOME because its AT-SPI bridge is more reliable in a headless XRDP session.

Phase 4 — XRDP

Installs xrdp and xorgxrdp, adds the xrdp user to ssl-cert, and enables the service. The Xorg configuration is deployed separately (Phase 8).

Phase 5 — AT-SPI accessibility

Installs the AT-SPI core libraries and adds /etc/profile.d/atspi.sh to set:

GTK_MODULES=gail:atk-bridge
NO_AT_BRIDGE=0
ACCESSIBILITY_ENABLED=1

These environment variables are picked up by MATE applications when launched from .xsession, making the widget tree available to AT-SPI consumers such as pyatspi.

Phase 6 — PulseAudio and RDP audio

See Sound device configuration below.

Phase 7 — Packages, Chrome, Python environment, Agent-S

  • Installs OCR support (tesseract-ocr), Python 3, and assistive tech libraries
  • Downloads and installs Google Chrome from the official .deb
  • Creates a Python venv at ~/env/agents with --system-site-packages (so pyatspi and python3-gi, installed system-wide, are available)
  • Extracts Agent-S into ~/gh/Agent-S

Phase 8 — XRDP Xorg configuration and session

Deploys xorg.conf.j2 to /etc/X11/xrdp/xorg.conf and writes .xsession to start MATE:

exec mate-session

Any change to the Xorg config triggers the restart xrdp handler.


X Server / RDP configuration

The Xorg config is templated from ansible/agent_s/xorg.conf.j2 and deployed to /etc/X11/xrdp/xorg.conf.

Resolution

The display is pinned to 1024×768. UI-TARS and similar visionlanguage models are trained on screenshots at this resolution; higher resolutions degrade accuracy. Fallback modelines for 800×600 and 640×480 satisfy the xrdpdev driver's internal requirements but are never selected in normal operation.

Module loading order

Load "glamoregl"    ← must precede xorgxrdp
Load "xorgxrdp"

xorgxrdp 0.10.2 (shipped in Ubuntu 25.04) depends on the symbol glamor_xv_init, which lives in libglamoregl.so. If glamoregl is not loaded first, libxorgxrdp.so fails with:

undefined symbol: glamor_xv_init

This cascades — xrdpdev, xrdpmouse, and xrdpkeyb all fail because they depend on symbols exported by libxorgxrdp.so.

Device section

The Device section uses only the xrdpdev virtual framebuffer driver with no DRM/GPU options:

Section "Device"
    Identifier "Video Card (xrdpdev)"
    Driver "xrdpdev"
EndSection

DRM options (DRMDevice, DRI3, DRMAllowList) are not applicable to the virtual framebuffer and were the source of a previous misconfiguration on larissa.

Display variable

The agent environment sets DISPLAY=:10.0 (via ~/.agent_s_env). XRDP assigns display numbers starting at :10 by default.


Sound device configuration

RDP audio redirection requires two components:

  1. PulseAudio — the userspace sound server
  2. pulseaudio-module-xrdp — a PulseAudio sink/source module that forwards audio to the RDP client

Build process

pulseaudio-module-xrdp must be compiled against the PulseAudio headers matching the exact version running on the target host. The playbook automates this:

  1. Enables deb-src entries in /etc/apt/sources.list.d/ubuntu.sources
  2. Runs apt-get source pulseaudio into /usr/local/src/
  3. Generates config.h by running meson setup build in the PulseAudio source tree
  4. Extracts pulseaudio-module-xrdp from the staged tarball into /usr/local/src/pulseaudio-module-xrdp/
  5. Runs ./bootstrap && ./configure PULSE_DIR=<pulse-src> && make && make install

Steps 45 are skipped if module-xrdp-sink.so is already present under /usr/lib/pulse-*/modules/. Re-running the playbook after a PulseAudio upgrade will trigger a rebuild because the old .so won't be found at the new versioned path.

How audio flows

MATE application
   → PulseAudio (userspace daemon, per-user session)
      → module-xrdp-sink  (installed by pulseaudio-module-xrdp)
         → XRDP audio channel
            → RDP client

PulseAudio is started automatically as part of the MATE session (mate-session launches pulseaudio --start). No additional service unit is required.


Variables

Variable Default Description
principal_user robert Username for the human account the agent runs as
agent_s_rel main Git branch/tag to stage from ~/gh/Agent-S
pulseaudio_module_xrdp_rel devel Git branch/tag to stage from ~/gh/pulseaudio-module-xrdp
agent_s_venv /home/{{ principal_user }}/env/agents Path to the Python virtual environment
agent_s_repo /home/{{ principal_user }}/gh/Agent-S Extraction path for the Agent-S source

All variables are set in ansible/inventory/host_vars/larissa.helu.ca.yml (principal_user) and ansible/inventory/group_vars/all/vars.yml (release branches).


Environment activation

After deployment, activate the Agent-S environment on the target host:

source ~/.agent_s_env

This activates the virtual environment, sets AGENT_S_HOME, DISPLAY, and exports HF_TOKEN and OPENAI_API_KEY placeholder values that must be filled in for model inference.


Troubleshooting

X server fails to start — undefined symbol: glamor_xv_init

glamoregl is missing or ordered after xorgxrdp in the Module section. Check /etc/X11/xrdp/xorg.conf and ensure Load "glamoregl" appears before Load "xorgxrdp".

Black screen on RDP connect

Confirm .xsession contains exec mate-session and is executable (chmod 0755). Check /var/log/xrdp-sesman.log for session startup errors.

No audio in RDP session

Verify module-xrdp-sink.so is installed:

find /usr/lib/pulse-*/modules/ -name 'module-xrdp-sink.so'

If absent, re-run the playbook. If PulseAudio was upgraded, the old .so path will not exist and the build steps will execute automatically.

Accessibility / AT-SPI not working

Confirm the profile script is loaded in the session:

echo $GTK_MODULES   # should include atk-bridge

If empty, verify /etc/profile.d/atspi.sh exists and the session was started via .xsession (not a display manager session that bypasses /etc/profile.d/).