Refactor user management in Ansible playbooks to standardize on keeper_user
- Updated user addition tasks across multiple playbooks (mcp_switchboard, mcpo, neo4j, neo4j_mcp, openwebui, postgresql, rabbitmq, searxng, smtp4dev) to replace references to ansible_user and remote_user with keeper_user. - Modified PostgreSQL deployment to create directories and manage files under keeper_user's home. - Enhanced documentation to clarify account taxonomy and usage of keeper_user in playbooks. - Introduced new deployment for Agent S, including environment setup, desktop environment installation, XRDP configuration, and accessibility support. - Added staging playbook for preparing release tarballs from local repositories. - Created templates for XRDP configuration and environment activation scripts. - Removed obsolete sunwait documentation.
This commit is contained in:
245
docs/agent_s.md
Normal file
245
docs/agent_s.md
Normal file
@@ -0,0 +1,245 @@
|
||||
# Ansible Deployment for Agent S
|
||||
|
||||
Agent S is a computer-use automation agent that controls a desktop environment via RDP. The deployment configures a full graphical stack on `larissa.helu.ca`: MATE desktop, XRDP, audio redirection via PulseAudio, and an AT-SPI accessibility bridge so agents can introspect the UI tree.
|
||||
|
||||
## Host
|
||||
|
||||
| Host | Group | Type |
|
||||
|------|-------|------|
|
||||
| `larissa.helu.ca` | `agent_s` | Incus container |
|
||||
|
||||
## Overview
|
||||
|
||||
The deployment installs and configures:
|
||||
|
||||
- **Principal user** (`robert`, UID 1000) — the human account the agent operates on behalf of
|
||||
- **MATE desktop** — chosen for strong AT-SPI accessibility support
|
||||
- **Firefox** from the Mozilla APT repo (bypasses the snap dependency introduced by Ubuntu)
|
||||
- **Google Chrome** for browser automation
|
||||
- **XRDP** with a custom Xorg config pinned to 1024×768 for UI-TARS / Agent-S model compatibility
|
||||
- **PulseAudio + pulseaudio-module-xrdp** — audio redirection over RDP
|
||||
- **AT-SPI accessibility stack** — allows agents to query the widget tree
|
||||
- **Python virtual environment** with the Agent-S package and dependencies
|
||||
- **Agent-S repository** extracted from a staged release tarball
|
||||
- **Environment activation script** at `~/.agent_s_env`
|
||||
|
||||
## Prerequisites
|
||||
|
||||
### Control node
|
||||
|
||||
- Ansible 2.12+
|
||||
- SSH access to `larissa.helu.ca`
|
||||
- Staged release tarballs in `~/rel/` (produced by `stage.yml`):
|
||||
- `~/rel/agent_s_<agent_s_rel>.tar`
|
||||
- `~/rel/pulseaudio_module_xrdp_<pulseaudio_module_xrdp_rel>.tar`
|
||||
|
||||
### Target host
|
||||
|
||||
- Ubuntu 25.04
|
||||
- Network access to Mozilla APT, Google Chrome DL, and the Ubuntu package mirrors
|
||||
- deb-src repositories available (the playbook enables them if absent)
|
||||
|
||||
## Staging
|
||||
|
||||
Before deploying, stage release tarballs from local git checkouts:
|
||||
|
||||
```bash
|
||||
ansible-playbook ansible/agent_s/stage.yml
|
||||
```
|
||||
|
||||
`stage.yml` runs on localhost and creates two tarballs from the configured git branches:
|
||||
|
||||
| Tarball | Source repo | Branch variable |
|
||||
|---------|-------------|-----------------|
|
||||
| `agent_s_<rel>.tar` | `~/gh/Agent-S` | `agent_s_rel` |
|
||||
| `pulseaudio_module_xrdp_<rel>.tar` | `~/gh/pulseaudio-module-xrdp` | `pulseaudio_module_xrdp_rel` |
|
||||
|
||||
Both variables default to the `all` group vars (`agent_s_rel: main`, `pulseaudio_module_xrdp_rel: devel`).
|
||||
|
||||
## Deployment
|
||||
|
||||
```bash
|
||||
ansible-playbook ansible/agent_s/deploy.yml
|
||||
```
|
||||
|
||||
### Phase 1 — Principal user and snap suppression
|
||||
|
||||
Creates the `robert` account (UID 1000) and pins `snapd` at priority -10 via `/etc/apt/preferences.d/nosnap.pref`. This must run before the desktop install so `ubuntu-mate-desktop` cannot pull in snap-packaged Firefox.
|
||||
|
||||
### Phase 2 — Firefox
|
||||
|
||||
Adds the Mozilla APT repo and pins all `packages.mozilla.org` packages at priority 1000, ensuring the system installs the .deb Firefox rather than the snap.
|
||||
|
||||
### Phase 3 — MATE desktop
|
||||
|
||||
Installs `ubuntu-mate-desktop`. MATE is preferred over GNOME because its AT-SPI bridge is more reliable in a headless XRDP session.
|
||||
|
||||
### Phase 4 — XRDP
|
||||
|
||||
Installs `xrdp` and `xorgxrdp`, adds the `xrdp` user to `ssl-cert`, and enables the service. The Xorg configuration is deployed separately (Phase 8).
|
||||
|
||||
### Phase 5 — AT-SPI accessibility
|
||||
|
||||
Installs the AT-SPI core libraries and adds `/etc/profile.d/atspi.sh` to set:
|
||||
|
||||
```bash
|
||||
GTK_MODULES=gail:atk-bridge
|
||||
NO_AT_BRIDGE=0
|
||||
ACCESSIBILITY_ENABLED=1
|
||||
```
|
||||
|
||||
These environment variables are picked up by MATE applications when launched from `.xsession`, making the widget tree available to AT-SPI consumers such as `pyatspi`.
|
||||
|
||||
### Phase 6 — PulseAudio and RDP audio
|
||||
|
||||
See [Sound device configuration](#sound-device-configuration) below.
|
||||
|
||||
### Phase 7 — Packages, Chrome, Python environment, Agent-S
|
||||
|
||||
- Installs OCR support (`tesseract-ocr`), Python 3, and assistive tech libraries
|
||||
- Downloads and installs Google Chrome from the official .deb
|
||||
- Creates a Python venv at `~/env/agents` with `--system-site-packages` (so `pyatspi` and `python3-gi`, installed system-wide, are available)
|
||||
- Extracts Agent-S into `~/gh/Agent-S`
|
||||
|
||||
### Phase 8 — XRDP Xorg configuration and session
|
||||
|
||||
Deploys `xorg.conf.j2` to `/etc/X11/xrdp/xorg.conf` and writes `.xsession` to start MATE:
|
||||
|
||||
```
|
||||
exec mate-session
|
||||
```
|
||||
|
||||
Any change to the Xorg config triggers the `restart xrdp` handler.
|
||||
|
||||
---
|
||||
|
||||
## X Server / RDP configuration
|
||||
|
||||
The Xorg config is templated from `ansible/agent_s/xorg.conf.j2` and deployed to `/etc/X11/xrdp/xorg.conf`.
|
||||
|
||||
### Resolution
|
||||
|
||||
The display is pinned to **1024×768**. UI-TARS and similar vision–language models are trained on screenshots at this resolution; higher resolutions degrade accuracy. Fallback modelines for 800×600 and 640×480 satisfy the `xrdpdev` driver's internal requirements but are never selected in normal operation.
|
||||
|
||||
### Module loading order
|
||||
|
||||
```
|
||||
Load "glamoregl" ← must precede xorgxrdp
|
||||
Load "xorgxrdp"
|
||||
```
|
||||
|
||||
`xorgxrdp 0.10.2` (shipped in Ubuntu 25.04) depends on the symbol `glamor_xv_init`, which lives in `libglamoregl.so`. If `glamoregl` is not loaded first, `libxorgxrdp.so` fails with:
|
||||
|
||||
```
|
||||
undefined symbol: glamor_xv_init
|
||||
```
|
||||
|
||||
This cascades — `xrdpdev`, `xrdpmouse`, and `xrdpkeyb` all fail because they depend on symbols exported by `libxorgxrdp.so`.
|
||||
|
||||
### Device section
|
||||
|
||||
The Device section uses only the `xrdpdev` virtual framebuffer driver with no DRM/GPU options:
|
||||
|
||||
```
|
||||
Section "Device"
|
||||
Identifier "Video Card (xrdpdev)"
|
||||
Driver "xrdpdev"
|
||||
EndSection
|
||||
```
|
||||
|
||||
DRM options (`DRMDevice`, `DRI3`, `DRMAllowList`) are not applicable to the virtual framebuffer and were the source of a previous misconfiguration on `larissa`.
|
||||
|
||||
### Display variable
|
||||
|
||||
The agent environment sets `DISPLAY=:10.0` (via `~/.agent_s_env`). XRDP assigns display numbers starting at `:10` by default.
|
||||
|
||||
---
|
||||
|
||||
## Sound device configuration
|
||||
|
||||
RDP audio redirection requires two components:
|
||||
|
||||
1. **PulseAudio** — the userspace sound server
|
||||
2. **pulseaudio-module-xrdp** — a PulseAudio sink/source module that forwards audio to the RDP client
|
||||
|
||||
### Build process
|
||||
|
||||
`pulseaudio-module-xrdp` must be compiled against the PulseAudio headers matching the exact version running on the target host. The playbook automates this:
|
||||
|
||||
1. Enables `deb-src` entries in `/etc/apt/sources.list.d/ubuntu.sources`
|
||||
2. Runs `apt-get source pulseaudio` into `/usr/local/src/`
|
||||
3. Generates `config.h` by running `meson setup build` in the PulseAudio source tree
|
||||
4. Extracts `pulseaudio-module-xrdp` from the staged tarball into `/usr/local/src/pulseaudio-module-xrdp/`
|
||||
5. Runs `./bootstrap && ./configure PULSE_DIR=<pulse-src> && make && make install`
|
||||
|
||||
Steps 4–5 are skipped if `module-xrdp-sink.so` is already present under `/usr/lib/pulse-*/modules/`. Re-running the playbook after a PulseAudio upgrade will trigger a rebuild because the old `.so` won't be found at the new versioned path.
|
||||
|
||||
### How audio flows
|
||||
|
||||
```
|
||||
MATE application
|
||||
→ PulseAudio (userspace daemon, per-user session)
|
||||
→ module-xrdp-sink (installed by pulseaudio-module-xrdp)
|
||||
→ XRDP audio channel
|
||||
→ RDP client
|
||||
```
|
||||
|
||||
PulseAudio is started automatically as part of the MATE session (`mate-session` launches `pulseaudio --start`). No additional service unit is required.
|
||||
|
||||
---
|
||||
|
||||
## Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `principal_user` | `robert` | Username for the human account the agent runs as |
|
||||
| `agent_s_rel` | `main` | Git branch/tag to stage from `~/gh/Agent-S` |
|
||||
| `pulseaudio_module_xrdp_rel` | `devel` | Git branch/tag to stage from `~/gh/pulseaudio-module-xrdp` |
|
||||
| `agent_s_venv` | `/home/{{ principal_user }}/env/agents` | Path to the Python virtual environment |
|
||||
| `agent_s_repo` | `/home/{{ principal_user }}/gh/Agent-S` | Extraction path for the Agent-S source |
|
||||
|
||||
All variables are set in `ansible/inventory/host_vars/larissa.helu.ca.yml` (`principal_user`) and `ansible/inventory/group_vars/all/vars.yml` (release branches).
|
||||
|
||||
---
|
||||
|
||||
## Environment activation
|
||||
|
||||
After deployment, activate the Agent-S environment on the target host:
|
||||
|
||||
```bash
|
||||
source ~/.agent_s_env
|
||||
```
|
||||
|
||||
This activates the virtual environment, sets `AGENT_S_HOME`, `DISPLAY`, and exports `HF_TOKEN` and `OPENAI_API_KEY` placeholder values that must be filled in for model inference.
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### X server fails to start — `undefined symbol: glamor_xv_init`
|
||||
|
||||
`glamoregl` is missing or ordered after `xorgxrdp` in the Module section. Check `/etc/X11/xrdp/xorg.conf` and ensure `Load "glamoregl"` appears before `Load "xorgxrdp"`.
|
||||
|
||||
### Black screen on RDP connect
|
||||
|
||||
Confirm `.xsession` contains `exec mate-session` and is executable (`chmod 0755`). Check `/var/log/xrdp-sesman.log` for session startup errors.
|
||||
|
||||
### No audio in RDP session
|
||||
|
||||
Verify `module-xrdp-sink.so` is installed:
|
||||
|
||||
```bash
|
||||
find /usr/lib/pulse-*/modules/ -name 'module-xrdp-sink.so'
|
||||
```
|
||||
|
||||
If absent, re-run the playbook. If PulseAudio was upgraded, the old `.so` path will not exist and the build steps will execute automatically.
|
||||
|
||||
### Accessibility / AT-SPI not working
|
||||
|
||||
Confirm the profile script is loaded in the session:
|
||||
|
||||
```bash
|
||||
echo $GTK_MODULES # should include atk-bridge
|
||||
```
|
||||
|
||||
If empty, verify `/etc/profile.d/atspi.sh` exists and the session was started via `.xsession` (not a display manager session that bypasses `/etc/profile.d/`).
|
||||
Reference in New Issue
Block a user