Skip to content

Instantly share code, notes, and snippets.

@abrkn
Created February 2, 2026 10:42
Show Gist options
  • Select an option

  • Save abrkn/f547e73899531385d0cc4088eaef49ef to your computer and use it in GitHub Desktop.

Select an option

Save abrkn/f547e73899531385d0cc4088eaef49ef to your computer and use it in GitHub Desktop.
OpenClaw X11 tools

xdotool / X11 Access

Real mouse/keyboard automation via xdotool. Useful for bypassing bot detection that blocks CDP clicks.

Setup:

  • Browser runs X server on display :1 with socat forwarding to TCP 6001
  • Agent and browser containers share desk-net Docker network
  • Browser hostname: desk-browser (IP fallback: 172.18.0.2)

Usage:

export DISPLAY=desk-browser:1
xdotool getmouselocation
xdotool mousemove 500 300 click 1
xdotool key Return
xdotool type "hello world"

If DNS fails: Container may be disconnected from desk-net. Timer reconnects every 5 min, or ask Brektimus to run /data/openclaw/agents/desk/setup-x11.sh

IP fallback: DISPLAY=172.18.0.2:1


X11 Screenshots

Take screenshots of the browser using ImageMagick's import command over X11.

Take a screenshot:

DISPLAY=desk-browser:1 import -window root /tmp/screen.png
cp /tmp/screen.png ./screen.png  # Copy to workspace for reading

Read the screenshot:

read("screen.png")  # Returns the image

Why X11 screenshots?

  • CDP screenshots miss native browser popups (file dialogs, permission prompts)
  • X11 captures the actual screen as seen in noVNC

Window Management (xdotool)

Find browser window:

DISPLAY=desk-browser:1 xdotool search --class "chromium"
# Returns window IDs (use the larger geometry one, usually second)

Get window geometry:

DISPLAY=desk-browser:1 xdotool getwindowgeometry <window_id>

Resize window:

DISPLAY=desk-browser:1 xdotool windowsize <window_id> 1050 780

Note: getactivewindow doesn't work (minimal WM). Use search --class instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment