> ## Documentation Index
> Fetch the complete documentation index at: https://docs.openclaw.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Browser control API

For setup, configuration, and troubleshooting, see [Browser](/tools/browser).
This page is the reference for the local control HTTP API, the `openclaw browser`
CLI, and scripting patterns (snapshots, refs, waits, debug flows).

## Control API (optional)

For local integrations only, the Gateway exposes a small loopback HTTP API:

* Status/start/stop: `GET /`, `POST /start`, `POST /stop`
* Tabs: `GET /tabs`, `POST /tabs/open`, `POST /tabs/focus`, `DELETE /tabs/:targetId`
* Snapshot/screenshot: `GET /snapshot`, `POST /screenshot`
* Actions: `POST /navigate`, `POST /act`
* Hooks: `POST /hooks/file-chooser`, `POST /hooks/dialog`
* Downloads: `POST /download`, `POST /wait/download`
* Permissions: `POST /permissions/grant`
* Debugging: `GET /console`, `POST /pdf`
* Debugging: `GET /errors`, `GET /requests`, `POST /trace/start`, `POST /trace/stop`, `POST /highlight`
* Network: `POST /response/body`
* State: `GET /cookies`, `POST /cookies/set`, `POST /cookies/clear`
* State: `GET /storage/:kind`, `POST /storage/:kind/set`, `POST /storage/:kind/clear`
* Settings: `POST /set/offline`, `POST /set/headers`, `POST /set/credentials`, `POST /set/geolocation`, `POST /set/media`, `POST /set/timezone`, `POST /set/locale`, `POST /set/device`

All endpoints accept `?profile=<name>`. `POST /start?headless=true` requests a
one-shot headless launch for local managed profiles without changing persisted
browser config; attach-only, remote CDP, and existing-session profiles reject
that override because OpenClaw does not launch those browser processes.

If shared-secret gateway auth is configured, browser HTTP routes require auth too:

* `Authorization: Bearer <gateway token>`
* `x-openclaw-password: <gateway password>` or HTTP Basic auth with that password

Notes:

* This standalone loopback browser API does **not** consume trusted-proxy or
  Tailscale Serve identity headers.
* If `gateway.auth.mode` is `none` or `trusted-proxy`, these loopback browser
  routes do not inherit those identity-bearing modes; keep them loopback-only.

### `/act` error contract

`POST /act` uses a structured error response for route-level validation and
policy failures:

```json theme={"theme":{"light":"min-light","dark":"min-dark"}}
{ "error": "<message>", "code": "ACT_*" }
```

Current `code` values:

* `ACT_KIND_REQUIRED` (HTTP 400): `kind` is missing or unrecognized.
* `ACT_INVALID_REQUEST` (HTTP 400): action payload failed normalization or validation.
* `ACT_SELECTOR_UNSUPPORTED` (HTTP 400): `selector` was used with an unsupported action kind.
* `ACT_EVALUATE_DISABLED` (HTTP 403): `evaluate` (or `wait --fn`) is disabled by config.
* `ACT_TARGET_ID_MISMATCH` (HTTP 403): top-level or batched `targetId` conflicts with request target.
* `ACT_EXISTING_SESSION_UNSUPPORTED` (HTTP 501): action is not supported for existing-session profiles.

Other runtime failures may still return `{ "error": "<message>" }` without a
`code` field.

### Playwright requirement

Some features (navigate/act/AI snapshot/role snapshot, element screenshots,
PDF) require Playwright. If Playwright isn't installed, those endpoints return
a clear 501 error.

What still works without Playwright:

* ARIA snapshots
* Role-style accessibility snapshots (`--interactive`, `--compact`,
  `--depth`, `--efficient`) when a per-tab CDP WebSocket is available. This is
  a fallback for inspection and ref discovery; Playwright remains the primary
  action engine.
* Page screenshots for the managed `openclaw` browser when a per-tab CDP
  WebSocket is available
* Page screenshots for `existing-session` / Chrome MCP profiles
* `existing-session` ref-based screenshots (`--ref`) from snapshot output

What still needs Playwright:

* `navigate`
* `act`
* AI snapshots that depend on Playwright's native AI snapshot format
* CSS-selector element screenshots (`--element`)
* full browser PDF export

Element screenshots also reject `--full-page`; the route returns `fullPage is
not supported for element screenshots`.

If you see `Playwright is not available in this gateway build`, the packaged
Gateway is missing the core browser runtime dependency. Reinstall or update
OpenClaw, then restart the gateway. For Docker, also install the Chromium
browser binaries as shown below.

#### Docker Playwright install

If your Gateway runs in Docker, avoid `npx playwright` (npm override conflicts).
For custom images, bake Chromium into the image:

```bash theme={"theme":{"light":"min-light","dark":"min-dark"}}
OPENCLAW_INSTALL_BROWSER=1 ./scripts/docker/setup.sh
```

For an existing image, install through the bundled CLI instead:

```bash theme={"theme":{"light":"min-light","dark":"min-dark"}}
docker compose run --rm openclaw-cli \
  node /app/node_modules/playwright-core/cli.js install chromium
```

To persist browser downloads, set `PLAYWRIGHT_BROWSERS_PATH` (for example,
`/home/node/.cache/ms-playwright`) and make sure `/home/node` is persisted via
`OPENCLAW_HOME_VOLUME` or a bind mount. OpenClaw auto-detects the persisted
Chromium on Linux. See [Docker](/install/docker).

## How it works (internal)

A small loopback control server accepts HTTP requests and connects to Chromium-based browsers via CDP. Advanced actions (click/type/snapshot/PDF) go through Playwright on top of CDP; when Playwright is missing, only non-Playwright operations are available. The agent sees one stable interface while local/remote browsers and profiles swap freely underneath.

## CLI quick reference

All commands accept `--browser-profile <name>` to target a specific profile, and `--json` for machine-readable output.

<AccordionGroup>
  <Accordion title="Basics: status, tabs, open/focus/close">
    ```bash theme={"theme":{"light":"min-light","dark":"min-dark"}}
    openclaw browser status
    openclaw browser start
    openclaw browser start --headless # one-shot local managed headless launch
    openclaw browser stop            # also clears emulation on attach-only/remote CDP
    openclaw browser tabs
    openclaw browser tab             # shortcut for current tab
    openclaw browser tab new
    openclaw browser tab select 2
    openclaw browser tab close 2
    openclaw browser open https://example.com
    openclaw browser focus abcd1234
    openclaw browser close abcd1234
    ```
  </Accordion>

  <Accordion title="Inspection: screenshot, snapshot, console, errors, requests">
    ```bash theme={"theme":{"light":"min-light","dark":"min-dark"}}
    openclaw browser screenshot
    openclaw browser screenshot --full-page
    openclaw browser screenshot --ref 12        # or --ref e12
    openclaw browser screenshot --labels
    openclaw browser snapshot
    openclaw browser snapshot --format aria --limit 200
    openclaw browser snapshot --interactive --compact --depth 6
    openclaw browser snapshot --efficient
    openclaw browser snapshot --labels
    openclaw browser snapshot --urls
    openclaw browser snapshot --selector "#main" --interactive
    openclaw browser snapshot --frame "iframe#main" --interactive
    openclaw browser console --level error
    openclaw browser errors --clear
    openclaw browser requests --filter api --clear
    openclaw browser pdf
    openclaw browser responsebody "**/api" --max-chars 5000
    ```
  </Accordion>

  <Accordion title="Actions: navigate, click, type, drag, wait, evaluate">
    ```bash theme={"theme":{"light":"min-light","dark":"min-dark"}}
    openclaw browser navigate https://example.com
    openclaw browser resize 1280 720
    openclaw browser click 12 --double           # or e12 for role refs
    openclaw browser click-coords 120 340        # viewport coordinates
    openclaw browser type 23 "hello" --submit
    openclaw browser press Enter
    openclaw browser hover 44
    openclaw browser scrollintoview e12
    openclaw browser drag 10 11
    openclaw browser select 9 OptionA OptionB
    openclaw browser download e12 report.pdf
    openclaw browser waitfordownload report.pdf
    openclaw browser upload /tmp/openclaw/uploads/file.pdf
    openclaw browser fill --fields '[{"ref":"1","type":"text","value":"Ada"}]'
    openclaw browser dialog --accept
    openclaw browser wait --text "Done"
    openclaw browser wait "#main" --url "**/dash" --load networkidle --fn "window.ready===true"
    openclaw browser evaluate --fn '(el) => el.textContent' --ref 7
    openclaw browser highlight e12
    openclaw browser trace start
    openclaw browser trace stop
    ```
  </Accordion>

  <Accordion title="State: cookies, storage, offline, headers, geo, device">
    ```bash theme={"theme":{"light":"min-light","dark":"min-dark"}}
    openclaw browser cookies
    openclaw browser cookies set session abc123 --url "https://example.com"
    openclaw browser cookies clear
    openclaw browser storage local get
    openclaw browser storage local set theme dark
    openclaw browser storage session clear
    openclaw browser set offline on
    openclaw browser set headers --headers-json '{"X-Debug":"1"}'
    openclaw browser set credentials user pass            # --clear to remove
    openclaw browser set geo 37.7749 -122.4194 --origin "https://example.com"
    openclaw browser set media dark
    openclaw browser set timezone America/New_York
    openclaw browser set locale en-US
    openclaw browser set device "iPhone 14"
    ```
  </Accordion>
</AccordionGroup>

Notes:

* `upload` and `dialog` are **arming** calls; run them before the click/press that triggers the chooser/dialog.
* `click`/`type`/etc require a `ref` from `snapshot` (numeric `12`, role ref `e12`, or actionable ARIA ref `ax12`). CSS selectors are intentionally not supported for actions. Use `click-coords` when the visible viewport position is the only reliable target.
* Download, trace, and upload paths are constrained to OpenClaw temp roots: `/tmp/openclaw{,/downloads,/uploads}` (fallback: `${os.tmpdir()}/openclaw/...`).
* `upload` can also set file inputs directly via `--input-ref` or `--element`.

Stable tab ids and labels survive Chromium raw-target replacement when OpenClaw
can prove the replacement tab, such as same URL or a single old tab becoming a
single new tab after form submission. Raw target ids are still volatile; prefer
`suggestedTargetId` from `tabs` in scripts.

Snapshot flags at a glance:

* `--format ai` (default with Playwright): AI snapshot with numeric refs (`aria-ref="<n>"`).
* `--format aria`: accessibility tree with `axN` refs. When Playwright is available, OpenClaw binds refs with backend DOM ids to the live page so follow-up actions can use them; otherwise treat the output as inspection-only.
* `--efficient` (or `--mode efficient`): compact role snapshot preset. Set `browser.snapshotDefaults.mode: "efficient"` to make this the default (see [Gateway configuration](/gateway/configuration-reference#browser)).
* `--interactive`, `--compact`, `--depth`, `--selector` force a role snapshot with `ref=e12` refs. `--frame "<iframe>"` scopes role snapshots to an iframe.
* `--labels` adds a viewport-only screenshot with overlayed ref labels (prints `MEDIA:<path>`).
* `--urls` appends discovered link destinations to AI snapshots.

## Snapshots and refs

OpenClaw supports two "snapshot" styles:

* **AI snapshot (numeric refs)**: `openclaw browser snapshot` (default; `--format ai`)
  * Output: a text snapshot that includes numeric refs.
  * Actions: `openclaw browser click 12`, `openclaw browser type 23 "hello"`.
  * Internally, the ref is resolved via Playwright's `aria-ref`.

* **Role snapshot (role refs like `e12`)**: `openclaw browser snapshot --interactive` (or `--compact`, `--depth`, `--selector`, `--frame`)
  * Output: a role-based list/tree with `[ref=e12]` (and optional `[nth=1]`).
  * Actions: `openclaw browser click e12`, `openclaw browser highlight e12`.
  * Internally, the ref is resolved via `getByRole(...)` (plus `nth()` for duplicates).
  * Add `--labels` to include a viewport screenshot with overlayed `e12` labels.
  * Add `--urls` when link text is ambiguous and the agent needs concrete
    navigation targets.

* **ARIA snapshot (ARIA refs like `ax12`)**: `openclaw browser snapshot --format aria`
  * Output: the accessibility tree as structured nodes.
  * Actions: `openclaw browser click ax12` works when the snapshot path can bind
    the ref through Playwright and Chrome backend DOM ids.

* If Playwright is unavailable, ARIA snapshots can still be useful for
  inspection, but refs may not be actionable. Re-snapshot with `--format ai`
  or `--interactive` when you need action refs.

* Docker proof for the raw-CDP fallback path: `pnpm test:docker:browser-cdp-snapshot`
  starts Chromium with CDP, runs `browser doctor --deep`, and verifies role
  snapshots include link URLs, cursor-promoted clickables, and iframe metadata.

Ref behavior:

* Refs are **not stable across navigations**; if something fails, re-run `snapshot` and use a fresh ref.
* `/act` returns the current raw `targetId` after action-triggered replacement
  when it can prove the replacement tab. Keep using stable tab ids/labels for
  follow-up commands.
* If the role snapshot was taken with `--frame`, role refs are scoped to that iframe until the next role snapshot.
* Unknown or stale `axN` refs fail fast instead of falling through to
  Playwright's `aria-ref` selector. Run a fresh snapshot on the same tab when
  that happens.

## Wait power-ups

You can wait on more than just time/text:

* Wait for URL (globs supported by Playwright):
  * `openclaw browser wait --url "**/dash"`
* Wait for load state:
  * `openclaw browser wait --load networkidle`
* Wait for a JS predicate:
  * `openclaw browser wait --fn "window.ready===true"`
* Wait for a selector to become visible:
  * `openclaw browser wait "#main"`

These can be combined:

```bash theme={"theme":{"light":"min-light","dark":"min-dark"}}
openclaw browser wait "#main" \
  --url "**/dash" \
  --load networkidle \
  --fn "window.ready===true" \
  --timeout-ms 15000
```

## Debug workflows

When an action fails (e.g. "not visible", "strict mode violation", "covered"):

1. `openclaw browser snapshot --interactive`
2. Use `click <ref>` / `type <ref>` (prefer role refs in interactive mode)
3. If it still fails: `openclaw browser highlight <ref>` to see what Playwright is targeting
4. If the page behaves oddly:
   * `openclaw browser errors --clear`
   * `openclaw browser requests --filter api --clear`
5. For deep debugging: record a trace:
   * `openclaw browser trace start`
   * reproduce the issue
   * `openclaw browser trace stop` (prints `TRACE:<path>`)

## JSON output

`--json` is for scripting and structured tooling.

Examples:

```bash theme={"theme":{"light":"min-light","dark":"min-dark"}}
openclaw browser status --json
openclaw browser snapshot --interactive --json
openclaw browser requests --filter api --json
openclaw browser cookies --json
```

Role snapshots in JSON include `refs` plus a small `stats` block (lines/chars/refs/interactive) so tools can reason about payload size and density.

## State and environment knobs

These are useful for "make the site behave like X" workflows:

* Cookies: `cookies`, `cookies set`, `cookies clear`
* Storage: `storage local|session get|set|clear`
* Offline: `set offline on|off`
* Headers: `set headers --headers-json '{"X-Debug":"1"}'` (legacy `set headers --json '{"X-Debug":"1"}'` remains supported)
* HTTP basic auth: `set credentials user pass` (or `--clear`)
* Geolocation: `set geo <lat> <lon> --origin "https://example.com"` (or `--clear`)
* Media: `set media dark|light|no-preference|none`
* Timezone / locale: `set timezone ...`, `set locale ...`
* Device / viewport:
  * `set device "iPhone 14"` (Playwright device presets)
  * `set viewport 1280 720`

## Security and privacy

* The openclaw browser profile may contain logged-in sessions; treat it as sensitive.
* `browser act kind=evaluate` / `openclaw browser evaluate` and `wait --fn`
  execute arbitrary JavaScript in the page context. Prompt injection can steer
  this. Disable it with `browser.evaluateEnabled=false` if you do not need it.
* For logins and anti-bot notes (X/Twitter, etc.), see [Browser login + X/Twitter posting](/tools/browser-login).
* Keep the Gateway/node host private (loopback or tailnet-only).
* Remote CDP endpoints are powerful; tunnel and protect them.

Strict-mode example (block private/internal destinations by default):

```json5 theme={"theme":{"light":"min-light","dark":"min-dark"}}
{
  browser: {
    ssrfPolicy: {
      dangerouslyAllowPrivateNetwork: false,
      hostnameAllowlist: ["*.example.com", "example.com"],
      allowedHostnames: ["localhost"], // optional exact allow
    },
  },
}
```

## Related

* [Browser](/tools/browser) - overview, configuration, profiles, security
* [Browser login](/tools/browser-login) - signing in to sites
* [Browser Linux troubleshooting](/tools/browser-linux-troubleshooting)
* [Browser WSL2 troubleshooting](/tools/browser-wsl2-windows-remote-cdp-troubleshooting)
