TheDocumentation Index
Fetch the complete documentation index at: https://docs.openclaw.ai/llms.txt
Use this file to discover all available pages before exploring further.
web_fetch tool does a plain HTTP GET and extracts readable content
(HTML to markdown or text). It does not execute JavaScript.
For JS-heavy sites or login-protected pages, use the
Web Browser instead.
Quick start
web_fetch is enabled by default — no configuration needed. The agent can
call it immediately:
Tool parameters
URL to fetch.
http(s) only.Output format after main-content extraction.
Truncate output to this many characters.
How it works
Fetch
Sends an HTTP GET with a Chrome-like User-Agent and
Accept-Language
header. Blocks private/internal hostnames and re-checks redirects.Fallback (optional)
If Readability fails and Firecrawl is configured, retries through the
Firecrawl API with bot-circumvention mode.
Config
Firecrawl fallback
If Readability extraction fails,web_fetch can fall back to
Firecrawl for bot-circumvention and better extraction:
plugins.entries.firecrawl.config.webFetch.apiKey supports SecretRef objects.
Legacy tools.web.fetch.firecrawl.* config is auto-migrated by openclaw doctor --fix.
If Firecrawl is enabled and its SecretRef is unresolved with no
FIRECRAWL_API_KEY env fallback, gateway startup fails fast.Firecrawl
baseUrl overrides are locked down: hosted traffic uses
https://api.firecrawl.dev; self-hosted overrides must target private or
internal endpoints, and http:// is accepted only for those private targets.tools.web.fetch.providerselects the fetch fallback provider explicitly.- If
provideris omitted, OpenClaw auto-detects the first ready web-fetch provider from available credentials. Non-sandboxedweb_fetchcan use installed plugins that declarecontracts.webFetchProvidersand register a matching provider at runtime. Today the bundled provider is Firecrawl. - Sandboxed
web_fetchcalls stay limited to bundled providers. - If Readability is disabled,
web_fetchskips straight to the selected provider fallback. If no provider is available, it fails closed.
Trusted env proxy
If your deployment requiresweb_fetch to go through a trusted outbound
HTTP(S) proxy, set tools.web.fetch.useTrustedEnvProxy: true.
In this mode, OpenClaw still applies hostname-based SSRF checks before sending
the request, but it lets the proxy resolve DNS instead of doing local DNS
pinning. Enable this only when the proxy is operator-controlled and enforces
outbound policy after DNS resolution.
If no HTTP(S) proxy env var is configured, or the target host is excluded by
NO_PROXY, web_fetch falls back to the normal strict path with local DNS
pinning.Limits and safety
maxCharsis clamped totools.web.fetch.maxCharsCap- Response body is capped at
maxResponseBytesbefore parsing; oversized responses are truncated with a warning - Private/internal hostnames are blocked
tools.web.fetch.ssrfPolicy.allowRfc2544BenchmarkRangeandtools.web.fetch.ssrfPolicy.allowIpv6UniqueLocalRangeare narrow opt-ins for trusted fake-IP proxy stacks; leave them unset unless your proxy owns those synthetic ranges and enforces its own destination policy- Redirects are checked and limited by
maxRedirects useTrustedEnvProxyis an explicit opt-in and should only be enabled for operator-controlled proxies that still enforce outbound policy after DNS resolutionweb_fetchis best-effort — some sites need the Web Browser
Tool profiles
If you use tool profiles or allowlists, addweb_fetch or group:web:
Related
- Web Search — search the web with multiple providers
- Web Browser — full browser automation for JS-heavy sites
- Firecrawl — Firecrawl search and scrape tools