推理 CLI

openclaw infer 是 OpenClaw 中由提供商支持的推理工作流的规范无头接口。它有意暴露能力族，而不是原始 Gateway 网关 RPC 名称，也不是原始智能体工具 ID。

将 infer 转成一个 Skill

将下面内容复制并粘贴给智能体：

Read https://docs.openclaw.ai/cli/infer, then create a skill that routes my common workflows to `openclaw infer`.
Focus on model runs, image generation, video generation, audio transcription, TTS, web search, and embeddings.

一个优秀的基于 infer 的 Skill 应该：

将常见用户意图映射到正确的 infer 子命令
为它覆盖的工作流包含几个规范的 infer 示例
在示例和建议中优先使用 openclaw infer ...
避免在 Skill 正文中重新记录整个 infer 接口

典型的面向 infer 的 Skill 覆盖范围：

openclaw infer model run
openclaw infer image generate
openclaw infer audio transcribe
openclaw infer tts convert
openclaw infer web search
openclaw infer embedding create

为什么使用 infer

openclaw infer 为 OpenClaw 内由提供商支持的推理任务提供一个一致的 CLI。优点：

使用 OpenClaw 中已经配置好的提供商和模型，而不是为每个后端接入一次性的包装器。
将模型、图像、音频转录、TTS、视频、网页和嵌入工作流放在同一个命令树下。
为脚本、自动化和智能体驱动的工作流使用稳定的 --json 输出结构。
当任务本质上是“运行推理”时，优先使用 OpenClaw 的第一方接口。
对大多数 infer 命令使用普通本地路径，无需 Gateway 网关。

对于端到端提供商检查，较低层级的提供商测试通过后，优先使用 openclaw infer ...。它会在发起提供商请求之前，执行已发布的 CLI、配置加载、默认智能体解析、内置插件激活以及共享能力运行时。

命令树

 openclaw infer
  list
  inspect

  model
    run
    list
    inspect
    providers
    auth login
    auth logout
    auth status

  image
    generate
    edit
    describe
    describe-many
    providers

  audio
    transcribe
    providers

  tts
    convert
    voices
    providers
    status
    enable
    disable
    set-provider

  video
    generate
    describe
    providers

  web
    search
    fetch
    providers

  embedding
    create
    providers

常见任务

此表将常见推理任务映射到对应的 infer 命令。

任务	命令	备注
运行文本/模型提示词	`openclaw infer model run --prompt "..." --json`	默认使用普通本地路径
在图像上运行模型提示词	`openclaw infer model run --prompt "Describe this" --file ./image.png --model provider/model`	对多个图像输入重复使用 `--file`
生成图像	`openclaw infer image generate --prompt "..." --json`	从现有文件开始时使用 `image edit`
描述图像文件	`openclaw infer image describe --file ./image.png --prompt "..." --json`	`--model` 必须是支持图像的 `<provider/model>`
转录音频	`openclaw infer audio transcribe --file ./memo.m4a --json`	`--model` 必须是 `<provider/model>`
合成语音	`openclaw infer tts convert --text "..." --output ./speech.mp3 --json`	`tts status` 面向 Gateway 网关
生成视频	`openclaw infer video generate --prompt "..." --json`	支持 `--resolution` 等提供商提示
描述视频文件	`openclaw infer video describe --file ./clip.mp4 --json`	`--model` 必须是 `<provider/model>`
搜索网页	`openclaw infer web search --query "..." --json`
获取网页	`openclaw infer web fetch --url https://example.com --json`
创建嵌入	`openclaw infer embedding create --text "..." --json`

行为

openclaw infer ... 是这些工作流的主要 CLI 接口。
当输出会被另一个命令或脚本消费时，使用 --json。
当需要特定后端时，使用 --provider 或 --model provider/model。
对于 image describe、audio transcribe 和 video describe，--model 必须使用 <provider/model> 形式。
对于 image describe，显式 --model 会直接运行该提供商/模型。该模型必须在模型目录或提供商配置中支持图像。codex/<model> 会运行有边界的 Codex app-server 图像理解轮次；openai-codex/<model> 使用 OpenAI Codex OAuth 提供商路径。
无状态执行命令默认使用本地。
Gateway 网关管理的状态命令默认使用 Gateway 网关。
普通本地路径不要求 Gateway 网关正在运行。
本地 model run 是精简的一次性提供商补全。它会解析已配置的智能体模型和凭证，但不会启动聊天智能体轮次、加载工具或打开内置 MCP 服务器。
model run --file 接受图像文件，检测它们的 MIME 类型，并将它们与提供的提示词一起发送到所选模型。对多个图像重复使用 --file。
model run --file 会拒绝非图像输入。对音频文件使用 infer audio transcribe，对视频文件使用 infer video describe。
model run --gateway 会执行 Gateway 网关路由、已保存凭证、提供商选择和嵌入式运行时，但仍作为原始模型探测运行：它会发送提供的提示词和任何图像附件，不包含先前的会话转录、bootstrap/AGENTS 上下文、上下文引擎装配、工具或内置 MCP 服务器。
model run --gateway --model <provider/model> 需要可信操作员 Gateway 网关凭据，因为该请求要求 Gateway 网关运行一次性提供商/模型覆盖。

模型

使用 model 进行由提供商支持的文本推理以及模型/提供商检查。

openclaw infer model run --prompt "Reply with exactly: smoke-ok" --json
openclaw infer model run --prompt "Summarize this changelog entry" --model openai/gpt-5.4 --json
openclaw infer model run --prompt "Describe this image in one sentence" --file ./photo.jpg --model google/gemini-2.5-flash --json
openclaw infer model providers --json
openclaw infer model inspect --name gpt-5.5 --json

使用完整的 <provider/model> 引用来冒烟测试特定提供商，而无需启动 Gateway 网关或加载完整智能体工具接口：

openclaw infer model run --local --model anthropic/claude-sonnet-4-6 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model cerebras/zai-glm-4.7 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model google/gemini-2.5-flash --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model groq/llama-3.1-8b-instant --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model mistral/mistral-small-latest --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model openai/gpt-4.1 --prompt "Reply with exactly: pong" --json
openclaw infer model run --local --model ollama/qwen2.5vl:7b --prompt "Describe this image." --file ./photo.jpg --json

备注：

本地 model run 是检查提供商/模型/凭证健康状态的最窄 CLI 冒烟测试，因为它只向所选模型发送提供的提示词。
本地 model run --file 保持该精简路径，并将图像内容直接附加到单条用户消息。当 MIME 类型被检测为 image/* 时，PNG、JPEG 和 WebP 等常见图像文件可用；不受支持或无法识别的文件会在调用提供商之前失败。
当你想直接测试所选多模态文本模型时，model run --file 最合适。当你想使用 OpenClaw 的图像理解提供商选择和默认图像模型路由时，使用 infer image describe。
所选模型必须支持图像输入；纯文本模型可能会在提供商层拒绝请求。
model run --prompt 必须包含非空白文本；空提示词会在调用本地提供商或 Gateway 网关之前被拒绝。
当提供商没有返回文本输出时，本地 model run 会以非零状态退出，因此不可达的本地提供商和空补全不会看起来像成功探测。
当你需要测试 Gateway 网关路由、智能体运行时设置或 Gateway 网关管理的提供商状态，同时保持模型输入为原始内容时，使用 model run --gateway。当你需要完整智能体上下文、工具、记忆和会话转录时，使用 openclaw agent 或聊天接口。
model auth login、model auth logout 和 model auth status 管理已保存的提供商凭证状态。

图像

使用 image 进行生成、编辑和描述。

openclaw infer image generate --prompt "friendly lobster illustration" --json
openclaw infer image generate --prompt "cinematic product photo of headphones" --json
openclaw infer image generate --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "simple red circle sticker on a transparent background" --json
openclaw infer image generate --prompt "slow image backend" --timeout-ms 180000 --json
openclaw infer image edit --file ./logo.png --model openai/gpt-image-1.5 --output-format png --background transparent --prompt "keep the logo, remove the background" --json
openclaw infer image edit --file ./poster.png --prompt "make this a vertical story ad" --size 2160x3840 --aspect-ratio 9:16 --resolution 4K --json
openclaw infer image describe --file ./photo.jpg --json
openclaw infer image describe --file ./receipt.jpg --prompt "Extract the merchant, date, and total" --json
openclaw infer image describe-many --file ./before.png --file ./after.png --prompt "Compare the screenshots and list visible UI changes" --json
openclaw infer image describe --file ./ui-screenshot.png --model openai/gpt-4.1-mini --json
openclaw infer image describe --file ./photo.jpg --model ollama/qwen2.5vl:7b --prompt "Describe the image in one sentence" --timeout-ms 300000 --json

备注：

从现有输入文件开始时使用 image edit。
对支持参考图像编辑几何提示的提供商/模型，将 --size、--aspect-ratio 或 --resolution 与 image edit 一起使用。
将 --output-format png --background transparent 与 --model openai/gpt-image-1.5 一起使用，以获得透明背景的 OpenAI PNG 输出；--openai-background 仍可作为 OpenAI 专用别名使用。不声明背景支持的提供商会将该提示报告为被忽略的覆盖项。
使用 image providers --json 验证哪些内置图像提供商可被发现、已配置、被选中，以及每个提供商暴露了哪些生成/编辑能力。

使用 image generate --model <provider/model> --json 作为图像生成更改的最窄实时 CLI 冒烟测试。示例：

openclaw infer image providers --json
openclaw infer image generate \
  --model google/gemini-3.1-flash-image-preview \
  --prompt "Minimal flat test image: one blue square on a white background, no text." \
  --output ./openclaw-infer-image-smoke.png \
  --json

JSON 响应会报告 ok、provider、model、attempts 和已写入的输出路径。设置 --output 时，最终扩展名可能会遵循提供商返回的 MIME 类型。

对于 image describe 和 image describe-many，使用 --prompt 给视觉模型提供特定于任务的指令，例如 OCR、比较、UI 检查或简洁图注。
对于较慢的本地视觉模型或冷启动的 Ollama，使用 --timeout-ms。
对于 image describe，--model 必须是支持图像的 <provider/model>。
对于本地 Ollama 视觉模型，请先拉取模型，并将 OLLAMA_API_KEY 设置为任意占位值，例如 ollama-local。请参阅 Ollama。

音频

使用 audio 进行文件转录。

openclaw infer audio transcribe --file ./memo.m4a --json
openclaw infer audio transcribe --file ./team-sync.m4a --language en --prompt "Focus on names and action items" --json
openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json

注意：

audio transcribe 用于文件转录，而不是实时会话管理。
--model 必须是 <provider/model>。

TTS

使用 tts 进行语音合成和 TTS 提供商状态管理。

openclaw infer tts convert --text "hello from openclaw" --output ./hello.mp3 --json
openclaw infer tts convert --text "Your build is complete" --output ./build-complete.mp3 --json
openclaw infer tts providers --json
openclaw infer tts status --json

注意：

tts status 默认使用 Gateway 网关，因为它反映 Gateway 网关管理的 TTS 状态。
使用 tts providers、tts voices 和 tts set-provider 检查并配置 TTS 行为。

视频

使用 video 进行生成和描述。

openclaw infer video generate --prompt "cinematic sunset over the ocean" --json
openclaw infer video generate --prompt "slow drone shot over a forest lake" --resolution 768P --duration 6 --json
openclaw infer video describe --file ./clip.mp4 --json
openclaw infer video describe --file ./clip.mp4 --model openai/gpt-4.1-mini --json

注意：

video generate 接受 --size、--aspect-ratio、--resolution、--duration、--audio、--watermark 和 --timeout-ms，并将它们转发给视频生成运行时。
对于 video describe，--model 必须是 <provider/model>。

网页

使用 web 执行搜索和获取工作流。

openclaw infer web search --query "OpenClaw docs" --json
openclaw infer web search --query "OpenClaw infer web providers" --json
openclaw infer web fetch --url https://docs.openclaw.ai/cli/infer --json
openclaw infer web providers --json

注意事项：

使用 web providers 检查可用、已配置和已选择的提供商。

嵌入

使用 embedding 创建向量并检查嵌入提供商。

openclaw infer embedding create --text "friendly lobster" --json
openclaw infer embedding create --text "customer support ticket: delayed shipment" --model openai/text-embedding-3-large --json
openclaw infer embedding providers --json

JSON 输出

infer 命令会将 JSON 输出规范化到共享封套下：

{
  "ok": true,
  "capability": "image.generate",
  "transport": "local",
  "provider": "openai",
  "model": "gpt-image-2",
  "attempts": [],
  "outputs": []
}

顶层字段是稳定的：

ok
capability
transport
provider
model
attempts
outputs
error

对于生成媒体的命令，outputs 包含 OpenClaw 写入的文件。自动化时应使用该数组中的 path、mimeType、size 和任何媒体特定尺寸，而不是解析面向人类可读的 stdout。

常见陷阱

# Bad
openclaw infer media image generate --prompt "friendly lobster"

# Good
openclaw infer image generate --prompt "friendly lobster"

# Bad
openclaw infer audio transcribe --file ./memo.m4a --model whisper-1 --json

# Good
openclaw infer audio transcribe --file ./memo.m4a --model openai/whisper-1 --json

注意事项

openclaw capability ... 是 openclaw infer ... 的别名。

Overview

First steps

Guides

将 infer 转成一个 Skill

为什么使用 infer

命令树

常见任务

行为

模型

图像

音频

TTS

视频

网页

嵌入

JSON 输出

常见陷阱

注意事项

相关内容

Overview

First steps

Guides

Documentation Index

​将 infer 转成一个 Skill

​为什么使用 infer

​命令树

​常见任务

​行为

​模型

​图像

​音频

​TTS

​视频

​网页

​嵌入

​JSON 输出

​常见陷阱

​注意事项

​相关内容

将 infer 转成一个 Skill

为什么使用 infer

命令树

常见任务

行为

模型

图像

音频

TTS

视频

网页

嵌入

JSON 输出

常见陷阱

注意事项

相关内容