Chat Completions | LLMoxy 文档

POST /v1/chat/completions 是主要的 OpenAI 兼容聊天接口。

curl https://llmoxy.com/v1/chat/completions \
  -H "Authorization: Bearer <LLMOXY_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "system", "content": "Be concise." },
      { "role": "user", "content": "What is LLMoxy?" }
    ]
  }'

请求体

常用字段：

字段	说明
`model`	必填，模型 ID。
`messages`	必填，对话消息数组。
`temperature`, `top_p`, `n`, `stop`	采样控制参数，会传给兼容的上游。
`max_tokens`, `max_completion_tokens`	输出 Token 限制。
`stream`	设置为 `true` 启用 SSE 流式输出。
`stream_options.include_usage`	上游支持时在流式响应中返回 usage。
`tools`, `tool_choice`	OpenAI 兼容工具调用。
`response_format`	对支持的模型请求结构化输出。
`reasoning_effort`	支持推理的模型可使用 `low`、`medium`、`high`。
`modalities`, `audio`	支持多模态音频的模型可使用。
`user`	可选，用于你自己的终端用户标识。

不同模型支持的参数不同。如果响应里出现某个参数不被支持，请移除该参数或改用其他模型。

流式输出

{
  "model": "gpt-4o-mini",
  "stream": true,
  "messages": [
    { "role": "user", "content": "Stream one sentence." }
  ]
}

响应使用 OpenAI 风格 SSE chunk，并以 [DONE] 结束。

工具调用

{
  "model": "gpt-4o-mini",
  "messages": [
    { "role": "user", "content": "What is the weather in Singapore?" }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get the current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city": { "type": "string" }
          },
          "required": ["city"]
        }
      }
    }
  ]
}

支持工具调用的模型会接收工具定义并返回 OpenAI 兼容的工具调用响应。

旧 Completions

旧文本补全客户端可使用 POST /v1/completions。