Skip to main content

Responses Web Search

The Responses API has a built-in web search tool that enables models to search the internet for real-time information.
POST /v1/responses

Basic Usage

curl https://crazyrouter.com/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "input": "What are the important tech news stories today?",
    "tools": [
      {
        "type": "web_search_preview"
      }
    ]
  }'

Search Configuration

You can configure the search context size:
{
  "model": "gpt-4o",
  "input": "Latest AI model releases in 2026",
  "tools": [
    {
      "type": "web_search_preview",
      "search_context_size": "medium"
    }
  ]
}
ParameterValueDescription
search_context_sizelowLess search context, faster speed
search_context_sizemediumDefault, balancing speed and information
search_context_sizehighMore search context, more comprehensive information

Python
stream = client.responses.create(
    model="gpt-4o",
    input="Compare the latest version features of React and Vue",
    tools=[{"type": "web_search_preview"}],
    stream=True
)

for event in stream:
    if event.type == "response.output_text.delta":
        print(event.delta, end="")

Response Format

Search results are included in the response output:
{
  "id": "resp_abc123",
  "object": "response",
  "output": [
    {
      "type": "web_search_call",
      "id": "ws_abc123",
      "status": "completed"
    },
    {
      "type": "message",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "Based on the latest search results, today's major tech news includes...",
          "annotations": [
            {
              "type": "url_citation",
              "start_index": 10,
              "end_index": 20,
              "url": "https://example.com/news"
            }
          ]
        }
      ]
    }
  ]
}
The annotations field in web search results contains the source URLs of citations, which can be used to display reference links.
Web search increases response latency and token consumption. The model automatically determines whether a search is needed and only triggers it when real-time information is required.