GPT-5 Thinking Mode

GPT-5 supports enabling thinking mode through the Responses API’s reasoning parameter, allowing the model to perform deep reasoning before answering.

POST /v1/responses

Basic Usage

curl https://crazyrouter.com/v1/responses \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "gpt-5",
    "input": "Analyze the time complexity of the following code and propose an optimization:\ndef two_sum(nums, target):\n    for i in range(len(nums)):\n        for j in range(i+1, len(nums)):\n            if nums[i] + nums[j] == target:\n                return [i, j]",
    "reasoning": {
      "effort": "high"
    }
  }'

reasoning Parameter

Field	Type	Description
`effort`	string	Reasoning depth: `low` (quick), `medium` (balanced), `high` (deep)
`summary`	string	Thinking summary: `auto`, `concise`, `detailed`

effort Level Comparison

Level	Use Cases	Token Consumption
`low`	Simple questions, factual queries	Low
`medium`	General reasoning, code analysis	Medium
`high`	Complex math, deep analysis	High

Get Thinking Summary

Set the summary parameter to get a summary of the model’s thinking process:

Python

response = client.responses.create(
    model="gpt-5",
    input="Design a high-concurrency message queue system architecture",
    reasoning={
        "effort": "high",
        "summary": "detailed"
    }
)

# Output may contain thinking summary
for item in response.output:
    if item.type == "reasoning":
        print("Thinking process:", item.summary)
    elif item.type == "message":
        for content in item.content:
            if content.type == "output_text":
                print("Answer:", content.text)

Streaming Thinking

Python

stream = client.responses.create(
    model="gpt-5",
    input="Explain why the P=NP problem is important",
    reasoning={"effort": "high", "summary": "concise"},
    stream=True
)

for event in stream:
    if event.type == "response.reasoning_summary_text.delta":
        print(f"[Thinking] {event.delta}", end="")
    elif event.type == "response.output_text.delta":
        print(event.delta, end="")

Combined with Tools

Thinking mode can be used simultaneously with Function Calling and Web Search:

Python

response = client.responses.create(
    model="gpt-5",
    input="Analyze the current global AI chip market landscape and provide investment recommendations",
    reasoning={"effort": "high"},
    tools=[
        {"type": "web_search_preview"}
    ]
)

print(response.output_text)

Combined with System Instructions

Python

response = client.responses.create(
    model="gpt-5",
    instructions="You are a senior software architect who specializes in analyzing system design problems.",
    input="Design a real-time chat system that supports millions of users",
    reasoning={"effort": "high"}
)

In thinking mode, the model consumes additional tokens for internal reasoning. The higher the effort, the more tokens consumed, but the answer quality is typically better.

Not all models support the reasoning parameter. Currently it is mainly supported by GPT-5 and o-series models. For unsupported models, this parameter will be ignored.

Responses Function Calling Midjourney Overview

Getting Started

Chat - OpenAI

Chat - Claude

Chat - Gemini

Chat - Responses API

Image Generation

Video Generation

Audio

Embeddings & Rerank

Other APIs

Token Management

SDK & Code Examples

Integrations

Reference

GPT-5 Thinking Mode

GPT-5 Thinking Mode

Basic Usage

reasoning Parameter

effort Level Comparison

Get Thinking Summary

Streaming Thinking

Combined with Tools

Combined with System Instructions

Getting Started

Chat - OpenAI

Chat - Claude

Chat - Gemini

Chat - Responses API

Image Generation

Video Generation

Audio

Embeddings & Rerank

Other APIs

Token Management

SDK & Code Examples

Integrations

Reference

​GPT-5 Thinking Mode

​Basic Usage

​reasoning Parameter

​effort Level Comparison

​Get Thinking Summary

​Streaming Thinking

​Combined with Tools

​Combined with System Instructions

GPT-5 Thinking Mode

Basic Usage

reasoning Parameter

effort Level Comparison

Get Thinking Summary

Streaming Thinking

Combined with Tools

Combined with System Instructions