Skip to main content

Reasoning Models

Reasoning Models perform deep thinking before answering, making them suitable for complex tasks like math, coding, and logical reasoning. Crazyrouter supports multiple reasoning models with reasoning depth control.

Supported Reasoning Models

ModelDescription
o4-miniOpenAI reasoning model, balancing speed and capability
o3OpenAI advanced reasoning model
o3-miniOpenAI lightweight reasoning model
deepseek-r1DeepSeek reasoning model
deepseek-v3-1DeepSeek V3.1
claude-sonnet-4-20250514Claude with extended thinking support
gemini-2.5-flash-thinkingGemini thinking model

reasoning_effort Parameter

Control the model’s reasoning depth with reasoning_effort:
ValueDescription
lowQuick answers, suitable for simple questions
mediumModerate reasoning, balancing speed and quality
highDeep reasoning, suitable for complex problems
curl https://crazyrouter.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "o4-mini",
    "messages": [
      {"role": "user", "content": "Prove that the square root of 2 is irrational"}
    ],
    "reasoning_effort": "high"
  }'

thinking Parameter (Thinking Budget)

Some models support precise control of the thinking token budget via the thinking parameter:
Python
response = client.chat.completions.create(
    model="claude-sonnet-4-20250514",
    messages=[
        {"role": "user", "content": "Analyze the time complexity of this algorithm and optimize it"}
    ],
    thinking={
        "type": "enabled",
        "budget_tokens": 10000
    },
    max_tokens=16000
)
When using the thinking parameter, max_tokens must be greater than budget_tokens, because max_tokens includes both thinking tokens and output tokens.

Streaming Reasoning

Reasoning models also support streaming output. The thinking process and final answer are returned as separate chunks:
Python
stream = client.chat.completions.create(
    model="o4-mini",
    messages=[
        {"role": "user", "content": "Solve a Sudoku puzzle"}
    ],
    reasoning_effort="medium",
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

DeepSeek Reasoning Models

DeepSeek reasoning models return the thinking process in the response:
Python
response = client.chat.completions.create(
    model="deepseek-r1",
    messages=[
        {"role": "user", "content": "Which is larger, 9.11 or 9.8?"}
    ]
)

# The thinking process may be in the reasoning_content field
message = response.choices[0].message
print("Answer:", message.content)
Reasoning models typically consume far more tokens than regular models because the thinking process also consumes tokens. Be mindful of controlling reasoning_effort and budget_tokens to manage costs.
Reasoning models typically do not support sampling parameters like temperature and top_p. If these parameters are passed, they may be ignored or return an error.