Speech-to-Text (STT)
Request Parameters
Supported Audio Formats
Request Examples
Response Examples
JSON Format
verbose_json Format
SRT Format
Audio Translation

Speech-to-Text (STT)

POST /v1/audio/transcriptions

Transcribe audio files to text, compatible with the OpenAI Whisper API format.

Request Parameters

Parameter	Type	Required	Description
`file`	file	Yes	Audio file (multipart/form-data)
`model`	string	Yes	Model name: `whisper-1`, `gpt-4o-transcribe`
`language`	string	No	Audio language (ISO-639-1 format), e.g. `zh`, `en`, `ja`
`response_format`	string	No	Output format: `json` (default), `text`, `srt`, `verbose_json`, `vtt`
`temperature`	number	No	Sampling temperature, 0-1
`prompt`	string	No	Prompt to help the model understand context

Supported Audio Formats

mp3, mp4, mpeg, mpga, m4a, wav, webm

Request Examples

curl -X POST https://crazyrouter.com/v1/audio/transcriptions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F file=@audio.mp3 \
  -F model=whisper-1 \
  -F language=en \
  -F response_format=json

Response Examples

JSON Format

{
  "text": "Hello, welcome to the Crazyrouter API. Today we'll introduce the speech-to-text feature."
}

verbose_json Format

{
  "task": "transcribe",
  "language": "english",
  "duration": 5.2,
  "text": "Hello, welcome to the Crazyrouter API.",
  "segments": [
    {
      "id": 0,
      "start": 0.0,
      "end": 2.5,
      "text": "Hello, welcome to the Crazyrouter API."
    }
  ]
}

SRT Format

1
00:00:00,000 --> 00:00:02,500
Hello, welcome to the Crazyrouter API.

Audio Translation

POST /v1/audio/translations

Translate non-English audio to English text. Parameters are the same as the transcription endpoint.

Python

with open("chinese_audio.mp3", "rb") as audio_file:
    translation = client.audio.translations.create(
        model="whisper-1",
        file=audio_file
    )

print(translation.text)  # Outputs English translation

Specifying the language parameter can improve transcription accuracy. Audio file size limit is 25MB.

Text-to-Speech (TTS)GPT-4o Audio

Getting Started

Chat - OpenAI

Chat - Claude

Chat - Gemini

Chat - Responses API

Image Generation

Video Generation

Audio

Embeddings & Rerank

Other APIs

Token Management

SDK & Code Examples

Integrations

Reference

Speech-to-Text (STT)

Speech-to-Text (STT)

Request Parameters

Supported Audio Formats

Request Examples

Response Examples

JSON Format

verbose_json Format

SRT Format

Audio Translation

Getting Started

Chat - OpenAI

Chat - Claude

Chat - Gemini

Chat - Responses API

Image Generation

Video Generation

Audio

Embeddings & Rerank

Other APIs

Token Management

SDK & Code Examples

Integrations

Reference

​Speech-to-Text (STT)

​Request Parameters

​Supported Audio Formats

​Request Examples

​Response Examples

​JSON Format

​verbose_json Format

​SRT Format

​Audio Translation

Speech-to-Text (STT)

Request Parameters

Supported Audio Formats

Request Examples

Response Examples

JSON Format

verbose_json Format

SRT Format

Audio Translation