Which providers actually speak which language?
The English leaderboard hides a coverage cliff. Some providers transcribe Thai, Indonesian, and Vietnamese about as well as they do English; others don't support them at all. We check support first and only benchmark a provider on a language it can actually transcribe — the rest read "does not support."
Per-language accuracy
live| Provider | EnglishWER | SpanishWER | ThaiCER | IndonesianWER | VietnameseWER |
|---|---|---|---|---|---|
| OpenAI GPT-4o Transcribe | 2.4% | 1.4% | 8.1% | 2.4% | 2.5% |
| Alibaba Qwen3-ASR | 2.6% | 2.0% | 4.8% | 4.6% | 2.4% |
| ElevenLabs Scribe v2 | 2.9% | 1.8% | 4.1% | 3.0% | 1.9% |
| xAI Grok STT | 4.8% | 3.1% | 6.6% | 2.9% | 4.7% |
| Cartesia Ink-2 | 6.1% | does not support | does not support | does not support | does not support |
| Gradium | 13.2% | does not support | does not support | does not support | does not support |
FLEURS · all via the Speko gateway · English/Thai/Indonesian/Vietnamese loudness-normalized to −16 LUFS, measured 2026-06-03 (English 50-clip board, the rest 30 clips each) · Spanish FLEURS es_419, 250 clips, raw audio (not −16 LUFS), measured 2026-06-10 — so the Spanish column is a real measurement but cross-language comparison is approximate · Thai scored by CER (no word boundaries), the rest by WER
"Does not support" means the provider can't transcribe that language in its native script — it returns the wrong script or ~100% error — so we don't benchmark it there and don't publish a misleading number. Four of the six gateway providers cover the wedge: OpenAI, ElevenLabs, xAI, and Alibaba transcribe Spanish/Thai/Indonesian/Vietnamese; Cartesia and Gradium are English-only.
Wedge coverage is in progress: Malay and Filipino are not yet measured (corpus pending). Accent equality and code-switching are separate territories, below.