STT · Reliability

Fast on average isn't fast enough.

Production SLAs live in the tail, not the median. End-to-end latency at p50/p95/p99 shows which providers stay tight and which spike — the difference between "feels instant" and "occasionally hangs."

End-to-end latency

live
Provider p50 p95 p99
OpenAI GPT-4o Transcribe 974 ms 1.65 s 10.08 s
ElevenLabs Scribe v2 1.01 s 1.57 s 1.94 s
Deepgram Nova-3 2.22 s 3.25 s 4.62 s
AssemblyAI Universal-3 Pro 3.33 s 4.79 s 5.21 s

n=150/provider · single-location, includes network RTT · measured 2026-06-01

OpenAI GPT-4o Transcribe has the fastest median but a ~10 s p99 tail — the median hides it. ElevenLabs is both fast and tight (p99 under 2 s). The two slower providers are at least consistent.