STT · Reliability

Fast on average isn't fast enough.

Production SLAs live in the tail, not the median. End-to-end latency at p50/p95/p99 shows which providers stay tight and which spike — the difference between "feels instant" and "occasionally hangs."

End-to-end latency

live

Provider	p50	p95	p99
OpenAI GPT-4o Transcribe	974 ms	1.65 s	10.08 s
ElevenLabs Scribe v2	1.01 s	1.57 s	1.94 s
Deepgram Nova-3	2.22 s	3.25 s	4.62 s
AssemblyAI Universal-3 Pro	3.33 s	4.79 s	5.21 s

n=150/provider · single-location, includes network RTT · measured 2026-06-01

OpenAI GPT-4o Transcribe has the fastest median but a ~10 s p99 tail — the median hides it. ElevenLabs is both fast and tight (p99 under 2 s). The two slower providers are at least consistent.