Blog
Write-ups on how the benchmarks are built and what the data shows.
-
Speech-to-Speech Got Smart. It Still Can't Replace the Cascade.
Speech-to-speech models closed the reasoning gap with text LLMs in 2026. The gap that's left — observability, cost predictability, component swap — is the one that actually decides production architecture.
-
We Tried to Break Four Voice Agents with a Cough. We Failed.
We ran the same 400 ms cough into four production voice-agent stacks — OpenAI Realtime default and tuned, cascaded Deepgram Nova-3, cascaded ElevenLabs Scribe v2. Four for four absorbed it without yielding. Here are the clips, and the engineering that explains why.