I’ve spent the last six months working on a startup, building agent prototypes for one of the largest consumer packaged goods companies in the world. As part of that work, our team relied on off-the-shelf voice agent platforms to help the company operate more effectively. Though I can’t go into the business details, the technical takeaway was clear: voice agents are powerful, and there are brilliant off-the-shelf abstractions like Vapi and ElevenLabs that make spinning up voice agents a breeze. But: these abstractions also hide a surprising amount of complexity.
Essential digital access to quality FT journalism on any device. Pay a year upfront and save 20%.
,推荐阅读搜狗输入法2026获取更多信息
Крупнейшая нефтяная компания мира задумалась об альтернативе для морских перевозок нефти14:56
For comparison, the equivalent configuration in Vapi - using the same STT, LLM, and TTS models - estimates around ~840ms. In this setup, the custom orchestration actually beats Vapi's own estimates by about 50ms.
I tried going through back channels. I messaged the pharmacy which was using this courier. I talked to my prescriber, who was shocked at this issue. And the next time I got a delivery, it came via UPS instead (they do not have a leaky sieve for a tracking page, but they did "lose" my prescription once).