Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech
ServiceNow AI published a blog post on Hugging Face titled 'Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech.' The post benchmarks frontier automatic speech recognition (ASR) systems on code-switched speech—where speakers alternate between languages within a single utterance or conversation. This scenario is common for bilingual customers interacting with voice agents. The benchmark evaluates the performance of several ASR models on code-switched datasets, highlighting the difficulties these systems encounter when processing mixed-language input. The results show significant variation in accuracy across models, with some frontier ASR systems struggling to maintain performance compared to monolingual settings. The post does not specify exact model names or benchmark scores in the provided excerpt, but it underscores the need for improved ASR capabilities to handle multilingual and code-switched interactions effectively. The consequence is that current voice agents may underperform for bilingual users, potentially affecting user experience and adoption in multilingual markets.
Voice agents may fail bilingual users, limiting adoption in multilingual markets.