The industry has moved from punch card to keyboard and from mouse to touchscreen, all in pursuit of more direct system manipulation to optimize user experience (UX) in today’s increasingly mobile and interconnected world. These are all abstractions of physical devices, though, and voice control has been heralded as the next step toward a more natural UX. Unfortunately, today’s solutions can’t deliver what machines need to understand – people – resulting in poor performance and no convenient way to control a new generation of voice-only products and services.
One of the biggest impediments to satisfactory voice control performance has been ambient noise, including nearby conversations, outdoor sounds, and reverberation when speaking in certain indoor environments. The use of multiple acoustic microphones and microphone arrays to improve directional acquisition has proven expensive and incapable of adequately isolating the speaker for reliable voice control. Now, a new approach is available that leverages optical lasers and interferometry techniques to gather additional critical information exclusively about the user communicating with a device. Combining this optical information with the output from an acoustic microphone gives automatic speech recognition (ASR) engines something they have never had before – a near-perfect reference audio signal directly from the speaker’s facial vibrations, regardless of noise levels.