AI-Based Hands-Free App Gets Even Smarter

Sensory, developer of wake words for personal assistants, says it has made significant upgrades to the embedded artificial intelligence (AI) in its sixth generation of TrulyHandsfree, boosting the technology’s wake word performance and accuracy by more than 65%. Additionally, the company has improved the application’s deep-neural network training that allows for better near- and far-field speech recognition performance in all room conditions. 

 

Version 6.0 improves performance and word recognition accuracy relative to the last two generations of TrulyHandsfree by reducing wake word false positives by more than 65%. This reduction is due to a high-resolution speech feature front-end that sees a higher resolution digitized representation of the speech audio, combined with the introduction of on-device wake word post-qualification. Post qualification uses intelligence about wake word events to discriminate against false positives.

FREE SENSORS NEWSLETTER

Like this story? Subscribe to Sensors Online!

Sensors delivers a suite of newsletters, each serving as an information resource to help engineers and engineering professionals make efficient design and business decisions. Sign up to get news and updates delivered to your inbox and read on the go.

 

In addition to enhanced wake word performance, Sensory’s latest high-resolution speech feature front-end has improved upon what was already deemed the benchmark for embedded speech recognizers and contributed a significant boost in TrulyHandsfree 6.0’s speech recognition performance and accuracy over previous generations. The speech feature front-end enhancements in accuracy also enable support for multiple wake words, like “Okay Google” “Alexa™,” “Hey Cortana,” “Hey Siri” and ““Xiaodu, Xiaodu”. This allows device makers to create products with a user-friendly voice interface that works with more than one digital assistant technology.

 

Accents, Dialects and Varying Room Conditions

 

Sensory upgraded the machine learning within TrulyHandsfree to take advantage of high-resolution audio information for deep-neural net training. This improved training allows the algorithms to anticipate a variety of factors associated with wake word performance, including understanding how one person, or a population of people, may pronounce a wake word. It also takes into consideration acoustic challenges like various room configurations, device placement, room size, reverb and echo. This ensures reliable always-listening speech recognition performance regardless of where the device is placed in a room.

 

Enhanced Security

 

TrulyHandsfree 6.0 does all processing on device, keeping voice data completely safe by never storing it or sending it to the cloud. Additionally, TrulyHandsfree 6.0 includes a layer of voice biometrics recognition in the voice interface for user authentication and security. TrulyHandsfree’s embedded high-resolution voice enrollment and speaker verification (SV) technology is flexible, allowing users to enroll their voice and their own custom wake word or passphrase, restricting unauthorized users from accessing the voice user interface. Even if an unauthorized person learns the custom wake word or passphrase, Sensory’s voice biometrics technology will recognize that it’s not the enrolled user speaking and not authenticate them.

 

Barge-in and Far-Field Performance

 

Specifically tuned to provide an ideal voice barge-in experience with TrulyHandsfree, Sensory offers upgraded AEC solution that supports single mic input systems with mono or stereo sound sources. This technology allows users to interrupt their devices by saying the wake word while the device is in the middle of playing voice prompts, music or other sounds, at any volume level. The high-resolution speech feature front-end and AEC also play major roles in making the app better at hearing users in a variety of room sizes and configurations, making TrulyHandsfree 6.0’s far-field wake word performance second to none.

 

TrulyHandsfree supports US English, UK English, Arabic, Dutch, French, German, Italian, Japanese, Korean, Mandarin, Portuguese, Russian, Spanish, Swedish and Turkish. The TrulyHandsfree SDK is available for Android, iOS, Linux, QNX and Windows. Sensory provides developer support for cloud service interfaces on Linux, Android, iOS and Windows as well as support for dozens of proprietary DSPs, microcontrollers, smart microphones and other low-power embedded devices.

 

Additionally, ultra-low-power deeply embedded ports of TrulyHandsfree are available for leading DSP/MCU IP cores from ARM, Cadence, CEVA, NXP, Synopsys and Verisilicon, as well as for integrated circuits from Ambiq Micro, Analog Devices, Cirrus Logic, DSP Group, Fortemedia, Intel, Knowles, Microchip (Microsemi), NXP, Qualcomm, QuickLogic, Realtek, Synaptics, STMicroelectronics, TI, Yamaha, and XMOS. For more details, visit Sensory.