Speech Recognition in the Car: Are You Siri-ous?
By Peter Mui | Wednesday, January 27, 2016
Speech recognition has been around for a long time. Products that run on your PC initially required voice training and targeted professionals in vertical industries with highly specialized vocabularies (e.g. doctors). As speech recognition algorithms improved, computer processing got better, and more memory was available to store ever-larger dictionaries of words and phrases, products like Nuance’s Dragon Dictation have eventually gotten to the point where they’re largely general purpose and work very well.
Yet speech recognition did not come into wide use until the advent of the smartphone. With their small screens, fiddly typing and with the people using them constantly on the go, the ability to talk to your phone for commands and dictation became increasingly useful. Now, you no longer have to buy a separate application. Speech recognition is built into the three major smartphone operating systems: Apple has Siri, Google has Voice Search and Windows has Cortana. All three use cloud-based “deep learning” algorithms and background noise filtering. Under ideal locations, e.g., a strong cell phone data signal, a relatively quiet environment and minimal wind noise, they all work pretty well. Outside of those conditions the experience degrades and gets frustrating really fast.
It seems logical that we would want to have speech recognition in our cars. The ability to manipulate not just our phones but the climate control, sound system and navigation system using voice seems compelling. But it needs to work really well. Bad speech recognition in the car could lead to processing delays, misinterpreted commands or missed requests, leading to driver distraction and ultimately impacting safety.
Speech recognition in a car via a smartphone-derived OS such as Apple’s CarPlay (Siri), Google’s Android Auto (Voice Search) and Microsoft Windows Embedded Automotive (Cortana) is going to be difficult because of their inherent cloud-based nature. Depending on a reliable cell phone signal and dealing with the hand-off between cell towers is going to lead to an unsatisfying in-car experience. And, to their chagrin, the auto manufacturers who adopt the speech recognition in these smartphone-based systems are ultimately going to be blamed for the lousy customer experience.
In order to have a satisfying speech recognition experience the speech recognition processing is going to need to be done locally. That’s hard to do on a smartphone because it has limited memory and a low-power processor; it’s easier to do in a car because it can have a server-grade processor and lots of on-board memory. Even with the massive cloud-based processing and deep learning touted by the smartphone OS vendors, I suspect that local, on-board speech recognition in the car is going to be the only solution that will be acceptable, if for no other reason than the latency of communicating through cell phone towers.
But for what may be lost by not accessing the extensive vocabulary and context databases in the cloud, the fixed, quiet environment of the auto cabin means that the background noise filtering should be considerably easier. If the auto manufacturers build multiple microphones into the cabin and sync them to some reasonable signal processing they could provide a clean, high-quality voice signal into the local speech recognition engine, improving its accuracy.
(Bosch, a major supplier to the auto industry, recently acquired Akustica, a company that develops microphones manufactured in a new, low-cost way.)
Speech recognition is a differentiator for auto manufacturers both from each other and from smartphone manufacturers. . However, the auto manufacturers are being challenged on so many fronts these days: are they up to this challenge as well?
ICS is working with industry giants in the In-Vehicle Infotainment sector and have been a part of some of the innovative growth within that industry. Our participation with GENIVI, a non-profit industry alliance committed to driving the broad adoption of specified, open source, IVI software for example, has led to ICS’s Qt-based HMI Solution to be supported by GENIVI. You can read more about it using the link below.
The efforts of many are ‘Siri-ously’ helping to create powerful and adaptable platforms that auto manufacturers can use to build sophisticated IVIs for its customers.
GENIVI, GENIVI /HMI Press Release, website link last accessed, January 27, 2016, http://www.ics.com/company/news/icss-qt-based-hmi-solution-now-supported-genivi-gdp