Sensory at the Intel Developer Forum September 17th, 2011
I decided to pop up to San Francisco this week to hit the Intel Developer Forum. It’s open to the public, but it’s really more of a show and tell to Intel employees than from them.
One of the sessions was entitled “Enhanced Experiences with Low Power Speech Recognition,” and this was my main reason for being there. Intel’s Devon Worrell gave a very nice presentation, focusing on the importance of a closed computer being not just a brick, but still having functionality in a low power state. He put up a lot of compelling slides about using speech recognition in this mode, and emphasized the need for low-power command and control with an always-on always listening device that responds to commands…hmmmm…sounds like a page right out of the Sensory bible!
Realtek appears to have been selected by Intel as a chip provider for the low-power speech recognition, and they presented at the session and even gave a demo of their in-house speech recognition technology. I wasn’t very impressed; the idea was for it to work in music with the user not speaking directly into the microphone. For the demo, however, the music was so quiet the audience could barely tell it was on, and the speaker spoke only a few inches from the mic. I had a hard time understanding if it was working or not (well, that’s giving it the benefit of the doubt.)
Jean-Marc Jot from DTS also spoke and gave an impressive presentation and demo. Of course, I’m very biased….The DTS speech recognition demo used Sensory’s TrulyHandsfree™ Voice Control. I was a bit nervous because of Jean-Marc’s French accent and the fact that DTS had created their own TrulyHandsfree trigger phrase, “Hello Jennifer” without any assistance from Sensory. (As a side note, Sensory’s TrulyHandsfree 2.0 SUBSTANTIALLY improves performance, but there are a number of complex variables in our algorithm that are not accessible through our SDK’s, and therefore our customers can not yet use the latest technology to its fullest extent unless Sensory fine tunes the vocabularies in-house.) So…Jean-Marc was demoing our earliest incarnation of TrulyHandsfree Voice Control, with a French accent in a noisy room and with a command set that Sensory has never reviewed.
The demo was AWESOME. Jean-Marc spoke about 3 feet from the mic, and said commands like “Hey Jennifer…play Lady Gaga.” The music was cranked up really loud, and Jean-Marc spoke commands like “fast forward” and other music controls as well as calling up songs by name. I have a habit of counting speech recognition errors… On the trigger there were no false positives (accidental firing), and only 2 false negatives (where Jean-Marc needed to repeat the trigger phrase). That was 2 out of about 30 or 40 uses, indicating a 94% or 95% acceptance accuracy in high noise, and the phrases following the trigger had about the same high accuracy.
Sweet Demo of how speech recognition can work in a low-power mode and be always on and listening for commands even in high noise situations!
Voice Search and Other Video’s September 16th, 2010
Google seems to be putting a bit of promotion behind the Android Voice Search capabilities with a campaign called “What You Say is What You Search.” A few months back they announced that 25% of all Android based search functions are done by voice, and now they are blogging and creating videos to promote this WONDERFUL capability. My favorite Google voice search video is the informative Mike LeBeau video that he did for Voice Actions. I like it because Mike is a real person that really works for Google and knows his stuff…more charisma than Justin Long (you know, Apple’s old Mac guy), and he’s not a paid actor.
Seems that a big part of the Google message is “IT WORKS!”…unfortunately there are a lot more video’s promoting that speech recognition doesn’t work. Searching for “speech recognition” or “voice recognition” on YouTube by most-watched videos reveals that the most popular speech videos are the mistakes or “fails”, with some of these being real demo’s by Microsoft among others. Many are pretty humorous…
Here are my favorite funny speech recognition videos:
- Jimmy Kimmel’s Cousin Sal:
- The Voice-Activated Elevator: I get a special kick out of this, knowing that Sensory has been approached half a dozen or more times by elevator companies wanting to do this (and it’s always a highly confidential amazing idea that they think nobody else has ever thought of!)
- And of course there’s the movie clips:
Sensory has produced a variety of low budget in-house videos, and although they are not very funny, they showcase our unique technologies. I’ll have my VP of Sales post a blog about these soon.