Random Blogger Thoughts June 30th, 2014
- TrulySecure™ is now announced!!!! This is the first on device fusion of voice and vision for authentication, and it really works AMAZINGLY well. I’m so proud of our new computer vision team and in Sensory’s expansion from speech recognition to speech and vision technologies. Now we are much more than “The Leader in Speech Technologies for Consumer Electronics”- we are “The Leader in Speech and Vision Technology for Consumer Products!” Hey check out the new TrulySecure video on our home page, and our new TrulySecure Product Brief. We hope and expect that TrulySecure will have the same HUGE impact on the market as Sensory had with TrulyHandsfree, the technology that pioneered always on touch less control!
- Google I/O. Android wants to be everywhere: in our cars, in our homes, and in our phones. They are willing to spend billions of dollars to do it. Why? To observe our behaviors, which in turn will help provide us more of what we want…and they will also assist in those purchases. Of course this is what Microsoft and Apple and others want as well, but right now Google has the best cloud based voice experience, and if you ask me it’s the best user experience that will win the game. Seems like they should try and move ahead on the client, but lucky for Sensory we are staying ahead!
- Rumors about Samsung acquiring Nuance…Why would they spend $7B for Nuance when they can pick up a more unique solution from Sensory for only $1B? Yeah, that’s a joke, and is definitely not intended as an offer or solicitation to sell Sensory!
- OH! Sensory has a new logo! We made it to celebrate our 20 year anniversary!
Posted in Biometrics, Industry News, Mobile Phones, Security, Uncategorized, always listening, consumer electronics, speaker identification, speaker verification, truly hands-free | No Comments »
Touch-less Control Wins! June 9th, 2014
I still subscribe to the San Jose Mercury News, as they do a good job of tech business reporting. One of my favorite Mercury News writers is a true critic in the literary sense of the term, Troy Wolverton. Troy rarely raves and is typically critical, but in a smart, logical, and unemotional way.
A few days back he started writing about Microsoft’s Cortana and said “Watch out Siri, someone wants your job.”
I was eager to read his review of Cortana this morning and in particular his comparison with Siri. He ended up giving it a 7/10, and concluding Siri was still ahead. What I thought was most interesting though was that in his final summary, he compared three products and three assistants based on the ease of calling up each of those assistants:
- Cortana – required two touch steps to activate the personal voice assistant
- Siri – required one touch step to activate the personal voice assistant
- MotoX – The best, because you can just start talking with the keyword phrase “OK Google Now” making a TrulyHandsfree experience!!
Motorola is Sensory’s customer, and I am happy to read that Troy gets it and considers this front end activation an important metric in comparing personal assistants!
Hey Siri what’s really in iOS8? June 4th, 2014
It was about 4 years ago that Sensory partnered with Vlingo to create a voice assistant with a special “in car” mode that would allow the user to just say “Hey Vlingo” then ask any question. This was one of the first “TrulyHandsfree” voice experiences on a mobile phone, and it was this feature that was often cited for giving Vlingo the lead in the mobile assistant wars (and helped lead to their acquisition by Nuance).
About 2 years ago Sensory introduced a few new concepts including “trigger to search” and our “deeply embedded” ultra-low power always listening (now down to under 2mW, including audio subsystem!). Motorola took advantage of these excellent approaches from Sensory and created what I most biasedly think is the best voice experience on a mobile phone. Samsung too has taken the Sensory technology and used in a number of very innovative ways going beyond mere triggers and using the same noise robust technology for what I call “sometimes always listening”. For example when the camera is open it is always listening for “shoot” “photo” “cheese” and a few other words.
So I’m curious about what Google, Microsoft, and Apple will do to push the boundaries of voice control further. Clearly all 3 like this “sometimes always on” approach, as they don’t appear to be offering the low power options that Motorola has enabled. At Apple’s WWDC there wasn’t much talk about Siri, but what they did say seemed quite similar to what Sensory and Vlingo did together 4 years ago…enable an in car mode that can be triggered by “Hey Siri” when the phone is plugged in and charging.
I don’t think that will be all…I’m looking forward to seeing what’s really in store for Siri. They have hired a lot of smart people, and I know something good is coming that will make me go back to the iPhone, but for now it’s Moto and Samsung for me!
Nick Bilton, in a New York Times article, cites Forrester Research analysts who point out the importance of software in differentiating and creating value in the wearables market while avoiding commoditization.
While the new hardware is fun and exciting for consumers, the ultimate value will come from creating a connection and engaging the consumers with effective and useful analysis of all the data collected. And in the small wearable form factor, the user interface is always going to be critical. With little or no room for buttons and displays, and not always having a smartphone handy to run an app, voice will increasingly become the user interface of choice for these devices.
Sensory is very well positioned to support voice user interfaces for wearables with ultra-low power implementations that can be woken by a gesture, and quickly respond to commands or shut down to minimize impact on battery life. Watch this space (pun intended) for product announcements of wearables with great voice user interfaces!
Biometrics – The Studies Don’t Reveal the Truth May 7th, 2014
If you read through the biometrics literature you will see a general security based ranking of biometric techniques starting with retinal scans as the most secure, followed by iris, hand geometry and fingerprint, voice, face recognition, and then a variety of behavioral characteristics.
The problem is that these studies have more to do with “in theory” than “in practice” on a mobile phone, but they never-the-less mislead many companies into thinking that a single biometric can provide the results required. This is really not the case in practice. Most companies will require that False Accepts (error caused by wrong person or thing getting in) and False Rejects (error caused by the right person not getting in) be so low that the rate where these two are equal (equal error rate or EER) would be well under 1% across all conditions. Here’s why the studies don’t reflect the real world of a mobile phone user:
- Cost is key. Mobile phone manufacturers will not be willing to invest in the highest end approaches for capturing and measuring biometrics that are used by academic studies. This means less MIPS less memory, and poorer quality readers.
- Size matters. Mobile phone manufacturers have extremely limited real estate, so larger systems cannot be properly deployed, and further complicating things is that an extremely fast enrollment and usage is required without a form factor change.
- Conditions are uncontrollable. Noisy environments, lighting, dirty hands, oily screens/cameras/readers are all uncontrollable and will affect performance
- User compliance cannot be assumed. The careful placement of an eye, finger or face does not always happen.
A great case in point is the fingerprint readers now deployed by Apple and Samsung. These are extremely expensive devices, and the literature would make one think that they are highly accurate, but Apple doesn’t have the confidence to allow them to be used in the iTunes store for ID, and San Jose Mercury News columnist Troy Wolverton says:
“I’ve not been terribly happy with the fingerprint reader on my iPhone, but it puts the one on the S5 to shame. Samsung’s fingerprint sensor failed repeatedly. At best, I would get it to recognize my print on the second try. But quite often, it would fail so many times in a row that I’d be prompted to enter my password instead. I ended up turning it off because it was so unreliable (full article).”
There is a solution to this problem…It’s to utilize sensors already on the phone to minimize cost, and deploy a biometric chain combining face verification, voice verification, or other techniques that can be easily implemented in a user friendly manner that allows the combined usage to create a very low equal error rate, that become “immune” to conditions and compliance issues by having a series of biometric and other secure backup systems.
Sensory has an approach we call SMART, Sensory Methodology for Adaptive Recognition Thresholding that takes a look at environmental and usage conditions and intelligently deploys thresholds across a multitude of biometric technologies to yield a highly accurate solution that is easy to use and fast in responding yet robust to environmental and usage models AND uses existing hardware to keep costs low.
Mobile phones – It doesn’t have to be Cost OR Quality! April 25th, 2014
It’s not often that I rave about articles I read, but Ian Mansfield of Cellular News hit the nail on the head with this article.
Not only is it a well written and concise article but its chock full of recent data (primarily from JD Power research), and most importantly it’s data that tells a very interesting story that nicely aligns with Sensory’s strategy in mobile. So, thanks Ian, for getting me off my butt to start blogging again!
A few key points from the article:
- Price is becoming increasingly important in the choice of mobile phones, and simultaneously the prices of mobile phones are increasing.
- Although price might be the most important factor in choice, the overall customer satisfaction is driven by features.
- The features customers want are seamless voice control (36%); built-in sensors that can gauge temperature, lighting, noise and moods to customize settings to the environment (35%); and facial recognition and biometric security (28%).
- As everyone knows, Samsung and Apple have the overwhelming market share in mobile phones, but interesting to me was that they also both lead in customer satisfaction.
Now, let me dive one step deeper into the problem, and explore whether customer satisfaction can be achieved with minimal impact on cost:
Seamless voice control is here and soon every phone will have it, and it doesn’t add any hardware cost. Sensory introduced the technology with our TrulyHandsfree technology that allows users to just start talking, and our “trigger to search” technology has been nicely deployed by companies like Motorola that pioneered this “seamless voice control” in many of their recent releases. The seamless voice control really doesn’t add much cost, and with excellent engines from Google and Apple and Microsoft sitting in the clouds, it can and will be nicely implemented without effecting handset pricing.
Sensors are a different story. By their nature they will be embedded into the phones and will increase cost. Some “sensors” in the broadest sense of the term are no brainers and necessities, for example microphones and cameras are a must have, and the six-axis sensors combining GPS and accelerometers are arguably must haves as well. Magnetometers, barometers are getting increasingly common, and to differentiate further leading manufacturers are embedding things like heartbeat monitors; stereo 3D cameras are just around the corner. To address the desire for biometric security Samsung and Apple have the 2 bestselling phones in the world embedded with fingerprint sensors!
The problem is that all these sensors add cost, and in particular those finger print sensors are the most expensive and can add $5-$15 to the cost of goods. It’s kind of ironic that after spending all that money on biometric security, Apple doesn’t even allow them as a security measure for purchasing iTunes. And both Samsung and Apple have been chastised for fingerprint sensors that can be cracked with gummy bears or glue!
A much more accurate and cost effective solution can be achieved for biometrics by using the EXISTING sensors on the phones and not adding special purpose biometric sensors. In particular, the “must have sensors” like microphones, cameras, and 6-axis sensors can create a more secure environment that is just as seamless but much less difficult to crack. I’ll talk more about that in my next blog.
Mobile Voice 2014 and The Year of Wearables February 5th, 2014
Everyone seems to be talking about this as the year of the wearable. I don’t think so. Even if Apple does introduce a watch, and Google widely releases Glass, will they really go mainstream and sell hundreds of millions of units? I don’t think so. At least not for a few years. IMHO there needs to be a few major breakthroughs:
- Apps. Yeah we always need a killer app. I don’t think sending little messages and alerts is enough. The killer app could be a great music player…maybe one that’s completely voice controlled? Glass has the potential to augment my knowledge without my asking and that could be really cool, basically look and learn!
- Power. Why hasn’t battery power advanced beyond lithium? I’m hoping for energy harvesting breakthroughs that will allow devices to last and be tiny…to fulfill number 3.
- Invisibility. I stopped wearing a watch when I began carrying a smart phone. I never wear my wedding ring. I need something pretty comfortable and compelling to dangle electronics off of my body. What I really want is something invisible or near invisible. Moto has a tattoo patent for electronics, right? Then there’s the micro-electronic pills…when will we have seamless attachments to augment our abilities?
- Untethered. It would be really cool if I could travel around town without having to carry my phone to use a wearable. It kind of does defeat the purpose. It isn’t that hard to pull my phone out. If I could go a few miles that would be nice…20 would be even better. A completely untethered self-contained unit would be nice, but unlikely to be invisible!
I’ll be leading a Wearables panel at the Mobile Voice show with an AWESOME group of people representing thought leaders from Google, Pebble, Intel, Xowi, and reQall. Here’s the press release
CES 2014 – Sensory and Wearables Everywhere! January 15th, 2014
I spent last week at CES in Las Vegas. What a show!
The big keynote speech was the night before the show started and was given by Brian Krzanich, Intel’s new CEO. His talk was focused on Wearables, and he demonstrated 3 wearable devices (charger, in-ear, and platform architecture). The platform demo included a live on stage use of speech recognition with the low power wake up provided by Sensory. The demo was a smashing success! Several bloggers called it a “canned” demo assuming it couldn’t be live speech recognition if it worked so flawlessly!
I had a chance to walk through the Wearables area. Holy smoke there must have been 20 or 30 smart watches, a similar number of health bands, and even a handful of glasses vendors. In fact, seeing attendees wearing Google’s Glass was quite common place. The smart watches mostly communicate with Bluetooth, and some of the smaller, lighter devices, use Zigbee, ultra-low power Bluetooth, or Ant+ for wireless communications.
Sensory was all over CES, here’s some of the things Sensory sales people were able to catch us in:
- LG new Flex phone – Cool curved phone
- LG G2 phone – latest greatest phone from LG
- Samsung Note 3 – new Note product
- Samsung Android camera – command and control by Sensory!
- Samsung new 12.4 tablet
- Plantronics – miscellaneous headsets
- Intel – great keynote from Intel CEO, and behind closed doors platform demos
- Conexant – showing TV controlled by Sensory
- ivee – clock that controls home appliances
- Ubi – IoT product
- Motorola – Awesome Touchless Control feature on several phones
- Telenav - Scout navigation now hands-free
- Cadence – showing our music control demo.
- Realtek – showing deeply embedded PC
- DSPG – great glasses (wearable) demo on low power chips
- Wolfson –trigger to search demo on low power chips
- Sensory voice command demo on CEVA TeakLite-4
Overall a great show for Sensory. Jeff Rogers, Sensory’s VP Sales told me, “A few people said they had searched out speech recognition products on the show floor to find the various speech vendors, and found that they all were using Sensory.”
Posted in Mobile Phones, Uncategorized, Voice Control, always listening, bluetooth, consumer electronics, truly hands-free | No Comments »
Interview with the Scobleizer December 13th, 2013
Sensory’s PR firm set up an interview for me with Blogger Robert Scoble a couple of weeks ago. I showed up in their SFO office a few minutes early, and like clockwork Robert came through the front door, grabbed me and took me to a videotaping room (surprise to me…He’s a Video Blogger). They hooked up a mic and Robert said “my first question will be ‘who are you’…tell me something personal about you, and then we’ll move on to questions about your company”. I said OK, they ran a quick audio test and BAM we were off and done in 15 minutes!
KitKat’s Listening! November 15th, 2013
Android introduced the new KitKat OS for the Nexus 5, and Sensory has gotten lots of questions about the new “always listening” feature that allows a user to say “OK Google” followed by a Google Now search. Here’s some of the common questions:
- Is it Sensory’s? Did it come from LG (like the hardware)? Is it Google’s in-house technology? I believe it was developed within the speech team at Android. LG does use Sensory’s technology in the G2, but this does not appear to be an implementation of Sensory. Google has one of the smartest, most capable, and one of the larger speech recognition groups in the industry, and they certainly have the chops to build a key word spotting technology. Actually, developing a voice activated trigger is not very hard. There are several dozens of companies that can do this today (including Qualcomm!). However, making it useable in an “always on” mode is very difficult where accuracy is really important.
- The KitKat trigger is just like the one on MotoX, right? Ugh, definitely not. Moto X really has “always on” capabilities. This requires low power operation. The Android approach consumes too much power to be left “always on”. Also, the Moto X approach combines speaker verification so the “wrong” users can’t just take over the phone with their voice. Motorola is a Sensory licensee, Android isn’t.
- How is Sensory’s trigger word technology different than others?
- First of all, Sensory’s approach is ultra low power. We have IC partners like Cirrus Logic, DSPG, Realtek, and Wolfson that are measuring current consumption in the 1.5-2mA range. My guess is that the KitKat implementation consumes 10-100 times more power than this. This is for 2 reasons, 1) We have implemented a “deeply embedded” approach on these tiny DSPs and 2) Sensory’s approach requires as little as 5 MIPS, whereas most other recognizers need 10 to 100 times more processing power and must run on the power hungry Android processor!
- Second…Sensory’s approach requires minimal memory. These small DSP’s that run at ultra low power allow less RAM and more limited memory access. The traditional approach to speech recognition is to collect tons of data and build huge models that take a lot of memory…very difficult to move this approach onto low power silicon.
- Thirdly, to be left always on really pushes accuracy, and Sensory is VERY unique in the accuracy of its triggers. Accuracy is usually measured in looking at the two types of errors – “false accepts” when it fires unintentionally, and “false rejects” when it doesn’t let a person in when they say the right phrase. When there’s a short listening window, then “false accepts” aren’t too much of an issue, and the KitKat implementation has very intentionally allowed a “loose” setting which I suspect would produce too many false accepts if it was left “always on”. For example, I found this YouTube video that shows “OK Google” works great, but so does “OK Barry” and “OK Jarvis”
- Finally, Sensory has layered other technologies on top of the trigger, like speaker verification, and speaker identification. Also Sensory has implemented a “user defined trigger” capability that allows the end customer to define their own trigger, so the phone can accurately and at ultra low power respond to the users personalized commands!