Follow the Leader in Mobile October 2nd, 2012
I really enjoyed reading this article interviewing Vlad Sejnoha, Nuance’s CTO. Most people would consider Nuance the leader in speech recognition today, and Vlad is certainly a very smart, thoughtful, and articulate man.
I enjoyed it for a few different reasons. The first and main reason I liked the article is it helps to push the idea Sensory has been championing for the past several years that devices don’t have to be touched to enable voice commands, and that you should be able to just start talking to things like we talk to each other. That’s what Sensory calls TrulyHandsfree, and it’s the technology that showed up in the first Bluetooth carkit that requires no touching (by BlueAnt) AND the first mobile phones that responded to voice without touch (Samsungs Galaxy SII and SIII and Note – check out this video from Samsung and this one, also from Samsung). Even hit toys like Mattel’s award winning Fijit Friends and Hallmarks Interactive Books use this unique technology that just works when you talk to it. In fact, it really was the TrulyHandsfree feature that made Vlingo so popular, as this Vlingo video nicely states in its comparison between Vlingo and Siri. (Nuance bought Vlingo earlier this year, but the Sensory TrulyHandsfree didn’t come with it!).
The article says “Sejnoha believes that within a year or two you’ll be able to talk to your smartphone even as it lies idle on a desk, asking it questions such as, “When’s my next appointment?” The phone will be able to detect that you are speaking, wake itself up, and accomplish the task at hand.” Check out this Sensory video…this is definitely what Vlad is talking about! Yeah, we can do it today, and it’s REALLY FAST and really accurate.
But is it low power? Well that’s ABSOLUTELY KEY. That’s why Sensory partnered with Tensilica. Tensilica is a leader in low power audio DSP’s for Mobile Phones. Sensory already has its TrulyHandsfree running on chips that run under 5 mW for a COMPLETE audio system. And that’s without having to wake up to understand the task at hand. We can drop by another 1-2mW by not being always on, but turning the recognizer off doesn’t do much. That’s because even if the full recognizer is shut down, you still need to run a mic and preamp, which drives a lot of the current consumption when you have a low power recognizer like TrulyHandsfree (it can run on as little as 7 MIPS!). This means it’s REALLY critical to have a low power recognizer as well, and that’s Sensory’s forte. We are expecting that by next year we will have systems running at 1-3mW!
The article mentions “persistent” listening, but even though I’ve always preached this “always on” concept, I think what will really explode is “intelligent automatic listening”. That is, the device figures out when it needs to listen for what and turns on to listen for it. So it doesn’t always have to be on…it will just seem that way because the devices are so intelligent. For example a certain traveling speed could make a phone listen for car commands or car wake up words. An incoming call could cause the recognizer to wake up and listen for Answer/Ignore. For these to work, the device needs to run not only at very low power but also with VERY high accuracy. You don’t want to have a background conversation triggering the phone call to hang up! Accuracy is another Sensory forte! The combination of accuracy with low power consumption is a difficult mix to conquer! Sensory’s accuracy is not only in noise but also from a distance…that is when a recognizer works well with a poor S/N ratio, that means the signal can be lower (like from distance) and/or the noise can be higher.
So it’s really cool that Nuance is getting on the bandwagon behind Sensory’s innovations like TrulyHandsfree at low power. In fact after Samsungs release on the Galaxy SII with Sensory, Nuance did come out with an always “on and listening mobile device”; for fun we quickly ported our technology onto the same phone to compare…check out this video.
Something interesting we noticed was that after Sensory announced its speaker verification and speaker ID for mobile devices at CTIA this year, Nuance shortly thereafter came out with their own announcement, but there were no demo’s available so we couldn’t do a comparison video.
Random Thoughts and Miscellaneous Videos August 29th, 2012
- Android JellyBean Speech Recognition. It’s REALLY REALLY awesome. I thought all those video comparisons with Siri must be staged, but I’ve been using it and it’s very fast and very accurate and reasonably intelligent. My only criticism is in their marketing. First of all where’s the Mike LeBeau video? And what’s it called? Google Now? Google Voice? Google Voice Actions? JellyBean Speech Recognition? None of this marketing stuff really matters…it’s a big step forward in the handset based speech wars, and by my count puts Android in the lead on speech technology. Can’t wait to see Apple’s next release!! I bet it will be great…and Microsoft? You spent a billion dollars on Tellme, you have had the biggest speech team for the longest time, what are you doing???
- One of Sensory’s technology apps guys did a really nice demo placing the Sensory trigger to call up the Android JellyBean speech engine. Look how nicely the Sensory technology interacts to make the whole experience not only handsfree but ripping fast!
- ChinaMobile invested over $200M in iFlytek…WOAH!!! Really? Over $1.2B valuation. Holy Smokes.
- OK, I’m a speech geek…there’s something I really like about attractive women using speech recognition on QVC (yeah this is a Sensory chip based product, that works AMAZINGLY well in a live shoot)
- I’m a huge fan of Hallmark’s Interactive Storybuddies…There’s a ton of other fans who have posted videos showing how nice these products are. Sensory’s TrulyHandsfree technology on a NLP chip is embedded in a plush character that responds while you read a book. Now everyone in the speech industry knows that speech recognition works better with men than women, and that accents destroy recognition accuracy, and that you need to speak loudly into the mic or else the S/N will be too poor for recognition to perform. Well watch this video of a soft speaking British accented female using a Hallmark Storybuddy to see how AMAZINGLY perfect the Sensory engine does.
Thank you SIRI! January 27th, 2012
Lot’s of thoughts…no time to share them…So I’ll be brief in a few different areas:
- Thank you SIRI! Now every CE Company must have speech technology. How the world has changed, and after 18 years of Sensory being one of the only speech company focused on consumer electronics, now everyone is doing it!
- What’s really weird is the number of chip companies and investment bankers that have been popping up on our doorsteps since SIRI shipped. Companies do move in herds!
- Nuance buys Vlingo. Full disclosure…Vlingo is Sensory’s partner (we’ll see what happens after the deal closes.) How much was paid? (Rumor I keep hearing is the highway that runs near my house…) Why did they pay so much? (because they can, to end the personal lawsuit, to end the other lawsuits, to prevent market share from eroding, NOT to grow their technology base!)
- Speaking of Vlingo, I really like that their newsletter and videos that imply they are better than SIRI because they have “more hands-free functionality”…that’s TrulyHandsfree by Sensory!
- And what about the Justice Department’s investigation of Nuance (Don’t they have better things to do with our taxes these days?)…The Nuance/Vlingo’s position seems to be all about fighting Microsoft, Google, etc…which has some merit, but if it don’t have Android or Windows Phone, who ya gonna call? Nuance will always be on the list.
- Sensory news…
- Yeah! Our TrulyHandsfree is in Samsung’s Galaxy Note, introduced at CES!
- Monster Cable showed a cool product at CES with TrulyHandsfree™ inside…they were kind enough to invite the Sensory crew to see Chicago. GREAT CONCERT! I think there were another 20-30 or so products on the CES floor with Sensory inside!
- We also just got nominated for a Global Mobile Award at the Mobile World Congress.
- And who says there’s a recession still going on? Our chip-based product sales are going through the roof! The success of our IC product line is also based on TrulyHandsfree because it enables a quasi-natural language interface.
- Where in the world is Majel???? Sensory did a voice-controlled light switch a few years back with a company called VOS Systems. They licensed the Star Trek brand, used “Computer” as the voice trigger to control the lights, and even licensed Majel Roddenberry’s voice…pretty cool!
I Love Watson! July 12th, 2011
I have a new favorite toy. It’s Watson the Raccoon, one of Hallmark’s new Interactive Storybook and Story Buddy™ characters.
I used to have the time to buy every product that featured one of Sensory’s technologies. I have to admit I don’t do that much anymore, and as my kids have gotten older, I buy fewer and fewer toys in general, much less ones with speech recognition. Luckily I was visiting my dad, and he had purchased one out of curiosity.
For many years, my favorite Sensory-based toy was from the very early days of Sensory called Radar the Robot, from Fisher Price. I have fond memories of my kids imitating Radar, and also remember biking around Bali on my honeymoon, going for a night ride through the jungle by myself just to find a fax machine so I could send an agreement draft back to Fisher Price (Design Win for Radar!)
Sorry Radar…Watson has removed you from your pedestal. Watson now reigns supreme!
Hallmark’s Interactive Storybooks use Sensory’s NLP based processors with TrulyHandsfree™ Voice Control. As you read the book, the Story Buddy™ listens and interacts appropriate when it hears you say different phrases from the book.
I knew the concept was great when we first did Jingle the Husky Pup with Hallmark. I also knew that these products were selling really well, and I even knew that TrulyHandsfree™ Voice Control is the MOST AMAZING technology to ever come out of Sensory (Hey – I just got an email from the manager of one of the larger speech organizations in the world and he said “we keep trying to break your TrulyHandsfree™ 2.0 beta technology, but we just can’t seem to make it fail!”)
What I didn’t know is what an EXCELLENT job Hallmark does in story writing, character creation, and putting the whole thing together to make a really fun experience that really works! The book starts off with Watson wondering why grass grows up and rain falls down… I love that line!
Kudos to you Hallmark…now I gotta go buy a Watson for my lobby!
Truly Handsfree™ Trigger Technology Taking Over Sensory! February 24th, 2011
I haven’t had much time to blog lately, and you may have noticed that when I do, I often write about our revolutionary new Truly Handsfree™ Trigger speech technology. Technically it’s a phrase-spotting technology, but Sensory is using a revolutionary new multi-patent pending approach that’s changing the way we do speech recognition. The Truly Handsfree™ Trigger doesn’t use typical techniques like background noise modeling or speech detection (i.e. start and ending speech.) In operation, it ends up being MUCH more noise robust, yet still very efficient as it consumes less current than it would if we also included all the traditional approaches. The basic idea is that it’s on and listening all the time, and able to reject all of the wrong words and correctly identify the right words! This eliminates the need for activation via button pressing.
A lot of companies are using our technology now as a voice trigger for other speech recognition applications. At the recent Mobile World Congress, Samsung introduced the first Truly Handsfree Smartphone, the Galaxy sII, which uses a Truly Handsfree™ Trigger followed by the Vlingo experience. You say “Hey Galaxy” and it wakes up, no touching necessary! I tried this on the noisy showroom floor at Mobile World Congress, and it nailed my “Hey Galaxy” every time, even from a distance of 5 feet away!
Chris Schreiner over at Strategy Analytics recently tried out an early beta demo for Android, and in a blog late last year he said, “In a demo experience on my Android phone, the hands-free trigger worked remarkably well with varying types of background noise.”
With Truly Handsfree™ Trigger’s noise-robust nature and the ability to always be on listening, we are able to do more natural language-like schemes. A couple of great examples are in the toy space (and we do love toys at Sensory!)
- I mentioned Hallmark in my last blog…now they are rolling out a whole new product line built with Sensory chips because of the huge success of Jingle, the Husky Pup.
- Mattel has pushed us to deploy this phrase spotting technology even in our lowest cost, entry level processor. They have a new product line coming out this year that’s for sure to be a BIG HIT called Fijit. The Fijit’s are these cute wiggly characters with amazing skin, and they do the TOUGHEST speech recognition feats ever. They listen for a bunch (30??) of short key words like “hungry” so you can say a variety of things to it (Like…Hungry?…I’m Hungry…Are you Hungry?) and it can intelligently respond and interact. (Actually I don’t know if “Hungry” is a one of its actual words, that’s for example only.) SpeechTech just did a nice summary on Fiji Friends in their blog, and Mattel has some nice YouTube videos and websites where you can learn all about Fijits.
So what’s happening here at Sensory is that this technology initially invented as a trigger is migrating into being an amazingly noise-robust speech solution for any command and control application! It’s nominated for awards by MobileTrax in both the Speech Processing and Software Technology innovation categories!
Sensory has developed a whole product roadmap around our new approach, and this includes speaker adaptive recognition, larger vocabulary solutions, improvements in accuracy, and consumer created triggers. A funny thing about consumer created triggers…Our initial release was NOT INTENDED for this, but one of our customers, Adelavoice, did a few tricks and allowed end users to create their own triggers. Know what’s the most common trigger phrase?? “Yo Bitch”…I guess that says something about the demographic of the user base!
OK…I could go on and on about this new phrase spotting technology, but I gotta get some real work done!
Lots of Great Stuff at Sensory! December 7th, 2010
I’ve been so busy, I haven’t had much of a chance to blog, but here are some of the exciting new products I can talk about:
Last CTIA seems like yesterday, but it was 2 months ago. At the show Motorola introduced 3 new Bluetooth accessories, all of them using Sensory TTS and speech recognition. Moto has a very clever design using cloud based TTS for email reading, cloud based VR for dictation, then Sensory on the client for the “light lifting” tasks of command and control (like answering phones) and reading caller ID.
Sensory’s Truly Handsfree Trigger continues to get rave reviews and fans. Vlingo’s WONDERFUL In Car solution is using the Sensory Trigger “Hey Vlingo”; Enustech’s “Drive N Talk” solution just added a Truly Handsfree Trigger, and AdelaVoice started shipping their “StartTalking” WITHOUT Sensory, but quickly switched over to us when they tested out our Truly Handsfree Trigger. We consider it a real KUDO to win over companies like Vlingo that have some of the best speech technologists in the universe!
We know EVERY speech company on the planet is working on a Trigger WordSpot solution to compete with Sensory’s Truly Handsfree Triggers, so we challenge them all to a shootout! We’re happy to send you our stuff if you send us yours!!!!
OH HERE’S A GREAT TOY PRODUCT….Hallmark has released Jingle the Husky Pup Interactive Storybook and Story Buddy. The basic idea is a book that comes with a plush dog that interacts while you read the story. It’s an interesting product for several reasons:
- It’s a big speech recognition hit and just in time for the holidays. It has already won several awards and is selling out in many retail outlets.
- It’s from Hallmark, which is an interesting move. Hallmark is a multi-billion dollar privately held giant, of course best known for greeting cards. This successful move into high tech speech recognition toys brings them into a new market that given the success here, will experience rapid growth.
- The speech recognition is Sensory’s new phrase spotting technology (yep, our Truly Handsfree Triggers applied in a new way.) The Jingle product marks a new use of Sensory’s technology to do MULTI-WORD phrase spotting rather than single trigger words. As the person reads the book, Jingle listens for a half dozen or more key phrases, and when those phrases are spoken, Jingle chimes in with various barks and songs.
- It’s only $24.95 at retail…pretty breakthrough pricing for an advanced speech technology product!