See Jane Drive August 19th, 2009
Since Sensory has gotten very actively involved in providing speech recognition for Bluetooth® based products, I have been asking friends and family about their experiences with various “hands-free” wireless devices.
I recently had an interesting conversation that I’ll share. A woman I know (I’ll call her Jane) uses a Jabra SP-200 Bluetooth® car kit. She says she had tried a wireless headset, but found the car kit much more comfortable and convenient since she really only uses it while driving. Jane found the initial pairing process clumsy and uncomfortable, but after much reading and experimentation is now very happy with her Jabra car kit.
When I pressed Jane more about what she likes and doesn’t like here’s what I found:
- Doesn’t have to wear it on her head
- Call quality is good
- Simple and easy to use
- Every once in a while it makes a call accidentally
- There is no easy way to call people back when she gets disconnected
- Doesn’t always understand the different flashing lights
I found this particularly interesting, since on the one hand she said it was simple and easy to use, but also said the lights were confusing, there were control issues, and it was too difficult to easily call someone back.
Of course, if you know Sensory’s BlueGenie™ Car Kit product then you understand that ALL these issues are solved with a BlueGenie™ Voice Interface! (By the way, have you seen the BlueGenie™ car kit video on the Sensory website front page with my daughter Samantha? Smart kid.)
I decided to go a little more in-depth on the SP-200 and looked it up on the web. Interestingly, Jabra markets it as “hands-free” (of course it’s not), and calls it part of the EASY series (it could be a lot easier with BlueGenie™ …) Jabra must understand it’s not Truly Hands-Free, because in some places they call it “hands-free talking.”
Here’s what I learned from the manual:
- It has 3 LED’s (Blue, Green, and Red) that each mean a different thing. Sometimes they are solid, sometimes they blink, and SOMETIMES THEY BLINK AT DIFFERENT SPEEDS. No wonder Jane found this confusing. Even the same color doing the same thing can mean a different thing in a different mode (e.g. solid blue can mean it’s on, or it can mean it paired successfully).
- There’s a single big button to tap. This is part of what makes it EASY I guess. However, Jabra differentiates between a TAP and a PRESS. A tap is short and a press is long. And there can be DOUBLE TAPS, and PRESS AND HOLD, and the HOLD can be for 1 second or 5 seconds, etc. For example, you “tap” to answer a call, and you “press” to reject an incoming call, or you double press to redial. Maybe this has something to do with the “accidental” calls Jane mentioned??
I think you absolutely must read and memorize the manual to know how to use this product…and once you do know how to use it, you need to touch it, touch your handset and look at the car kit while driving. That’s not a Truly Hands-Free, Eyes-Free product.
On the other hand, BlueGenie™ car kits will hit the market in 2010, and they will change the world! People will understand what “Truly Hands-Free” really means!
The SCID’s are Coming!!!! August 4th, 2009
No, we’re not under attack from missiles and I’m not referring to results of the current financial crises. I’m talking about Speech Controlled Internet Devices. These are home consumer electronic devices that use a VUI (voice user interface) for the user to interact with the product. The products themselves are able to access data and information from the internet, and they use a client/server speech recognition system to obtain a higher recognition accuracy than possible with a lone client or lone server approach.
So what is Sensory’s role in this? Well, we originated the terminology, and we’re evangelizing the concept in advance of the release of our new chip in September. The new chip is designed to act as the main controller for SCID’s, although Sensory is looking for other partners on the chip side (like Intel or Phillips) for higher end/higher cost SCID’s. By the way, we’re also looking for server-based speech recognition partners (like Microsoft, Google, Vlingo, Novauris, etc.), and even hardware partners like Cisco that know the Wi-Fi and consumer electronics space.
Some of the press and analysts out there are starting to think about the potential for SCID’s. Troy Wolverton (my favorite Mercury News columnist) had a bit of a changed heart after seeing some of my demo’s. Earlier I had contacted him because he thought speech recognition never worked, so I was quite happy that his column was titled “Speech Recognition Technology is Rapidly Improving.”
I’m not going to say a whole lot about SCID’s here because Dan Miller from Opus Research has already done an EXCELLENT job of writing up a summary of our conversation. Dan highlights the HUGE volume opportunity that SCID’s will enable over the coming few years.
A really interesting angle on the SCID’s is the Voice Search opportunity they enable. Most people think of Voice Search as something for telephone handsets (the quick idea of “voice search” is that a multi-billion dollar ad/transaction business will emerge for voice search just like it has for conventional Google-like search, so all the major search players - Microsoft, Google, Yahoo, etc - are interested). The thing is, there will be billions of consumer electronic products hooked up to home internet, potentially with VOIP connections, so handsets won’t be the only devices enabling search opportunities - SCIDs could become a MAJOR driver for search revenues. Michael over at the Kelsey Group keyed off of the interesting opportunities that SCID’s bring to Voice Search and blogged a bit about that.
About the technology - It’s worth noting two very special things within the SCID’s:
- Sensory’s new Truly Hands-Free phrase spotting allows SCIDs to be always on always listening, so your voice becomes your remote control for accessing internet data through your SCID - no need to walk up and press buttons.
- Sensory will do really simple and accurate speech recognition on the client that provides standalone value when not connected to the internet, but ALSO ASSISTS THE SERVER RECOGNIZER by feeding categorized data along with the query.
For example, if “Local News” (or time, weather, etc.) is requested from a news-oriented SCID, the client Sensory recognizer can recognize that and stream a local news report, and if “Other News” is requested we can prompt “Please say the location where you would like news reports”. Then Sensory can send a very targeted query to a server based recognizer identifying the recording as a location where recent news is requested. This simplifies the server task, and improves the accuracy of the “say anything” approach to speech queries.