We caught up with the visionary inventor at SpeechTek 2008, where he talked with InformationWeek about speech technology, his new cellphone reading machine, and two new movie projects.
By Michael Greene
August 26, 2008 04:00 AM
Greene: You've been a pioneer in a number of areas, but I've read where you've said that applications of the technology benefiting disabled people that have brought you the greatest gratification. Can you talk about your ideas regarding speech and assistive technology?
Kurzweil: One vision we had dating back to 1980 when we started speech recognition was to apply to this to the deaf. The idea is that a deaf person would have a little display, which could be built into their eyeglasses where they would basically get subtitles on the world.
On the one hand it's a demanding technology because it has to be speaker-independent, have a large or unrestricted vocabulary, and support continuous speech. On the other hand, it doesn't require perfect accuracy. The early reading machines were highly inaccurate, but a blind person would be able to make up for it from context. A deaf person could similarly understand from context what was being said. Indeed human speech recognition, particularly in compromised acoustic environments, is not perfect either, and we can pick up from context in chat at cocktail parties, or at least pretend to understand what's being said.
I think we're pretty close to being able to do that at least in good acoustic environments, maybe not at a cocktail party, but if the person is being picked up, with pretty good accuracy. If you take something like Dragon Naturally Speaking, which combines the original Dragon technology in the Kurzweil speech recognition, which is now in Nuance, which actually used to be Kurzweil Computer Products, my first company. That accuracy is quite high. A New York Times reporter recently wrote that using his own voice with no training with the software never having heard him before was over 98.3% accurate on a 100,000 word vocabulary. And that's certainly accurate enough for the deaf application. I think prototypes of that could be put together that would work, at least in good acoustic environments.
Greene: You're currently demonstrating the KNFB Reader Mobile, which is a cell phone that is also a reading machine. What is the background for this device?
Kurzweil: In 2002 the President of the National Federation for the Blind said, 'Ray for years you've been saying that one day a blind person will be able to use a pocket size device to read signs, meeting handouts, menus, and other displays. When do you think this will be feasible?' I said according to our models we'll have the requisite hardware in six years. He said, 'OK, how long will it take to develop the software? I said, well about six years, so he said let's get started. We began in 2002. Right on schedule this spring, hardware that could run this application, a 4-mega pixel camera, sufficiently powered computers of about 300 megahertz and enough memory became available for the first time.
A little to our surprise we also got the software done on time and we introduced it this summer. Now about 1,000 blind guys and gals and dyslexic individuals are going around taking pictures of signs on the wall and handouts at meetings like this one and using this device.
The device is also a GPS navigation system, MP3 player, e-mail reader, web browser, phone and camera as well as being a reading machine in seven languages that also does the translation. This is all done with voice-directed output and voice prompts to help guide a blind or dyslexic person through use. We're working on additional features in our laboratory such as face recognition, object recognition, and the ability to recognize indoor scenes, like a hallway and office to help tell a blind person where they are.
Greene: Speech technology and intuitive design could also make a big difference in quality of life of seniors, but they are not usually mentioned in discussions on technology. What are your thoughts regarding this group?
Kurzweil: Seniors are a big part of our market for our reading machines. We have a version of a reading machine that's very easy to operate and is used by seniors so that they can read print. Even if they are not blind, they may not able to read very well.
Speech recognition can also be used to provide intuitive interfaces into devices. All of these technologies that help us interact with machines can help senior citizens if they are easy to use. Good design and user interface is the key. A lot of devices are not intuitive to use but speech technology can be interactive. It should be like an intelligent assistant that can guide the user regardless of how old they are, how to perform certain functions. You don't ever want to look at a user manual.
Greene: What do you consider good design principles regarding seniors?
Kurzweil: Well I am amazed at how poor interfaces are in general. They are very often not intuitive. You can't read what is happening it's not clear how to accomplish common tasks. That's just an issue of good design principles. These interfaces should be tested with people who have not read the manuals to see if they can actually use them and figure out what to do
Some companies have gotten it. Apple has built its whole company on intuitive interfaces that are easy and fun to use. It's not easy to do but the companies that will do that will succeed. It's not really a speech technology issue.
Greene: In your talk you state the importance of timing, that it's not just about the invention but what we might call the whole ecosystem of related technologies coming together at the right time in order for an invention to work. Is it one technology or a range of technologies that we are requiring here?
Kurzweil: Well, input technology, such as speech technology and character recognition are definitely part of the puzzle. Good user interface design is important. Seniors may have decreased visual acuity or hearing problems, so these issues become more important, but it's really the same issue for any user. We also develop technologies for very young users like children, where again it has to be intuitive and easy to use.
But the technology is getting more intelligent. The market is rewarding products like the iPhone that are intuitive and easy to use. Hardware technologies like flexible touch screens are coming into play. We're getting good speech recognition and character recognition. So over time these will come together and we'll have intuitive products that are easy to use with interfaces that are like an intelligent human assistant that is sort of guiding you through a process.
Greene: Do you see video communication as an area that can also expand seniors' accessibility?
Kurzweil: I think in the next decade, the teen years, we'll be routinely visiting with each other like you and I are doing now in real reality, even if we're hundreds of miles apart and not just as a grainy postage stamp sized video conferencing image on your screen but as a full immersion experience where we really seem to be with the other person. It will be a full immersion visual and auditory virtual reality. We'll have images beamed into our retinas from our eyeglasses. We'll be online all the time. The electronics will be woven into our clothing and in our belt buckles and we'll routinely be visiting with other people in these full immersion environments.
We'll have augmented reality so we'll see real reality but there will be an overlay of virtual reality on top of that helping guide us through the real world. It will direct us inside and outside, not with a navigation system on a small screen that we carry in our palm, but one that's actually built into our field of view. We'll look at a person and it will remind us who they are and give us background about them. That will be very helpful. I mean how many times are you at a cocktail party where you see someone and think you know who they are but you're not quite sure. It would be great get that confirmed and we will have technology like that. It'll help the elderly but it will help all of us. You don't have to be 80 years old to have a senior moment.
Greene: Onto another topic I know your now involved with a couple of film projects. I believe there are two movies coming out that you're involved with one based on your book and the other a documentary about you.
Kurzweil: I'm making a movie based on my book the Singularity is Near. It's called The Singularity is Near: A True Story about the Future. It's an intertwined A line documentary with a B line narrative story. It goes back and forth between the documentary and the dramatization.
In the documentary, I interview 19 big thinkers. People like Marvin Minsky, the father of AI, Richard Clarke, who was the head of counter-terrorism under Clinton and Bush, Alan Dershowitz on the legal rights of machines and 15 or 16 others.
In the B line narrative story it's a story of an AI, Ramona. It starts actually with true documentary footage where I created Ramona at the 2001 Tech conference then Ramona goes into the future and becomes more and more realistic, more and more humanlike. She hires Alan Dershowitz who plays himself to press for her legal rights to be recognized as a person. The judge says I will grant you legal rights as a person if you can pass the Turing Test. She goes and gets coaching from Tony Robbins who plays himself to learn the secret of what it is to be human and the story goes on from there.
It stars Pauley Perrette as Ramona, she's the star of the TV show NCIS. It's due to come out early 2009.
There's also a full length motion picture being made about my life, career and ideas called Transcendent Man and that's on the same schedule. They're going to be released as a double feature. It's another venue to communicate my ideas about the future.