Computers That Talk

Where can we use speech recognition ?

In future, computers will shrink more and more. We will have small but powerful devices.
In the kitchen, the refrigerator will tell you what you've to buy. In cars, you will ask the travel assistant, how you can get to a special destination and it will tell you how you'll have to drive. A small alarm clock will make the whole time planning for you. You can't use a keyboard to input your appointments into the clock. And the clock won't have a big display to show the next one.

Who developed with his partners a speech recognition ?

Victor Zue was born in China and went to Florida in the late 1960s to study there, near his older sisters. So he had to learn the English language and the American pronunciation. It was difficult for him.
In 1968, he saw the science-fiction film "2001: A Space Odyssey", in which the computer "HAL" talked. So he had the idea to develop real speech systems for computers. He came to MIT and began to analyse speech.

How can a computer hear and talk ?

First, the computer has to record the spoken sentence. Then it divides the sound into the containing frequencies. Out of these, it can get the phonemes, the basic phonetic parts of words. Connecting these phonemes, it can build possible words. According to grammar rules and saved meanings of words, combined with probability statistics, it can understand what you've said.
The computer can answer with information out of databases in it or in the Internet. It builds sentences, transforms the words into its phonemes and sends it to the connected loud speaker. It is nearly the same process in the reverse direction.
But today's systems make one failure per sentence, on average. So "recognize speech" can be understood as "wreck a nice beach ". The difficult words are homonyms like there and their. Also same letters can be pronounced different in several words, like the "t" in try and button. Depending on the context, the same word can mean many diverse things.

Which type of computer we will need for it ?

The first computer systems were huge. Each was used by many people. Today we have one computer per person. But this is changing. Soon we won't have a laptop or a home computer, but small devices, which we can use for nearly everything. You will download the required software from the Internet. The chip in the computer will contain identical tiles; the software connects and tunes they to get the right functions. Only these chips are scalable enough to understand all speech commands and make these variable devices possible.

Today and future in speech recognition

Today we can control cell phones by speech and an editing software transforms our spoken words into text. But you've to train this software to your voice. Victor Zue's MIT lab made one system, which you can ask by a telephone call. Mercury Travel Service for flights gives the right answer to the question: "When does the next flight leave from Boston for San Francisco ?" In future, several systems (to reduce the context failures) will be connected, so you can ask for nearly everything.

Contents

Stefan Ziegler, 02.04.2001