Title: Voice Recognition
1Voice Recognition
- Dennis Gehris
- Professor
- Bloomsburg University
2Voice Recognition
- Voice Recognition Definitions
- Timeline of Voice Recognition Development
- Types of Voice Recognition Systems
- How Voice Recognition Works
- Voice Recognition Systems and Requirements
- Voice Recognition Demonstration
2
3Voice Recognition Definitions
- The field of computer science that deals with
designing computer systems that can recognize
spoken words. - Speech-to-Text
- The field of computerized voice recognition /
speech recognition. (These terms are usually used
interchangeably)
3
4Voice Recognition Definitions (continued)
- Text-to-Speech
- Computerized enunciation of the written word. In
a sense, this is the opposite of speech-to-text. - Voice Identification, Voice Verification
- The use of voice recognition technology to
identify a person by their voice, rather than to
identify the meaning of the spoken word.
4
5Voice Recognition Timeline
- Late 1950s
- voice recognition research begins.
- 1964
- IBM demonstrates Shoebox for spoken digits at New
York World's Fair.
5
6Voice Recognition Timeline (continued)
- 1968
- The HAL-9000 computer in 2001 A Space Odyssey
introduces the world to voice recognition. - 1978
- Texas Instruments introduces the first
single-chip speech synthesizer and the Speak and
Spell toy.
6
7Voice Recognition Timeline (continued)
- 1993
- IBM launches the first packaged voice recognition
product, the IBM Personal Dictation System for
OS/2. - 1993
- Apple ships PlainTalk, a series of speech
recognition and speech synthesis extensions for
the Macintosh.
7
8Voice Recognition Timeline (continued)
- 1994
- Dragon Systems' DragonDictate for Windows 1.0 is
the first software-only PC-based dictation
product. - 1996
- IBM introduces MedSpeak/Radiology, the first
real-time continuous-speech recognition product.
8
9Voice Recognition Timeline (continued)
- 1996
- OS/2 Warp 4 becomes the first operating system to
include built-in voice navigation and
recognition. - June 1997
- Dragon ships NaturallySpeaking, the first
general-purpose continuous-voice recognition
product.
9
10Voice Recognition Timeline (continued)
- August 1997
- IBM ships ViaVoice.
- Fall 1997
- Microsoft CEO Bill Gates identifies voice
recognition as a key technological advance.
10
11Types of Voice Recognition Systems
- Discrete Speech Systems
- require that the speaker speak slowly and
distinctly and separate each word with a short
pause - The requirement to pause, usually between 1/10th
and 2/10ths second between words - Continuous Speech Systems
- allow you to speak naturally without pauses
between the words
11
12Types of Voice Recognition Systems (continued)
- Most systems are Speaker Dependent
- requires an enrollment (training) period
(usually 15 - 60 minutes) in order to understand
a new speaker - Some systems are Speaker Independent
- does not need an enrollment period to start
recognition. A system which will comprehend any
person who speaks into the system as soon as they
start speaking
12
13How Voice Recognition Works
- Speech Engine --
- The key to speech recognition
- The mechanism that translates sounds into words
and sentences.
14How Voice Recognition Works (continued)
- 1. Audio Input
- The speech engine receives the audio input, or
speech, from a microphone via a PC with a
standard sound card such as the Creative Labs
Sound Blaster.
15How Voice Recognition Works (continued)
- 2. Acoustic Processor
- The acoustic processor, filters out background
noise and converts the capture audio into a
series of sounds that correspond to the
phonemes--units of speech--making up the selected
language, such as American English.
16How Voice Recognition Works (continued)
- 3. Word Matching
- The speech engine attempts to match the sounds to
the most likely words in two basic ways.
17How Voice Recognition Works (continued)
- 3. Word Matching (continued)
- a. acoustical analysis is used to build a list
of possible words that contain similar sounds. - b. language modeling is used (the likelihood
that a given word would appear between those
coming before and after it) to narrow the list
and come up with the best candidates.
18How Voice Recognition Works (continued)
- 4. Decoder
- The decoder selects the most likely word based on
the rankings assigned during word matching and
assembles the words in the most likely sentence
combinations.
19How Voice Recognition Works (continued)
- 5. The End Result
- Results sometimes appear in the application and
are saved as an application file - Dragon Systems' Dragon NaturallySpeaking and
IBM's ViaVoice include their own word processors.
Results are saved in rich-text format or pasted
it into another application.
20Voice Recognition Demonstration