Speech Recognition Application - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Speech Recognition Application

Description:

Speech Recognition Application. Voice Enabled Phone Directory - Yousef Rabah ... Automatic speech interacting phone directory assistance without human interaction. ... – PowerPoint PPT presentation

Number of Views:150
Avg rating:3.0/5.0
Slides: 12
Provided by: Tru3
Category:

less

Transcript and Presenter's Notes

Title: Speech Recognition Application


1
Speech Recognition Application
  • Voice Enabled Phone Directory
  • - Yousef Rabah

2
Process of Speech Recognition
  • Speaker dependent vs. Speaker Independent
  • Vocabulary ? Isolated vs. Continuous
  • Frequency changes
  • Pronunciation
  • Speech Processing
  • HMM Probabilities, Parameters, Training
  • Phonemes to words

3
Problem
  • Automatic speech interacting phone directory
    assistance without human interaction.

4
Automatic Speech Recognition - Sphinx
  • Acoustic modeling
  • Language Model
  • Unigrams
  • Bigrams P(word2 word1)
  • Trigrams P(word3 word2 word1)
  • Lexicon Structure
  • ZERO Z IH R OW
  • ONE W AH N
  • TWO T UW

5
Input / Output
  • 24003 samples in file /usr/local/share/sphinx3/mod
    el/lm/an4/hell.raw
  • INFO live.c(239) live_nfeatvec 13
  • INFO main_live_pretend.c(92) PARTIAL HYP
  • INFO live.c(239) live_nfeatvec 12
  • INFO main_live_pretend.c(92) PARTIAL HYP
    A(2)
  • INFO live.c(239) live_nfeatvec 13
  • INFO main_live_pretend.c(92) PARTIAL HYP
    EIGHTH
  • INFO live.c(239) live_nfeatvec 12
  • INFO main_live_pretend.c(92) PARTIAL HYP
    H
  • INFO live.c(239) live_nfeatvec 13
  • INFO main_live_pretend.c(92) PARTIAL HYP
    H E
  • INFO live.c(239) live_nfeatvec 12
  • INFO main_live_pretend.c(92) PARTIAL HYP
    H E
  • INFO live.c(239) live_nfeatvec 13
  • INFO main_live_pretend.c(92) PARTIAL HYP
    H E L
  • INFO live.c(239) live_nfeatvec 12
  • INFO main_live_pretend.c(92) PARTIAL HYP
    H E L
  • INFO live.c(239) live_nfeatvec 13
  • INFO main_live_pretend.c(92) PARTIAL HYP
    H E L OH
  • Backtrace(null)
  • LatID SFrm EFrm AScr LScr Type
  • 254 0 45 -391470 -74100 -1
  • 594 46 81 -472155 -148846 0 H
  • 1291 82 102 -288621 -148846 0 E
  • 1850 103 126 -235274 -148846 0 L
  • 2599 127 147 -430694 -148846 0 L
  • 2650 148 148 0 -148846 0
  • 0 148 -1818214 -818330 (Total)
  • FWDVIT H E L L (null)

6
Difficulties
  • Hardware issues
  • ASR software issues
  • Letter phonemes - e-set
  • Time

7
Solution
  • Database (PostgreSQL)
  • Names
  • Numbers
  • Phone number
  • Fast access

8
Solution
  • Example (general idea)
  • PC Say the letters of first name, press space
    bar before and after you speak
  • User S AA EM
  • PC Did you say, SAM ?
  • Architecture of application
  • User Interaction
  • Connects to Database
  • Communicates with Sphinx
  • Uses of C, Perl, shell scripts

9
Solution
10
Check List
  • Reading
  • ASR system
  • Database - PSQL
  • Applications in C, Perl, PHP, vxml, shell

11
Timeline
Write a Comment
User Comments (0)
About PowerShow.com