Integrating Multiple Knowledge Sources For Improved Speech Understanding - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Integrating Multiple Knowledge Sources For Improved Speech Understanding

Description:

Integrating Multiple Knowledge Sources For Improved Speech ... Prosodic Utterance Classifier. Travel. User. Departure Route. Depart Loc. Arrive Loc. Depart Date ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 13
Provided by: michaels123
Category:

less

Transcript and Presenter's Notes

Title: Integrating Multiple Knowledge Sources For Improved Speech Understanding


1
Integrating Multiple Knowledge Sources For
Improved Speech Understanding
  • Sherif Abdou, Michael ScordilisDepartment of
    Electrical and Computer Engineering, University
    of MiamiCoral Gables, Florida 33124, U.S.A.

2
Abstract
  • The sentence produced by the decoder with the
    highest recognition probability may not be the
    best choice for extracting the intended concepts.
  • The more knowledge sources that share in the
    selection process the better result can be
    achieved.
  • In late disambiguation approach, many hypotheses
    are permitted to propagate through the system
    till there is enough knowledge to select the best
    one.
  • In this work recognition score, parsing score,
    dialog expectations and prosody are used for the
    decision of selecting the best hypotheses.
  • The scaling weights of the combined scores are
    determined automatically by an optimization
    procedure.

3
System Architecture
I/O Interface
Prosodic Utterance Classifier
Speech Recognizer
Dialog Manger
Synthesizer
Parser
Acoustic Model
Language Model
Grammar
Goal Trees
Dialog History
Prerecorded Speech Units
Flights Database
4
Domain Plans
Travel
Departure Route
Return Route
User
Return Time
Depart Loc
Arrive Loc
Depart Date
Depart Time
Return Date
Hierarchical goal tree
5
Discourse Plans
  • Clarifications User-initiated subdialogs,
    usually by questioning, to ask about some
    feature for a concept related to one of the
    current plans of the focus stack .
  • Corrections User-initiated subdialogs with the
    intention to correct part of an already
    constructed plan. They usually appear after
    system-explicit or implicit confirmations.
  • Meta_communications User-initiated subdialogs
    that refer to the dialogue itself, such as
    asking for repetitions or signaling
    nonundersatnding.

6
Parser Score
User utterance "I need a flight from Miami to
Boston two days after Christmas"
Recognizer output (i) I_need flight from Miami
to Boston two days after Christmas Parser output
(i) Flight_Constraintsdepartloc ( FROM
Location ( city ( City_Name ( MIAMI ) ) )
) Flight_Constraintsarriveloc ( TO Location
( city ( City_Name ( BOSTON ) ) )
) Flight_ConstraintsDate_Time ( Date
(Date_Relative ( date_offset ( day_offset (
Number ( TWO ) ) DAYS _days_after ( AFTER ) )
) ) holiday ( holiday_name ( CHRISTMAS ) ) ) )
Non Fragmented Parse
7
Recognizer output (j) I_need flight from Miami
to Boston two days other Christmas Parser
output(j) Flight_Constraintsdepartloc ( FROM
Location ( city ( City_Name ( MIAMI ) ) )
) Flight_Constraintsarriveloc ( TO Location
( city ( City_Name ( BOSTON ) ) ) )
) Flight_ConstraintsTime_Range ( Time (
Hour ( TWO ) )) Flight_ConstraintsDate_Time
( Date ( holiday (holiday_name ( CHRISTMAS
) ) ) )
Fragmented Parse
8
Utterance Type Classification Tree
Q/S 0.5/0.5
F0_difgt15
F0_diflt15
S 0.69
Q 0.67
End_slopelt4.07
End_slopegt4.07
End_slopelt2.56
End_slopegt2.56
Q 0.77
S 0.56
S 0.8
Q 0.54
Reg_shape-1
F0_rangelt9
Pen_slopelt1.59
F0_rangegt9
Reg_shape1
Pen_slopegt1.59
S 0.84
Q 0.6
S 0.63
Q 0.78
S 0.89
Q 0.85
F0_rangegt7
Reg_shape1
F0_rangelt7
Reg_shape-1
Q 0.52
S 0.75
S 0.56
Q 0.77
End_slopegt3.51
F0_pen_difgt5
F0_pen_diflt5
End_slopelt3.51
Q 0.74
S 0.75
Q 0.85
S 0.85
9
How Prosody Can Help
Utterance transcription and what's the fist
flight in the morning  Recognizer output(1)
and with the first flight in the morning Parser
output (1) Flight_ReservationFlight_Reference(
WITH THE Earliest(FIRST FLIGHT IN THE
Time_Range( Time_spec( Period_Of_Day(
MORNING ) ) ) ) )  Recognizer output(2) I'd
what's the first flight in the morning Parser
output (2) Flight_ReservationRequest(Wh_form
( WHAT'S Flight_Reference(THE Earliest (
FIRST FLIGHT IN THE Time_Range( Time_Spec(
Period_Of_Day( MORNING ) ) ) ) ) ) )
10
Weights Computation Least Squares
Minimization/Hill Climbing
The error function E ?i ( Gi - ?j Wj
Sij)2 I training sample index J
knowledge source index Gi training score,
selected manually, for training sample i Wj
score weight for knowledge source j Sij score
of knowledge source j for sample i Get minimum
error by solving system of k linear
equations -2 ?i Sik( Gi - ?j Wj Sij)0
11
Experimental Results

 
Table I. Comparison of Systems Performance
Table 2. Measure for knowledge sources
contribution
12
CONCLUSION AND FUTURE WORK
  • Maximize the amount of information passed
    between system
  • modules, and use all the higher level
    knowledge to evaluate the
  • different hypothesis. Decisions are made
    whenever possible and
  • delayed when necessary.
  • Rank parse results according to number of word
    coverage and
  • information content.
  • Use expectation list generated from current
    dialog state to select
  • the most appropriate hypothesis.
  • Future work Use confidence measures from the
    recognizer output to confirm our selection.
Write a Comment
User Comments (0)
About PowerShow.com