Title: Integrating Multiple Knowledge Sources For Improved Speech Understanding
1Integrating Multiple Knowledge Sources For
Improved Speech Understanding
- Sherif Abdou, Michael ScordilisDepartment of
Electrical and Computer Engineering, University
of MiamiCoral Gables, Florida 33124, U.S.A.
2Abstract
- The sentence produced by the decoder with the
highest recognition probability may not be the
best choice for extracting the intended concepts.
- The more knowledge sources that share in the
selection process the better result can be
achieved. - In late disambiguation approach, many hypotheses
are permitted to propagate through the system
till there is enough knowledge to select the best
one. - In this work recognition score, parsing score,
dialog expectations and prosody are used for the
decision of selecting the best hypotheses. - The scaling weights of the combined scores are
determined automatically by an optimization
procedure.
3System Architecture
I/O Interface
Prosodic Utterance Classifier
Speech Recognizer
Dialog Manger
Synthesizer
Parser
Acoustic Model
Language Model
Grammar
Goal Trees
Dialog History
Prerecorded Speech Units
Flights Database
4Domain Plans
Travel
Departure Route
Return Route
User
Return Time
Depart Loc
Arrive Loc
Depart Date
Depart Time
Return Date
Hierarchical goal tree
5 Discourse Plans
- Clarifications User-initiated subdialogs,
usually by questioning, to ask about some
feature for a concept related to one of the
current plans of the focus stack . - Corrections User-initiated subdialogs with the
intention to correct part of an already
constructed plan. They usually appear after
system-explicit or implicit confirmations. - Meta_communications User-initiated subdialogs
that refer to the dialogue itself, such as
asking for repetitions or signaling
nonundersatnding.
6Parser Score
User utterance "I need a flight from Miami to
Boston two days after Christmas"
Recognizer output (i) I_need flight from Miami
to Boston two days after Christmas Parser output
(i) Flight_Constraintsdepartloc ( FROM
Location ( city ( City_Name ( MIAMI ) ) )
) Flight_Constraintsarriveloc ( TO Location
( city ( City_Name ( BOSTON ) ) )
) Flight_ConstraintsDate_Time ( Date
(Date_Relative ( date_offset ( day_offset (
Number ( TWO ) ) DAYS _days_after ( AFTER ) )
) ) holiday ( holiday_name ( CHRISTMAS ) ) ) )
Non Fragmented Parse
7Recognizer output (j) I_need flight from Miami
to Boston two days other Christmas Parser
output(j) Flight_Constraintsdepartloc ( FROM
Location ( city ( City_Name ( MIAMI ) ) )
) Flight_Constraintsarriveloc ( TO Location
( city ( City_Name ( BOSTON ) ) ) )
) Flight_ConstraintsTime_Range ( Time (
Hour ( TWO ) )) Flight_ConstraintsDate_Time
( Date ( holiday (holiday_name ( CHRISTMAS
) ) ) )
Fragmented Parse
8Utterance Type Classification Tree
Q/S 0.5/0.5
F0_difgt15
F0_diflt15
S 0.69
Q 0.67
End_slopelt4.07
End_slopegt4.07
End_slopelt2.56
End_slopegt2.56
Q 0.77
S 0.56
S 0.8
Q 0.54
Reg_shape-1
F0_rangelt9
Pen_slopelt1.59
F0_rangegt9
Reg_shape1
Pen_slopegt1.59
S 0.84
Q 0.6
S 0.63
Q 0.78
S 0.89
Q 0.85
F0_rangegt7
Reg_shape1
F0_rangelt7
Reg_shape-1
Q 0.52
S 0.75
S 0.56
Q 0.77
End_slopegt3.51
F0_pen_difgt5
F0_pen_diflt5
End_slopelt3.51
Q 0.74
S 0.75
Q 0.85
S 0.85
9How Prosody Can Help
Utterance transcription and what's the fist
flight in the morning  Recognizer output(1)
and with the first flight in the morning Parser
output (1) Flight_ReservationFlight_Reference(
WITH THE Earliest(FIRST FLIGHT IN THE
Time_Range( Time_spec( Period_Of_Day(
MORNING ) ) ) ) )Â Recognizer output(2) I'd
what's the first flight in the morning Parser
output (2) Flight_ReservationRequest(Wh_form
( WHAT'S Flight_Reference(THE Earliest (
FIRST FLIGHT IN THE Time_Range( Time_Spec(
Period_Of_Day( MORNING ) ) ) ) ) ) )
10Weights Computation Least Squares
Minimization/Hill Climbing
The error function E ?i ( Gi - ?j Wj
Sij)2 I training sample index J
knowledge source index Gi training score,
selected manually, for training sample i Wj
score weight for knowledge source j Sij score
of knowledge source j for sample i Get minimum
error by solving system of k linear
equations -2 ?i Sik( Gi - ?j Wj Sij)0
11Experimental Results
Â
Table I. Comparison of Systems Performance
Table 2. Measure for knowledge sources
contribution
12CONCLUSION AND FUTURE WORK
- Maximize the amount of information passed
between system - modules, and use all the higher level
knowledge to evaluate the - different hypothesis. Decisions are made
whenever possible and - delayed when necessary.
- Rank parse results according to number of word
coverage and - information content.
- Use expectation list generated from current
dialog state to select - the most appropriate hypothesis.
- Future work Use confidence measures from the
recognizer output to confirm our selection.