Integrating Multiple Knowledge Sources For Improved Speech Understanding

1 / 12

About This Presentation

Title:

Integrating Multiple Knowledge Sources For Improved Speech Understanding

Description:

Integrating Multiple Knowledge Sources For Improved Speech ... Prosodic Utterance Classifier. Travel. User. Departure Route. Depart Loc. Arrive Loc. Depart Date ... –

Number of Views:32

Avg rating:3.0/5.0

Slides: 13

Provided by: michaels123

Category:

more less

Transcript and Presenter's Notes

Title: Integrating Multiple Knowledge Sources For Improved Speech Understanding

1
Integrating Multiple Knowledge Sources For
Improved Speech Understanding

Sherif Abdou, Michael ScordilisDepartment of
Electrical and Computer Engineering, University
of MiamiCoral Gables, Florida 33124, U.S.A.

2
Abstract

The sentence produced by the decoder with the
highest recognition probability may not be the
best choice for extracting the intended concepts.
The more knowledge sources that share in the
selection process the better result can be
achieved.
In late disambiguation approach, many hypotheses
are permitted to propagate through the system
till there is enough knowledge to select the best
one.
In this work recognition score, parsing score,
dialog expectations and prosody are used for the
decision of selecting the best hypotheses.
The scaling weights of the combined scores are
determined automatically by an optimization
procedure.

3
System Architecture
I/O Interface
Prosodic Utterance Classifier
Speech Recognizer
Dialog Manger
Synthesizer
Parser
Acoustic Model
Language Model
Grammar
Goal Trees
Dialog History
Prerecorded Speech Units
Flights Database
4
Domain Plans
Travel
Departure Route
Return Route
User
Return Time
Depart Loc
Arrive Loc
Depart Date
Depart Time
Return Date
Hierarchical goal tree
5
Discourse Plans

Clarifications User-initiated subdialogs,
usually by questioning, to ask about some
feature for a concept related to one of the
current plans of the focus stack .
Corrections User-initiated subdialogs with the
intention to correct part of an already
constructed plan. They usually appear after
system-explicit or implicit confirmations.
Meta_communications User-initiated subdialogs
that refer to the dialogue itself, such as
asking for repetitions or signaling
nonundersatnding.

6
Parser Score
User utterance "I need a flight from Miami to
Boston two days after Christmas"
Recognizer output (i) I_need flight from Miami
to Boston two days after Christmas Parser output
(i) Flight_Constraintsdepartloc ( FROM
Location ( city ( City_Name ( MIAMI ) ) )
) Flight_Constraintsarriveloc ( TO Location
( city ( City_Name ( BOSTON ) ) )
) Flight_ConstraintsDate_Time ( Date
(Date_Relative ( date_offset ( day_offset (
Number ( TWO ) ) DAYS _days_after ( AFTER ) )
) ) holiday ( holiday_name ( CHRISTMAS ) ) ) )
Non Fragmented Parse
7
Recognizer output (j) I_need flight from Miami
to Boston two days other Christmas Parser
output(j) Flight_Constraintsdepartloc ( FROM
Location ( city ( City_Name ( MIAMI ) ) )
) Flight_Constraintsarriveloc ( TO Location
( city ( City_Name ( BOSTON ) ) ) )
) Flight_ConstraintsTime_Range ( Time (
Hour ( TWO ) )) Flight_ConstraintsDate_Time
( Date ( holiday (holiday_name ( CHRISTMAS
) ) ) )
Fragmented Parse
8
Utterance Type Classification Tree
Q/S 0.5/0.5
F0_difgt15
F0_diflt15
S 0.69
Q 0.67
End_slopelt4.07
End_slopegt4.07
End_slopelt2.56
End_slopegt2.56
Q 0.77
S 0.56
S 0.8
Q 0.54
Reg_shape-1
F0_rangelt9
Pen_slopelt1.59
F0_rangegt9
Reg_shape1
Pen_slopegt1.59
S 0.84
Q 0.6
S 0.63
Q 0.78
S 0.89
Q 0.85
F0_rangegt7
Reg_shape1
F0_rangelt7
Reg_shape-1
Q 0.52
S 0.75
S 0.56
Q 0.77
End_slopegt3.51
F0_pen_difgt5
F0_pen_diflt5
End_slopelt3.51
Q 0.74
S 0.75
Q 0.85
S 0.85
9
How Prosody Can Help
Utterance transcription and what's the fist
flight in the morning Recognizer output(1)
and with the first flight in the morning Parser
output (1) Flight_ReservationFlight_Reference(
WITH THE Earliest(FIRST FLIGHT IN THE
Time_Range( Time_spec( Period_Of_Day(
MORNING ) ) ) ) ) Recognizer output(2) I'd
what's the first flight in the morning Parser
output (2) Flight_ReservationRequest(Wh_form
( WHAT'S Flight_Reference(THE Earliest (
FIRST FLIGHT IN THE Time_Range( Time_Spec(
Period_Of_Day( MORNING ) ) ) ) ) ) )
10
Weights Computation Least Squares
Minimization/Hill Climbing
The error function E ?i ( Gi - ?j Wj
Sij)2 I training sample index J
knowledge source index Gi training score,
selected manually, for training sample i Wj
score weight for knowledge source j Sij score
of knowledge source j for sample i Get minimum
error by solving system of k linear
equations -2 ?i Sik( Gi - ?j Wj Sij)0
11
Experimental Results

Table I. Comparison of Systems Performance
Table 2. Measure for knowledge sources
contribution
12
CONCLUSION AND FUTURE WORK

Maximize the amount of information passed
between system
modules, and use all the higher level
knowledge to evaluate the
different hypothesis. Decisions are made
whenever possible and
delayed when necessary.
Rank parse results according to number of word
coverage and
information content.
Use expectation list generated from current
dialog state to select
the most appropriate hypothesis.
Future work Use confidence measures from the
recognizer output to confirm our selection.