Title: Tight Coupling between ASR and MT in Speech-to-Speech Translation
1Tight Coupling between ASR and MT in
Speech-to-Speech Translation
- Arthur Chan
- Prepared for
- Advanced Machine Translation Seminar
2This Seminar
3A Conceptual Model of Speech-to-Speech Translation
Speech Recognizer
Machine Translator
Speech Synthesizer
Decoding Result(s)
Translation
waveforms
waveforms
4Motivation of Tight Coupling between ASR and MT
- One best of ASR could be wrong
- MT could be benefited from wide range of
supplementary information provided by ASR - N-best list
- Lattice
- Sentenced/Word-based Confidence Scores
- E.g. Word posterior probability
- Confusion network
- Or consensus decoding (Mangu 1999)
- Some observed that
- MT quality depends on WER.
5Scope of this talk
Speech Recognizer
Machine Translator
Speech Synthesizer
1-best?
Translation
N-best?
waveforms
waveforms
Lattice?
Confusion network?
1, Should we combine the two? 2, How tight should
be the coupling?
6Topics Covered Today
- The concept of Coupling
- The tightness of coupling between ASR and X
- (Ringger 95)
- Interfaces between ASR and MT in loose coupling
- What could ASR provide?
- What could MT use?
- Very tight coupling
- Neys formulae
- ATT Approach
- Combination of features of ASR and MT
- Direct Modeling
7The Concept of Coupling
8Classification of Coupling of ASR and Natural
Language Understanding (NLU)
- Proposed in Ringger 95, Harper 94
- 3 Dimensions of ASR/NLU
- Complexity of the search algorithm
- Simple N-gram?
- Incrementality of the coupling
- On-line? Left-to-right?
- Tightness of the coupling
- Tight? Loose? Semi-tight?
9Tightness of Coupling
Tight
Semi-Tight
Loose
10Summary of Coupling between ASR and NLU
11Implication on ASR/MT coupling
- Generalize many systems
- Loose coupling
- Any system which uses 1-best, n-best, lattice for
1-way module communication - Tight coupling
- ATT FST-based system
- Semi-tight coupling
- Filled in a quote here
12Interfaces in Loose Coupling
13Perspectives
- What output could an ASR generates?
- Not all of them are used but it could mean
opportunity in future. - What algorithms could MT uses given a certain
inputs? - On-line algorithm is a focus
14Decoding of HMM-based ASR
- Decoding of HMM-based ASR
- Searching the best path in a huge HMM-state
lattice. - 1-best ASR result
- The best path one could find from backtracking.
- State Lattice (Next page)
15(No Transcript)
16Things one could extract from the state lattice
- From the backtracking information
- N-best list
- The N best decoding results from the state
lattice - Lattice
- A lattice of the decoding but in the word level
- From the lattice
- N-best list
- Confusion network.
- Or consensus decoding (Mangu 99)
17Other things one could extract from the decoder
- Begin time and end time
- Useful in time-sensitive application
- E.g. multi-modal applications
- Sentence/Word-based Confidence Scores
- Found to be pretty useful in many other occasions
18Experimental Results
19How MT used the output?
- What decoding algorithms are using?
20Tight Coupling
21Literature
- Eric K. Ringger, A Robust Loose Coupling for
Speech Recognition and Natural Language
Understanding, Technical Report 592, Computer
Science Department, Rochester University, 1995 - The ATT paper