Title: SpeechtoSpeech MT JANUS CSTARNespole
1Speech-to-Speech MTJANUS C-STAR/Nespole!
- Lori Levin, Alon Lavie, Bob Frederking
- LTI Immigration Course
- September 11, 2000
2Outline
- Problems in Speech-to-Speech MT
- The JANUS Approach
- The C-STAR/NESPOLE! Interlingua (IF)
- System Design and Engineering
- Evaluation and User Studies
- Open Problems, Current and Future Research
3JANUS Speech Translation
- Translation via an interlingua representation
- Main translation engine is rule-based
- Semantic grammars
- Modular grammar design
- System engineered for multiple domains
- Incorporate alternative translation engines
4The C-STAR Travel Planning Domain
- General Scenario
- Dialogue between one traveler and one or more
travel agents - Focus on making travel arrangements for a
personal leisure trip (not business) - Free spontaneous speech
5The C-STAR Travel Planning Domain
- Natural breakdown into several sub-domains
- Hotel Information and Reservation
- Transportation Information and Reservation
- Information about Sights and Events
- General Travel Information
- Cross Domain
6Semantic Grammars
- Describe structure of semantic concepts instead
of syntactic constituency of phrases - Well suited for task-oriented dialogue containing
many fixed expressions - Appropriate for spoken language - often disfluent
and syntactically ill-formed - Faster to develop reasonable coverage for limited
domains
7Semantic Grammars
- Hotel Reservation Example
- Input we have two hotels available
- Parse Tree
- give-informationavailabilityhotel
- (we have hotel-type
- (quantity (two)
- hotel (hotels)
- available)
8The JANUS-III Translation System
9The JANUS-III Translation System
10The SOUP Parser
- Specifically designed to parse spoken language
using domain-specific semantic grammars - Robust - can skip over disfluencies in input
- Stochastic - probabilistic CFG encoded as a
collection of RTNs with arc probabilities - Top-Down - parses from top-level concepts of the
grammar down to matching of terminals - Chart-based - dynamic matrix of parse DAGs
indexed by start and end positions and head cat
11The SOUP Parser
- Supports parsing with large multiple domain
grammars - Produces a lattice of parse analyses headed by
top-level concepts - Disambiguation heuristics rank the analyses in
the parse lattice and select a single best path
through the lattice - Graphical grammar editor
12SOUP Disambiguation Heuristics
- Maximize coverage (of input)
- Minimize number of parse trees (fragmentation)
- Minimize number of parse tree nodes
- Minimize the number of wild-card matches
- Maximize the probability of parse trees
- Find sequence of domain tags with maximal
probability given the input words P(TW), where
T t1,t2,,tn is a sequence of domain tags
13JANUS Generation Modules
- Two alternative generation modules
- Top-Down context-free based generator - fast,
used for English and Japanese - GenKit - unification-based generator augmented
with Morphe morphology module - used for German
14Modular Grammar Design
- Grammar development separated into modules
corresponding to sub-domains (Hotel,
Transportation, Sights, General Travel, Cross
Domain) - Shared core grammar for lower-level concepts that
are common to the various sub-domains (e.g.
times, prices) - Grammars can be developed independently (using
shared core grammar) - Shared and Cross-Domain grammars significantly
reduce effort in expanding to new domains - Separate grammar modules facilitate associating
parses with domain tags - useful for multi-domain
integration within the parser
15Translation with Multiple Domain Grammars
- Parser is loaded with all domain grammars
- Domain tag attached to grammar rules of each
domain - Previously developed grammars for other domains
can also be incorporated - Parser creates a parse lattice consisting of
multiple analyses of the input into sequences of
top-level domain concepts - Parser disambiguation heuristics rank the
analyses in the parse lattice and select a single
best sequence of concepts
16Translation with Multiple Domain Grammars
17A SOUP Parse Lattice
18Alternative Approaches SALT
- SALT - Statistical Analyzer for Lang. Translation
- Combines ML trainable and rule-based analysis
methods for robustness and portability - Rule-based parsing restricted to well-defined set
of argument-level phrases and fragments - Trainable classifiers (NN, Decision Trees, etc.)
used to derive the DA (speech-act and concepts)
from the sequence of argument concepts. - Phrase-level grammars are more robust and
portable to new domains
19SALT Approach
- Example
- Input we have two hotels available
- Arg-SOUP exist hotel-type
available - SA-Predictor give-information
- Concept-Predictor availabilityhotel
- Predictors using SOUP argument concepts and input
words - Preliminary results are encouraging
20Alternative Approaches MEMT
- Glossary-based Translation
- Translates directly into target language (no IF)
- Based on Pangloss translation system developed at
CMU - Uses a combination of EBMT, phrase glossaries and
a bilingual dictionary - English/German system operational
- Good fall-back for uncovered utterances
21User Studies
- We conducted three sets of user tests
- Travel agent played by experienced system user
- Traveler is played by a novice and given five
minutes of instruction - Traveler is given a general scenario - e.g., plan
a trip to Heidelberg - Communication only via ST system, multi-modal
interface and muted video connection - Data collected used for system evaluation, error
analysis and then grammar development
22System Evaluation Methodology
- End-to-end evaluations conducted at the SDU
(sentence) level - Multiple bilingual graders compare the input with
translated output and assign a grade of Perfect,
OK or Bad - OK meaning of SDU comes across
- Perfect OK fluent output
- Bad translation incomplete or incorrect
23August-99 Evaluation
- Data from latest user study - traveler planning a
trip to Japan - 132 utterances containing one or more SDUs, from
six different users - SR word error rate 14.7
- 40.2 of utterances contain recognition error(s)
24Evaluation Results
25Evaluation - Progress Over Time
26- Speech-to-speech translation for eCommerce
- CMU, Karlsruhe, IRST, CLIPS, 2 commercial
partners - Improved limited-domain speech translation
- Experiment with multimodality and with MEMT
- EU-side has strict scheduling and deliverables
- First test domain Italian travel agency
- Second showcase international Help desk
- Tied in to CSTAR-III
27C-STAR-III
- Partners ATR, CMU, CLIPS, ETRI, IRST, UKA
- Main Research Goals
- Expandability - towards unlimited domains
- Accessibility - Speech Translation over wireless
phone - Usability - real service for real users
28LingWear for the Information Warrior
- New Ideas
- The pre-development of appropriate interlingua
representations for domains of interest
facilitates generation into a new language within
two weeks. - The development of new MT engines (e.g.
learnable transfer rules) and improved
multi-engine integration supports rapid
deployment of MT for a new language with scarce
resources. - Gisting and summarzation in the source language
followed by MT is better than vice versa.
- Impact
- Allow military and relief organizations to
converse in limited domains of interest with the
local population in an area of conflict and/or
disaster - Allow military and other operatives in the field
to assimilate forien language information they
encounter on-the-move - Rapidly port and deploy the technology into new
languages with scarce resources
Schedule
Port to second language
Baseline summarizer ready
Baseline MT systems ready
Port to third language
Carnegie Mellon University School of Computer
Science A.Waibel, L. Levin, A. Lavie, R.
Frederking
29Current and Future Work
- Expanding the travel domain covering descriptive
as well as task-oriented sentences - Development of the SALT statistical approach and
expanding it to other domains - Full integration of multiple MT approaches SOUP,
SALT, Pangloss - Task-based evaluation
- Disambiguation improved sentence-level
disambiguation applying discourse contextual
information for disambiguation
30Students Working on the Project
- Chad Langley improved SALT approach
- Dorcas Wallace DA disambiguation using decision
trees, English grammars - Taro Watanabe DA correction and disambiguation
using Transformation-based Learning, Japanese
grammars - Ariadna Font-Llitjos Spanish Generation
31The JANUS/C-STAR/Nespole! Team
- Project Leaders Lori Levin, Alon Lavie, Alex
Waibel, Bob Frederking - Grammar and Component Developers Donna
Gates, Dorcas Wallace, Kay Peterson, Chad
Langley, Taro Watanabe, Celine Morel, Susie
Burger, Vicky Maclaren, Dan Schneider