Title: Recent Advances in Speech Translation Systems
1Recent Advances in Speech Translation Systems
- ESSLLI-2002 Tutorial Course
- August 12-16, 2002
- Course Organizers
- Alon Lavie Carnegie Mellon University
- Lori Levin Carnegie Mellon University
- Fabio Pianesi ITC-irst
2Course Objectives and Format
- Present the NESPOLE! System and Research Project
as a Case Study for state-of-the-art speech
translation systems - Survey the underlying language technology
involved, design considerations components and
architecture - Challenges, capabilities and limitations
- Tasks and methodologies involved
- Lessons learned
- Format
- Each day devoted to a different theme aspect(s)
- Presentations by topic experts among senior
researchers working on the NESPOLE! Project
3Course Outline and Schedule
- Monday, 12 August
- Introduction and System Overview (Lavie and
Pianesi) - Nespole! System Architecture (Lavie)
- Data Collection in Nespole! (Costantini)
- Tuesday, 13 August
- Interchange Format (Levin)
- Wednesday, 14 August
- Speech Recognition Challenges and Solutions
- ASR and Scalability (Vaufreydaz)
- ASR and Robustness (Metze)
- Multilinguality in Automatic Speech Recognition
Systems - (Schultz)
4Course Outline and Schedule
- Thursday, 15 August
- Analysis and Generation Approaches
- Trainable Analysis Approach (Lavie)
- French Analysis and Generation Approaches
(Blanchon) - Italian Generation (Pianta)
- Friday, 16 August
- Experimenting with Direct Approaches
- Statistical Machine Translation (Vogel)
- Evaluation in Nespole! (Lavie, Levin,
Costantini) - Conclusion and Future Directions (Pianesi and
Lavie)
5Introduction
- Evolution of Speech Translation Systems
6NESPOLE! System Overview
- Human-to-human spoken language translation for
e-commerce application (e.g. travel tourism)
(Lavie et al., 2002) - English, German, Italian, and French
- Translation via interlingua
- Translation servers for each language exchange
interlingua to perform translation - Speech recognition (Speech ? Text)
- Analysis (Text ? Interlingua)
- Generation (Interlingua ? Text)
- Synthesis (Text ? Speech)
7Interchange Format
- Interchange Format (IF) is a shallow semantic
interlingua for task-oriented domains - Utterances represented as sequences of semantic
dialog units (SDUs) - IF representation consists of four parts
- Speaker
- Speech Act
- Concepts
- Arguments
- speaker speech act concept arguments
Domain Action
8Example
- Hello. I would like to take a vacation in Val di
Fiemme.
hello i would like to take a vacation in val di
fiemme
cgreeting (greetinghello) cgive-informationdis
positiontrip (disposition(whoi, desire),
visit-spec(identifiabilityno, vacation),
location(place-nameval_di_fiemme))
ENG Hello! I want to travel for a vacation at
Val di Fiemme. ITA Salve. Io vorrei una vacanza
in Val di Fiemme.