Tanja Schultz - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

Tanja Schultz

Description:

Pronunciation: voting among multilingual recognizers. Ch De Fr. S S S S. a e a ... In each language a pronunciation has to be generated for each word. Problem: ... – PowerPoint PPT presentation

Number of Views:72

Avg rating:3.0/5.0

Slides: 19

Provided by: tanj3

Category:

more less

Transcript and Presenter's Notes

Title: Tanja Schultz

1
Multilinguality in Automatic Speech Recognition
Systems

Tanja Schultz
Carnegie Mellon University, LTI
ESSLLI workshop, Trento, August 14, 2002

2
Outline
Tanja Schultz

Multilinguality in Automatic Speech Recognition
(ASR)
Motivation, Topics
Monolingual ASR in many languages
Challenges, Language peculiarities
Language Independent ASR
Language Portability
Rapid deployment of LVCSR systems
What is done, what needs to be solved
Applications of multilingual systems

3
Motivation Multilinguality in ASR

Myth Everyone speaks English, why bother?
NO About 4500 different languages in the world,
increasing num-ber of languages in the Internet
(as indicator for growing diversity)
Computerization and Globalization feed the needs
for (civilian and military) applications in many
languages
Language diversity and digital discrimination
rapid deployment of systems in many language
Myth Its just retraining on foreign data - no
science
NO Other languages bring unseen challenges,
i.e. scripts, vocabu-lary and morphology,
tonality, less than ideal resources, few if any
speech/text data
? Automatic Speech recognition in many languages
Human-to-Human communication, Human-Computer
interfaces

4
Topics in Multilinguality

Monolingual ASR systems in many languages
Language Portability
Combine Data and Knowledge of many languages
Faster Cycles years -gt days
Fewer Data Low density languages
ASR Systems Acoustic Model, Dictionary, LM
Multilingual applications
Multilingual assistance / user interfaces
Foreign Accents - non-native speech
Language, Accent, and Speaker Identification

5
Projects adressing Multilinguality
6
Available Multilingual Resources

What do we need for ASR audio, text, dictionary
Data distribution, webpage and catalogues online
http//www.icp.inpg.fr/ELRA/home.html
http//www.ldc.upenn.edu/
LDC BroadCast News (En, Ch, Sp, Cz, Ja, Ar),
CallFriend, CallHome (13 languages, only few
transcripts)
ELRA SpeechDat, SpeechDat-car, Aurora,
Verbmobil, Polyphone, Accor, SpeeCon
? 20 language extensively studied so far
Interactive Systems Labs GlobalPhone

7
GlobalPhone

Multilingual Database
Widespread languages
Native speakers
Uniformity
Broad domain
Huge text resources
Internet newspapers
Total sum of resources
15 languages so far
? 300 hours speech data
? 1400 native speakers

Arabic Ch-Mandarin Ch-Shanghai Czech Croatian
French German Japanese Korean Portuguese
Russian Spanish Swedish Tamil Turkish
Soon available from ELRA!
8
Language Peculiarities

? Prosody Tonal languages like Mandarin
? Sound system simple vs very complex systems
? Phonotactics simple syllable structure vs
complex clusters
? Scripts, l-2-s simple 11 mapping vs
pictographs
? Morphology, Segmentation
Natural segmentation into units suitable for
LVCSR (English)
Compounds (German)
Donau-dampf-schiffahrts-gesellschafts-kapitän
The Captain of the Company that operates the
Steamboats on the Donau River
Word phrases due to morphological structure
(Turkish, Korean)
Osman-l?-laç-t?r-ama-yabil-ecek-ler-imiz-den-mis-s
iniz
behaving as if you were of those whom we might
consider not converting into Ottoman
No segmentation at all (Chinese, Japanese)

9
Monolingual Recognizers in 10 Languages
10
Language Independent ASR

Can we build a language independent ASR system?
Universal (Language Independent) Acoustic
Modeling
Sounds production is human NOT language specific
International Phonetic Alphabet (IPA) simple
to implement easy to port to new languages
Fully data-driven procedure considers
spectral properties and similarities apply to
context independent and dependent models
Universal Language Modeling
Combine LMs of languages to allow code switching
Experiments
Train language dependent and independent AMLM
Evaluate in monolingual and multilingual mode

11
Language Independent AM
12
Language Portability
13
Language Portability AM
Model mapping to the target language 1) Map
the multilingual phonemes to Portuguese ones
based on the IPA-scheme 2) Copy the
corresponding acoustic models in order to
initialize Portuguese models Problem Contexts
are language specific, how to apply context
dependent models to a new target
language Solution Adaptation of multilingual
contexts to the target language based on limited
training data
14
Language Portability Experiments

15
Language Portability Dictionary
16
Language Portability Dictionary
Task In each language a pronunciation has to be
generated for each word Problem Hand-crafting
is very expensive Rule-based approach requires
letter-to-sound relationship and linguistic
knowledge Letter-to-Sound relationship