Tanja Schultz - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Tanja Schultz

Description:

Pronunciation: voting among multilingual recognizers. Ch De Fr. S S S S. a e a ... In each language a pronunciation has to be generated for each word. Problem: ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 19
Provided by: tanj3
Category:

less

Transcript and Presenter's Notes

Title: Tanja Schultz


1
Multilinguality in Automatic Speech Recognition
Systems
  • Tanja Schultz
  • Carnegie Mellon University, LTI
  • ESSLLI workshop, Trento, August 14, 2002

2
Outline
Tanja Schultz
  • Multilinguality in Automatic Speech Recognition
    (ASR)
  • Motivation, Topics
  • Monolingual ASR in many languages
  • Challenges, Language peculiarities
  • Language Independent ASR
  • Language Portability
  • Rapid deployment of LVCSR systems
  • What is done, what needs to be solved
  • Applications of multilingual systems

3
Motivation Multilinguality in ASR
  • Myth Everyone speaks English, why bother?
  • NO About 4500 different languages in the world,
    increasing num-ber of languages in the Internet
    (as indicator for growing diversity)
  • Computerization and Globalization feed the needs
    for (civilian and military) applications in many
    languages
  • Language diversity and digital discrimination
    rapid deployment of systems in many language
  • Myth Its just retraining on foreign data - no
    science
  • NO Other languages bring unseen challenges,
    i.e. scripts, vocabu-lary and morphology,
    tonality, less than ideal resources, few if any
    speech/text data
  • ? Automatic Speech recognition in many languages
    Human-to-Human communication, Human-Computer
    interfaces

4
Topics in Multilinguality
  • Monolingual ASR systems in many languages
  • Language Portability
  • Combine Data and Knowledge of many languages
  • Faster Cycles years -gt days
  • Fewer Data Low density languages
  • ASR Systems Acoustic Model, Dictionary, LM
  • Multilingual applications
  • Multilingual assistance / user interfaces
  • Foreign Accents - non-native speech
  • Language, Accent, and Speaker Identification

5
Projects adressing Multilinguality
6
Available Multilingual Resources
  • What do we need for ASR audio, text, dictionary
  • Data distribution, webpage and catalogues online
  • http//www.icp.inpg.fr/ELRA/home.html
  • http//www.ldc.upenn.edu/
  • LDC BroadCast News (En, Ch, Sp, Cz, Ja, Ar),
    CallFriend, CallHome (13 languages, only few
    transcripts)
  • ELRA SpeechDat, SpeechDat-car, Aurora,
    Verbmobil, Polyphone, Accor, SpeeCon
  • ? 20 language extensively studied so far
  • Interactive Systems Labs GlobalPhone

7
GlobalPhone
  • Multilingual Database
  • Widespread languages
  • Native speakers
  • Uniformity
  • Broad domain
  • Huge text resources
  • Internet newspapers
  • Total sum of resources
  • 15 languages so far
  • ? 300 hours speech data
  • ? 1400 native speakers

Arabic Ch-Mandarin Ch-Shanghai Czech Croatian
French German Japanese Korean Portuguese
Russian Spanish Swedish Tamil Turkish
Soon available from ELRA!
8
Language Peculiarities
  • ? Prosody Tonal languages like Mandarin
  • ? Sound system simple vs very complex systems
  • ? Phonotactics simple syllable structure vs
    complex clusters
  • ? Scripts, l-2-s simple 11 mapping vs
    pictographs
  • ? Morphology, Segmentation
  • Natural segmentation into units suitable for
    LVCSR (English)
  • Compounds (German)
  • Donau-dampf-schiffahrts-gesellschafts-kapitän
  • The Captain of the Company that operates the
    Steamboats on the Donau River
  • Word phrases due to morphological structure
    (Turkish, Korean)
  • Osman-l?-laç-t?r-ama-yabil-ecek-ler-imiz-den-mis-s
    iniz
  • behaving as if you were of those whom we might
    consider not converting into Ottoman
  • No segmentation at all (Chinese, Japanese)

9
Monolingual Recognizers in 10 Languages
10
Language Independent ASR
  • Can we build a language independent ASR system?
  • Universal (Language Independent) Acoustic
    Modeling
  • Sounds production is human NOT language specific
  • International Phonetic Alphabet (IPA) simple
    to implement easy to port to new languages
  • Fully data-driven procedure considers
    spectral properties and similarities apply to
    context independent and dependent models
  • Universal Language Modeling
  • Combine LMs of languages to allow code switching
  • Experiments
  • Train language dependent and independent AMLM
  • Evaluate in monolingual and multilingual mode

11
Language Independent AM
12
Language Portability
13
Language Portability AM
Model mapping to the target language 1) Map
the multilingual phonemes to Portuguese ones
based on the IPA-scheme 2) Copy the
corresponding acoustic models in order to
initialize Portuguese models Problem Contexts
are language specific, how to apply context
dependent models to a new target
language Solution Adaptation of multilingual
contexts to the target language based on limited
training data
14
Language Portability Experiments

15
Language Portability Dictionary
16
Language Portability Dictionary
Task In each language a pronunciation has to be
generated for each word Problem Hand-crafting
is very expensive Rule-based approach requires
letter-to-sound relationship and linguistic
knowledge Letter-to-Sound relationship
  • Solutions Fully automatical dictionary
    generation
  • Apply letter-to-sound rules where possible
  • Use phoneme recognizers in different languages
  • Learn sound units and dict units from scratch

17
Language Portability LM
18
Conclusions Todos
  • Applications are needed in many languages
  • Many applications like S-2-S require Multilingual
    Systems
  • Monolingual ASR
  • Language Pecularities Morphology, Segmentation,
    Scripts, ...
  • Language Independent ASR
  • AM very useful for independent and adaptive
    applications
  • Potential for non-native / accented speech
  • LM universal language model allows code
    switching
  • Language Portability
  • Language adaptive AM reduce need for new data
  • What needs to be solved, improved, ...
  • Dictionary, LMing, task-specific vs large vocab
Write a Comment
User Comments (0)
About PowerShow.com