Current Status and Future Challenges - PowerPoint PPT Presentation

About This Presentation
Title:

Current Status and Future Challenges

Description:

Benchmarking HLT a Danish Perspective, Hotel Scandic, Copenhagen, April 30, 2003 ... Limited prosodic control and limited NLP capacity ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 8
Provided by: cst
Category:

less

Transcript and Presenter's Notes

Title: Current Status and Future Challenges


1
Current Status and Future Challenges
- a Speech Technology View
Assoc. Prof. Børge Lindberg Speech and
Multimedia Communication Division (Center for
PersonKommunikation (CPK) is now integrated into
the Department of Communication
Technology) Aalborg University, Denmark E-mail
lindberg_at_kom.auc.dk
2
Current status speech recognition
  • A Corpus Based Technology
  • Statistical models derived from acoustic and
    linguistic databases !
  • Assisted by a significant Media-development
    (storage capacity) and Hardware development
    (computational capacity)
  • Appearance of Data Consortia- Linguistic Data
    Consortium (LDC), www.ldc.upenn.edu- European
    Language Resources Association (ELRA)
    www.icp.grenet.fr/ELRA/home.html
  • Recently, powerful commercial speech recognition
    engines have become available

3
Current status speech recognition
  • Applications in Danish
  • Mainly in the telephony domain (traffic
    information, calling rates) Providers Nuance,
    Philips (now Scansoft)/PDC
  • Penetration of call centre automation lower than
    in e.g. the US.
  • Recently, in the office environment, also
    specialised domains such as Pathology and
    Radiology have become available more
    plannedProvider Philips/MaxManus
  • No computer command control recogniser
    available for Danish and no general domain
    speech recognition available.Database will be
    soon be available from the SPEECON project
  • The latter two are required for physically
    handicapped and for people having reading-,
    writing or spelling disabilities
  • Other players are IBM/NST (ViaVoice), Empathy
    Systems and Scansoft (Dragon)
  • Lack of robustness

4
Current status speech synthesis (TTS)
  • Like ASR, a corpus based technology
  • With less computational demands on the core
    engine
  • Applications are out running on Danish
    SMS-to-speech, Phone-browsing, adgangforalle.dk
    (access to all), automatic generation of audio
    books
  • Dedicated recorded database for target voice, -
    or larger database for trainable synthesis
  • Three products available (Infovox, RealSpeak and
    DanTTS)
  • Inflexible voice
  • Limited prosodic control and limited NLP capacity

5
Major results achieved in Denmark
  • As a result of public funding
  • Text-to-Speech Synthesis is there, to a large
    extent because of public funding
  • Language resources have been developed for speech
    recognition (SpeechDat family) these are
    available for research, but for a fairly high
    price
  • Automatic Speech Recognition (ASR) is coming up
    (main drive is though from industry)
  • Denmark is lagging behind with respect to ASR

6
What went wrong ?
  • Sequential development of TTS and ASR TTS first
    ....This strategy postponed the development of
    ASR in Denmark, assisted by commercial players
    claiming they were able to do so.
  • No follow-up public investment in ASR
  • Databases are there but hardly accessible
  • No Open Source like modules available for core
    technologies

7
Challenges and what is needed ?
  • Development of no business case applications,
    mainly for disabled - public involvement, i.e.
    investment is needed
  • Speech Recognisers and Text-to-speech
    synthesisers are lacking robustness and a number
    of qualities. Public involvement is needed to
    ensure continued research in new methods applied
    to the Danish language or at least to ensure
    Open-Source-like situation for core
    technologies
  • Public involvement in the distribution,
    collection, validation, standardisation,
    improvement and production of language resources
    ideally freely accessible
  • Without support for the smaller languages, EU
    speech technology will develop along two lines
    languages that allow full use, and languages that
    don't.
  • In order to benefit from the free flow of
    researchers within EU, for that reason alone, we
    need to strive for creating and maintaining
    attractive research environments which are
    facilitating high level research
  • Research into multi-modality (speech is one out
    of more modalities), multi-linguality and in
    particular non-native language.
Write a Comment
User Comments (0)
About PowerShow.com