Title: SpeechtoSpeech Translation:
1 - Speech-to-Speech Translation
- A New Direction for the Speech Industry
-
Mark Seligman, CEO
SpeechTEK West February 21-23, 2007
2-
- Converser for Healthcare is the worlds first
commercially available speech-to-speech
translation system for wide-ranging
conversations. (Input via handwriting,
touchscreen, and keyboard is also enabled.) - Converser for Healthcare is an affordable,
reliable, portable translation system which can
improve communication 24/7 between healthcare
workers and patients with limited English
proficiency. -
3Overview
- Automatic Spoken Language Translation (SLT)
- an age-old dream
- Practical SLT systems are now coming into use
- but users must cooperate and compromise
- History three classes of SLT systems
- categorized by degree of user cooperation and
linguistic or topical coverage - Demo
- Market
- Commercial and research activity
4Star Trek? Not!
- The goal speak as usual
- freely shift topics
- full range of vocabulary, idioms, structures
- spontaneous language fragments, false starts,
hesitations - mumble
- converse in noisy environments
- ignore the translation program
- For now some cooperation, compromise
5The scientific problem component integration
- Component technologies (SR, MT, TTS)
- imperfect, hard to integrate
- Each is usable, but combination may fall below
usefulness threshold - error rates combine, compound
6Class One
- Class One voice-driven phrase book
- linguistic coverage narrow
- topical coverage narrow
- cooperation required low
- Fixed expressions or templates only
- Id like a bottle of beer, wine, soda,
please. - Id like a bottle of BEVERAGE, please.
- Advantages for user
- no need to carry a book -- use telephone
- selection of phrase by voice rather than finger
- translation output pronounced by native
- Technology
- Speech recognition IVR
- MT flat lookup, template or example-based
- Engineering exercise low risk
7Other Class One
- Sony TalkMan
- Pending entries
- Sharp
- NEC
- Future IVR systems?
8Class Two
- Class Two robust speech translation in narrow
domains - linguistic coverage broad
- topical coverage narrow
- cooperation required medium
- Examples
- Uh, could I reserve a double room for next
Tuesday, please? - I need to, um, I need a double room please.
Thats for next Tuesday. - Hello, Im calling about reserving a room. Id be
arriving next week on Tuesday. - Advantages
- Lots of experience
- Can optimize SR, MT special grammars (patterns)
- Interlingua possible for MT
- Challenges
- Robust parsing still imperfect, so MT input is
dirty - Some user frustration inevitable, but balanced by
freedom - Risk medium
9Class two Worldwide Research
- CMU/Univ Karlsruhe (USA/Germany)
- ATR (Japan)
- IRST (Italy)
- ETRI (Korea)
- GETA-CLIPS (France)
- CAS-NLPR (China)
- IBM (USA)
10Class two Research
11Class Three
- Class three highly interactive speech
translation with broad linguistic and topical
coverage - linguistic coverage broad
- topical coverage broad
- cooperation required extensive
- User achieves broad coverage by supervising
- SR need dictation for broad coverage
- MT need broad coverage, good quality
- Must be modifiable to enable interactive
correction
12In the beginning
French Quest-ce que vous étudiez? (What do
you study?) English Computer science. (Linforma
tique.) French Qu'est-ce que vous faites plus
tard? (What are you doing later?) English
I'm going skiing. (Je vais faire du
ski.) French Vous n'avez pas besoin de
travailler? (You don't need to work?) English
I'll take my computer with me. (Je prendrai mon
ordinateur avec moi.) French Où est-ce que vous
mettrez l'ordinateur pendant que vous
skiez? (Where will you put the computer while
you ski?) English In my pocket. (Dans ma poche.)
13Converser Features
14 15Market U.S. Healthcare
- 200,000 potential customers
- Healthcare venues
- 6,003 hospitals (2003 www.USNews.com)
- 836,156 physicians (2001 www.ama.com)
- 15-20 minutes/meeting
- 45-150/hour for human interpreter
16Value Proposition
- Operational
- significant ROI
- 24/7 access to interpreting
- reduced patient waiting time
- more efficient use of employees (keep staff in
their positions) - patient SAFETY (real and perceived)
- reduced liability bilingual transcripts of
interaction with patients - compliance
- Communication benefits
- privacy
- more verifiability, consistency than with human
interpreter - Informed consent
17Worldwide Market
- IDC
- Cross-language software
- 67 billion (2000) to 237 billion (2005)
- Worldwide e-business globalization support
- gt 540 billion
- Multilingual communications, collaboration tools
- 5 billion (by 2008)
- Allied Business Intelligence, Inc.
- Worldwide human translation
- 5.7 billion (in 2006)
- Global Reach
- 70 of online population not native English
18Markets
- Defense and Security
- services, intelligence, allies
- law enforcement
- Travel and Tourism
- Language Instruction/Education
- Government Service
- immigration
- welfare, food stamps, etc.
- Business
- B2C customer service
- B2B multinational firms, global
partners/operations - Consumer
- online affinity/personal portals
(e.g. online dating)
19Some Current Research/Commercial Activity
- Spoken Translation, Inc. (Converser)
- IBM (Mastor)
- Sehda (S-Minds)
- SpeechGear (Compadre Interpreter)
- VoxTec (Phraselator)
- Sony/Sharp/NEC (tourist)
- Ectaco (Dictionary )
- MIT (flight domain)
- CMU (Arabic for military)
- BBN (Arabic for military)
20- Thank you!
- To view demo visit
- www.ConverserforHealthcare.com