NESPOLE System Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

NESPOLE System Architecture

Description:

Support real-time sharing of web pages, maps and visual information gestures and ... Original audio/video, whiteboard data passed between the end users via the ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 18
Provided by: chadtl
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: NESPOLE System Architecture


1
NESPOLE! System Architecture
  • Alon Lavie
  • ESSLLI-2002 Tutorial Course
  • August 12, 2002

2
Outline
  • Main design considerations
  • Global system architecture
  • NESPOLE! Mediator
  • NESPOLE! HLT Servers
  • Multimodal User Interfaces
  • Strengths and Weaknesses

3
  • Speech-to-speech translation for E-Commerce
    applications
  • Partners CMU, Univ of Karlsruhe, ITC-irst,
    UJF-CLIPS, AETHRA, APT-Trentino
  • Builds on successful collaboration within C-STAR
  • Improved limited-domain speech translation
  • Experiment with multimodality and with MEMT
  • Showcase-1 Travel and Tourism in Trentino,
    completed in Nov-2001, demonstrated
  • Showcase-2 expanded travel medical service

4
Speech-to-Speech Translation in E-commerce
Applications
  • Replace current passive web E-commerce with live
    interaction capabilities
  • Client starts via web, can easily connect to
    agent for specific information
  • Thin client - very little special hardware and
    software on client PC browser, MS Netmeeting,
    Shared Whiteboard

5
Main Design Considerations
  • Real E-commerce scenario enable common users to
    actually experiment with the developed system ?
    minimal hardware and software requirements on the
    client PCs
  • Real-time translation over the internet ?
    flexible distributed configurations
  • Support real-time sharing of web pages, maps and
    visual information ? gestures and multimodality
  • Clean separation of the communication channel
    between the two parties from the translation
    servers
  • Build upon our considerable experience from
    C-STAR II, reuse components and technology to
    whatever extent possible maintain an
    interlingua-based approach
  • Modularity allow each site to independently
    develop its own language specific modules, easy
    integration of showcase prototypes

6
NESPOLE! User Interfaces
7
NESPOLE! Architecture
8
The NESPOLE! Mediator
  • Central Role mediates between the communication
    channels connecting the two parties and the
    speech translation server(s)
  • Establishes all communication channels during a
    session, controls data flow and processing,
    integrates data streams
  • Controls MS NetMeeting, Whiteboard, shared
    Web-Browser, and Translation Monitor on the
    end-user PCs
  • Uses standard H.323 data formats over UDP
    protocol for audio, video and binary data
    transfer with the end-user applications
  • Uses linear PCM packets and text control messages
    over TCP to communicate with language-specific
    HLT servers

9
Communication Flow
  • Session initiated by client via a button on the
    provider web page ? send request to the Mediator
  • Mediator establishes a three-way NetMeeting
    connection client-Med-agent
  • Mediator starts shared whiteboard and translation
    monitors on both end-user PCs
  • Mediator open socket connections with the two
    appropriate HLT servers
  • Data Flow
  • Original audio/video, whiteboard data passed
    between the end users via the Mediator
  • Translation chain
  • G.711 audio from NetMeeting captured by Mediator
  • Mediator sends audio as PCM packets to L1 HLT
    server
  • L1 HLT server performs ASR analysis to IF
  • L1 HLT server sends IF via CS to L2 HLT server
  • L2 HLT server performs generation and synthesis
    to L2
  • L2 HLT server sends PCM packets back to Mediator
  • Mediator integrates synthesis back into audio
    stream

10
Language-specific HLT Servers
11
Distributed Speech-to-Speech Translation over the
Internet
12
Network Traffic Impact
13
Aethra Whiteboard
14
Aethra Whiteboard
  • Whiteboard supports
  • Sending of bitmaps of images
  • Gesture annotations on the shared image area
    selection, free-hand drawing,
  • Scrolling and zooming of image
  • URL opening on remote browser
  • Mediator can support synchronization of
    whiteboard functions with translation ? delay
    display of whiteboard function until translation
    of simultaneous audio packet is ready
  • Multi-modal communication significantly supports
    the communication between the two parties

15
NESPOLE! Translation Monitor
16
NESPOLE! Translation Monitor
  • Supported via text messages between the Mediator
    and the appropriate HLT-servers
  • HLT-server sends Hypo message with ASR output
    to Mediator ? displayed to original user as
    System Hears
  • HLT-server sends LGen message with paraphrase
    generation to Mediator ? displayed to original
    user as System Understands
  • HLT-server sends RGen message with text
    translation to Mediator ? displayed to partner
    user as Remote Speech Translation
  • Cancel Translation button sends a message via
    Mediator to other party that last displayed
    translation should be ignored (flashing red)
  • System Hears field is editable, sends text via
    mediator back to HLT server as a revised Hypo
    message for translation

17
Strengths and Weaknesses
  • Clean, simple, well-defined APIs and
    communication protocols
  • Very flexible distributed configurations
  • HLT servers can be developed independently,
    tested outside of full system
  • Easy to plug in new languages into the system
  • Can support Mediators with other type of
    communication devices and applications
    Translation over mobile phones
  • Performance very dependent on network bandwidth
    over the internet transmission of video is
    problematic, standard 56K modem connections are
    insufficient
  • Dependence on NetMeeting technology need for
    Push-to-talk in noisy environments
Write a Comment
User Comments (0)
About PowerShow.com