Media Manager Mail Access Unified Messaging - PowerPoint PPT Presentation

About This Presentation
Title:

Media Manager Mail Access Unified Messaging

Description:

onebox.com. CoolMail.net. Lucent/Octel Unified Messenger. Stanford Mobile People Architecture ... Iceberg Universal Inbox Component. Desktop. MediaManager Mail Access ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 30
Provided by: barbar181
Category:

less

Transcript and Presenter's Notes

Title: Media Manager Mail Access Unified Messaging


1
Media Manager Mail AccessUnified Messaging
  • Barbara Hohlt
  • UC Berkeley
  • Ericsson Presentation
  • August 22, 2000

2
Messages from many sources
???
3
Project Overview
  • Make messages more accessible
  • Get all types of messages
  • Access from different devices with different
    capabilities
  • Enable faster browsing of many voicemails
  • Media Mail services
  • A unified messaging infrastructure
  • Voicemail is email encoded in MIME
  • Transcoding services
  • Enhance voicemail interaction
  • Includes skimmed audio, transcript, text/audio
    summary, and outline

4
Related Work
  • Universal Inboxes/Unified Messaging
  • onebox.com
  • CoolMail.net
  • Lucent/Octel Unified Messenger
  • Stanford Mobile People Architecture
  • Audio Content Extraction Techniques
  • SpeechSkimmer, MITs MultiMedia Lab Arons95
  • Auto-Summarization, Microsoft Research
  • CueVideo, IBM

5
Architecture
6
Applications
  • Conventional GUIs
  • Context-Aware Applications
  • Iceberg Universal Inbox Component

A conventional desktop gui can contact the Media
Manager directly and request messages as text.
The Media Manager will return emails and
voicemails as text.
7
Context-Aware Application
Redirection Proxy
8
Iceberg Universal Inbox
9
Architecture
Client
Client
Folder Store
Client
  • Transcoder Service
  • Voicemail-gtText Transcript
  • Voicemail-gtText Summary
  • Voicemail-gtText Outline
  • Email -gtPlain Audio
  • Email -. GSM Audio
  • Voicemail -gt GSM Summary
  • Voicemail-gtAudio Summary
  • Voicemail-gtSkimmed Audio

Media Manager Interface
Media Manager Service
10
MediaManagerServiceIF
  • getFolders( ) and getFoldersAs( )
  • Given a username, returns a list of folder names
  • Returns the list as audio or gsm
  • getList( ) and getListAs( )
  • Given a username, foldername, and count
  • Returns a list of messages (sendername, title,
    date)
  • Returns the list as audio or gsm
  • getMessage( )
  • Given a Message Ref, returns the entire message
  • getMessageContent( )
  • Given a Content ID and return type
  • Returns one part of the message as the return type

11
Messages and Content Objects
  • Media Message
  • Media Reference id
  • Array of Content Objects
  • Content Object
  • Content ID
  • Data
  • Content ID
  • Media Reference id
  • Content Part index
  • Content Type

12
Interface Example
  • User asks for list of messages as GSM
  • Media Manager returns a list of message headers
  • Cell Phone sends a Content ID back
  • Media Manager sends a voicemail Content Object

13
Audio Tools
  • Speech Recognition/Synthesis
  • Transcribe voicemail to text
  • IBM ViaVoice SDK and custom audio libs
  • Natural Language Processing
  • Directed word spotting by understanding content
  • ViaVoice SRCL
  • Pitch
  • Detecting important words by emphasized pitch
  • Pause
  • Compression through pause removal
  • Spurts
  • Retrieve sentence structure of voicemail

14
Transcoding Techniques
Voice Mail -gt Text Transcript Speech recognition
Voice Mail -gt Text Summary NLP, pitch detection and recognition
Voice Mail -gt Text Outline Pause detection and speech recognition
E Mail -gt Plain Audio Speech synthesis
E Mail -gt GSM Audio Speech synthesis and toast
Voice Mail -gt Skimmed Audio Pause detection
Voice Mail -gt Audio Summary Text summary and speech synthesis
Voice Mail -gt GSM Summary Audio summary and toast
15
Examples
Original Voicemail
Hello, This is Barbara. How are you and the
cats doing? I was wondering if you would feed
them a little more the first time in case they
eat too much. My number is (713) 465-5155. You
can call me anytime. Have a very good holiday.
Bye bye
Processed Voicemail
(Skimmed)
(Just pitch)
(Pitch emphasized words in green)
16
Examples continued...
Original Voicemail
Faced with a seemingly inevitable engineering
task authors tend to adopt one of two strategies
for adding new services to the Internet
landscape inflexible, highly tuned,
hand-constructed services.
Processed Voicemail
(Skimmed)
(Just pitch)
  • Faced with a seemingly inevitable engineering
    task authors tend to adopt what it to strategies
    for adding new services to the internet
    landscape.
  • Inflexible, highly Tate, had constructed
    services.

(Pitch emphasized words in green)
17
Results
  • Pause detection
  • Worked well for given applications
  • Playback speedup by 50-70
  • Pitch detection
  • Problems due to high pitch sounds and transitions
  • Speech recognition
  • Performance decrease in conversational settings
  • Natural Language Processing
  • Performed well with small grammar

18
Example Adding GSM Acess
  • Define a specific types, ie GSMAudio, GSMSummary
  • Optionally create new Content Objects
  • Add Content Object definition to MediaManager
  • Add add gsm transcoder to TranscoderService

19
Detail Adding GSM Access
  • Add Content Object definition to MediaManager
  • Define GSMAUDIO and GSMSUMMARY
  • Add cases to createObject() in Content Object
  • Add cases to Media Manager
  • Add GSM to Transcodeer
  • Add method toGSM() to Transcoder
  • Edit .config file
  • External.transcoder.gsm rungsm
  • Edit related transcoders
  • speechSynthesizer and audioSummary()

20
Implementing Other Mail Stores
  • Examples IMAP, POP, Microsoft Exchange Server
  • Implement MailAccessIF
  • String getMAFolders( userName )
  • MediaMessage getMAList( userName, folderName,
    count )
  • MediaMessage getMAMessage( MediaRef )
  • ContentObject getMAMessageContent( ContentID )
  • Add new protocol to Media Manager protocol table
  • Optionally add protocol for users in to
    FolderStore

21
Conclusion
  • Overall
  • System useful as navigational hints
  • To achieve total comprehension, need better voice
    recognition
  • What works well
  • Skimming using pause removal
  • Detecting spurts for structure
  • What needs work
  • Speech detection in conversational settings
  • Pitch emphasis needs refining
  • Future Directions
  • Implementing more mail stores
  • Enhancing interfaces
  • Pause detection/word boundaries using speech
    detection
  • Developing voicemail grammars
  • Using NLP feedback with pitch emphasis detection
  • Improved speech detection in noisy environments

22
(No Transcript)
23
MediaManagerServiceIF
  • String getFolders( userName )
  • byte getFoldersAs( userName, returnType )
  • MediaMessage getList( userName, folderName,
    count )
  • byte getListAs( userName, folderName, count,
    returnType )
  • MediaMessage getMessage( MediaRef )
  • ContentObject getMessageContent( ContentID,
    returnType )

24
Pitch Detection
  • The Idea
  • A speakers pitch naturally changes when
    introducing topics or emphasizing words
    Hirshberg92
  • Use pitch increases as hints for important
    words
  • Algorithm Aaron95
  • Determine pitch for each 20 ms frame (FFT with
    SHS)
  • Set emphasis threshold to be top 1 of pitch
    values (by histogram)
  • Mark 1 sec interval as emphasized if contains gt3
    emphasized frames

25
Pause Detection
  • Why is pause detection useful?
  • Removing pauses speedups playback
  • Typically, 50-70 of original time Foulke71
  • Long pauses signify groups (talk spurts)
  • Noise and soft sounds create difficulties
  • Algorithm Smoothed Histogram Lamet81
  • Calculate energy per 10 ms frame
  • Threshold based on smoothed histogram (5 dB after
    first peak)
  • Use heuristics to remove artifacts

26
Results
  • Pause detection
  • Worked well for given applications
  • Playback speedup by 50-70
  • Pitch detection
  • Problems due to high pitch sounds and transitions
  • Speech recognition
  • Performance decrease in conversational settings
  • Natural Language Processing
  • Performed well with small grammar

27
Conclusion
  • Overall
  • System useful as navigational hints
  • To achieve total comprehension, need better voice
    recognition
  • What works well
  • Skimming using pause removal
  • Detecting spurts for structure
  • What needs work
  • Speech detection in conversational settings
  • Pitch emphasis needs refining
  • Future Directions
  • Implementing more mail stores
  • Enhancing interfaces
  • Pause detection/word boundaries using speech
    detection
  • Developing voicemail grammars
  • Using NLP feedback with pitch emphasis detection
  • Improved speech detection in noisy environments

28
Works Cited
  • Arons95 B. Arons. Interactively Skimming
    Recorded Speech, Ph.D. dissertation, MIT 1985.
  • Foulke71 E. Foulke The Perception of Time
    Compressed Speech. Ch 4 in Perception of
    Language, edit by P.M. Kjeldergaaid, D.L. Horton,
    and J.J. Jenkins, Charles E. Merill Publishing
    Company, 1971. pp. 79-107
  • Hirshberg92 J. Hirschberg and B. Grosz.
    Intonational Features of Local and Global
    Discourse. In Proceedings of the Speech and
    Natural Language workshop (Harriman, NY, Feb.
    23-26). Morgan Kaufman Publishers, 1992. pp.
    441-446.
  • Lamel81 L.F. Lamel, L.R. Rabiner, A.E.
    Rosenberg, and J.G. Wilpson. An Improved
    Endpoint Detector for Isolated Word Recognition.
    IEEE Transactions on Acoustics, Speech, and
    Signal Processing ASSP-29, 4. (Aug, 1981),
    771-785.

29
Architecture
Client
Client
  • Transcoder Service
  • Voicemail-gtText Transcript
  • Voicemail-gtText Summary
  • Voicemail-gtText Outline
  • Email -gtPlain Audio
  • Email -. GSM Audio
  • Voicemail -gt GSM Summary
  • Voicemail-gtAudio Summary
  • Voicemail-gtSkimmed Audio

Folder Store
Client
Write a Comment
User Comments (0)
About PowerShow.com