Final Year Project 2003/2004 LYU0302 PVCAIS - PowerPoint PPT Presentation

About This Presentation
Title:

Final Year Project 2003/2004 LYU0302 PVCAIS

Description:

Personal VideoConference Archives Indexing System ... Face Training: 8 different face images of each user are needed for training and ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 40
Provided by: Test272
Category:

less

Transcript and Presenter's Notes

Title: Final Year Project 2003/2004 LYU0302 PVCAIS


1
Final Year Project 2003/2004LYU0302PVCAIS
Personal VideoConference Archives Indexing System
  • Supervisor Prof Michael R. Lyu
  • Presented by Lewis Ng, Philip Chan
  • 20 April 2004

2
Outline
  • Introduction
  • Motivation
  • Architecture of PVCAIS
  • - Media Acquisition Module
  • - Archive Indexing Module
  • - Videoconference Accessing Module
  • Implementation
  • Conclusion

3
Introduction
  • PVCAIS stands for
  • Personal VideoConference Archives Indexing
    System
  • A system that provides convenient searching and
    browsing support for videoconferencing users on
    past videoconference archives

4
Introduction
  • What is videoconference?
  • A real-time communication technology which
    combines different media
  • audio, video, text chat, file transfer,
    whiteboard and shared applications
  • - More precisely is multimedia conference
  • - Standard of video conferencing ITU-T H.323

5
Motivation
  • Videoconferencing is becoming popular in
  • education, business and personal communication
  • Participants wish to keep videoconference
    archives for later references
  • Normal video and audio files are neither
    searchable nor helpful to recall their contents
  • Indexing of videoconference archives has not been
    investigated till now

6
Architecture of PVCAIS
  • Consists of 3 modules
  • - Media Acquisition Module
  • - Archive Indexing Module
  • - Videoconference Accessing Module

7
Architecture of PVCAIS
Archive Indexing
Media Acquisition
Videoconference Accessing
8
Architecture Media Acquisition Module
  • Extracts channel data and forms media files
  • Videoconferencing physically contains 4 types of
    channels Audio, Video, Data and Control
  • Audio and Video channels transmit incoming/
    outgoing audio and video information
  • Data channel carries information for user
    application such as Text Chat, Whiteboard and
    File Transfer
  • Control channel transmits system control
    information such as Member Information

9
Architecture Media Acquisition Module
  • Video-in and Video-out channel
  • Reduce redundancy just store key-frames
  • Detect scene change in real time
  • Each key frame picture is stored with a timestamp

10
Architecture Media Acquisition Module
  • Audio-in and Audio-out channel
  • Mixed into one stream after videoconference
  • Will be used for Speech Recognition

11
Architecture Media Acquisition Module
  • Text Chat channel
  • sender / receiver
  • message
  • store with timestamp

12
Architecture Media Acquisition Module
  • Whiteboard channel
  • Consists of a text-based index file and a number
    of snapshot pictures
  • Index file records timestamp for each whiteboard
    update event and the path of the corresponding
    snapshot picture
  • Update of this channel happens in a period of
    time -gt need to detect when update begins and
    ends by monitoring data transfer in this channel

13
Architecture Media Acquisition Module
  • File Transfer channel
  • Make a copy of the sent/received files to the
    directory of archive
  • Index file includes senders / recipients user
    names and the path of the files

14
Architecture Media Acquisition Module
  • Control channel
  • Contains timestamp and information of each event
    such as member joined and member left

15
Architecture Media Acquisition Module

16
Architecture Archive Indexing Module
  • Raw files are extracted in Media Acquisition
    Module
  • Need to implement some multimedia indexing
    functions to retrieve more information
  • These includes
  • Face Detection, Face Recognition, Speech
    Recognition, Time-based Text Merging

17
Architecture Archive Indexing Module
  • Face Detection
  • - If face is detected, find out the face region

Face Detection
Face
18
Architecture Archive Indexing Module
  • Face Recognition
  • - Associate human faces in Video-in with name
  • - Need to keep a face base
  • - If no match in the face base, ask remote user
    to enter the name

19
Architecture Archive Indexing Module
  • Speech Recognition
  • - Generate speech script from audio archive-
    Speech of a videoconference contains important
    information
  • - Can use commercial libraries Microsoft SAPI,
    IBM Via Voice
  • Time-based Text Merging
  • - Merge the Speech script, Chat messages and
  • Whiteboard text into the Text Source
    according to their
  • timestamps

20
Architecture Videoconference Accessing
Module
  • Provides an interface for user to manage, search
    and review all indexed conference archives.
  • Allows user to search for a conference by
    different criteria, such as meeting date, member
    name and title.
  • Allows user to review a conference by playing
    back different media in a synchronized way.

21
Implementation
  • NetMeeting 3.0
  • A Windows feature that provides Internet
    conferencing function.
  • Supports video, audio and data conferencing
    including application sharing, chat, whiteboard
    and file transfer.
  • NetMeeting 3.0 SDK
  • An extension of NetMeeting, provides an interface
    for programmers and Web developers to integrate
    conferencing capabilities into their
    applications.
  • API is in the form of COM interfaces and
    functions..

22
Implementation Videoconferencing Client
  • A videoconferencing program built on top of the
    NetMeeting 3.0 SDK.
  • Support
  • Video Streaming
  • Audio Streaming
  • Text Chat
  • File Transfer
  • Whiteboard

23
(No Transcript)
24
Implementation Videoconferencing Client
  • Face Verification Login Feature
  • Face Training 8 different face images of each
    user are needed for training and their face
    images are saved in face base
  • Face Verification EigenFace algorithm is
    implemented to check the face of a user against
    his/her user ID. If the verification is
    successful, user can then join a videoconference

25
(No Transcript)
26
Implementation Media Acquisition Module
  • By directly using the functions of the API, the
    following raw data can be obtained
  • member information
  • file transfer record
  • text message record
  • Video, audio and whiteboard data cannot be
    directly obtained.

27
Implementation Media Acquisition Module
  • Video
  • create a thread to check the display of the video
    windows
  • if scene change is detected, the video will be
    captured and stored as a still image.
  • the stored images are key frames of the
    conference.

28
Implementation Media Acquisition Module
  • Audio
  • create a thread to record the local audio from
    the microphone.
  • members of the conference will continuously
    exchange the audio data.
  • all the received audio files and locally recorded
    audio files will be combined to generate a single
    audio file.

29
Implementation Media Acquisition Module
  • Whiteboard
  • cannot capture the NetMeeting whiteboard
    information because the format of the data is not
    stated in the API.
  • solution create our own whiteboard function and
    data format.

30
(No Transcript)
31
Implementation Media Acquisition Module
  • - The whiteboard update flow

32
Implementation Archive Indexing Module
  • The stored key-frames will be used for face
    detection and recognition after the conference.
  • The final audio file will be used for voice
    recognition, the voice engine used is Microsoft
    SAPI.

33
Implementation Videoconference Accessing
Module
  • Consists of
  • - Searching Interface search conference by
    title, date, participants, text,
  • file transferred, whiteboard content.
  • - Playback Interface review conference by
    playing back the content of
  • the conference by using SMIL

34
(No Transcript)
35
Implementation Videoconference Accessing
Module
  • SMIL
  • stands for Synchronized Multimedia Integration
    Language
  • HTML-like language
  • can integrate streaming audio and video with
    images, text, or any other media type into one
    presentation

36
Implementation Videoconference Accessing
Module
  • SMIL document generation process

37
(No Transcript)
38
Conclusion
  • We have developed the PVCAIS which supports
  • Videoconferencing functions
  • Acquisition of all videoconference content
  • Archive indexing
  • Searching and synchronized playback of
    videoconference archives

39
  • Q A
  • Session
Write a Comment
User Comments (0)
About PowerShow.com