Title: Final Year Project 2003/2004 LYU0302 PVCAIS
1Final Year Project 2003/2004LYU0302PVCAIS
Personal VideoConference Archives Indexing System
- Supervisor Prof Michael R. Lyu
- Presented by Lewis Ng, Philip Chan
- 20 April 2004
2Outline
- Introduction
- Motivation
- Architecture of PVCAIS
- - Media Acquisition Module
- - Archive Indexing Module
- - Videoconference Accessing Module
- Implementation
- Conclusion
3Introduction
- PVCAIS stands for
- Personal VideoConference Archives Indexing
System - A system that provides convenient searching and
browsing support for videoconferencing users on
past videoconference archives
4Introduction
- What is videoconference?
- A real-time communication technology which
combines different media - audio, video, text chat, file transfer,
whiteboard and shared applications - - More precisely is multimedia conference
- - Standard of video conferencing ITU-T H.323
5Motivation
- Videoconferencing is becoming popular in
- education, business and personal communication
- Participants wish to keep videoconference
archives for later references - Normal video and audio files are neither
searchable nor helpful to recall their contents - Indexing of videoconference archives has not been
investigated till now
6Architecture of PVCAIS
- Consists of 3 modules
- - Media Acquisition Module
- - Archive Indexing Module
- - Videoconference Accessing Module
-
7Architecture of PVCAIS
Archive Indexing
Media Acquisition
Videoconference Accessing
8Architecture Media Acquisition Module
- Extracts channel data and forms media files
- Videoconferencing physically contains 4 types of
channels Audio, Video, Data and Control - Audio and Video channels transmit incoming/
outgoing audio and video information - Data channel carries information for user
application such as Text Chat, Whiteboard and
File Transfer - Control channel transmits system control
information such as Member Information
9Architecture Media Acquisition Module
- Video-in and Video-out channel
- Reduce redundancy just store key-frames
- Detect scene change in real time
- Each key frame picture is stored with a timestamp
10Architecture Media Acquisition Module
- Audio-in and Audio-out channel
- Mixed into one stream after videoconference
- Will be used for Speech Recognition
11Architecture Media Acquisition Module
- Text Chat channel
- sender / receiver
- message
- store with timestamp
12Architecture Media Acquisition Module
- Whiteboard channel
- Consists of a text-based index file and a number
of snapshot pictures - Index file records timestamp for each whiteboard
update event and the path of the corresponding
snapshot picture - Update of this channel happens in a period of
time -gt need to detect when update begins and
ends by monitoring data transfer in this channel
13Architecture Media Acquisition Module
- File Transfer channel
- Make a copy of the sent/received files to the
directory of archive - Index file includes senders / recipients user
names and the path of the files
14Architecture Media Acquisition Module
- Control channel
- Contains timestamp and information of each event
such as member joined and member left
15Architecture Media Acquisition Module
16Architecture Archive Indexing Module
- Raw files are extracted in Media Acquisition
Module - Need to implement some multimedia indexing
functions to retrieve more information - These includes
- Face Detection, Face Recognition, Speech
Recognition, Time-based Text Merging
17Architecture Archive Indexing Module
- Face Detection
- - If face is detected, find out the face region
Face Detection
Face
18Architecture Archive Indexing Module
- Face Recognition
- - Associate human faces in Video-in with name
- - Need to keep a face base
- - If no match in the face base, ask remote user
to enter the name -
19Architecture Archive Indexing Module
- Speech Recognition
- - Generate speech script from audio archive-
Speech of a videoconference contains important
information - - Can use commercial libraries Microsoft SAPI,
IBM Via Voice
- Time-based Text Merging
- - Merge the Speech script, Chat messages and
- Whiteboard text into the Text Source
according to their - timestamps
-
20Architecture Videoconference Accessing
Module
- Provides an interface for user to manage, search
and review all indexed conference archives. - Allows user to search for a conference by
different criteria, such as meeting date, member
name and title. - Allows user to review a conference by playing
back different media in a synchronized way.
21Implementation
- NetMeeting 3.0
- A Windows feature that provides Internet
conferencing function. - Supports video, audio and data conferencing
including application sharing, chat, whiteboard
and file transfer.
- NetMeeting 3.0 SDK
- An extension of NetMeeting, provides an interface
for programmers and Web developers to integrate
conferencing capabilities into their
applications. - API is in the form of COM interfaces and
functions..
22Implementation Videoconferencing Client
- A videoconferencing program built on top of the
NetMeeting 3.0 SDK. - Support
- Video Streaming
- Audio Streaming
- Text Chat
- File Transfer
- Whiteboard
23(No Transcript)
24Implementation Videoconferencing Client
- Face Verification Login Feature
- Face Training 8 different face images of each
user are needed for training and their face
images are saved in face base - Face Verification EigenFace algorithm is
implemented to check the face of a user against
his/her user ID. If the verification is
successful, user can then join a videoconference
25(No Transcript)
26Implementation Media Acquisition Module
- By directly using the functions of the API, the
following raw data can be obtained - member information
- file transfer record
- text message record
- Video, audio and whiteboard data cannot be
directly obtained.
27Implementation Media Acquisition Module
- Video
- create a thread to check the display of the video
windows - if scene change is detected, the video will be
captured and stored as a still image. - the stored images are key frames of the
conference.
28Implementation Media Acquisition Module
- Audio
- create a thread to record the local audio from
the microphone. - members of the conference will continuously
exchange the audio data. - all the received audio files and locally recorded
audio files will be combined to generate a single
audio file.
29Implementation Media Acquisition Module
- Whiteboard
- cannot capture the NetMeeting whiteboard
information because the format of the data is not
stated in the API. - solution create our own whiteboard function and
data format.
30(No Transcript)
31Implementation Media Acquisition Module
- - The whiteboard update flow
32Implementation Archive Indexing Module
- The stored key-frames will be used for face
detection and recognition after the conference. - The final audio file will be used for voice
recognition, the voice engine used is Microsoft
SAPI.
33Implementation Videoconference Accessing
Module
- Consists of
- - Searching Interface search conference by
title, date, participants, text, - file transferred, whiteboard content.
- - Playback Interface review conference by
playing back the content of - the conference by using SMIL
34(No Transcript)
35Implementation Videoconference Accessing
Module
- SMIL
- stands for Synchronized Multimedia Integration
Language - HTML-like language
- can integrate streaming audio and video with
images, text, or any other media type into one
presentation
36Implementation Videoconference Accessing
Module
- SMIL document generation process
37(No Transcript)
38Conclusion
- We have developed the PVCAIS which supports
- Videoconferencing functions
- Acquisition of all videoconference content
- Archive indexing
- Searching and synchronized playback of
videoconference archives
39