Supervised by Prof. LYU, Rung Tsong Michael - PowerPoint PPT Presentation

About This Presentation
Title:

Supervised by Prof. LYU, Rung Tsong Michael

Description:

img src='pix/0.jpg' dur='15' region='scene'/ img src='pix/15.jpg' dur='5' region='scene' ... img src='pix/20.jpg' dur='7' region='scene'/ img src='pix/27.jpg' ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 39
Provided by: Pat34
Category:

less

Transcript and Presenter's Notes

Title: Supervised by Prof. LYU, Rung Tsong Michael


1
Department of Computer Science Engineering The
Chinese University of Hong Kong
LYU0102 XML for Interoperable Digital Video
Library
  • Supervised by Prof. LYU, Rung Tsong Michael

Prepared by Chan Pik Wah, Pat Ngai Cheuk Han,
Table
2
Outline
  • Introduction to XVIP
  • Overview of Project
  • Extraction Techniques
  • Face Detection
  • Speech Recognition
  • Multimedia Transformation Presentation
  • XSL
  • SMIL
  • Transformation
  • Problems Solutions
  • Conclusion

3
Motivations
  • Rapid increase in the usage of multimedia
    information
  • New approach DIGITAL VIDEO LIBRARY

Project Outline
4
Motivations
  • Little attention paying on video information
    extraction and storage
  • Scalability of the system in terms of adding new
    extraction components
  • Lack of a generic framework for presentation and
    visualization of video information

Project Outline
5
Overview of XVIP
Project Outline
6
Achievements in last Semester
  • 2 Extraction Techniques
  • Scene Change
  • VOCR
  • Integrate data into XML
  • XML Editor
  • Knowledge Enrichment

Project Outline
7
Achievements in this Semester
  • 2 more extraction techniques
  • Face Detection
  • Speech Recognition
  • New data integrated to XML
  • XML to SMIL Transformer

Project Outline
8
Extraction Techniques
XML
Video
Scene Change
VOCD
Face Detection
Speech Recognition
Extraction Techniques
9
Face Detection
  • Object-presence detections are also an important
    technique.
  • Identify and index features to support image
    similarity matching. Face detection is a good
    example

Extraction Techniques
10
Face Detection
  • Name of people appearing in the video
  • How they are interacting with the environment
  • More searchable

Extraction Techniques
11
Face Detection
  • Neural Network-Based Algorithm
  • The basic algorithm used for face detection

Extraction Techniques
12
Face Detection
  • Face Recognition
  • Facial Expression Analysis
  • Enrich the XML
  • Easier for user to search the content of video

Extraction Techniques
13
Speech Recognition
  • Speech recognition technology can make any spoken
    data useful for library indexing and retrieval

Extraction Techniques
14
Speech Recognition Engine
Extraction Techniques
15
Speech Recognition
  • ViaVoice
  • Error rate gt 50

Extraction Techniques
16
Usage of XML
Indexing Searching
XML
Combine with other XML for Knowledge Enrichment
Exchange data with different application
Presentation
17
Presentation of the video data
  • XML is not presentable without processing
  • HTML with images, but is static
  • SMIL is good for multimedia presentation
  • No existing tools for integrating different XML
    data into a SMIL presentation
  • Current transformation language has
  • a lot of limitations in transforming
  • XML to SMIL

SMIL
18
SMIL
  • SMIL stands for Synchronized Multimedia
    Integration Language is currently a W3C
    Recommendation.
  • It is a markup language that can synchronize and
    integrate multimedia.
  • It enables authors to specify when and what
    should be presented.
  • RealPlayer, QuickTime, IE support

SMIL
19
Advantages
  • SMIL is text-based
  • Easy to develop with a text editor
  • Generate customized presentations
  • Generate customized SMIL file based on
    preferences recorded in the visitor's browser
  • SMIL effort is led by the W3C
  • W3C tries to shape a specification that is
    beneficial to all parties involved.
  • Avoid using container formats.
  • SMIL can stream many media formats, no need to
    merge clips into a single streaming file.

SMIL
20
Timing and Synchronization
  • Parallel element
  • ltpargt
  • lttext src"text/transcript.rt" region"transcript"
    /gt
  • lttext src"text/mapdetail.rt" region"mapdetail"
    /gt
  • ltvideo src"news.mpg" region"video"
    fill"freeze"/gt
  • lt/pargt
  • Sequence element
  • ltseqgt
  • ltimg src"pix/0.jpg" dur"15" region"scene"/gt
  • ltimg src"pix/15.jpg" dur"5" region"scene"/gt
  • ltimg src"pix/20.jpg" dur"7" region"scene"/gt
  • ltimg src"pix/27.jpg" dur"4" region"scene"/gt
  • lt/seqgt

SMIL
21
XSL
  • Stands for Extensible Stylesheet Language
  • XSL is the language defined by the W3C to add
    formatting information to XML data.
  • XSLT -- most commonly used XSL standard
  • Transforms one XML document into another.
  • Used in our FYP.

XSL
22
Working Principle
XSL Stylesheet
Source Tree
Output
XSL
23
Transformation Process
  • Input files
  • XML file generated by XVIP
  • XML files of additional information
  • Output files
  • A SMIL file
  • Some RealText files

Transformation
24
Design 1
  • Build with VC solely
  • Read all the input files, get the information
  • Create the output the files for the SMIL
    presentation.
  • Disadvantages
  • Layout of the SMIL presentation need to be
    hard-coded in the VC program.
  • The layout becomes hard to change and the
    transformer becomes hard to extend.

Transformation
25
Design 1 with modification
  • Modification
  • Provide an additional file or interface as a
    template for user to define the layout of SMIL
    presentation.
  • Disadvantage
  • The flexibility provided is still limited.
  • Not a standard way to define a template.

Transformation
26
Design 2
  • Use XSLT assisting the transformation. User can
    define his own template with XSL.
  • Advantages
  • Program-independent
  • Extensible
  • Standard templates
  • Limitations of XSLT
  • It can only read one input data file and one XSL
    file, then generate one output.
  • It cannot do combin-ation among files.

Transformation
27
Design 2
  • Solutions
  • Knowledge Enrichment
  • Combine additional information with the XML file
    from XVIP before converting to SMIL
  • Creating output files
  • Use separate XSL files to generate RealText files
  • Use separate XSL files to generate layout of the
    presentation and displaying order of objects in
    different regions, then combine them to a SMIL
    file

Transformation
28
Knowledge Enrichment
Information of major cities
XML file from XVIP
Combined XML file
Transformation
29
Combined XML file
  • XML file contains information of major cities
    that are related to the video.
  • ltCOMBINEgt
  • ltTIME begin"10" dur"11"gt
  • ltNAMEgt??lt/NAMEgt
  • ltDETAILgt??????????lt/DETAILgt
  • ltAREAgtChinalt/AREAgt
  • lt/TIMEgt
  • ltTIME begin"21" dur"20"gt
  • ltNAMEgt??lt/NAMEgt
  • ltDETAILgt??????????lt/DETAILgt
  • ltAREAgtAmericalt/AREAgt
  • lt/TIMEgt
  • lt/COMBINEgt

Transformation
30
Create RealText files
  • Geographical Information
  • Biographical Information
  • Video Transcript

Transformation
31
Create SMIL file
Layout
Displaying order
Transformation
32
Create SMIL file
SMIL Presentation
Combining the temporary files
Transformation
33
Problems Solutions
  • Problem 1
  • The result from XSLT processor is in UTF-8
    encoding format, but SMIL needs the format ANSI.
  • Solution
  • Write a function UTF8toANSI for conversion.

Problems Solutions
34
Problems Solutions
  • Problem 2
  • XSLT has limitation. It can only read one XML,
    one XSL file and generate one output file.
  • Our transformation process has more than one
    input files
  • Solution
  • Do knowledge enrichment and produce a combined
    XML result file before creating the output files.

Problems Solutions
35
Conclusion
  • XVIP contains
  • Four video information modalities
  • Scene change detection
  • VOCD
  • Speech recognition
  • Face detection
  • Information integration module with XML
  • For storing the extracted video data in XML format

Conclusion
36
Conclusion
  • XML editor
  • For editing the XML file generated
  • Knowledge enrichment component
  • For adding additional information to the
    XML-based video data
  • XML to SMIL transformer
  • For converting the XML-based video data into SMIL
    presentation

Conclusion
37
Conclusion
  • XVIP
  • provides multiple functions for extracting
    video information
  • stores video information in a flexible and
    scalable way
  • Comprises a transformer to generate
    presentation on the information
  • Paper XVIP An XML-Based Video Information
    Processing System, Michael Lyu, Edward Yau,
    C.H.Ngai, P.W.Chan, was accepted by COMPSAC 2002.

Conclusion
38
Q A
Write a Comment
User Comments (0)
About PowerShow.com