Cross-media Intelligent Searching in - PowerPoint PPT Presentation

About This Presentation
Title:

Cross-media Intelligent Searching in

Description:

Cross-media Intelligent Searching in Digital Library Yueting Zhuang Zhejiang University, China Nov. 18, 2006, Egypt – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 80
Provided by: bibalexOrg2
Learn more at: http://www.bibalex.org
Category:

less

Transcript and Presenter's Notes

Title: Cross-media Intelligent Searching in


1
Cross-media Intelligent Searching in Digital
Library
  • Yueting Zhuang
  • Zhejiang University, China
  • Nov. 18, 2006, Egypt

2
Outline
  • 1. CADAL China digital library
  • 2. Our Vision to next generation of digital
    library
  • 3. From Multimedia Retrieval to Cross-media
    Retrieval
  • 4. Retrieval of Chinese calligraphy character a
    cross-media practice
  • 5. Building Personalized Portal
  • 6. Conclusion

3
Outline
  • 1. CADAL China digital library
  • 2. Our Vision to next generation of digital
    library
  • 3. From Multimedia Retrieval to Cross-media
    Retrieval
  • 4. Retrieval of Chinese calligraphy character a
    cross-media practice
  • 5. Building Personalized Portal
  • 6. Conclusion

4
3rd Workshop 2004, CMU, USA
5
ICUDL 2005, Zhejiang University, China
6
(No Transcript)
7
1. CADAL China Digital Library
  • China-US One Million Book Digital Library Project
  • a unique library resource to scholars, students,
    and citizens
  • contain over one million scanned books
  • A big step towards the goal create a universal
    free to read digital library
  • Get knowledge available on the web, anytime,
    anyone, anywhere

http//www.cadal.zju.edu.cn
8
(No Transcript)
9
  • As of today, CADAL has achieved
  • 1.023 million books was digitized, including
  • Degree dissertation
  • Modern Chinese books
  • Traditional cultural resources
  • English books
  • Supporting multimedia resource
  • Image
  • audio
  • video
  • 3D model
  • Chinese calligraphy
  • about 200,000 clicks a day (http//www.cadal.zju.e
    du.cn)
  • users spread over 70 countries and regions
  • 16 scanning centers in China, occupying more than
    2000 square meters

10
Scanning books
Processing digitized books
11
(No Transcript)
12
Users spread over 70 countries and regions
13
  • Service structure of CADAL

14
  • Current services provided by CADAL

(1) Metadata searching
  • digital resources are classified into 8 classes
    according to the publication time and type.
  • both unified and advanced search are provided for
    all resources

15
(2) Unified search
16
China Ancient
Choose the types of resources to search
17
search results contain each type of resources.
18
(3) advanced search
Users can choose search scope, combined results
and result style
Second search, full texts and detailed
information are available in result page.
19
(4) full-text search
  • Full text search uses the texts from OCR

20
Outline
  • 1. CADAL China digital library
  • 2. Our Vision to next generation of digital
    library
  • 3. From Multimedia Retrieval to Cross-media
    Retrieval
  • 4. Retrieval of Chinese calligraphy character a
    cross-media practice
  • 5. Building Personalized Portal
  • 6. Conclusion

21
2. Our Vision to Next Generation of Digital
Library
  • typical features of existing DLs
  • books are indexed by title, author, keywords
  • users query books by keywords input
  • mostly only text information is returned
  • multimodal data is not fully-supported
  • What the next generation of DL looks like?
  • support multimodal sources
  • enable cross-media retrieval

22
Extension to the concept of Book
  • The key of our vision to next generation of
    digital library is the extension of book
    concept
  • A book is regarded as not only the written
    symbols on papers, but also any type of
    multimedia item, such as
  • A video clip
  • An audio clip
  • A piece of painting
  • .

23
So in the next generation of DL, book can be in
multimodal
  • We can find a general data structure to represent
    multimodal books

24
Supporting multimodal data is an important trend
in multimedia retrieval
?
We get multimodal information from real world,
then can we get multimodal data from digital
world, especial like a digital library?
25
Cross-media retrieval
  • After the extension of Book concept, the
    retrieval shall also be extended.
  • We call it cross-media retrieval.

26
Scenario a simple example of cross-media
Giant Panda Image
Starting Query
Starting Query
Textual Description to the giant Panda the
Panda is a kind of cat which
Starting Query
Giant Panda Text
Giant Panda Audio
User can start a query from any type of media,
and relevant multimedia data would be returned.
27
Cross-media retrieval is a useful way to access
multimodal data
  • Cross-media retrieval can be regarded as the
    simulation of the real world, and it helps us get
    multimodal data in a more flexible and more
    informative way!

28
What cross-media retrieval needs to do?
It can be an image, audio or keywords
29
Outline
  • 1. CADAL China digital library
  • 2. Our Vision to next generation of digital
    library
  • 3. From Multimedia Retrieval to Cross-media
    Retrieval
  • 4. Retrieval of Chinese calligraphy character a
    cross-media practice
  • 5. Building Personalized Portal
  • 6. Conclusion

30
3. From Multimedia Retrieval to Cross-media
Retrieval
1) Image Retrieval Content-based
31
query example
relevance feedback
Searching images
negative example
positive example
32
  • multimedia retrieval

(2) Image retrieval text-based
Query text
33
  • multimedia retrieval

(3) Motion retrieval
Given a query example of motion data, we can find
similar motion data from database.
34
  • multimedia retrieval

(4) Audio retrieval Content-based
System Framework
35
  • multimedia retrieval

audio retrieval key techniques
  • extract auditory features in compression field
    from audio clips
  • cluster fuzzy auditory features
  • represent audio clips with the cluster center
  • retrieve similar audios by cluster center
    matching
  • introduce relevance feedback techniques

36
  • multimedia retrieval

audio retrieval an example
feature weight
query example
weight adjusting
relevance feedback
37
  • multimedia retrieval

(5) video retrieval Overview
  • unlike text resources, video is unstructured.
  • rich in visual contents
  • poor in semantic understanding
  • the challenging issues
  • summarization structuring
  • video mining

38
  • multimedia retrieval

(5) video retrieval key techniques
  • video structuring
  • construct video table-of-content (VTOC)
  • make it physically structured.
  • video summarization
  • help the user quickly grasp the content of video
    clips
  • support video browsing
  • video encoding/compression

39
  • video structuring

video stream
video
concept clustering
table of contents
Scene
scene construction
group
grouping
shot boundary detection
shot
temporal features
key frame
spatial features
Key Frame Extraction
40
  • video summary video content mining

original video (redundant)
video content mining
summarized video (concise and informative )
Find meaningful patterns to support efficient
video browsing
41
  • video summary an example

two news video are separated in 6 video shots
(the following are the key frames) . And their
total length is 3 minutes
42
After video summarization, the video is 3
seconds. And it consists of 3 key frames as
below.
43
video shot clustering result
44
video browse
45
video browse
summary
key frames
46
  • multimedia retrieval

(6) 3D model retrieval overview
measure 3D model with shape similarity
47
  • multimedia retrieval

(6) 3D model retrieval an example
query example
48
  • As shown above, the multimedia retrieval is
    generally content-based X retrievalCBXR.

49
  • towards cross-media Retrieval
  • Motivation

image retrieval
audio retrieval
video retrieval
Cross-media retrieval

motion retrieval
3D model retrieval
CBXR
We can provide a more flexible and efficient way
to access multimodal data. We name it as
cross-media retrieval.
50
  • Support multimodal sources
  • smooth integration of multimodal data
  • query media objects by examples of different
    modalities
  • Challenging issues
  • texts, images, audios, etc. are represented with
    different features
  • different features are heterogeneous
  • cross-media similarity cant be measured by
    content features
  • there is a semantic gap between low-level
    features and semantics

51
  • Our Solution to Cross-media retrieval
  • build cross-indexing from multimodal data
  • organize multimedia document
  • explore cross-media correlations

52
Cross-indexing-based retrieval General idea
Retrieval interface

53
(1) Cross-index retrieval interface
The system now support images, audios and videos.
Users can submit any of the media objects, and
the system returns relevant images, audios and
videos.
54
Building multimedia document General idea
  • definition of multimedia document
  • a logical representation of multimodal data
  • consists of semantically related media objects
  • formal structure

Document ltID, Title, URI, KeywordList,
ElementSet,LinkSetgt ElementSet (Audio Image
Text Video) i i?N Audio ltID, ParentID,
URI, Size, KeywordList, AudioFeaturegt Image
ltID, ParentID, URI, Size, KeywordList,
ImageFeaturegt Text ltID, ParentID, URI,
KeywordList gt Video ltID, ParentID, URI,
Frames, KeywordList, VideoFeaturegt
55
Build multimedia document framework
Storage Subsystem
Multimedia document
keyword
text
image
Learning and Relevance feedback subsystem
audio
video
graphics
Query Processor (multimedia document media
objects)
Preprocessing
Semantic skeleton base
56
Building multimedia document retrieval interface
  • the left figure is the relevant media data
    retrieved by the query of water.
  • A multimedia document is visualized as its
    sketch, i.e. text, images and key-frame lists for
    videos.
  • Besides keyword-based search, the user can
    perform a content-based search with a specific
    media object as the query example

57
Exploring cross-media correlations challenges
Gap 1 Content gap
Challenges
  1. multimodal data reside in heterogeneous feature
    spaces
  2. the semantic gap

58
Exploring Cross-media Correlations Solutions
Images and audios represent high-level semantics
from different perspectives. If we can find the
correlation between different perspectives, we
can enable cross-media retrieval with the bridge
of correlations.
correlation
correlation
bird
tiger
explosion
dog
car
59
Exploring cross-media correlations mathematical
realization
Basic idea
X and Y are of different dimension !
At the same time, the correlation between X and Y
maximally coincides with the correlation between
X and Y
X and Y are of the same dimension !
60
Exploring cross-media correlations subsequent
challenges
1. how to measure both intra- and inter-media
correlations ?
cross-media
Intra-media
Intra-media
cross-media
2. how to introduce new media objects into the
system?
testing data
locate
the correlation network in the subspace
locate
61
Outline
  • 1. CADAL China digital library
  • 2. Our Vision to next generation of digital
    library
  • 3. From Multimedia Retrieval to Cross-media
    Retrieval
  • 4. Retrieval of Chinese calligraphy character a
    cross-media practice
  • 5. Building Personalized Portal
  • 6. Conclusion

62
4. Retrieval of Chinese Calligraphy Character
  • motivation
  • Original calligraphy works is unique.
  • They exist in paper, bamboo slips, and are
    easily to be destroyed.

63
How to search?
  • In our digital library, we digitize Chinese
    Calligraphy works,
  • Design retrieval systems to make them sharable
    by all the people on internet.

64
  • the objective

1. to query similar characters
Similar characters could be found and returned to
users. This is like traditional content based
image retrieval.
65
2. to find out where a character comes from
Character ? comes from this work
We aim to provide an intelligent way to find out
surrounding characters, and represent them to
users.
66
System Overview
67
(1). segmentation
  • noise elimination
  • page-image analysis
  • smoothing

(2). retrieval
  • feature extraction
  • shape matching
  • speed up

68
(1) segmentation
minimum-bounding box
We segment page into columns, and cut the columns
into individual characters within the
minimum-bounding box.
69
(2) Retrieval of Chinese Calligraphy Characters
  • feature extraction

Calligraphy character is written by brush in
stead of hard pen. The brush causes stroke
varies in different shape and different sickness.
Also the ancient calligraphy has many
degradation because of nature changes.
we use contour points to represent the
calligraphy character, and keep the features of
each individual calligraphy character in the
database
70
  • shape matching
  • use polar coordinates to represent the characters

divide the direction into 8 bins equally, and
divide each bin into 4 areas. Then count the
points in every bins as show in the picture.
71
  • speed up strategy
  • coarse-to-fine Strategy
  • improve Shape matching algorithm
  • dynamic Time Warping of projecting histogram
  • extended DTW for 2D calligraphy contour warping
  • high dimensional indexing

72
Visualization of Chinese Calligraphy
Retrieval result
Shape-based character retrieval
Submit Example
73
Outline
  • 1. CADAL China digital library
  • 2. Our Vision to next generation of digital
    library
  • 3. From Multimedia Retrieval to Cross-media
    Retrieval
  • 4. Retrieval of Chinese calligraphy character a
    cross-media practice
  • 5. Building Personalized Portal
  • 6. Conclusion

74
5. Building Personalized Portal
  • Personalized portal
  • Web personalization is the technique to help
    users quickly locate interesting information
    which features multimedia and cross-media.
  • Service integration around the content
  • Information filtering based recommendation

Show me the information that I really need !
75
  • personalized portal
  • Personalization services provided by portal
  • my bookshelf
  • my bookmark
  • my rules
  • personal profile
  • setting

My bookshelf
My bookmark
Books recommended by rules
76
  • service integration around the content
  • detail information about book
  • translate metadata
  • full-text search
  • my bookshelf management
  • ranking
  • CALIS union catalog and inter- library loan
  • My bookshelf management
  • my bookmark management
  • bilingual translation
  • full-text search

77
  • information filtering based recommendation
  • the classification of Web data
  • content data texts, images
  • structure data XML/HTML tag
  • usage data Web access log
  • user profile preferences, demographic information
  • implementing information filtering techniques
  • content based filtering method
  • collaborative filtering method

78
6. Conclusion
  • Next generation of digital library shall focus
    more on multimedia, and finally cross-media
    retrieval.
  • But more research issues to be faced with
  • Cross-Media Representation Framework
  • Cross-Media Knowledge-based Reasoning
  • Analysis and Recognition
  • Complex retrieval

79
Thanks !
Write a Comment
User Comments (0)
About PowerShow.com