Title: SCSIT Talk, Nottingham University,
1Indexing of Graphic Document Images a
Perceptive Approach
- Mathieu Delalandre¹,²
- Thursday 16th June 2005
- ¹ PSI Laboratory, Rouen University, France
- ² SCSIT, Nottingham University, UK
2Who I am ?
- Mathieu Delalandre
- Thesis Fourth year of PhD (defence in September)
- Lab PSI Laboratory, Rouen city, France
- Super E. Trupin, J.M. Ogier, J. Labiche
- Team S. Adam, H. Locteau, P. Héroux, E. Barbu,
Y. Lecourtier - Field Document Image Analysis (Graphics
Recognition) - Postdoc IPI, SCSIT, from April to September (4-5
months) with Tony Pridmore
3Indexing of Graphic Document Images a
Perceptive Approach
- Introduction
- Systems Overview
- The Knowledge Level
- Conclusion
4IntroductionIndexing Retrieval (I R)
- -Indexing Retrieval (I R)
- -Categorization of Images
- -I R of Document Images
- -My Topic
- Indexing Retrieval Greengrass00
- Indexing Identification and recording of
attributes of data that will aid retrieval. - Retrieval Ability of a database management
system to get back data that were stored there
previously. - Applications
- videos (MPEG, AVI, )
- Web pages (XML, XHTML, )
- structured documents (PDF, PS, Word, )
- images (JPG, GIF, )
5IntroductionCategorization of Images
- -Indexing Retrieval (I R)
- -Categorization of Images
- -I R of Document Images
- -My Topic
foreground/background images
6Introduction I R of Document Images (1/3)
- -Indexing Retrieval (I R)
- -Categorization of Images
- -I R of Document Images
- -My Topic
- Today, document images are not indexed by search
engines due of complexity of Document Image
Analysis (DIA) task Doerman98Walker00Baird
03 - Is indexing of document images really needed ? ?
two questions - Question How many document images and where
Spring95 Cleveland98 Steve99 Ouf01
Baird03 Hu04 ?
7Introduction I R of Document Images (2/3)
- -Indexing Retrieval (I R)
- -Categorization of Images
- -I R of Document Images
- -My Topic
Question New or just old document images ?
8Introduction I R of Document Images (3/3)
- -Indexing Retrieval (I R)
- -Categorization of Images
- -I R of Document Images
- -My Topic
- To Conclude
- (1) DIA is needed (and will be needed) in the
future of I R of documents Baird03
Breul04 - (2) DIA must come back today under the way of I
R Baird03
9Introduction My Topic
- -Indexing Retrieval (I R)
- -Categorization of Images
- -I R of Document Images
- -My Topic
- Indexing of graphic document images
- Indexing Retrieval ? Indexing
- Identification and recording of attributes of
data that will aid retrieval - First step before retrieval
- document images ? graphic document images
10Indexing of Graphic Document Images a
Perceptive Approach
- Introduction
- Systems Overview
- The Knowledge Level
- Conclusion
11Systems OverviewIntroduction
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
- Overview of systems to index graphic document
images - we talk about Graphics Indexing Systems
- Graphics Indexing Systems are specialized from
DIA systems applied to recognition and
understanding of graphic document images
Tombre03 - we talk about Graphics Recognition Systems
12Systems OverviewGraphics Recognition Systems
(1/3)
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
- Graphics Recognition Systems
- graphic document images ? structured documents
symbol
linear
text
- Applications deal with graphics parts (symbol and
linear) - text/graphics segmentation Tombre02,
vectorisation Mejbri02, symbol recognition
Llados02, document interpretation (or
understanding) Ablameko00,
13Systems OverviewGraphics Recognition Systems
(2/3)
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
- Graphics are structured and connected
- Graphics Recognition Systems are based on
structural methods - relational organization of low-level features
(graphic primitives) into higher-level structures
(graph) Tombre96 Shi89
14Systems OverviewGraphics Recognition Systems
(3/3)
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
- Architecture of Graphics Recognition Systems
- Graphic Primitive Extraction, some methods
Wenyin98 Delalandre04 - skeletonization Hilaire04, contouring
Ramel00, tracking Song00, labelling
Badawy02, transform Couasnon01, meshes
Vaxiviere95, region segmentation Cao00,
run-length Burge98, - Recognition
- Graph Matching Bunke00, Graph Transform
Blostein05, Primitive Matching Foggia99,
15Systems OverviewGraphics Indexing Systems (1/3)
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
- Graphics Indexing Systems Doerman98
Tombre03, 3 classes
Title block recognition Arias98, Najman01,
Lamiroy02,
Statistical framework Samet96, Worring99,
Tabbone03, Terrades03,
Graphics indexing Kasturi88, Lorenz95,
Huang97, Hu97, Barbu04, Valasoulis04,
16Systems OverviewGraphics Indexing Systems (2/3)
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
17Systems OverviewGraphics Indexing Systems (3/3)
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
18Systems OverviewOpen Problems (1/2)
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
- All these systems use a Lexical/Syntactic (or
Bottom/Up) approach Tombre96 - Lexical (Bottom) Extraction from images of
graphical primitives in an fixed way - Syntactic (Up) Analysis of graphical primitives
without returns on image - So, all these systems use a Document
Understanding Approach, but I R is not an
Understanding problem
?content adaptation is the most important feature
of I R systems
19Systems OverviewOpen Problems (2/2)
- -Introduction
- -Graphics Recognition Systems
- -Graphics Indexing Systems
- -Open Problems
- Examples of Content Adaptation
- A broad class of document
- To conclude
- A I R must deal with the content adaptation
- Content adaptation cant be solved without a
knowledge based approach
20Indexing of Graphic Document Images a
Perceptive Approach
- Introduction
- Systems Overview
- The Knowledge Level
- Conclusion
21The Knowledge Level Introduction
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
- Some (general) definitions Tuthill90
Holsapple04 - Knowledge human mental grasp of reality
- Representation placement (and meaning) of
knowledge into (from) computer memory - Formalism a set of symbols corresponding to
knowledge inside computers
- Different types of knowledge
- on strategies
- on case based reasoning
- on ontologies
- .
22The Knowledge Level Graphical Knowledge (1/2)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
- Graphical Knowledge Delalandre05 It is a
type of knowledge corresponding to human mental
grasp of graphics
it is a gate !
23The Knowledge Level Graphical Knowledge (2/2)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
- Two formalism levels Tombre96
- Graphic Primitives Murray96
- Pixel-based formalism pixel, raster, run,
connected component, - Vector-based formalism vector, arc, curve,
ellipsis, square,
- Graph-based formalisms Sowa 99 Relational
Attributed Graphs (RAG), Frames, Object-Oriented
Languages,
24The Knowledge Level Graphics Model (1/2)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
- Model Seguela01 a knowledge representation
using given formalisms and for given systems
purposes - Graphics Model Delalandre05 model used to
represent the graphical knowledge
25The Knowledge Level Graphics Model (2/2)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
- One system one model ? a considerable number of
models - Joseph92 Pasternak93 Han94 Burgue95
Yu97 Lee98 Ramel00 Couasnon01
Badawy02 Yan04 - Models depend of extracted graphic primitives, we
can defined a graphics model taxonomy into 3
classes Delalandre05
26The Knowledge Level a Perceptive Approach (1/6)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
27The Knowledge Level a Perceptive Approach (2/6)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
Region Level
Contour Level
Skeleton Level
28The Knowledge Level a Perceptive Approach (3/6)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
- First step, the region level connected
component analysis Alnuweiri92
29The Knowledge Level a Perceptive Approach (4/6)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
- Six Features
- (F) Foreground
- (B) Background
- (R) Resolution (ie. distance)
- (N) Neighboring
- (S) Size
- (I) Inclusion
30The Knowledge Level a Perceptive Approach (5/6)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
31The Knowledge Level a Perceptive Approach (6/6)
- -Introduction
- -Graphical Knowledge
- -Graphics Model
- -a Perceptive Approach
FS1
BR2
Ngt2
32Indexing of Graphic Document Images a
Perceptive Approach
- Introduction
- Systems Overview
- The Knowledge Level
- Conclusion
33Conclusion
- Conclusion
- It is just a bibliography study and ideas
- Start on this ideas ?
- Perspectives
- Contour and skeleton levels ?
- System to control the representation building ?