Title: Prezentace aplikace PowerPoint
1Two-dimensional Context-Free Grammars Mathematic
al Formulae Recognition
Daniel Prua, Václav Hlavác
Center for Machine Perception Faculty of
Electrical Engineering Czech Technical
University, Prague
2Presentation Overview
- Formulae recognition, problem formulation
- Known methods
- General idea of structural recognition
- Two-dimensional context-free grammars
- Extension of the grammars
- Recognition tool, pilot implementation
- Results, future plans
3Motivation for this work
- To test a theoretical construct on a practical
pilot problem with explicit structure ?
mathematical formulae - The group of Schlesinger, Savchynskyy from Kiev
works on music score recognition. We cooperate in
a joint research project.
4Math. formulae, off-line or on-line
- Formulae recognition can be divided into two
groups by the type of input - Off-line recognition a formula is depicted in a
raster image. - On-line recognition a formula represented by a
sequence of pen strokes (growing importance due
to tablet PCs).
5Math. formulae recognition, usage
- Off-line recognition conversion of scanned
printed mathematical texts into an electronic
form. - On-line recognition connected to pen-based
computing technologies (electronic tablets). - There are many papers on formulae recognition,
but only a few commercial products (e.g.,
xMathJournal by xThink)
6Usual architecture
- Two independent layers
- Symbol detection and recognition.
- Structural analysis.
image, sequence of strokes
symbol recognition
symbols ( coordinates and font size)
error corrections (optional)
structural analysis
derivation tree
7Symbol recognition methods
- Image segmentation OCR tool.
- Image segmentation and character recognition
performed simultaneously (e.g., by Hidden Markov
Models). - It is very difficult to recover from errors made
in segmentation phase. - Semantic not taken into account.
8Structural analysis methods
- Grammar based
- geometric grammars
- graph grammars
- Non-grammar based
- minimum spanning tree
- hard-coded rules
9Our approach to structural recognition
- Based on general structural constructions by M.I.
Schlesinger, V. Hlavác in Ten Lectures on
Statistical and Syntactic Pattern Recognition
(Kluwer Academic Publishers, 2002) - Do not separate segmentation and parsing, perform
them simultaneously. - Suitable for recognition of objects with rich
structure. - Already successfully applied to music scores and
electric circuits diagrams.
10Structural Recognition General Idea
Assumptions input image, set of derivation rules
Recognition
- Algorithm starts with regions labeled by
terminals - - squares corresponding to one symbol,
- - regions detected by an external tool.
- Bigger regions labeled by non-terminals are
derived by applying the rules, each derivation is
assigned by a penalty. - Result region matching the whole picture with
the smallest penalty.
N
Region N is derived by a rule from regions A, B,
C, D
B
A
D
C
11Structural Recognition Applied on Formulaeusing
2D Context-free Grammars
- Uniform shapes of regions considered rectangles
- 2D grammar for mathematical formulae designed.
- Terminals detection - detect all possible
occurrences of elementary symbols using an OCR
tool, evaluate the occurrences by a penalty
(computed by the OCR tool).
fraction line, minus sign
symbol 5
12Structural Recognition Applied on Formulaeusing
2D Context-free Grammars
Parsing let the structural analysis decide what
is the best segmentation and interpretation of
the elementary symbols, i.e. find derivation tree
covering the whole image, evaluated by the
smallest penalty.
-
5
2
13Two-dimensional Context-free Grammars
set of terminals
set of non-terminals
initial non-terminal
set of productions
Three basic types of productions in P
Generalized form of productions
14Interpretation of Productions
G generates pictures that can be named by the
initial non-terminal S
15Theoretical Results on 2D CF Languages
L(2CFG) ... class of languages that can be
generated by a 2D CF grammar
- L(2CFG) includes 1D context-free languages
- L(2CFG) and L(2FSA) are not comparable
- There is no analogy to the Chomsky normal form of
productions
- Basic form of productions is weaker than general
one
- Emptiness problem is not decidable
- Languages in L(2CFG) can be recognized in
polynomial time
Observation natural generalization, but the
properties of L(2CFG) differ to the properties
of the class of 1D context-free languages.
16Recognition in Polynomial Time
2D CF grammars with productions in the basic form
Generated languages can be recognized in time
(M.I. Schlesinger)
picture size
Algorithm can be generalized on all languages in
L(2CFG)
Maximal number of rows on the right-hand side of
a production.
Maximal number of columns on the right-hand side
of a production.
- degree of the polynomial depends on size of the
productions
17Extension of 2D CF Grammars
2D context-free grammar are not power enough to
express complex structure of mathematical
formulae.
We need a formalism allowing to easily work with
relative positions and sizes of symbols, e.g. to
express relationships like a symbol is
superscript of another symbol, etc.
1
3
5
2
5
3
6
4
18Extension of 2D CF Grammars
- Regions are still rectangles.
- Each derived region is assigned by a feature
point (logical center). The feature point a
derived region is determined by the applied
production.
1
5
3
19Extension of 2D CF Grammars
- Usage of productions is not limited on directly
neighboring (touching) rectangles. - Productions can specify a rectangular area where
some specific point of a rectangle has to be
contained. - Position and sizes can be given relative to one
of the rectangles. - Restrictions on relative sizes of rectangles are
also possible.
32
5
20Penalty Computation
Based on summing partial penalties determined by
the following criterions
- Used production.
- Relative sizes and positions of regions the
production is applied on (original regions). - Number of black pixels in the new region that are
not in the original regions. - Penalty of the original regions.
21Implementation of the Recognition Tool
- Off-line recognition.
- Implemented in Java.
- Trained and tuned for hand-written formulae.
- Black and white images (but can be extended on
gray-scale images). - The following constructs are supported
- variables, numbers, parenthesis,
- common unary and binary operators, power to
operator, - fractions, square root, subscripts, superscripts,
- sum, integral.
- Can deal with noise, ambiguities, touching or
split symbols, etc. and also with misplaced
symbols.
22Tool Architecture
OCR tool
terminals detection
2D grammar
parsing
23Terminals Detection
Ideally, all regions should be scanned for an
elementary symbol presence, but this consumes
much time, two smarter strategies implemented
- Scanning rectangular windows of some predefined
sizes (not all sizes). - Detection based on connectivity components.
Limitations of the method overlaping symbols
bounding boxes, symbols that intersect
Used OCR tool A simple method implemented -
feature vector extracted from image, k-nearest
neighbor classifier used to classify the vector.
Trained for all supported elementary symbols.
24Remarks on Terminals Detection
- Symbols that do not have size limited by a
constant are not treated as terminal symbols
(e.g., fraction line, square root). - In addition, square root cannot be separated from
an image by a rectangle (it surrounds its
argument). - Solution Treat these cases as symbols composed
of several terminal symbols, extend grammar by
related productions.
25Parsing Algorithm
- Bottom up approach, as described in the general
structural recognition. - Complexity depends on the number of terminals
detected during the first phase in general, can
be exponential, but it is substantially reduced
by production restristions and usage of suitable
data structures - Data structures for orthogonal range queries
(searching points that are located in a
rectangle) used to speed up the algorithm.
26Future Plans
- Focus on printed formulae
- Collect sufficiently large set of annotated
printed formulae - Apply learning methods learn etalons of
elementary symbols and productions parameters