Distinguishing Mathematics Notation from English Text using Computational Geometry

About This Presentation

Title:

Distinguishing Mathematics Notation from English Text using Computational Geometry

Description:

Distinguishing Mathematics Notation from English Text using Computational ... 77 node binary features. 2926 quadratic binary features (ANDing pairs of features) ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 16

Provided by: Joh6531

Learn more at: https://www.cse.lehigh.edu

Category:

more less

Transcript and Presenter's Notes

Title: Distinguishing Mathematics Notation from English Text using Computational Geometry

1
Distinguishing Mathematics Notation from English
Text using Computational Geometry

D. Drake, H.S. Baird
Department of Computer Science and Engineering
Lehigh University

2
The Task

Differentiate isolated math and English textlines

English text or Math?
English text or Math?
How can Optical Character Recognition (OCR)
systems make this distinction? (a) math
symbols (b) spatial arrangement
3
Applications of Textline Classification

Commercial OCR systems far better on text than
on math
e.g. OCR systems still garble math
Textline classification allows
Processing of text/math differently
Hand off math to special purpose recognizers
Users can see Math textlines as image
No OCR errors

4
Prior Work

Past approaches
Symbol recognition
plus spatial analysis

Our approach Purely spatial analysis
Requires a classifier for special math symbols Often sensitive to font font size text orientation language Independent of font font size text orientation Easily extendable to other languages But may not handle as many cases-lets see
5
Voronoi Diagrams
Given a set of point sites in the
plane, Partition the plane into regions such that
the points in each region are closer to one site
than any other
A computational geometry data structure which is
invariant under arbitrary nonsingular similarity
transformations (translation, rotation,
scaling) --- and is efficiently computable
6
We use Kises Area Voronoi diagrams
Input Image
Sample points on boundary of black connected
components
Compute Voronoi Diagram
Compute Area Voronoi Diagram
Compute Neighbor Graph
Input to our classifier decides whether
textline is math or text
7
Kises algorithm run on math notation
8
Features of the Neighbor Graph we use for
Classification

Crafted to detect spatial arrangements among
characters that distinguish math from text

Edge Features
angle (wrt horizontal)
ratio of areas
ratio of diameters
shadowing

Node Features
aspect ratio
diameter/area ratio
fanout

Coarsely quantized Binary-valued presence (1)
or absence (0)
9
Classifier design

77 node binary features
2926 quadratic binary features (ANDing pairs of
features)
assume class-conditional independence among
quadratic features
trained a Bayesian node classifier
29 edge binary features
406 quadratic binary features (pairs of features)
assume class-conditional independence among
quadratic features
trained a Bayesian edge classifier
Combined results into a textline classifier
Runs fast 0.072 CPU sec per textline on average
(on a 650 MHz SunBlade) not optimized for
speed

10
Training Test data

Collected 264 images of textlines
from scanned math books
also, synthesized using LaTeX
Training set
132 textlines 68 math, 64 text
7273 nodes total 2273 math, 5000 text
9358 edges total 3827 math, 5531 text
Test set
132 textlines 68 math, 64 text
7072 nodes total 2269 math, 4803 text
9322 edges total 4005 math, 5317 text
(A small, preliminary trial.)

11
Examples of Correctly Classified Textlines
12
Results

Experiment performed on synthetically-generated
images and scanned books

Classified as True class Math Text
Math 67 1
Text 0 64
Data Set True class Training Testing
Math 0.029 0.015
Text 0.000 0.000
Overall 0.015 0.008
Confusion Matrix
Error Rates
Examples of misclassified textlines
13
Summary

Analysis of spatial arrangements (without symbol
recognition) handles many cases
Automatically trainable
Needs no prior knowledge of font, font size, or
spacing
Far less effort to train spatial classifiers than
to build a recognizer for math symbols in all
typefaces, sizes, etc
Possibly easily extendable to (trainable on)
other languages than English

14
Future Work