Title: Dr. Istv
1SSIP 2005, Szeged
Character Recognition Internals
- Dr. István Marosi
- Scansoft-Recognita, Inc., Hungary
2OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- User assisted correction
- Result exportation
3OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Get image
- B/W Scanning
- Gray Scanning
- Color Scanning
- Load from image file
- Preprocess image
- Layout recognition
- Text recognition
- User assisted correction
- Result exportation
4OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Get image
- Preprocess image
- Color separation
- Thresholding
- Despeckling
- Rotation
- Deskewing
- Layout recognition
- Text recognition
- User assisted correction
- Result exportation
Color Separation
De-speckle, de-skew
5The Preprocessed Image
Joined chars
6The Preprocessed Image
Joined chars
7The Preprocessed Image
Joined chars
8The Preprocessed Image
Broken chars
9The Preprocessed Image
Broken chars
10The Preprocessed Image
Broken chars
11OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text zones
- Columns of flowed text
- Tables
- Inverse text
- Graphic zones
- Text recognition
- User assisted correction
- Result exportation
12OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text zones
- Graphic zones
- Line Art
- Photo
- Text recognition
- User assisted correction
- Result exportation
13OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- Segmentation
- Calculation of Feature Vector Elements
- Classification
- Language Analysis
- Voting
- User assisted correction
- Result exportation
14Segmentation
- What are those pixel groups belonging to a single
letter?
15Segmentation
- What are those pixel groups belonging to a single
letter?
16Segmentation
- What are those pixel groups belonging to a single
letter?
17Segmentation
- What are those pixel groups belonging to a single
letter?
18Segmentation
- What are those pixel groups belonging to a single
letter?
19Segmentation
- What are those pixel groups belonging to a single
letter?
20Segmentation
- What are those pixel groups belonging to a single
letter?
21OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- Segmentation
- Calculation of Feature Vector Elements
- Classification
- Language Analysis
- Voting
- User assisted correction
- Result exportation
22Calculation of FV Elements Contour Tracing
- Find a (new) white-black transition
- Follow the edge of the pixels using the MIN or
MAX rule - Administrate the already traced white-black
transitions - Collect information while going around
- And repeat the process on new shapes ...
23Contour Tracing
- Find a (new) white-black transition
- Follow the edge of the pixels using the MIN or
MAX rule - Administrate the already traced white-black
transitions - Collect information while going around
- And repeat the process on new shapes ...
24Contour Tracing
- Find a (new) white-black transition
- Follow the edge of the pixels using the MIN or
MAX rule
if black(a) then turn(ccw) else if black(b) then
forward else turn(cw)
a
b
25Contour Tracing
- Find a (new) white-black transition
- Follow the edge of the pixels using the MIN or
MAX rule
if black(a) then turn(ccw) else if black(b) then
forward else turn(cw)
a
b
a
if white(b) then turn(cw) else if white(a) then
forward else turn(ccw)
b
26Contour Tracing
- Find a (new) white-black transition
- Follow the edge of the pixels using the MIN or
MAX rule - Administrate the already traced white-black
transitions - Collect information while going around
- And repeat the process on new shapes ...
27Some Easily Calculatable Data
Turning CW InIn-11 Turning CCW
InIn-1-1 Going Forward InIn-1
28Some Easily Calculatable Data
Turning CW InIn-11 Turning CCW
InIn-1-1 Going Forward InIn-1
29Some Easily Calculatable Data
Going Up InIn-1-Xn Going Down
InIn-1Xn Going Right InIn-1 Going Left
InIn-1
30Some Easily Calculatable Data
Going Up InIn-1-Xn Going Down
InIn-1Xn Going Right InIn-1 Going Left
InIn-1
31OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- Segmentation
- Calculation of Feature Vector Elements
- Classification
- Language Analysis
- Voting
- User assisted correction
- Result exportation
32Classification Training models
- Restricted Coulomb Energy (RCE) Network(Dr. Leon
Cooper, Dr. Charles Elbaum)
B
A
B
A
33Classification Training models
- Restricted Coulomb Energy (RCE) Network(Dr. Leon
Cooper, Dr. Charles Elbaum)
B
A
B
A
34Classification Training models
- Nestor Learning System (NLS)
35Classification Training models
- Nestor Learning System (NLS)
Default radius Rmax
36Classification Training models
- Nestor Learning System (NLS)
37Classification Training models
- Nestor Learning System (NLS)
Default radius Rmax
38Classification Training models
- Nestor Learning System (NLS)
39Classification Training models
- Nestor Learning System (NLS)
40Classification Training models
- Nestor Learning System (NLS)
Default radius Rmax
41Classification Training models
- Nestor Learning System (NLS)
42Classification Training models
- Nestor Learning System (NLS)
Decreased radius
43Classification Training models
- Nestor Learning System (NLS)
44Classification Training models
- Nestor Learning System (NLS)
Decreased radius Rmin
45Classification Training models
- Nestor Learning System (NLS)
Pass 2
Decreased radius
46OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- Segmentation
- Calculation of Feature Vector Elements
- Classification
- Language Analysis
- Voting
- User assisted correction
- Result exportation
47Voting
- Text recognition in OmniPage Pro
- OCR Engines available
- Caeres engine (codename Salt Pepper)
- Recognitas engine (codename Paprika)
- ScanSofts engine (codename Fireworx)
48Voting
- Text recognition in OmniPage Pro
- OCR Engines available
- Caeres engine (Salt Pepper)
- Uses a Matrix Matching based algorithm
- feature set 40 cells of an 8x5 grid
- good overall description of a shape
- weaker at detailed structure
- Recognitas engine (Paprika)
- Uses a Contour Tracing based algorithm
- feture set convex and concave arcs on the
contour - good detailed description of a shape
- weaker at overall structure
49Voting
- Text recognition in OmniPage Pro
- OCR Engines available
- Caeres engine (Salt Pepper)
- Recognitas engine (Paprika)
- ScanSofts engine (Fireworx)
- Segmentation algorithms
50Voting
- Text recognition in OmniPage Pro
- OCR Engines available
- Caeres engine (Salt Pepper)
- Recognitas engine (Paprika)
- ScanSofts engine (Fireworx)
- Segmentation algorithms
- Developed by independent groups
- Have different strengths and weaknesses
51Voting
- Text recognition in OmniPage Pro
- OCR Engines available
- Segmentation algorithms
- Conclusion
- They are complementary
- Lets create a voting system
52Voting
Image
- Voting strategies
- External Black boxvoting20 gain
Paprika
Salt Pepper
Fire- worx
Txt 3
Txt 1
Txt 2
Dict
Vote
Final Txt
53Voting
Image
- Voting strategies
- External Black boxvoting
- Internal Shapevoting
Fire- worx
Salt Pepper
Paprika
Txt 2
Txt 1
Txt 3
Dict
Bronze
Final Txt
54Voting
Image
Recognize originalsegmentation
- Paprika
- Original segmentation
- Every independent connected component is a
character - Good segmentation recognize
- Bad segmentation reject
K.B.
55Voting
Image
Recognize originalsegmentation
K.B.
Txt 1
Train adaptive classifierfrom original shapes
Txt 2
AdaptiveK.B.
56Voting
Image
Recognize originalsegmentation
- Paprika
- Try several segmentations
- Loop if unrecognizable
K.B.
Txt 1
Train adaptive classifierfrom original shapes
Txt 2
Recognize broken andjoined shapes
AdaptiveK.B.
Dict
57Voting
Image
Recognize originalsegmentation
K.B.
Txt 1
Train adaptive classifierfrom original shapes
Txt 2
Recognize broken andjoined shapes
AdaptiveK.B.
Train adaptive classifierfrom ugly shapes
Dict
58Voting
Image
Recognize originalsegmentation
K.B.
Txt 1
Train adaptive classifierfrom original shapes
Txt 2
Recognize broken andjoined shapes
AdaptiveK.B.
Train adaptive classifierfrom ugly shapes
Dict
Recognize more brokenand joined shapes
- Try several segmentations
- Loop if unrecognizable
Txt 3
59Voting
Image
- Voting strategies
-
- 60 gain
Fire- worx
Salt Pepper
Paprika
Txt 1
Txt 1
Txt 3
Dict
Bronze
Final Txt
60OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- User assisted correction
- By the users random editing...
- Pop-up verifier
- Manual Training
- By proofreading of doubtful words
- Result exportation
61OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- User assisted correction
- By the users random editing...
- By proofreading of doubtful words
- Correct User dictionary
- Changed IntelliTrain
- Remember trained characters
- Apply them on following pages
- Result exportation
62IntelliTrain
- Recognized word sorneUüng
63IntelliTrain
- Recognized word sorneUüng
- Fixed word something
64IntelliTrain
- Recognized word sorneUüng
- Fixed word something
65IntelliTrain
- Recognized word sorneUüng
- Fixed word something
- Substitutions found m ? rn
- thi ? Uü
66IntelliTrain
- Recognized word sorneUüng
- Fixed word something
- Substitutions found m ? rn
- thi ? Uü
- Perform automatically
- Learn image pattern and substitution info
- Find similar substituted (blue) text on actual
page - Match against pattern of substitution and correct
- Find such errors on following pages, too
67OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- User assisted correction
- Result exportation
- Combine pages into a Document
- Header / Footer recognition
- Page numbers
- Hyperlinks (e.g. See Table 20)
- Save results
68OCR Internals
- Main tasks of an OCR system
- Image acquisition
- Layout recognition
- Text recognition
- User assisted correction
- Result exportation
- Combine pages into a Document
- Save results
- doc file
- e-mail
- Speech synthesizer
69(No Transcript)