Title: Optical Character Recognition OCR
1Optical Character Recognition (OCR)
2GIST of OCR Testing
Sub modules
Test Data Size
Sl No
Name of Language
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
Major Fonts Including Digital ones DVB- TT
Yogesh DVB-TT Surekh Metal Type
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Sample Size of 1000 pages
96 to 98
12 to 36 11 To 32
Hindi
1
v
v
v
3GIST of OCR Testing
Sub modules
Test Data Size
Sl No
Name of Language
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
DVB- TT Yogesh DVB-TT Surekh Metal Type
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Not available
90 at word level
11 to 32
Marathi
2
v
v
v
Spell Checker, Algorithm Improvement, Exhaustive
Testing
Tested on 16 Pages Foe each character fonts and 4
font sizes each
Gurumukhi-IIYS, CDAC (PN-TT Amar), Punjabi,
Primaja, Anandpur Sahib
12-20, 14-24, 12-20, 12-20, 12-20
3.
97
Punjabi
--
--
--
4GIST of OCR Testing
Sub modules
Test Data Size
Sl No
Name of Language
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
DVB- TT Yogesh DVB-TT Surekh Metal Type
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Not available
90 at word level
11 to 32
Marathi
2
Spell Checker, Algorithm Improvement, Exhaustive
Testing
Tested on 16 Pages Foe each character fonts and 4
font sizes each
Gurumukhi-IIYS, CDAC (PN-TT Amar), Punjabi,
Primaja, Anandpur Sahib
12-20, 14-24, 12-20, 12-20, 12-20
3.
97
Punjabi
5GIST of OCR Testing
Sub modules
Sl No
Name of Language
Test Data Size
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Tested on 80-90 Pages
95 to 97
Hemalatha Harsha- -priya SreeLipi Ann font
Family
Telugu
12-24, 16-20, 14-20, 14
4
--
v
--
Spell Checker, Algorithm Improvement, Exhaustive
Testing
Tested on 600 pages
Tamil
5.
Any font
12-36,
98
v
--
v
6GIST of OCR Testing
Sub modules
Sl No
Name of Language
Test Data Size
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
Not mentioned
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
95 to 96
Multi Font
Kannada
Multi Size
6
--
--
--
ML-TT Kartika, MalBrubhmi Manorma
Fonts Current Books
12,14, 16,18 12 to 16 12 to 16 12,14
Not mentioned
Malayalam
--
--
--
7.
Spell Checker, Algorithm Improvement, Exhaustive
Testing
90 to 92
7GIST of OCR Testing
Sub modules
Sl No
Name of Language
Test Data Size
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
- 1.Modular
- Shree
- 2.CDAC
- Font
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Sample size of about 100 pages
93
Oriya
14, 16
8
--
v
--
Sample size of about 100 pages
Ratnagiri
Spell Checker, Algorithm Improvement, Exhaustive
Testing
12-36
95
--
--
v
9.
Assamese
8GIST of OCR Testing
Sub modules
Sl No
Name of Language
Test Data Size
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
Conves the major fonts used for Publishing
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Sample size of about 100 pages
97
Bangla
12-36
10.
v
v
v
DA gt Document Analysis DS gt Document
Synthesis CR gt Character Recognition Engine
v gt Facility exists -- gt Facility not
available
9Summary of the Bangla-OCR Test Result performed
by STQC
- Accuracy Character Level 96.43
- Accuracy Word Level 90.80
- Speed ( Avg. time for converting 45 Character per
second - to .tif or .pc format)
- 4. Noise Reduction 3.00
- (AUT is rated on the scale of 1-5
- 5 being the best)
- Skew detection and Correction /- 5 Degree
- 6. Character Recognition Text Characters
varying from -
14 points to 36 points - 7. Additional Features Reduces speckles and
smudge -
Recognizes different papers such
as -
bond, glossy,photocopier etc. - Supports Offset prints, laser prints and
-
photocopy of offset and laser
prints