Optical Character Recognition OCR - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

Optical Character Recognition OCR

Description:

Foe each character fonts and 4 font sizes each. Gurumukhi-IIYS, CDAC (PN-TT Amar) ... bond, glossy,photocopier etc. Supports Offset prints, laser prints and ... – PowerPoint PPT presentation

Number of Views:4616
Avg rating:3.0/5.0
Slides: 10
Provided by: somn
Category:

less

Transcript and Presenter's Notes

Title: Optical Character Recognition OCR


1
Optical Character Recognition (OCR)
2
GIST of OCR Testing
Sub modules
Test Data Size
Sl No
Name of Language
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
Major Fonts Including Digital ones DVB- TT
Yogesh DVB-TT Surekh Metal Type
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Sample Size of 1000 pages
96 to 98
12 to 36 11 To 32
Hindi
1
v
v
v
3
GIST of OCR Testing
Sub modules
Test Data Size
Sl No
Name of Language
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
DVB- TT Yogesh DVB-TT Surekh Metal Type
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Not available
90 at word level
11 to 32
Marathi
2
v
v
v
Spell Checker, Algorithm Improvement, Exhaustive
Testing
Tested on 16 Pages Foe each character fonts and 4
font sizes each
Gurumukhi-IIYS, CDAC (PN-TT Amar), Punjabi,
Primaja, Anandpur Sahib
12-20, 14-24, 12-20, 12-20, 12-20
3.
97
Punjabi
--
--
--
4
GIST of OCR Testing
Sub modules
Test Data Size
Sl No
Name of Language
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
DVB- TT Yogesh DVB-TT Surekh Metal Type
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Not available
90 at word level
11 to 32
Marathi
2
Spell Checker, Algorithm Improvement, Exhaustive
Testing
Tested on 16 Pages Foe each character fonts and 4
font sizes each
Gurumukhi-IIYS, CDAC (PN-TT Amar), Punjabi,
Primaja, Anandpur Sahib
12-20, 14-24, 12-20, 12-20, 12-20
3.
97
Punjabi
 
5
GIST of OCR Testing
Sub modules
Sl No
Name of Language
Test Data Size
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Tested on 80-90 Pages
95 to 97
Hemalatha Harsha- -priya SreeLipi Ann font
Family
Telugu
12-24, 16-20, 14-20, 14
4
--
v
--
Spell Checker, Algorithm Improvement, Exhaustive
Testing
Tested on 600 pages
Tamil
5.
Any font
12-36,
98
v
--
v
 
6
GIST of OCR Testing
Sub modules
Sl No
Name of Language
Test Data Size
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
Not mentioned
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
95 to 96
Multi Font
Kannada
Multi Size
6
--
--
--
ML-TT Kartika, MalBrubhmi Manorma
Fonts Current Books
12,14, 16,18 12 to 16 12 to 16 12,14
Not mentioned
Malayalam
--
--
--
7.
Spell Checker, Algorithm Improvement, Exhaustive
Testing
90 to 92
 
7
GIST of OCR Testing
Sub modules
Sl No
Name of Language
Test Data Size
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
  • 1.Modular
  • Shree
  • 2.CDAC
  • Font

Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Sample size of about 100 pages
93
Oriya
14, 16
8
--
v
--
Sample size of about 100 pages
Ratnagiri
Spell Checker, Algorithm Improvement, Exhaustive
Testing
12-36
95
--
--
v
9.
Assamese
 
8
GIST of OCR Testing
Sub modules
Sl No
Name of Language
Test Data Size
Font Size
Method of Improvement
Font Name
Efficiency
DA
DS
CR
Conves the major fonts used for Publishing
Spell Checker, Algorithm Improve-ment, Exhaustive
Testing
Sample size of about 100 pages
97
Bangla
12-36
10.
v
v
v
DA gt Document Analysis DS gt Document
Synthesis CR gt Character Recognition Engine
v gt Facility exists -- gt Facility not
available
 
9
Summary of the Bangla-OCR Test Result performed
by STQC
  • Accuracy Character Level 96.43
  • Accuracy Word Level 90.80
  • Speed ( Avg. time for converting 45 Character per
    second
  • to .tif or .pc format)
  • 4. Noise Reduction 3.00
  • (AUT is rated on the scale of 1-5
  • 5 being the best)
  • Skew detection and Correction /- 5 Degree
  • 6. Character Recognition Text Characters
    varying from

  • 14 points to 36 points
  • 7. Additional Features Reduces speckles and
    smudge

  • Recognizes different papers such
    as

  • bond, glossy,photocopier etc.
  • Supports Offset prints, laser prints and

  • photocopy of offset and laser
    prints
Write a Comment
User Comments (0)
About PowerShow.com