Kein Folientitel - PowerPoint PPT Presentation

About This Presentation
Title:

Kein Folientitel

Description:

Detection and Extraction of Artificial Text from Videos ... N Vert 05100707. Berlingo. PUBLICITE. prenez. diffusion simultan e en st r o sur. boyard ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 47
Provided by: Christi406
Category:
Tags: folientitel | kein | nord | vert

less

Transcript and Presenter's Notes

Title: Kein Folientitel


1
Detection and Extraction of Artificial Text from
Videos
Christian Wolf and Jean-Michel Jolion
10th July 2001
PROJECT France Télécom Research Development
001B575
Laboratoire de Reconnaissance de Formes et
Vision Bât. Jules Verne INSA 69621 Villeurbanne
CEDEX
http//rfv.insa-lyon.fr/wolf,jolion
2
Plan of the presentation
Slides
  • Introduction
  • Detection
  • Image enhancement - multiple frame integration
  • Binarisation of the text boxes
  • Setup of the experiments
  • Results
  • Detection
  • Binarisation
  • OCR
  • Conclusion and outlook

6 8 3 10 11 6 2 46
3
Content based image retrieval
Result
Example image
Similarity Function
Indexing phase
4
Similarity measures

similar
similar
Not similar
5
Indexing using Text
Result
Key word
Keyword based Search
Patrick Mayhew
Indexing phase
Patrick Mayhew Min. chargé de lirlande de
Nord ISRAEL Jerusalem montage T.Nouel ... ... ...
... ...
6
Video properties
7
Text extraction general scheme
Image enhancement - Multiple frame integration
Detection of the text in single frames
Tracking
Video
"EVENEMENT" "ACTU" "SPELEOS" "Gouffre Berger
(Isére)" "aujourd'hui" "France 3 Alpes" "un
spéléologue sauveteur"
Segmentation/Binarisation
OCR
8
Detection in single frames
Video
Connected components Analysis
Verification of geometric constraints
Calculation of the gradient
Accumulation
Verification of special cases
Binarisation
Combination of the rectangles
Mathematical Morphology
List of rectangles
9
Detection in single frames examples
10
A filter for text detection
Accumulation of horizontal gradients.
Justification Text forms a regular texture
containing vertical edges which are aligned
horizontally.
11
Mathematical morphology
12
Detection in video sequences
Detection per single frame
List of rectangles per frame
13
Integration of the rectangles ? occurrences
At every new frame, the detected rectangles must
be matched with the stored text occurrences
14
Suppression of false alarms Examples
All detections
After suppression of false alarms
15
Image enhancement
16
Interpolation Examples
Bi-linear interpolation
Robust bi-linear interpolation
Robust bi-cubic interpolation
17
Interpolation thresholded examples
Bi-linear interpolation
Robust bi-linear interpolation
Robust bi-cubic interpolation
18
Binarisation
  • Different Binarisation algorithms have been
    implemented and evaluated
  • Fisher/Otsu and windowed Fisher/Otsu algorithm
  • Yanowitz-Bruckstein
  • Niblack, Sauvola
  • Our adaptive version of Niblack/Sauvolas method.

19
Binarisation methods
Yanowitz Bruckstein The threshold surface is
calculated from the edge information.
Threshold surface
Windowed-Fisher, Niblack-Sauvola The threshold
surface is calculated from the statistics
collected in a window which is shifted across the
image.
Threshold surface
20
Binarisation by Niblack
Niblack proposed a method which calculates a
threshold surface by gliding a rectangular window
over the image and calculating statistics on this
window
m mean s standard deviation k parameter, -0.2
21
Binarisation by Niblack Problems
Problems are light textures in the background,
which are considered as text with small contrast
22
Binarisation Improvement by Sauvola
To overcome these problems, Sauvola et al.
proposed a new improved formula to calculate the
threshold
m mean s standard deviation k parameter,
0.5 R parameter (dynamic range of std.dev.), R
128
23
Binarisation by Sauvola, examples
Original image
24
Improvement Adaptive dynamic range
Fixing the dynamic range R128 might be ok for
document images, but not for text boxes taken
from videos. Binarisation will not be correct, if
the contrast of the image is smaller. We
therefore set the parameter R to the maximum
standard deviation for all windows calculated
To avoid two passes of the windowing algorithm,
the mean and standard deviation can be stored in
a table during the first pass and the threshold
surface calculated on this data.
25
Improvement Shift of the image range
The strong hypothesis on the gray values (text
pixels must be near zero) is not justified for
some video text boxes
Gray value histogram
26
Improvement Shift of the image range
A correction of the images histogram resolves
this problem
Original image
27
Fast incremental calculation
Mean and variance can be calculated in one pass
28
The experiments
  • Description of the experiments
  • The videos used in the experiments.
  • Description of the evaluation process (OCR
    Evaluation).
  • Results for
  • Text detection
  • Binarisation
  • OCR

29
Test videos
We performed experiments on 5 different MPEG 1
videos of resolution 384x288
30
AIM2 Commercials
AIM3 News
AIM4 Cartoon, News
AIM5 News
31
Video example - France Télécom
22 minutes of video 33000 frames
32
The interface to the OCR software
Ideal situation Pass individual (binarised) text
boxes to an OCR software which recognises the
contents box after box. In reality We used
standard commercial OCR software for our tests.
This software has been designed to recognise
scanned A4 or US letter pages and cannot directly
process text boxes.
33
OCR Page - Manual
An input image, ready for the OCR
34
OCR Output
051Q07Ô7 NVerf 05JQ0707 PUBLICITE IPUBIIÏITE
IPUBLICITE prenez prenez prenez boyard
boyard boyard française française
française FRANCE FRANCE FRANCE
FRANCE FRANCE c'est plus musclé c'est plus
musclé iï 'J fort fort fort fort fort
.fort .fort .fort cotHfUet blé
cotHfUet blé cQtfUet blé uutàfruuk On va
beaucoup loin avec Itineris. Partout Partout
Partout Partout Partout Partout Partout Partout
Partout Partout I22h35 I22h35 I22h35 I22h35
I22h35 PUBLICITE \PUBLICITE \PUBLICITE gt3h55l23h55
l23h55l23h55l23h55l23h55 20h.50120h50
20h50120h50 20h50120h50 ,f ort boyard ,f ort
boyard ,f ort boyard ,f ort boyard 2,4 Kg J 2,4
Kg g 2,4 Kg J 2,4 Kg g 2,4 Kg J 2,4 Kg g 2,4 Kg J
2,4 Kg g 2,4 Kg J II II II II II II
II II II gà dents gà dents gà
dents IIH r Lessive classique lljir Lessive
classique IHT Lessive classique le temps le
temps le temps le temps le temps PUBLICITE
PUBLICITE PUBLICITE I Par Amour du Goût. Il Par
Amour du Goût. I en en en en en en en en
en révolution révolution révolution
35
Post processing of OCR output
Post processed OCR output
Ground truth
dimanche 23h55 N Vert 05100707 Berlingo PUBLICITE
prenez diffusion simultanée en stéréo
sur boyard française FRANCE c'est plus
musclé PUBLICITE fort Coral blé complet fruits On
va beaucoup Plus loin avec Itineris. Bohême Partou
t 22h35 PUBLICITE 23h55 20h50 fort fort boyard
23h55 051Q07Ô7 PUBLICITE prenez boyard françai
se FRANCE c'est plus musclé fort blé
cotHfUet uutàfruuk On va beaucoup loin avec
Itineris. Partout I22h35 PUBLICITE
\ gt3h55l 20h.50 ,f ort boyard
36
Automatic evaluation using markers
The manual processing of the OCR output
(separation of the output strings and search of
the corresponding input box) is time consuming
and error prone, especially in cases where the
quality of the OCR output is very
poor. Automatic OCR output processing can be
achieved by placing marker images between the
text boxes. The marker boxes contain text which
is easily recognised by the OCR software.
In the results section we will present results
for both types of evaluation.
37
An input image with markers, ready for the OCR
38
OCR Evaluation
OCR output
Raw ground truth
Tkenchar 037 Tkenchar 037 'gfrançaise
'gfrançaise 'gfrançaise 'gfrançaise Tkenchar
038 Tkenchar 038 Mpe pire de fje pire de
fje pire de Tkenchar 039 Tkenchar 039
_at_S Par Amour du Goût. _at_S en _at_S révolution _at_S
la _at_S française _at_S le pire de _at_S 20H45
39
OCR Evaluation Wagner Fischer
A measure for resemblance of two character
strings. The cost to transform string A into
string B is calculated. Basic transformation
operations are used, which correspond to a
certain cost. The cost function is minimised.
Substitution
cost
Insertion
cost
Deletion
cost
40
Detection results - INA Videos
No suppression of false alarms
41
Binarisation methods Examples
Original image Fisher Fisher (windowed) Yanowitz
B. Yanowitz B. PP Niblack Sauvola et al. Our
method
42
Binarisation methods Examples
Original image Fisher Fisher (windowed) Yanowitz
B. Yanowitz B. PP Niblack Sauvola et al. Our
method
43
OCR Results - Classification by binarisation
method
Robust bi-cubic interpolation
Results obtained using the manual evaluation
method (no markers in the input page).
44 pages
44
OCR Results Interpolation methods
Results obtained using the automatic evaluation
method (including markers in the input page).
Robust bi-cubic interpolation
97 pages
45
Conclusion
  • We developed a system for detection, tracking,
    enhancement and binarisation of text.
  • A detection performance of 93.5 is obtained.
  • We derived a new binarisation method adapted to
    the type of text found in videos.
  • The total recognition rate is surprisingly high,
    given the quality of the text, but not yet good
    enough for indexation purposes.
  • OCR integration problem No software development
    kits for direct access to the recognition
    functions available. A collaboration with an OCR
    company seems to be inevitable.

46
Outlook
The perspectives of our work are situated in the
extension of the existing algorithms to text with
more difficult properties, and the enhancement
and deeper studies of the existing
techniques Scene text The binarisation
techniques developed in the last 30 years are
aimed either at document images or images from
computer vision. The method we introduced in the
framework of this project is an improvement of
the work already presented, but the quality of
the text is not yet satisfying enough. Especially
the binarisation of scene text will demand the
development of new methods. Detection recall We
are convinced, that the recall of the detection
system can still be increased by further
research, e.g. on the binarisation technique
applied to the map of accumulated gradients.
Write a Comment
User Comments (0)
About PowerShow.com