Title: Visual CAPTCHA with Handwritten Image Analysis
1Visual CAPTCHA with Handwritten Image Analysis
- Amalia Rusu and Venu Govindaraju
- CEDAR
- University at Buffalo
2Background on CAPTCHA
- Completely Automatic Public Turing test to tell
Computers and Humans Apart CAPTCHA - CAPTCHA should be automatically generated and
graded - Tests should be taken quickly and easily by human
users - Tests should accept virtually all human users and
reject software agents - Tests should resist automatic attack for many
years despite the technology advances and prior
knowledge of algorithms - Exploits the difference in abilities between
humans and machines - (e.g., text, speech or facial features
recognition) - A new formulation of the Alan Turings test -
Can machines think?
3Securing Cyberspace Using CAPTCHA
Automatic Authentication Session for Web Services.
- Initialization
- Handwritten CAPTCHA Challenge
- User Response
- Verification
4Objective
- Develop CAPTCHAs based on the ability gap between
humans - and machines in handwriting recognition using
Gestalt laws of perception
State-of-the-art in HR
Lexicon size Lexicon Driven Lexicon Driven Lexicon Driven Grapheme Model Grapheme Model Grapheme Model
Lexicon size time (secs) accuracy accuracy time (secs) accuracy accuracy
Lexicon size time (secs) Top 1 Top 2 time (secs) Top 1 Top 2
10 0.027 96.53 98.73 0.021 96.56 98.77
100 0.044 89.22 94.13 0.031 89.12 94.06
1000 0.144 75.38 86.29 0.089 75.38 86.29
20000 1.827 58.14 66.56 0.994 58.14 66.49
Speed and accuracy of a HR. Feature extraction
time is excluded. Testing platform is an
Ultra-SPARC.
Xue, Govindaraju 2002
5H-CAPTCHA Motivation
- Machine recognition of handwriting is more
difficult than printed text - Handwriting recognition is a task that humans
perform easily and reliably - Several machine printed text based CAPTCHAs have
been already broken - Greg Mori and Jitendra Malik of the UCB have
written a program that can solve Ez-Gimpy with
accuracy 83 - Thayananthan, Stenger, Torr, and Cipolla of the
Cambridge vision group have written a program
that can achieve 93 correct recognition rate
against Ez-Gimpy - Gabriel Moy, Nathan Jones, Curt Harkless, and
Randy Potter of Areté Associates have written a
program that can achieve 78 accuracy against
Gimpy-R - Speech/visual features based CAPTCHAs are
impractical - H-CAPTCHAs thus far unexplored by the research
community
6H-CAPTCHA Challenges
- Generation of random and infinite many distinct
handwritten CAPTCHAs - Quantifying and exploiting the weaknesses of
state-of-the-art handwriting recognizers and OCR
systems - Controlling distortion - so that they are human
readable (conform to Gestalt laws) but not
machine readable
7Generation of random and infinite many distinct
handwritten text images
- Use handwritten word images that current
recognizers cannot read - Handwritten US city name images available from
postal applications - Collect new handwritten word samples
- Create real (or nonsense) handwritten words and
sentences by gluing isolated upper and lower case
handwritten characters or word images
8Generation of random and infinite many distinct
handwritten text images
- Use handwriting distorter for generating
human-like samples - Models that change the trajectory/shape of the
letter in a controlled fashion (e.g. Hollerbachs
oscillation model)
Original handwritten image (a). Synthetic images
(b,c,d,e,f).
9Exploit the Source of Errors for State-of-the-art
Handwriting Recognizers
Kim, Govindaraju 1997
- Word Model Recognizer (WMR)
- Accuscript
Xue, Govindaraju 2002
10Source of Errors for State-of-the-art Handwriting
Recognizers
- Image quality
- Background noise, printing surface, writing
styles - Image features
- Variable stroke width, slope, rotations,
stretching, compressing - Segmentation errors
- Over-segmentation, merging, fragmentation,
ligatures, scrawls - Recognition errors
- Confusion with a similar lexicon entries, large
lexicons
11Gestalt Laws
- Gestalt psychology is based on the observation
that we often experience things that are not a
part of our simple sensations - What we are seeing is an effect of the whole
event, not contained in the sum of the parts
(holistic approach) - Organizing principles Gestalt Laws
- By no means restricted to perception only (e.g.
memory)
12Gestalt Laws
- 1. Law of closure 2. Law of similarity
OXXXXXX XOXXXXX XXOXXXX XXXOXXX XXXXOXX
XXXXXOX XXXXXXO
3. Law of proximity 4. Law of symmetry
13Gestalt Laws
5. Law of continuity
- Ambiguous segmentation
- Segmentation based on good continuity, follows
the path of minimal curvature change - Perceptually implausible segmentation
6. Law of familiarity
- Ambiguous segmentation
- Perceptual segmentation
- Segmentation based on good continuity proves to
be erroneous
14Gestalt Laws
7. Figure and ground
8. Memory
15Control Overlaps
Gestalt laws proximity, symmetry, familiarity,
continuity, figure and ground
Create horizontal or vertical overlaps
For same word, smaller distance overlaps
For different words, bigger distance overlaps
16Control Occlusions
Gestalt laws closure, proximity, familiarity
Add occlusions by circles, rectangles, lines with
random angles Ensure small enough occlusions such
that they do not hide letters completely
17Control Occlusions
Gestalt laws closure, proximity, familiarity
Add occlusions by waves from left to right on
entire image, with various amplitudes /
wavelength or rotate them by an angle Choose
areas with more foreground pixels, on bottom part
of the text image (not too low not to high)
18Control Extra Strokes
Gestalt laws continuity, figure and ground,
familiarity
- Add occlusion using the same pixels as the
foreground pixels (black pixels), arcs, or lines,
with various thickness - Curved strokes could be confused with part of a
character - Use asymmetric strokes such that the pattern
cannot be learned
19Control Letter/Word Orientation
Gestalt laws memory, internal metrics,
familiarity of letters
vertical mirror
horizontal mirror
flip-flop
Change word orientation entirely, or the
orientation for few letters only Use variable
rotation, stretching, compressing
20General H-CAPTCHA Generation Algorithm
- Input.
- Original (randomly selected) handwritten image
(existing US city name image or synthetic word
image with length 5 to 8 characters or meaningful
sentence) - Lexicon containing the images truth word
- Output.
- H-CAPTCHA image
- Method.
- Randomly choose a number of transformations
- Randomly establish the transformations
corresponding to the given number - If more than one transformation is chosen then
- A priori order is assigned to each transformation
based on experimental results - Sort the list of chosen transformations based on
their priori order and apply them in sequence, so
that the effect is cumulative
21Testing Results on Machines
HW Recognizer WMR WMR Accuscript Accuscript
Lexicon Size 4,000 40,000 4,000 40,000
Occlusion by circles 35.93 20.28 32.34 17.37
Vertical Overlap 27.88 14.36 12.64 3.94
Horizontal Overlap (Small) 24.35 10.70 2.93 0.60
Black Waves 16.36 5.33 1.57 0.38
Occlusion by waves 15.43 7.00 10.56 4.28
Horizontal Overlap (Large) 12.93 3.56 2.42 0.36
Overlap Different Words 3.80 0.48 4.43 0.92
Flip-Flop 0.46 0.14 0.70 0.19
General Image Transformations 9.28 N/A 4.41 N/A
The accuracy of HR on images deformed using
Gestalt laws approach. The number of tested
images is 4,127 for each type of transformation.
HR running time increases from few seconds per
image for lexicon 4,000 to several minutes per
image for lexicon 40,000.
22Testing Results on Humans
Human Tests All Transforms Occlusion by circles Vertical Overlap Horizontal Overlap (Small) Black Waves Occlusion by waves Horizontal Overlap (Large)
Nr. Of Tested Images 1069 90 88 90 90 87 89
Accuracy 76.08 67.78 87.50 76.67 80.00 80.46 65.17
The accuracy of human readers on images deformed
using Gestalt laws approach. A word image is
recognized correctly when all characters are
recognized.
23H-CAPTCHA Evaluation
- No risk of image repetition
- Image generation completely automated words,
images and distortions chosen at random - The transformed images cannot be easily
normalized or rendered noise free by present
computer programs, although original images must
be public knowledge - Deformed images do not pose problems to humans
- Human subjects succeeded on our test images
- Test against state-of-the-art Word Model
Recognizer, Accuscript - CAPTCHAs unbroken by state-of-the-art recognizers
24Future Work
- Develop general methods to attack H-CAPTCHA (e.g.
pre and post processing techniques) - Research lexicon free approaches for handwriting
recognition - Quantify the gap between humans and machines in
reading handwriting by category (of distortions
Gestalt laws) - Parameterize the difficulty levels of Gestalt
based H-CAPTCHAs
25 Questions?