CAPTCHA as a Communications Problem

About This Presentation

Title:

Description:

Number of Views:89

Avg rating:3.0/5.0

Slides: 21

Provided by: kunja

Category:

Tags: captcha | communications | inefficiently | problem

Transcript and Presenter's Notes

Title: CAPTCHA as a Communications Problem

1
CAPTCHA as a Communications Problem

2
What Is a CAPTCHA?

3
Purpose of CAPTCHAs

Automated process for determining whether an
internet form submission is completed by a
legitimate human user or a malicious bot
Most common form is a visual CAPTCHA (in image)

4
Motivation

Almost every major web application uses a visual
CAPTCHA
Many visual CAPTCHAs are easily crackable, since
there is currently no analytical method for
studying the limits of detection for various
CAPTCHA schemes

Easy to crack! ?
5
Though the Lens of IT
6
Immediate Consequences of the new methodology

7
Evolution of CAPTCHAs
8
Some CAPTCHAs have features that reduce
randomization

Examples
Limiting the variations of letters to english
words decreases entropy of the source since it is
no longer a uniform distribution of the letters

9
Some CAPTCHAs are very difficult for humans to
detect
10
Some CAPTCHAs are impossible for humans to detect

11
What can the capacity of this channel tell us?

A capacity equal to log X for this channel
would show that all output from the channel is
potentially decodable by a human observer
Programmer can design distortions based upon this
measure
However, humans are sub-optimal decoders, so this
does NOT tell us if humans can decode the image

12
How can we find it?

13
Geometric Distortions

14
Method

Take fixed alphabet W for input and generate a
single 8x8 image corresponding to each
(generating X)

A B C D
15
Find all possible transformations

16
Dealing with non-integer (x2,y2)

17
Result

The system achieved capacity for up to 2 pixels
moved in each direction (although this does not
mean that the other cases were distinguishable
for a human observer)

18
Too much work for a simple answer!

Number of computations become exponentially large
as size of image and bounds on distortion grow.

19
Other limitations of model

This system treats the input and output as a
one-channel-use problem. In reality, many OCRs
segment the image into individual letters. So, W
can be modeled as a random process with a smaller
alphabet.
Also, some distortion systems (channels) have
knowledge of their input, and will bound the
amount of distortion based upon this information

20
Future Directions

Treatment of geometric distortions as time-warped
data, or correlations between inputs
Quantization of values and power limitation of
the pixel intensity

Write a Comment

User Comments (0)