User Interfaces and Algorithms for Fighting Phishing - PowerPoint PPT Presentation

About This Presentation

Title:

User Interfaces and Algorithms for Fighting Phishing

Description:

Bad guys may try to subvert search engines. Only works if legitimate page is indexed ... Mandy Holbrook. Norman Sadeh. Anthony Tomasic. Umut Topkara ... – PowerPoint PPT presentation

Number of Views:87

Avg rating:3.0/5.0

Slides: 89

Provided by: jason203

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: User Interfaces and Algorithms for Fighting Phishing

1
User Interfaces and Algorithms for Fighting
Phishing
Jason I. HongCarnegie Mellon University
2
Everyday Privacy and Security Problem
3
This entire process known as phishing
4
Phishing is a Plague on the Internet

Estimated 3.5 million people have fallen for
phishing
Estimated 350m-2b direct losses a year
31000 unique phishing sites reported in June 2007
Easier (and safer) to phish than rob a bank

5
Project Supporting Trust Decisions

Goal help people make better online trust
decisions
Currently focusing on anti-phishing
Large multi-disciplinary team project at CMU
Computer science, human-computer interaction,
public policy, social and decision sciences, CERT

6
Our Multi-Pronged Approach

Human side
Interviews to understand decision-making
PhishGuru embedded training
Anti-Phishing Phil game
Understanding effectiveness of browser warnings
Computer side
PILFER email anti-phishing filter
CANTINA web anti-phishing algorithm

Automate where possible, support where necessary
7
Our Multi-Pronged Approach

Human side
Interviews to understand decision-making
PhishGuru embedded training
Anti-Phishing Phil game
Understanding effectiveness of browser warnings
Computer side
PILFER email anti-phishing filter
CANTINA web anti-phishing algorithm

What do users know about phishing?
8
Interview Study

Interviewed 40 Internet users (35 non-experts)
Mental models interviews included email role
play and open ended questions
Brief overview of results (see paper for details)
J. Downs, M. Holbrook, and L. Cranor. Decision
Strategies and Susceptibility to Phishing. In
Proceedings of the 2006 Symposium On Usable
Privacy and Security, 12-14 July 2006,
Pittsburgh, PA.

9
Little Knowledge of Phishing

Only about half knew meaning of the term
phishing
Something to do with the band Phish, I take it.

10
Little Attention Paid to URLs

Only 55 of participants said they had ever
noticed an unexpected or strange-looking URL
Most did not consider them to be suspicious

11
Some Knowledge of Scams

55 of participants reported being cautious when
email asks for sensitive financial info
But very few reported being suspicious of email
asking for passwords
Knowledge of financial phish reduced likelihood
of falling for these scams
But did not transfer to other scams, such as an
amazon.com password phish

12
Naive Evaluation Strategies

The most frequent strategies dont help much in
identifying phish
This email appears to be for me
Its normal to hear from companies you do
business with
Reputable companies will send emails
I will probably give them the information that
they asked for. And I would assume that I had
already given them that information at some point
so I will feel comfortable giving it to them
again.

13
Summary of Findings

People generally not good at identifying scams
they havent specifically seen before
People dont use good strategies to protect
themselves
Currently running large-scale survey across
multiple cities in the US to gather more data
Amazon also active in looking for fake domain
names

14
Outline

Human side
Interviews to understand decision-making
PhishGuru embedded training
Anti-Phishing Phil game
Understanding effectiveness of browser warnings
Computer side
PILFER email anti-phishing filter
CANTINA web anti-phishing algorithm

Can we train people not to fall for phish?
15
Web Site Training Study

Laboratory study of 28 non-expert computer users
Asked participants to evaluate 20 web sites
Control group evaluated 10 web sites, took 15 min
break to read email or play solitaire, evaluated
10 more web sites
Experimental group same as above, but spent 15
min break reading web-based training materials
Experimental group performed significantly better
identifying phish after training
Less reliance on professional-looking designs
Looking at and understanding URLs
Web site asks for too much information

People can learn from web-based training
materials, if only we could get them to read
them!
16
How Do We Get People Trained?

Most people dont proactively look for training
materials on the web
Companies send security notice emails to
employees and/or customers
We hypothesized these tend to be ignored
Too much to read
People dont consider them relevant
People think they already know how to protect
themselves
Led us to idea of embedded training

17
Embedded Training

Can we train people during their normal use of
email to avoid phishing attacks?
Periodically, people get sent a training email
Training email looks like a phishing attack
If person falls for it, intervention warns and
highlights what cues to look for in succinct and
engaging format
P. Kumaraguru, Y. Rhee, A. Acquisti, L. Cranor,
J. Hong, and E. Nunge. Protecting People from
Phishing The Design and Evaluation of an
Embedded Training Email System. CHI 2007.

18
Embedded Training Example
Subject Revision to Your Amazon.com Information
Please login and enter your information
http//www.amazon.com/exec/obidos/sign-in.html
19
Intervention 1 Diagram
20
Intervention 1 Diagram
Explains why they are seeing this message
21
Intervention 1 Diagram
Explains what a phishing scam is
22
Intervention 1 Diagram
Explains how to identify a phishing scam
23
Intervention 1 Diagram
Explains simple things you can do to protect self
24
Intervention 2 Comic Strip
25
Intervention 2 Comic Strip
26
Intervention 2 Comic Strip
27
Embedded Training Evaluation 1

Lab study comparing our prototypes to standard
security notices
Group A eBay, PayPal notices
Group B Diagram that explains phishing
Group C Comic strip that tells a story
10 participants in each condition (30 total)
Screened so we only have novices
Go through 19 emails, 4 phishing attacks
scattered throughout, 2 training emails too
Role play as Bobby Smith at Cognix Inc

28
Embedded Training Results
29
Embedded Training Results

Existing practice of security notices is
ineffective
Diagram intervention somewhat better
Though people still fell for final phish
Comic strip intervention worked best
Statistically significant
Combination of less text, graphics, story?

30
Evaluation 2

New questions
Have to fall for phishing email to be effective?
How well do people retain knowledge?
Roughly same experiment as before
Role play as Bobby Smith at Cognix Inc, go thru
16 emails
Embedded condition means have to fall for our
email
Non-embedded means we just send the comic strip
Also had people come back after 1 week
To appear in APWG eCrime Researchers Summit (Oct
4-5 at CMU)

31
(No Transcript)
32
Results of Evaluation 2

Have to fall for phishing email to be effective?
How well do people retain knowledge after a week?

33
Results of Evaluation 2

Have to fall for phishing email to be effective?
How well do people retain knowledge after a week?

Correctness
34
Results of Evaluation 2

Have to fall for phishing email to be effective?
How well do people retain knowledge after a week?

Correctness
35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
Anti-Phishing Phil

A game to teach people not to fall for phish
Embedded training focuses on email
Our game focuses on web browser
Goals
How to parse URLs
Where to look for URLs
Use search engines for help
Try the game!
http//cups.cs.cmu.edu/antiphishing_phil

39
Anti-Phishing Phil
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44

45
Evaluation of Anti-Phishing Phil

Test participants ability to identify phishing
web sites before and after training up to 15 min
10 web sites before training, 10 after,
randomized order
Three conditions
Web-based phishing education
Printed tutorial of our materials
Anti-phishing Phil
14 participants in each condition
Screened out security experts
Younger, college students

46
Results

No statistically significant difference in false
negatives among the three groups
Actually a phish, but participant thinks its not
Unsure why, though game group had fewest false
positives
Press release last month, 50k new users
Still analyzing results
High knowledge retention 1 week later by
participants
Faster at identifying phish (12 seconds to 6
seconds)
Banks, non-profits, consulting firms, Air Force,
ISPs

47
(No Transcript)
48

49
Outline

Human side
Interviews to understand decision-making
PhishGuru embedded training
Anti-Phishing Phil game
Understanding effectiveness of browser warnings
Computer side
PILFER email anti-phishing filter
CANTINA web anti-phishing algorithm

Do people see, understand, and believe web
browser warnings?
50
Screenshots
Internet Explorer Passive Warning
51
Screenshots
Internet Explorer Active Block
52
Screenshots
Mozilla FireFox Active Block
53
How Effective are these Warnings?

Tested four conditions
FireFox Active Block
IE Active Block
IE Passive Warning
Control (no warnings or blocks)
Shopping Study
Setup some fake phishing pages and added to
blacklists
Users were phished after purchases
Real email accounts and personal information
Spoofing eBay and Amazon (2 phish/user)
We observed them interact with the warnings

54
How Effective are these Warnings?
55
How Effective are these Warnings?
56
Discussion of Phish Warnings

Nearly everyone will fall for highly contextual
phish
Passive IE warning failed for many reasons
Didnt interrupt the main task
Slow to appear (up to 5 seconds)
Not clear what the right action was
Looked too much like other ignorable warnings
(habituation)
Bug in implementation, any keystroke dismisses

57
Screenshots
Internet Explorer Passive Warning
58
Discussion of Phish Warnings

Active IE warnings
Most saw but did not believe it
Since it gave me the option of still proceeding
to the website, I figured it couldnt be that
bad
Some element of habituation (looks like other
warnings)
Saw two pathological cases

59
Screenshots
Internet Explorer Active Block
60
A Science of Warnings

See the warning?
Understand?
Believe it?
Motivated?
Planning on refining this model for computer
warnings

61
Outline

Human side
Interviews to understand decision-making
PhishGuru embedded training
Anti-Phishing Phil game
Understanding effectiveness of browser warnings
Computer side
PILFER email anti-phishing filter
CANTINA web anti-phishing algorithm

Can we automatically detect phish emails?
62
PILFER Email Anti-Phishing Filter

Philosophy automate where possible, support
where necessary
Goal Create email filter that detects phishing
emails
Spam filters well-explored, but how good for
phishing?
Can we create a custom filter for phishing?
I. Fette, N. Sadeh, A. Tomasic. Learning to
Detect Phishing Emails. In W W W 2007.

63
PILFER Email Anti-Phishing Filter

Heuristics combined in SVM
IP addresses in link (http//128.23.34.45/blah)
Age of linked-to domains (younger domains likely
phishing)
Non-matching URLs (ex. most links point to
PayPal)
Click here to restore your account
HTML email
Number of links
Number of domain names in links
Number of dots in URLs (http//www.paypal.update.e
xample.com/update.cgi)
JavaScript
SpamAssassin rating

64
PILFER Evaluation

Ham corpora from SpamAssassin (2002 and 2003)
6950 good emails
Phishingcorpus
860 phishing emails

65
PILFER Evaluation
66
PILFER Evaluation

PILFER now implemented as SpamAssassin filter
Alas, Ian has left for Google

67
Outline

Human side
Interviews to understand decision-making
PhishGuru embedded training
Anti-Phishing Phil game
Understanding effectiveness of browser warnings
Computer side
PILFER email anti-phishing filter
CANTINA web anti-phishing algorithm

How good is phish detection for web sites? Can
we do better?
68
Lots of Phish Detection Algorithms

Dozens of anti-phishing toolbars offered
Built into security software suites
Offered by ISPs
Free downloads (132 on download.com)
Built into latest version of popular web browsers

69
Lots of Phish Detection Algorithms

Dozens of anti-phishing toolbars offered
Built into security software suites
Offered by ISPs
Free downloads (132 on download.com)
Built into latest version of popular web browsers
But how well do they detect phish?
Short answer still room for improvement

70
Testing the Toolbars

November 2006 Automated evaluation of 10
toolbars
Used phishtank.com and APWG as source of phishing
URLs
Evaluated 100 phish and 510 legitimate sites
Y. Zhang, S. Egelman, L. Cranor, J. Hong.
Phinding Phish An Evaluation of Anti-Phishing
Toolbars. NDSS 2006.

71
Testbed System Architecture
72
Results
38 false positives
1 false positives
PhishTank
73
APWG
74
Results

Only one toolbar gt90 accuracy (but high false
positives)
Several catch 70-85 of phish with few false
positives

75
Results

Only one toolbar gt90 accuracy (but high false
positives)
Several catch 70-85 of phish with few false
positives
Can we do better?
Can we use search engines to help find phish?
Y. Zhang, J. Hong, L. Cranor. CANTINA A
Content-Based Approach to Detecting Phishing Web
Sites. In W W W 2007.

76
Robust Hyperlinks

Developed by Phelps and Wilensky to solve 404
not found problem
Key idea was to add a lexical signature to URLs
that could be fed to a search engine if URL
failed
Ex. http//abc.com/page.html?sigword1word2...
word5
How to generate signature?
Found that TF-IDF was fairly effective
Informal evaluation found five words was
sufficient for most web pages

77
Adapting TF-IDF for Anti-Phishing

Can same basic approach be used for
anti-phishing?
Scammers often directly copy web pages
With Google search engine, fake should have low
page rank

Fake
Real
78
How CANTINA Works

Given a web page, calculate TF-IDF score for
each word in that page
Take five words with highest TF-IDF weights
Feed these five words into a search engine
(Google)
If domain name of current web page is in top N
search results, we consider it legitimate
N30 worked well
No improvement by increasing N
Later, added some heuristics to reduce false
positives

79
Fake
eBay, user, sign, help, forgot
80
Real
eBay, user, sign, help, forgot
81
(No Transcript)
82
(No Transcript)
83
Evaluating CANTINA
PhishTank
84
Weaknesses in CANTINA