Title: Accurately Detect Parked Domain Typosquatting Attacks
1Accurately Detect Parked Domain Typo-squatting
Attacks
- Mishari Almishari and Xiaowei Yang
- University of California, Irvine
- Donald Bren School of Information and Computer
Sciences - Computer Science Department
- malmisha, xwy_at_ics.uci.edu
2Introduction
- Typo-Squatting refers to the act of registering
domain names that are typographical errors of
other popular domain names (target domains) to
hijack the traffic intended to those popular
domain names - Hijacking for malicous purposes
- Hijacking for financial purposes
3Goals Contributions
- Accurately identify typo-squatting domains
- Measure the amount of traffic hijacked by
squatters - Build a system that would reduce the amount of
traffic to such domains
4Methodology
- Identifying Typos
- Use edit distance of 1 as our typo definition
- Less controversial in terms of typo definition
- Users are more prone to make a single error than
2 or more - A study shows that 90-95 of spelling errors are
of 1 mistake - Nevertheless, extending the typo definition is
worth working at.
5Methodology
- Identifying hijacking attempts
- Is being a typo domain enough?
- No, 55 are not squatting
- What are the common hijacking indicators?
- Parked Domain / Ads Listing (88.5)
- Offensive Adult Content (3.1)
- Domain For Sale (2.1)
- Forwarding To Another Domain (8.3)
- How to identify Parked Domain?
- Use Machine Learning Classifier (96) (100)
6Experiment
- Measure amount of hijacked traffic
- UCI DNS traces of 8 months
- 500 popular domains from Alexa Website
- Steps
- Pre-processing of DNS queries
- Finding Typo Domains
- Finding Typo Squatting Domains
7Measurement Results
- Typo-squatting Hits
- Total of 23,989
- Ranges from 1,675 to 3,621
- Typo-squatting Domains
- Total of 1,786 domains
- Ranges from 347 to 530 domains
8Measurement Results
- Maximum Hits to Typo-squatting Domains
- Could reach up to 649 hits for one domain in on
month - Average Hijack Ratio
- Low
- 0.33 to 1
9Measurement Results
- Maximum Hijack Ratio
- From 82 to 100
- Most squatted Domains
- Most hijacked is www.facebook.com
- 2nd Most hijacked is www.youtube.com
10Measurement Results
- Typo Characterization
- 14 of Cat 1 is missing dot
- 66 of Cat 2 is from neighbor keys
- 26 of Cat 2 is the same as one before or after
- 42 is from neighbor keys
11Comparison With Other Typo-correctors
- Google Yahoo typo-correction web services
- 15 (12) missed by Google (Yahoo)
- 99.6 (98) of what is missed are real parked
domains - 23(31) fwd to the same target domain
12System Implementation
- Successfully integrate our methodology with
Mozilla Firefox browser - Second set, 94
- Non Typo domains, 10 ms in avg and max is 25 ms
13Classifier
- Data Set is of 2,800 sample
- 700 are parked domain and 2,100 general purpose
domain from Yahoo Directory - Identify distinguishing features
- Compute Distribution for verification
- Use WEKA library to try different classification
algorithms, Random Forest was the best
14Conclusion
- Defined and implemented an accurate
identification methodology - Performed measurements that show typo-squatters
are moderately successful - Integrated the methodology with a Firefox browser
to detect typo-squatting domains on the fly