Title: Andrew Williams
1Ranking Attackers Through Network Traffic Analysis
- Andrew Williams Nikunj Kela
2Agenda
- Background
- Tools We've Developed
- Our Approach
- Results
- Future Work
3Background The Problem
- Setting 1 Corporate Environment
- Large number of attackers
- How do you prioritize which attacks to
investigate?
RSA
4Background The Problem
- Setting 2 Hacking Competitions
- How do you know who should win?
5Background Information Available
- Network Traffic Captures
- Alerts from Intrusion Detection Systems (IDS)
- Application and Operating System Logs
6Background Traffic Captures
- HUGE volumes of data
- A complete history of interactions between
clients and servers -
- Information available
- Traffic Statistics
- Info on interactions across multiple servers
- How traffic varies with time
- Everything up to and including application layer
info
7Background IDS Alerts
- Messages indicating that a packet matches the
signature of a known malicious one - Still a fairly large amount of data
- Same downsides as anti-virus programs, but most
IDS signatures are open source! - If IDS is compromised, these might not be
available -
- Information available
- Indication that known attacks are being launched
- Alert Statistics
- How alerts vary with time
8Background Application/OS Logs
- Ex mysql logs, apache logs, Windows 7 Security
logs, ... -
- Detailed, application-specific error messages and
warnings - Large amount of data
- If a server is compromised, logs may not be
available -
- Information available
- Very detailed information with more context
- Access to errors/issues even if traffic was
encrypted
9Background iCTF 2010 Contest
- 72 teams attempting to compromise 10 servers
- Vulnerabilities include SQL Injection,
exploitable off-by-one errors, format string
exploits, and several others - Pretty complex set of rules
-
- Dataset from competition
- 27 GB of Network Traffic Captures
- 46 MB of Snort Alerts (from competition)
- 175 MB of Snort Alerts (generated with updated
rulesets) - No Application or OS Logs
-
-
- More information on the contest can be found
here - http//www.cs.ucsb.edu/gianluca/papers/ctf-acsac2
011.pdf
10Tools We've Developed
- We wrote scripts to...
- Parse the large amount of data
- Extract network traffic between multiple parties
- Filter out less important Snort Alerts
- Track connection state to generate statistics and
stream data - Visualize the data
- Show all of the alerts and flag submissions with
respect to time - Analyze the data
- Pull out the transaction distances and find
statistics on them - Generate Application and OS Logs
- Replay network traffic to live virtual machine
images
11Our Approach Intuition
- Vulnerability Discovery Phase
- Identify the type of vulnerability
- Vulnerability Exploitation Phase
- Refine the attack string
- It is quite intuitive that a skilled attacker
will come up with the attack-string in less time
than an unskilled attacker - How do we know if the attacker has broken into
the system? - We only have logs to work with!
- Time taken to break into the system reflects the
learning capabilities of an attacker - Fast learner implies good attacker
12Our Approach Identify the attack string
- Once the attacker break into the system, he/she
would use the same attack string almost every
time to gather information - We observed from the traffic logs that in most of
the cases, the attacker used one TCP stream to
break into the system - One TCP connection for each attempt!
- We chose Levenshtein distance (Edit Distance) as
our metric to compare the two TCP communication
from attacker to server - Consecutive zero as the distance between TCP data
means the attacker has successfully broken into
the system
13Example Identify the attack string
Stream1 "2720or2027273D270Alist0A" Stre
am2 "2720OR2027273D270ALIST0A" Stream3
"asdfasd202720UNION20SELECT202827secret.t
xt2729 3B20--20200AMUGSH
OT0ASADF0A" Stream4 "asdfasd202720UNION20S
ELECT202827secret.txt2729
3B20--20200AMUGSHOT0A393930A" Stream5
"asdfasd202720UNION20SELECT202827secret.txt
2729 3B20--20200AMUGSHOT
0A16060A" S
Stream6 "asdfasd202720UNION20SELECT202827s
ecret.txt2729
3B20--20200AMUGSHOT0A16060A"
14Our Approach Features Selection
- Time taken to successfully break into the system
- Mean and standard deviation of the distances
between consecutive TCP streams - Number of attempts before successfully breach
into the service - Length of the largest sequence of consecutive
zero's -
-
15Result Distance-Time Plot
16Interesting Findings from the contest
- Although the contest involved only attacking the
vulnerable services, yet the teams tried to break
into each others systems - We noticed that teams shared the Flag value with
each other through the chat server - The active status of the service was maintained
through a complex petri-net system and most of
the teams struggled to understand it - Hints about different vulnerabilities in the
services were released time to time through out
the contest by the administrators -
-
17Future Work
- Use of data mining tools(e.g. SAS miner) to
analyse the relationships among the features -
- Use of data mining tools for developing a scoring
systems to give scores to each teams based on the
feature set -
- Continue improving the replay script to handle
the large number of connections
18Thank You!
19Image Sources
WooThemes, free for commercial use
Icons-Land, free for non-commercial use
Fast Icon Studio, used with permission