Detection of Interactive Stepping Stones

1 / 36

About This Presentation

Title:

Detection of Interactive Stepping Stones

Description:

stepping-stone attack: attacker uses chain of compromised machines to reach victim ... Chaos and volume of Internet traffic. Not always logged ... – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 37

Provided by: shobhaven

Learn more at: http://www.cs.cmu.edu

more less

Transcript and Presenter's Notes

Title: Detection of Interactive Stepping Stones

1
Detection of Interactive Stepping Stones

Shobha Venkataraman
shobha_at_cs.cmu.edu
Joint work with Avrim Blum Dawn Song
Carnegie Mellon University
ICML Workshop 2006
June 29, 2006

2
Stepping Stone

stepping-stone attack attacker uses chain of
compromised machines to reach victim
Difficult to find attacker from looking only at
victim
Victim only sees the last host in chain

3
Why stepping-stones?

Stepping-stones attractive to attackers
Ease of compromising hosts on Internet
Difficulty of detection
Dont know when host is compromised
Only know when there is attack
Dont know who compromised
Chaos and volume of Internet traffic
Not always logged
True attacker almost untraceable near-perfect
way to achieve anonymity!
Large-scale stepping-stones botnets

4
Botnets For sale, stepping-stones

Botnets Set of compromised hosts controlled by a
single command-center
How this works
individual hosts compromised
control priveleges sold to other attackers, who
use them launch attacks.
Nearly impossible to discover true attackers
Extremely prevalent on Internet
Logs at CMU dept discovered Gaobot infection
Across 6-7 months of traffic (everything we
examined!)
Across 100 hosts (1/10th network) at peak
infection

5
Botnets (II)
CMU
Pittsburgh Verizon DSL
Stanford
6
General Stepping-Stone Detection

Extremely difficult
Indefinite delay between stepping-stone legs
Traffic too voluminous/insufficiently logged for
traceback
Packets encrypted or padded between legs
Stepping-stone legs additionally masked by
adding superfluous traffic (chaff)

7
Restricting the Problem

Restrictions
Traffic monitoring done at routers/gateways
Interactive stepping-stone streams
Bounded delay ? between stepping-stones

T ? ?
Internet
T 0
8
Restricting the Problem (II)

Restrictions put together observe 2 time-delayed
streams at monitor, are they a stepping-stone
pair?
If attacker uses no chaff
If attacker uses chaff

T ? ?
Internet
T 0
9
Prior work

Donoho, Flesia, Shankar, Paxson, Coit
Staniford, RAID 02
Assumptions needed
Attack stream from Poisson or Pareto distribution
Normal users perfectly uncorrelated
No guarantees on monitoring time or false
positives
Wang Reeves, CCS 03
Assumptions needed
Timing perturbation of packets iid strong
assumption
No chaff
Scheme breaks without assumptions
Other related work SH95, YE00, ZP00, WRWY01,
WRW02, W04

10
Our work

Want to allow correlations among normal users
Dont flag just any correlated pair
Time-correlated pair ! stepping-stone pair
Use milder assumptions
Model non-attack streams as sequences of Poisson
processes
No additional assumption on attacker
Allow chaff
Present algorithms and analysis for these models

11
Inspiration from learning theory

Learning Theory Question
How many examples do we need to see before we
can identify hypotheses with guaranteed
confidence?
Our Question
How many packets do we need to see before we can
identify normal/attack streams with guaranteed
confidence?
Rest of talk answer this question

12
Outline

Problem definition
Without chaff
Simple Poisson model
Generalized Poisson model
With chaff
Algorithms
Hardness of detection results
Conclusions

13
Problem Definition (I)

Set-up stepping-stone monitor tracks no. of
packets in streams S1, S2 at a given time t
N1(t), N2(t)
Assumptions
Packets correspond 1-1 on stepping-stone streams
(without chaff)
Max tolerable delay bound ? exists
Max no. packets attacker can send in time ?
exists p?
Our bounds will be in terms of p?.

14
Problem Definition (II)

For stepping-stone streams S1 S2
1. Every packet on S2 comes from S1
N1(t) ? N2(t)
2. Every packet on S1 appears on S2 within ?
time
N1(t) ? N2(t ?)
Assumptions on normal streams next
Detect stepping-stone pairs with guarantees on
Monitoring time M
total packets observed on both streams before
detection
False-positive probability ?

15
Simple Poisson Model

Assumptions
Normal stream Poisson process with fixed rates
(generalize this later)
p? is known (relax this later).
No chaff (generalize this later).
Outline
Algorithm
Analysis sketch
Relax knowledge of p?

16
Algorithm

Algorithm
Observe y packets on union of streams S1 and S2
Compute difference in no. of packets d N1 N2.
If d is not in - pD, pD, return NORMAL
Repeat over x iterations the above procedure
Return ATTACK if d lies in -pD, p? throughout
Thm with x log 1/e, y 2(p? 1)2
Monitoring time M xy O(p?2 log 1/e) packets
False positives lt e

17
Analysis (I)

Overhead
Only per-stream packet counters running all the
time!
Compute sums differences for pairs once in a
while
Algorithm needs NO knowledge of Poisson rates
Any stepping-stone pair sending M packets
reported
For stepping-stone pair, d within -pD, pD
If d gt pD, some packet violates max delay bound
Ensure that false positive probability less than
e
i.e. d leaves -pD, pD with probability more
than 1 - e
When d leaves -pD, pD, algorithm returns
normal

18
Analysis (II)

Streams S1 and S2 Poisson processes with rates
?1, ?2
(normalized so that ?1 ?2 1)
On union of streams, each packet
?1 chance of coming from S1,
?2 chance of coming from S2

Time
19
Analysis (III)
2
Z
1
0
1
1
0
1
0
0
1
0
1

Let Z be the difference in no. of packets on S1
and S2

Every time packet appears on S1 ? S2
Z Z 1 with probability ?1
Z Z - 1 with probability ?2
Thus, Z equivalent to 1-d random walk
Need Z to exit -pD, p? after some steps

20
Analysis (IV)

Fact 1-d random walk exits bounded region of
length t in expected O(t2) time!
Therefore,
When n O(pD2) ,
PrZ will stay in bounded region lt 1/2
Repeat for m log 1/e iterations
PrZ will stay in bounded region lt e
When Z exits bounded region,
normal pair does not get falsely accused.
Done!

21
What if p? is unknown?

What if we do not know p??
Use guess and double strategy.
Set pj 2j.
Run algorithm over sequence of pj p1, p2,
When a pair is cleared for pj, examine it with
respect to pj1..

22
What if p? is unknown?

For stepping-stone pair, increases monitoring
time by O(log log p?).
Guarantee depends only on true value of p?!
In practice, set upper bound for p?
Normal streams monitored until upper bound
reached
As j increases, test differences exponentially
less often
Fundamental problem cannot distinguish between
normal pair and attack pair with longer delay
bound

23
Summary Simple Poisson

Normal streams Poisson process with single
fixed rate.
Algorithm with guaranteed false positives and
monitoring time
Algorithm needs no knowledge of Poisson rates
Analysis extended
When p? is unknown
When false positive probability is distributed
over all pairs of streams in paper

24
Outline

Problem definition
Without chaff
Simple Poisson model
Generalized Poisson model
With chaff
Algorithms
Hardness of detection results
Conclusions

25
Generalized Poisson model

Model normal process as SEQUENCE of Poisson
processes varying rates for varying time periods
i.e. stream given by (?1, t1), (?2, t2),
General model coarsely approximate almost any
usage pattern, for example
Coarsely simulate Pareto distributions good
model of typing patterns
Correlated users same sequence of Poisson rates
time intervals

26
Analysis Sketch

Formally, a stream S is given by (?1, t1), (?2,
t2),
Key observation
At time T, packet distribution equivalent to
Poisson process with single fixed rate ?j (?j .
tj)/T (weighted mean)
More details in paper.

27
Summary General Poisson

Normal streams modelled as sequences of Poisson
processes (?1, t1), (?2, t2),
Very general model
Algorithm with guarantees on monitoring time and
false positive rate
Once again, algorithm needs no knowledge of
Poisson rates
Results in this model extended similarly
When p? is unknown
When false positive probability is distributed
over all pairs of streams

28
Outline

Problem definition
Without chaff
Simple Poisson model
Generalized Poisson model
With chaff
Algorithms
Hardness of detection results
Conclusions

29
Chaff
Stepping Stone
Victim
Attacker

Chaff dummy packets inserted in traffic streams
to avoid detection

Algorithms (as presented) broken by single packet
of chaff
Next, modify algorithms to handle limited chaff

30
Chaff Algorithms

Fix chaff rate, but chaff arbitrarily distributed
Simple Poisson model
Algorithm
Let y be number of packets needed before we exit
bounded region in random walk.
Allow chaff rate of p?/4y, monitor for difference
to leave -2pD, 2pD
Regular streams get difference (wait longer)
Can tweak algorithm to handle slightly higher
chaff rate, but thats all. Hardness results
next
Extends similarly to general Poisson model.

31
Hardness of Detection

No algorithm based on timing delays alone can
detect stepping-stones with smart use of chaff
Can give bounds on chaff needed so attacker can
pre-generate two independent processes
send packets to mimic independent processes
exactly
Details strategies in paper
If attacker can actively send such chaff,
detection requires use of other information

32
Summary

Algorithms to detect stepping stones
Guarantees on monitoring time and false positives
Simple and generalized Poisson models
With and without (arbitrarily distributed) chaff
When p? is known/unknown
Compared to previous work
Milder assumptions, allow for substantial
correlation among normal users
No additional assumptions on attacker (besides
delay bound)
With sufficient chaff, attacker can mask stepping
stones, so that no algorithm that uses
inter-packet delays can detect them.