ICSI Work on DetectionDefense - PowerPoint PPT Presentation

About This Presentation

Title:

ICSI Work on DetectionDefense

Description:

Exploited flaw in the passive analysis of Internet Security Systems products ... E.g., USER-AGENT: Mozilla/4.0 (compatible; MSIE 6.0b; Windows 98) ... – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 49

Provided by: nichola58

Learn more at: https://www.sysnet.ucsd.edu

Category:

more less

Transcript and Presenter's Notes

Title: ICSI Work on DetectionDefense

1
ICSI Work on Detection/Defense

Vern Paxson, Nicholas Weaver, et al
September 20, 2005

2
Overview

Forensic analysis of Witty
Internet Situational Awareness
Scan detection
Detecting Triggers
Preliminary signature white-listing
Students Abhishek Kumar (Georgia Tech), Vinod
Yegneswaran (UWisc) Jaeyeon Jung (MIT), Juan
Caballero (CMU), Jayanthkumar Kannan (UCB),
Christian Kreibich (Cambridge)

3
Forensic Analysis of Witty

March 2004 (flaw announced previous day)
Single UDP packet - stateless spreading
Exploited flaw in the passive analysis of
Internet Security Systems products
Payload slowly corrupt random disk blocks
Telescope data from UCSD/CAIDA /8
Also UWisc /8, sampled 1-in-10

4
Witty Abstract Pseudo-code

Seed the PRNG using system time.
Send 20,000 copies of self to randomly selected
destinations.
Open physical disk chosen randomly between 0 ..
7.
If success
Overwrite a randomly chosen block on
this disk.
Goto line 1.
Else
Goto line 2.

5
More Detailed Pseudo-code

srand(seed) X ? seed
rand() X ? X214013 2531011 return X
main()
srand(get_tick_count())
for(i0ilt20,000i)
dest_ip ? rand()0..15 rand()0..15
dest_port ? rand()0..15
packetsize ? 768 rand()0..8
packetcontents ? top-of-stack
sendto()
if(open_physical_disk(rand()13..15 ))
write(rand()0..14 0x4e20)
goto 1
else goto 2

6
Witty Becomes Deterministic

Given top 16 bits of linear congruential
pseudo-random number generator, can brute-force
possible bottom bits to recover the pseudo-random
state
Keys to the kingdom infectee operation
effectively becomes deterministic (except for
pesky reseeding) with packets carrying an
implicit sequence number
So, for example, we can compute each infectees
local access bandwidth even in the presence of
heavy packet loss (since Windows sendto() call
is blocking)
Just based on sequence number of packets seen _at_
telescope and the amount of data sent between them

7
Inferred Access Bandwidth of Individual Witty
Infectees
8
Precise Bandwidth Estimation vs. Rates Measured
by Telescope
9

srand(seed) X ? seed
rand() X ? X214013 2531011 return X
main()
srand(get_tick_count())
for(i0ilt20,000i)
dest_ip ? rand()0..15 rand()0..15
dest_port ? rand()0..15
packetsize ? 768 rand()0..8
packetcontents ? top-of-stack
sendto()
if(open_physical_disk(rand()13..15 ))
write(rand()0..14 0x4e20)
goto 1
else goto 2

4 calls to rand() per loop

Or complete reseeding if not
10
Witty Infectee Reseeding Events

Recall every 20,000 packets, Witty burns a random
number picking a disk to open trash. For
packets with state Xi and Xj
If from the same batch of 20,000 then
j - i 0 mod 4
If from separate but adjacent batches, for which
Witty did not reseed, then
j - i 1 mod 4
(but which of the 100s/1000s of intervening
packets marked the phase shift?)
If from batches across which Witty reseeded, then
no apparent relationship.
Lets us find the phase of Witty reseeding events

11
Finding Each Infectees Random Seed

Given the phase of reseeding events
plus the fact that Witty uses uptime (in msec)
for its entropy
thus its seeds increase linearly with time
plus some computational geometry
We can extract each infectees random seed
I.e. we know its uptime
And, by observing times it didnt reseed, how
many disks it has attached

12
Uptime of 750 Witty Infectees
13
Disk Drives Per Witty Infectee
14
Given Exact Valuesof Seeds Used for Reseeding

More generally, we know every packet each
infectee sent
Can compare this to when new infectees show up
i.e. Who-Infected-Whom

15
Infection Attempts That WereToo Early, Too Late,
or Just Right
16
Witty is Incomplete

Recall that LCD PRNG generates a complete orbit
over a permutation of 0..232-1.
But Witty author didnt use all 32 bits of
single PRNG value
dest_ip ? (Xi)0..15 (XI1)0..15
This does not generate a complete orbit!
Misses 10 of the address space
Visits 10 of the addresses (exactly) twice
So, were 10 of the potential infectees protected?

17
Time When Infectees Seen At Telescope
18
How Do Unscanned Infectees Become Infected?

Multihomed host infected via another address
DHCP or NAT aliasing
But what about the extra-quick ones?
Either they were passively infected and had a
large cross-sections
Or they were known in advance to the attacker

19
Uptime of 750 Witty Infectees
Part of a group of 135 infectees from same /16
20
Time When Infectees Seen At Telescope
Most also belong to that /16
21
Witty Started With A Hit List

Initial infectees exhibit super-exponential
growth ? they werent found by random scanning
(And can in fact show large-scale passive
infection unlikely)
Prevalent /16 U.S. military base
Attacker knew of ISS security software
installation at military site ? ISS insider (or
ex-insider)
Fits with very rapid development of worm after
public vulnerability disclosure

22
Are All The Worms In Fact Executing Witty?

Answer No.
One infectee probes addresses not on the orbit,
each of the form A.B.A.B rather than A.B.C.D.
Each probe contains Witty contagion, but lacks
randomized payload size.
Shows up very near beginning of trace.
Patient Zero - machine attacker used to
launch Witty. (Really, Patient Negative One.)
European retail ISP.
Communicated to law enforcement.

23
Implications of Witty Forensics

Provided a degree of worm attribution
(truth be told, doesnt require the full
analysis)
Powerful demonstration of opportunistic
measurement and exploiting structure
Very labor intensive
A one-trick pony?

24
Internet Situational Awareness

Separate from ICSI honeyfarm, at LBL we operate a
2,560 honeynet w/ honeyd responders
Basic question how do we tell when it sees
something new
and interesting
Idea
Characterize background radiation in abstract
terms
Remove any matches, consider remainder new
except first run for a few months to converge
on full set of abstractions

25
Internet Situational Awareness, cont

It doesnt work.
There is constant churn in what arrives thats
new
Though often with very minor variations
In principle removable, but need better
meta-abstractions for doing so
Basic question 2 What can we say about an
event seen by the honeynet?
Is it a worm, a botnet, a misconfiguration?
If a botnet, could it be more than one? Is the
scanning coordinated? How large a region is the
scan targeting?

26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
Internet Situational Awareness, cont

It doesnt work ... Yet.
Significant noise problems
Significant modalities variations
Calibration difficulties

32
Scan Detection

TRW (Threshold Random Walk) very effective at
detecting random scanners
at least, at a sites border
(we now have some enterprise traces to evaluate)
What about non-random scanning worms?
Topological, meta-server
Idea detect anomalously high fan-out rate
But with what detection threshold? Too low and
busy hosts trigger false positives. Too high and
worm can fly under the radar.

33
Applying Sequential Hypothesis Testing to
Rate-based Detection

Idea per-host, learn its past rate of contacting
new hosts
This becomes its Bayesian prior for non-infection
Hypothesize higher rate for infected hosts
As new contacts made, apply SHT to decision
between infection/non-infection
Benefits
No single fixed detection threshold
Hosts behavior somewhat integrated over multiple
time scales by updates to SHT

34
RBS (Rate-Based Seq. Hyp. Testing)

Math based on Poisson arrivals for hosts
contacting new destinations (not too bad an
assumption)
Evaluated on partial enterprise traces
Proxies for topological scanners internal
security scanner, web crawlers, printer manager,
service monitor
Prior for benign fan-out rate 3.8 Hz
Preliminary works fairly well, 1 FP/hr
Also assess hybrid, RBSTRW
But
FP high enough to make automatic response
problematic
Topological worm can still spread very fast _at_
?3.8 Hz if avoids TRWs failure detection

35
DNS-Based Scan Detection

Previous work watch DNS traffic to detect
random-address scanners because not preceded by
name lookup
Idea (preliminary) for non-random scanning
worms, use a sites DNS server to gain insight
into what cant otherwise be seen
The hope even if scanning activity occurs within
an unmonitored subnet, for topological worms will
still often be preceded by DNS lookup that is
seen at DNS server
Assessed on traces from LBLs name servers
Problem there are a lot of hosts with
significant DNS fan-out (also, surprisingly, a
lot of failure to cache previous answers)

36
DNS-Based Scan Detection, cont

Another idea analyze DNS lookups to spot
potential contact graphs
I.e., A looks up B which then looks up C which
looks up D
Somewhat more promising, but
Needs to work on short chains, since trouble
likely grows exponentially with chain-length
Trace evaluation finds clusters of hosts that
frequently look each other up. Need to
distinguish these from true contact graphs (by
training? by a tell?)

37
Detecting Triggers

Observation many forms of successful
attack/abuse manifest as incoming traffic to a
host H triggers H to initiate/receive connections
it otherwise wouldnt
Phone home signal on successful exploits
Also done by opening up a new port thats probed
by attackware to determine success
Incoming worm traffic triggers outgoing scanning
Incoming email/IRC triggers outgoing email/IRC
Idea such triggers manifest as apparently
unrelated connections occurring closer in time
than should happen just due to chance

38
Detecting Triggers, cont

Mathematical framework assumes that application
sessions well-modeled as Poisson process.
Compute probability that two independent Poisson
processes would occur as close together as
observed. If low, flag as anomalous.
Requires recognizing known session structure,
e.g., FTP user connection FTP data connections
optional ident connection. Or SMTP in to
known server (again w/ optional ident) that leads
to SMTP out as it forwards it.
We codified 39 of these

39
Detecting Triggers, cont

This works! in terms of finding hidden
causality, i.e., connections that are related
even though not part of one of the recognized
sessions.
This doesnt work! in terms of assuming that
such hidden causality reflects abuse.
Instead, it nearly always means weve found a new
type of (benign) application session.
Prevalence could be skewed by degree to which
LBLs traffic includes a very diverse set of
applications.
We got the FP rate down to a few dozen per day
not good enough. Serves as good anomaly signal
but not actionable.
Were now thinking about recasting in terms of
automatically discovering session structure.

40
Signature White-listing

Problem when automatically distilling signatures
(e.g.., from honeypot traffic), how do we ensure
that the signature doesnt reflect benign/common
protocol elements?
E.g., USER-AGENT Mozilla/4.0 (compatible MSIE
6.0b Windows 98)
Idea run signature distillation over large
corpus of mostly benign traffic, identify
frequently occurring protocol elements for
white-listing
Status basic algorithms developed, preliminary
test on HTTP traces promising
with key questions being how will it scale to
sufficiently large datasets
and will this suffice to construct a complete
enough list?

41
(Additional Slides Re Witty Analysis)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)

Write a Comment

User Comments (0)