Traceback - PowerPoint PPT Presentation

1 / 37

About This Presentation

Title:

Traceback

Description:

IEEE/ACM Transactions on Networking, December 2002. Xiang, Y., Zhou, W. A Defense System Against DDoS Attacks by Large-Scale IP Traceback , IEEE 2005. – PowerPoint PPT presentation

Number of Views:101

Avg rating:3.0/5.0

Slides: 38

Provided by: Yano9

Learn more at: https://engineering.unt.edu

Category:

more less

Transcript and Presenter's Notes

Title: Traceback

1
Traceback

Pat Burke
Yanos Saravanos

2
Agenda

Introduction
Problem Definition
Traceback Methods
Packet Marking
Hash-based
Conclusion
References

3
Why Use Traceback?

SPAM
DoS
Insider attacks
Worms / Viruses
Code Red (2001) spreading at 8 hosts/sec
Slammer Worm (2003) spreading at 125 hosts/sec

4
Why Use Traceback?

Currently very difficult to find spammers, virus
authors
Easy to spoof IPs
No inherent tracing mechanism in IP
Blaster virus author left clues in code, was
eventually caught
What if we could trace packets back to point of
origin?

5
Packet Tracing
6
Benchmarks

Effect on throughput
Amount of overhead added to the packets
False positive rate
Computational intensity
Time required to trace an attack
Amount of data required to trace an attack
CPU/memory usage on router
Collisions
For hash-based traceback methods

7
FDPM Flexible Deterministic Packet Marking
8
Packet Marking

Add information to the packets so that paths can
be retraced to original source
Methods for marking packets
Probabilistic
Node Marking
Edge Marking
Deterministic

9
Probabilistic Packet Marking (PPM)

Using probability, router marks a packet
With router IP address (node marking)
With edge of paths (edge marking)
95 accuracy, requires 300,000 packets
PPM needs a large number of packets and creates
very high computational load

10
PPM Nodes - Cons

Large number of false positives
DDoS with 25 hosts requires several days and has
thousands of false positives
Slow convergence rate
For 95 success, we need 300,000 packets
Attacker can still inject modified packets into
PPM network (mark spoofing)
This is only for a single attacker

11
Deterministic Packet Marking (DPM)

Every packet is marked
Spoofed marks are overwritten with correct marks

12
DPM

Incoming packets are marked
Outgoing packets are unaltered
Requires more overhead than PPM
Less computation required
Probability of generating ingress IP address
(1-p)d-1

13
DPM Mark

16-bit mark field
1-bit flag field

14
DPM Mark Encoding

Two 16-bit fields and a 1-bit flag
IP populates one of the two fields with
probability of 0.5
Set flag to 1 if using the higher end bits
Can be made more secure by using non-uniform
probability distributions

15
DPM

Claimed to have 0 false positives
Claimed to converge very quickly
99 probability of success with 7 packets
99.9 probability of success with only 10 packets
Has not been tested on large networks
Cannot deal with NAT

16
Flexible DPM

DPM uses 17 bits in the IP header to store
marking information
FDPM uses a variable length TOS added to the mark
TOS is between 0 and 8 bits, mark is 17-25 bits
Split ingress IP into k segments, send in
separate packets
Segment numbers keep address order consistent
Reconstruction process recognizes packets from
same source

17
FDPM Reconstruction

Mark Recognition
Store reconstruction packets in cache
Split IP header into fields to find mark length
Address Recovery
Analyze and store mark in recovery table
Different source IPs may have the same digest
(hash value) and collisions may occur
More than one entry is created

18
Flow-Based Marking

Mark packets selectively according to flow
properties when router has heavy traffic
Reduce load on router while still marking
Packets are classified according to destination
IP address
Uses flow thresholds
Lmax is threshold where routers load is exceeded
(called the overload problem)

19
Traceable Sources
20
Overload Problem

Under heavy traffic, router can randomly mark
packets
Both attack and normal packets may be marked
Flow-based marking as mentioned before is much
more efficient

21
FDPM Performance
PPM DPM FDPM
Packets for Traceback 103 102 102
Traceable Sources 102 103 105
Computational Load High Medium Low
Overload Prevention None None Good

FDPM can trace many more sources with less
computational load since it uses variable mark
lengths
Incorporates overload prevention to keep router
from failure
Requires more overhead (up to 25 bits instead of
only 17)

22
HASH-BASED TRACEBACK Source Path Isolation
Engine (SPIE)
23
SPIE - Overview

Each router along a packets transmission path
computes a set of Hash-codes (digests) associated
with each packet
The time-tagged digests are stored in
router-memory for some time period
Limited by available router resources
Traceback is initiated only by authenticated
agent requests to the SPIE Traceback Manager
(STM)
Executed by means of a broadcast message
Results in the construction of a complete attack
graph within the STM

24
SPIE - Assumptions

Packets may be addressed to multiple destinations
Attackers are aware they are being traced
Routers may be subverted, but not often
Routing within the network may be unstable
Traceback must deal with divergent paths
Packet size should not grow as a result of
traceback
1 byte increase in size 1 increase in resource
use
Very controversial self-enabling assumption
End hosts may be resource constrained
Traceback is an infrequent operation
Broadcast messages can have a significant impact
on internet performance
Traceback should return entire path, not just
source

25
SPIE - Architecture
DGA (Data Generation Agent) Resident in
SPIE-enhanced routers to produce digests and
store them in time-stamped digest tables.
Implemented as software agents, interface
cards, or dedicated aux boxes
STM (SPIE Traceback Manager) Controls the SPIE
system. Verifies authenticity of a traceback
request, dispatches the request to the
appropriate SCARs, gathers regional attack
graphs, and assembles the complete attack graph.
SCAR (SPIE Collection and Reduction Agents)
Data concentration point for some regional area.
When traceback is requested, SCARs initiate a
broadcast request for traceback and produce
regional attack graphs based upon data from
constituent DGAs
26
SPIE - Hashing

Multiple hash-codes (hash-codes, different
groupings of fields) are calculated for each
package based on 24 bytes of relatively invariant
fields in the header plus the first 8 byte of the
payload.

For each packet received, SPIE computes k
independent n-bit digests, and sets the
corresponding bits in the 2n bit digest table
Bloom Filter (nominally 256 Mb per filter).
Each filter contains the digests of multiple
packages (approximately 50M packets per filter)
as large as possible, but avoiding collisions.

Masked (gray) areas are NOT used in hash-code
calculation
27
SPIE Hashing Collisions

The figure to the right shows the fraction of
packets that collide as a function of prefix
length.
The WAN represents 985,150 packets between 6,031
host pairs collected at the University of Florida
OC-3 gateway.
The LAN trace consists of one million packets
between 2,879 pairs at the MIT Lab for Computer
Science.

LAN .139
WAN .00092
28
SPIE Hashing

When one Bloom Filter is full, the next one is
initialized and time tagged to record the receipt
of the next packet. Can be implemented as a
circular buffer of filters.
For security purposes, each SPIE Agent generates
a new set of k input vectors to the Bloom Filters
with each filter change
Based on a pseudo-random number generator
independently seeded at each router
These vectors are stored with the associated
filter.
SPIE never needs to record any packet payload
information
The first 8 bytes of the payload can be
regenerated from the hashing, given the stored
input vectors

29
SPIE Traceback Processing

The SPIE Traceback Manager controls the process
Cryptographically verifies that the authenticity
and integrity of the traceback request message
Authorized requester
Packet ID
Victim
Approximate time of attack
Dispaches the request to the appropriate SPIE
Collection and Reduction Agents (SCARs)
SCARs poll their assigned Data Collection Agents
(DCAs)
DCAs poll their assigned routers
If the response from the targeted SCARs indicates
that other regional SCARs are involved in the
Trace, the STM sends another direct request
This loop continues until all branches terminate
Gathers the resulting attack graphs from the
(SCARs)
Assembles them into a Complete Attack Graph

30
SPIE Traceback Processing (Cont)

SPIE-enhanced routers hash the data received in
the Traceback Request to determine whether or not
the target message passes through the router
Computes k digests using the appropriate input
vectors
Checks for a 1 in each of the corresponding K
locations of the digest table near the target
time
If ALL associated bits are set, it is highly
likely that the packet was stored.
It is within the realm of possibilities that the
Filter is saturated with an overabundance of
packets, creating a false positive.
This is controlled by limiting the number of
digests in each filter, depending upon Digest
Table size and the mean volume of packet traffic.

31
SPIE Traceback Processing (Cont)
Reverse Path Flooding, starting at the Victims
router (V) and proceeding backwards toward the
Attacker (A). Solid arrows represent the attack
path. Dashed arrows are SPIE queries. Queries
are dropped by routers that did not forward the
packet in question.
ATTACK PATH A R2 R5
R7 R9 V
32
SPIE Metrics
33
SPIE Implementation Issues

PRO
Single packet tracing is feasible
Automated processing by SPIE-enhanced routers
make spoofing difficult, at best
Relatively low storage required
Only digests and time are stored
Does not aid in eavesdropping of payload data
Payload is not stored

CON
Requires specially configured (SPIE-enhanced)
routers.
Probability of detection is directly related to
the number of available SPIE-enhanced routers in
the network in question
Storage in routers is a limiting factor in the
window of time in which a packet may be
successfully traced
May consider some sort of filtering of packets to
be digested
May have the appearance of a loss of anonymity
across the Internet

34
Conclusions

DoS, worms, viruses continuously becoming more
dangerous
Attacks must be shut down quickly and be
traceable
Integrating traceback into next generation
Internet is critical

35
Conclusions

Flexible Deterministic Packet Marking
As fast as regular DPM, faster than PPM
Requires more overhead than DPM, but traces more
sources and less computational load
Hash-based Traceback
No packet overhead
New, more capable routers

36
Conclusions

Cooperation is required
Routers must be built to handle new tracing
protocols
ISPs must provide compliance with protocols
Internet is no longer anonymous
Some issues must still be solved
NATs
Collisions

37
References

Belenky, A., Ansari, N. IP Traceback with
Deterministic Packet Marking. IEEE
Communications Letter, April 2003.
Savage, S., et al. Practical Network Support
for IP Traceback. Department of Computer
Science, University of Washington.
Snoeren, A., Partridge, Craig, et al.
Single-Packet IP Traceback. IEEE/ACM
Transactions on Networking, December 2002.
Xiang, Y., Zhou, W. A Defense System Against
DDoS Attacks by Large-Scale IP Traceback, IEEE
2005.