Traceback - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Traceback

Description:

IEEE/ACM Transactions on Networking, December 2002. Xiang, Y., Zhou, W. A Defense System Against DDoS Attacks by Large-Scale IP Traceback , IEEE 2005. – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 38
Provided by: Yano9
Category:
Tags: ddos | traceback

less

Transcript and Presenter's Notes

Title: Traceback


1
Traceback
  • Pat Burke
  • Yanos Saravanos

2
Agenda
  • Introduction
  • Problem Definition
  • Traceback Methods
  • Packet Marking
  • Hash-based
  • Conclusion
  • References

3
Why Use Traceback?
  • SPAM
  • DoS
  • Insider attacks
  • Worms / Viruses
  • Code Red (2001) spreading at 8 hosts/sec
  • Slammer Worm (2003) spreading at 125 hosts/sec

4
Why Use Traceback?
  • Currently very difficult to find spammers, virus
    authors
  • Easy to spoof IPs
  • No inherent tracing mechanism in IP
  • Blaster virus author left clues in code, was
    eventually caught
  • What if we could trace packets back to point of
    origin?

5
Packet Tracing
6
Benchmarks
  • Effect on throughput
  • Amount of overhead added to the packets
  • False positive rate
  • Computational intensity
  • Time required to trace an attack
  • Amount of data required to trace an attack
  • CPU/memory usage on router
  • Collisions
  • For hash-based traceback methods

7
FDPM Flexible Deterministic Packet Marking
8
Packet Marking
  • Add information to the packets so that paths can
    be retraced to original source
  • Methods for marking packets
  • Probabilistic
  • Node Marking
  • Edge Marking
  • Deterministic

9
Probabilistic Packet Marking (PPM)
  • Using probability, router marks a packet
  • With router IP address (node marking)
  • With edge of paths (edge marking)
  • 95 accuracy, requires 300,000 packets
  • PPM needs a large number of packets and creates
    very high computational load

10
PPM Nodes - Cons
  • Large number of false positives
  • DDoS with 25 hosts requires several days and has
    thousands of false positives
  • Slow convergence rate
  • For 95 success, we need 300,000 packets
  • Attacker can still inject modified packets into
    PPM network (mark spoofing)
  • This is only for a single attacker

11
Deterministic Packet Marking (DPM)
  • Every packet is marked
  • Spoofed marks are overwritten with correct marks

12
DPM
  • Incoming packets are marked
  • Outgoing packets are unaltered
  • Requires more overhead than PPM
  • Less computation required
  • Probability of generating ingress IP address
    (1-p)d-1

13
DPM Mark
  • 16-bit mark field
  • 1-bit flag field

14
DPM Mark Encoding
  • Two 16-bit fields and a 1-bit flag
  • IP populates one of the two fields with
    probability of 0.5
  • Set flag to 1 if using the higher end bits
  • Can be made more secure by using non-uniform
    probability distributions

15
DPM
  • Claimed to have 0 false positives
  • Claimed to converge very quickly
  • 99 probability of success with 7 packets
  • 99.9 probability of success with only 10 packets
  • Has not been tested on large networks
  • Cannot deal with NAT

16
Flexible DPM
  • DPM uses 17 bits in the IP header to store
    marking information
  • FDPM uses a variable length TOS added to the mark
  • TOS is between 0 and 8 bits, mark is 17-25 bits
  • Split ingress IP into k segments, send in
    separate packets
  • Segment numbers keep address order consistent
  • Reconstruction process recognizes packets from
    same source

17
FDPM Reconstruction
  • Mark Recognition
  • Store reconstruction packets in cache
  • Split IP header into fields to find mark length
  • Address Recovery
  • Analyze and store mark in recovery table
  • Different source IPs may have the same digest
    (hash value) and collisions may occur
  • More than one entry is created

18
Flow-Based Marking
  • Mark packets selectively according to flow
    properties when router has heavy traffic
  • Reduce load on router while still marking
  • Packets are classified according to destination
    IP address
  • Uses flow thresholds
  • Lmax is threshold where routers load is exceeded
    (called the overload problem)

19
Traceable Sources
20
Overload Problem
  • Under heavy traffic, router can randomly mark
    packets
  • Both attack and normal packets may be marked
  • Flow-based marking as mentioned before is much
    more efficient

21
FDPM Performance
PPM DPM FDPM
Packets for Traceback 103 102 102
Traceable Sources 102 103 105
Computational Load High Medium Low
Overload Prevention None None Good
  • FDPM can trace many more sources with less
    computational load since it uses variable mark
    lengths
  • Incorporates overload prevention to keep router
    from failure
  • Requires more overhead (up to 25 bits instead of
    only 17)

22
HASH-BASED TRACEBACK Source Path Isolation
Engine (SPIE)
23
SPIE - Overview
  • Each router along a packets transmission path
    computes a set of Hash-codes (digests) associated
    with each packet
  • The time-tagged digests are stored in
    router-memory for some time period
  • Limited by available router resources
  • Traceback is initiated only by authenticated
    agent requests to the SPIE Traceback Manager
    (STM)
  • Executed by means of a broadcast message
  • Results in the construction of a complete attack
    graph within the STM

24
SPIE - Assumptions
  • Packets may be addressed to multiple destinations
  • Attackers are aware they are being traced
  • Routers may be subverted, but not often
  • Routing within the network may be unstable
  • Traceback must deal with divergent paths
  • Packet size should not grow as a result of
    traceback
  • 1 byte increase in size 1 increase in resource
    use
  • Very controversial self-enabling assumption
  • End hosts may be resource constrained
  • Traceback is an infrequent operation
  • Broadcast messages can have a significant impact
    on internet performance
  • Traceback should return entire path, not just
    source

25
SPIE - Architecture
DGA (Data Generation Agent) Resident in
SPIE-enhanced routers to produce digests and
store them in time-stamped digest tables.
Implemented as software agents, interface
cards, or dedicated aux boxes
STM (SPIE Traceback Manager) Controls the SPIE
system. Verifies authenticity of a traceback
request, dispatches the request to the
appropriate SCARs, gathers regional attack
graphs, and assembles the complete attack graph.
SCAR (SPIE Collection and Reduction Agents)
Data concentration point for some regional area.
When traceback is requested, SCARs initiate a
broadcast request for traceback and produce
regional attack graphs based upon data from
constituent DGAs
26
SPIE - Hashing
  • Multiple hash-codes (hash-codes, different
    groupings of fields) are calculated for each
    package based on 24 bytes of relatively invariant
    fields in the header plus the first 8 byte of the
    payload.
  • For each packet received, SPIE computes k
    independent n-bit digests, and sets the
    corresponding bits in the 2n bit digest table
    Bloom Filter (nominally 256 Mb per filter).
  • Each filter contains the digests of multiple
    packages (approximately 50M packets per filter)
    as large as possible, but avoiding collisions.

Masked (gray) areas are NOT used in hash-code
calculation
27
SPIE Hashing Collisions
  • The figure to the right shows the fraction of
    packets that collide as a function of prefix
    length.
  • The WAN represents 985,150 packets between 6,031
    host pairs collected at the University of Florida
    OC-3 gateway.
  • The LAN trace consists of one million packets
    between 2,879 pairs at the MIT Lab for Computer
    Science.

LAN .139
WAN .00092
28
SPIE Hashing
  • When one Bloom Filter is full, the next one is
    initialized and time tagged to record the receipt
    of the next packet. Can be implemented as a
    circular buffer of filters.
  • For security purposes, each SPIE Agent generates
    a new set of k input vectors to the Bloom Filters
    with each filter change
  • Based on a pseudo-random number generator
    independently seeded at each router
  • These vectors are stored with the associated
    filter.
  • SPIE never needs to record any packet payload
    information
  • The first 8 bytes of the payload can be
    regenerated from the hashing, given the stored
    input vectors

29
SPIE Traceback Processing
  • The SPIE Traceback Manager controls the process
  • Cryptographically verifies that the authenticity
    and integrity of the traceback request message
  • Authorized requester
  • Packet ID
  • Victim
  • Approximate time of attack
  • Dispaches the request to the appropriate SPIE
    Collection and Reduction Agents (SCARs)
  • SCARs poll their assigned Data Collection Agents
    (DCAs)
  • DCAs poll their assigned routers
  • If the response from the targeted SCARs indicates
    that other regional SCARs are involved in the
    Trace, the STM sends another direct request
  • This loop continues until all branches terminate
  • Gathers the resulting attack graphs from the
    (SCARs)
  • Assembles them into a Complete Attack Graph

30
SPIE Traceback Processing (Cont)
  • SPIE-enhanced routers hash the data received in
    the Traceback Request to determine whether or not
    the target message passes through the router
  • Computes k digests using the appropriate input
    vectors
  • Checks for a 1 in each of the corresponding K
    locations of the digest table near the target
    time
  • If ALL associated bits are set, it is highly
    likely that the packet was stored.
  • It is within the realm of possibilities that the
    Filter is saturated with an overabundance of
    packets, creating a false positive.
  • This is controlled by limiting the number of
    digests in each filter, depending upon Digest
    Table size and the mean volume of packet traffic.

31
SPIE Traceback Processing (Cont)
Reverse Path Flooding, starting at the Victims
router (V) and proceeding backwards toward the
Attacker (A). Solid arrows represent the attack
path. Dashed arrows are SPIE queries. Queries
are dropped by routers that did not forward the
packet in question.
ATTACK PATH A R2 R5
R7 R9 V
32
SPIE Metrics
33
SPIE Implementation Issues
  • PRO
  • Single packet tracing is feasible
  • Automated processing by SPIE-enhanced routers
    make spoofing difficult, at best
  • Relatively low storage required
  • Only digests and time are stored
  • Does not aid in eavesdropping of payload data
  • Payload is not stored
  • CON
  • Requires specially configured (SPIE-enhanced)
    routers.
  • Probability of detection is directly related to
    the number of available SPIE-enhanced routers in
    the network in question
  • Storage in routers is a limiting factor in the
    window of time in which a packet may be
    successfully traced
  • May consider some sort of filtering of packets to
    be digested
  • May have the appearance of a loss of anonymity
    across the Internet

34
Conclusions
  • DoS, worms, viruses continuously becoming more
    dangerous
  • Attacks must be shut down quickly and be
    traceable
  • Integrating traceback into next generation
    Internet is critical

35
Conclusions
  • Flexible Deterministic Packet Marking
  • As fast as regular DPM, faster than PPM
  • Requires more overhead than DPM, but traces more
    sources and less computational load
  • Hash-based Traceback
  • No packet overhead
  • New, more capable routers

36
Conclusions
  • Cooperation is required
  • Routers must be built to handle new tracing
    protocols
  • ISPs must provide compliance with protocols
  • Internet is no longer anonymous
  • Some issues must still be solved
  • NATs
  • Collisions

37
References
  • Belenky, A., Ansari, N. IP Traceback with
    Deterministic Packet Marking. IEEE
    Communications Letter, April 2003.
  • Savage, S., et al. Practical Network Support
    for IP Traceback. Department of Computer
    Science, University of Washington.
  • Snoeren, A., Partridge, Craig, et al.
    Single-Packet IP Traceback. IEEE/ACM
    Transactions on Networking, December 2002.
  • Xiang, Y., Zhou, W. A Defense System Against
    DDoS Attacks by Large-Scale IP Traceback, IEEE
    2005.
Write a Comment
User Comments (0)
About PowerShow.com