Modeling the Spread of Worms - PowerPoint PPT Presentation

About This Presentation
Title:

Modeling the Spread of Worms

Description:

Metadata allows us to forward packets when we want. E.g. letters at a post office headed for main post office. address labels allow us to forward them in batches ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 18
Provided by: wadet
Category:
Tags: modeling | office | post | spread | us | worms

less

Transcript and Presenter's Notes

Title: Modeling the Spread of Worms


1
Modeling the Spread of Worms
  • Wade Trappe

2
Overview
  • Quick discussion of how the Internet is
    organized.
  • Random Constant Spread (RCS) Model and Code-Red I
  • The Differential Equation
  • Solving it!
  • Observations
  • Improvements in worm design
  • Scanning Strategies

3
Internet Overview, pg.1
  • The Internet started as a research project
    connecting 4 computers in 1969, and has grown to
    connect over 100 million machines.
  • The Internet is
  • A loose collection of networks organized into a
    hierarchy through interconnection technologies.
  • At the local level machines are connected to each
    other (local area network), and to a router.
  • A router is a special-purpose device that
    transfers data to and from the next layer of the
    hierarchy.
  • Loose collection of networks organized into a
    multilevel hierarchy
  • 10-100 machines connected to a hub or a router
  • service providers also provide direct dialup
    access
  • or over a wireless link
  • 10s of routers on a department backbone
  • 10s of department backbones connected to campus
    backbone
  • 10s of campus backbones connected to regional
    service providers
  • 100s of regional service providers connected by
    national backbone
  • 10s of national backbones connected by
    international trunks

4
Internet Overview, Conceptual Picture
5
Internet Overview, pg. 2
  • Question So, I want to send an email, how does
    it happen?
  • Answer We use Addresses, and Route between
    Addresses using the Internet Protocol (IP).
  • Your data is sent via packets, and the Internet
    employs a store-and-forward strategy when
    delivering them between nodes.
  • Packets consist of Meta-data (header) and the
    data (payload)
  • Metadata allows us to forward packets when we
    want
  • E.g. letters at a post office headed for main
    post office
  • address labels allow us to forward them in
    batches

6
Internet Overview, pg. 3
  • Internet addresses are called IP addresses
  • Refer to a host interface (device connecting the
    computer to the network) need one IP address per
    interface
  • Addresses are structured as a two-part hierarchy
  • network number
  • host number
  • Question How many bits to assign to host number
    and how many to network number?
  • If many networks, each with a few hosts, then
    more bits to network number
  • And vice versa
  • In the end, IP addresses consist of three sets of
    partitions of bits
  • class A 8 bits network, 24 bits host
  • class B 16 bits each
  • class C 24 bits network, 8 bits host
  • Routing uses these addresses to deliver from a
    source to a destination.

7
Internet Overview, pg. 4
  • An example of a message route
  • traceroute henna.iitd.ernet.in
  • traceroute to henna.iitd.ernet.in
    (202.141.64.30), 30 hops max, 40 byte packets
  • 1 UPSON2-NP.CIT.CORNELL.EDU (128.84.154.1) 1
    ms 1 ms 1 ms
  • 2 HOL1-MSS.CIT.CORNELL.EDU (132.236.230.189) 2
    ms 3 ms 2 ms
  • 3 CORE1-MSS.CIT.CORNELL.EDU (128.253.222.1) 2
    ms 2 ms 2 ms
  • 4 CORNELLNET1.CIT.CORNELL.EDU (132.236.100.10)
    4 ms 3 ms 4 ms
  • 5 ny-ith-1-H1/0-T3.nysernet.net (169.130.61.9)
    5 ms 5 ms 4 ms
  • 6 ny-ith-2-F0/0.nysernet.net (169.130.60.2) 4
    ms 4 ms 3 ms
  • 7 ny-pen-1-H3/0-T3.nysernet.net (169.130.1.121)
    21 ms 19 ms 16 ms
  • 8 sl-pen-21-F6/0/0.sprintlink.net
    (144.228.60.21) 16 ms 40 ms 36 ms
  • 9 core4-hssi5-0.WestOrange.mci.net
    (206.157.77.105) 20 ms 20 ms 24 ms
  • 10 core2.WestOrange.mci.net (204.70.4.185) 21
    ms 34 ms 26 ms
  • 11 border7-fddi-0.WestOrange.mci.net
    (204.70.64.51) 21 ms 21 ms 21 ms
  • 12 vsnl-poone-512k.WestOrange.mci.net
    (204.70.71.90) 623 ms 639 ms 621 ms
  • 13 202.54.13.170 (202.54.13.170) 628 ms 629 ms
    628 ms
  • 14 144.16.60.2 (144.16.60.2) 1375 ms 1349 ms
    1343 ms
  • 15 henna.iitd.ernet.in (202.141.64.30) 1380 ms
    1405 ms 1368 ms

8
Now Back to Worms
  • Someone who controls many nodes on the Internet
    can cause serious damage to the Internet.
  • It is reasonable to gain control of millions of
    Internet hosts through worms.
  • Worms differ from viruses in that worms do not
    require human intervention to propagate. Viruses
    require user action (aka. Clicking that email
    attachment).
  • Pandurang gave the overview of Worms, along with
    its history in the previous lecture.
  • We will start with Code Red

9
Code Red
  • The Code Red Worm was initially released in July
    2001.
  • The worm spread by compromising Microsoft web
    servers using a vulnerability that had been
    discovered just a few weeks earlier.
  • Once a host was infected, Code Red would spread
    itself by launching 99 threads, that each
    generated a random IP address and tried to infect
    that address using the same vulnerability.
  • Initial version of Code Red, CRv1, had a bug in
    the random number generator.
  • Second version of Code Red, CRv2, the bug was
    fixed. CRv2 contained a piece of code to perform
    a distributed denial of service attack on
    www.whitehouse.gov.

10
Random Constant Spread, pg. 1
  • Code Red spread very rapidly at first, until
    almost all vulnerable machines were compromised,
    then it seemed to slow down its spread.
  • The Random Constant Spread (RCS) is one model to
    describe this phenomenon.
  • Let N total of vulnerable servers which can be
    corrupted/infected (assume its constant with
    time)
  • Let K initial compromise rate
  • i.e. the number of vulnerable machines that an
    infected host can find and compromise at the
    start (when few other hosts have been
    compromised).
  • K is some universal constant for a particular
    worm.
  • Assume that a compromised machine picks other
    machines at random, and that once a machine is
    infected it cannot be compromised again.
  • Let T be point when half the machines are
    infected.
  • Variables
  • a the proportion of vulnerable machines that
    have been infected (e.g. a1 means all N have
    been infected). The variable a will change with
    time t.
  • t time in hours

11
Random Constant Spread, pg. 2
  • RCS is based upon the idea of logistic growth
  • The actual growth rate at a time t depends on the
    population
  • Suppose a(t) is the proportion of the N machines
    infected at time t, then there are a total of
    Na(t) machines that have been infected.
  • If we go from time t to time (tdt), then a(t)
    will become a(tdt)a(t) da.
  • da represents the change in the proportion a, and
    is an infinitesimal quantity (i.e. everything is
    in the limit).
  • So Nda represents the total number of additional
    machines that will be infected in dt more time.
  • Thats one way to calculate the number of
    additional machines that can be infected in dt
    time, we need one more way.

12
Random Constant Spread, pg. 3
  • Key Idea Suppose I have 100 machines and I can
    infect K of those machines in one hour. Now,
    instead, suppose I have 80 machines, then how
    many can I infect in one hour?
  • Answer 0.8 K
  • Now, suppose Na machines have been infected, then
    that leaves (1-a)N machines left.
  • Question When I had N infectible-machines I
    could infect K machines. So, now I have (1-a)N
    infectible machines, how many can one machine
    infect?
  • Answer (1-a)K
  • Next Issue I can infect (1-a)K machines in 1
    hour, but what about in dt time? Answer
    (1-a)Kdt.
  • Final Issue At time t I have a(t)N machines that
    can do the infecting, so how many will be
    infected in time dt?
  • Simple, but not completely accurate answer
    (Na)K(1-a)dt

13
Random Constant Spread, pg. 4
  • Lets put the two sides together
  • Nda (Na)K(1-a) dt
  • So, how do we solve this? Answer Its an easy
    first order diffeq.
  • One way

14
Random Constant Spread, pg. 5
  • Observations
  • For small t (before the first infection) there is
    no growth, but once the infection happens, growth
    happens exponentially.
  • However, once significantly past T, growth slows
    again because we are running out of machines to
    infect.
  • See plot for an example.
  • These observations were confirmed in the real
    worm data.
  • Several hours before Code Red was due to
    terminate itself, it had slowed down due to the
    fact it had found the majority of infectible
    machines.

15
Random Constant Spread, pg. 6
  • What was wrong with the RCS model?
  • Basically, problem lies to the simplification of
    the probability involved.
  • The assumption that if aN machines are infected
    then (aN)(1-a)K machines will be infected in next
    hour is wrong.
  • Randomly choosing an address might mean that you
    actually try to reinfect an already infected
    machine.
  • Or, by randomly choosing an address, two infected
    machines might try to infect the same machine.
  • Overall, the value of RCS is not its rigor, but
    the fact it reveals underlying principles and
    dynamics.

16
Better Worm Strategies
  • Localized Scanning
  • It takes more time to infect a node further away
    than one nearby.
  • Localized scanning seeks to balance the amount of
    attempts a worm takes in infecting a nearby
    machine versus choosing a random machine on the
    Internet.
  • Strategy employed in Code Red II.
  • Hit List Scanning
  • We saw that worms take a while to get started,
    but once started they grow exponentially. How do
    we speed up the start?
  • Idea
  • Give the initial worm a list of high-potential
    targets.
  • Once it infects a machine on the hit-list, it
    splits the hit-list in half and gives half to
    child worm to use.
  • Child worms continue replicating and splitting
    hit list.
  • Advantages hit-list shrinks quickly, initial
    spread is very quick.

17
Better Worm Strategies, pg. 2
  • Permutation Scanning
  • One limitation of random scanning is that
    different nodes may try to infect the same
    machine, or infect an already infected machine.
  • Idea
  • Each worm gets a starting point of permutation
    space to work with.
  • Permutation Space is mapped to IP Address Space
    via a 32 bit cipher (with fixed key).
  • The worm goes along attempting to infect each
    machine in its region of permutation space. If it
    ever encounters a machine that has been infected,
    it knows that its permutation space will start
    overlapping another worms permutation space, so
    it chooses a new, random place to start from in
    permutation space.
  • Result Worms end up trying to work on separate
    sections of permutation space.
  • Improvements Enforced partitions of permutation
    space.
Write a Comment
User Comments (0)
About PowerShow.com