Internet Outbreaks Epidemiology and Defenses - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Internet Outbreaks Epidemiology and Defenses

Description:

... Anderson, Jay Chen, Cristian Estan, Ranjit Jhala, Flavio Junqueira, Erin ... Attacker manually targets high-value system/resource ... – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 57
Provided by: sysne
Category:

less

Transcript and Presenter's Notes

Title: Internet Outbreaks Epidemiology and Defenses


1
Internet OutbreaksEpidemiology and Defenses
  • Geoffrey M. Voelker
  • Collaborative Center for
  • Internet Epidemiology and Defenses
  • Dept. of Computer Science Engineering
  • University of California, San Diego
  • October 11, 2006
  • In collaboration with David Anderson, Jay Chen,
    Cristian Estan, Ranjit Jhala, Flavio Junqueira,
    Erin Kenneally, Justin Ma, John McCullough, David
    Moore, Vern Paxson (ICSI), Stefan Savage, Colleen
    Shannon, Sumeet Singh, Alex Snoeren, Stuart
    Staniford (Nevis), Amin Vahdat, Erik Vandekeift,
    George Varghese, Michael Vrable, Nick Weaver
    (ICSI), Qing Zhang

2
Paradise Lost
Our Goal Develop the understanding and
technology to address large-scale subversion of
Internet hosts
3
Threat Transformation
  • Traditional threats
  • Attacker manually targets high-value
    system/resource
  • Defender increases cost to compromise high-value
    systems
  • Biggest threat insider attacker
  • Modern threats
  • Attacker uses automation to target all systems at
    once (can filter later)
  • Defender must defend all systems at once
  • Biggest threats software vulnerabilities naïve
    users

4
Large-Scale Enablers
  • Unrestricted high-performance connectivity
  • Large-scale adoption of IP model for networks
    apps
  • Internet is high-bandwidth, low-latency
  • The Internet succeeded!
  • Software homogeneity user naiveté
  • Single bug ? mass vulnerability in millions of
    hosts
  • Trusting users (ok) ? mass vulnerability in
    millions of hosts
  • Lack of meaningful deterrence
  • Little forensic attribution/audit capability
  • Effective anonymity
  • No deterrence, minimal risk

5
Driving Economic Forces
  • Emergence of profit-making payloads
  • Spam forwarding (MyDoom.A backdoor, SoBig),
    Credit Card theft (Korgo), DDoS extortion, (many)
    etc
  • Virtuous economic cycle transforms nature of
    threat
  • Commoditization of compromised hosts
  • Fluid third-party exchange market (millions)
  • Going rate for Spam proxying 3 -10
    cents/host/week
  • Seems small, but 25k botnet gets you 40k-130k/yr
  • Raw bots, .01/host, Special orders (50)
  • Hosts effectively becoming a criminal platform
  • Innovation in both host substrate and its uses
  • Sophisticated infection and command/control
    networks
  • DDoS, SPAM, piracy, phishing, identity theft are
    all applications

6
Botnet Spammer Rental Rates
gt20-30k always online SOCKs4, url is de-duped and
updated every gt10 minutes. 900/weekly, Samples
will be sent on request. gtMonthly payments
arranged at discount prices.
  • 3.6 cents per bot week
  • 6 cents per bot week
  • 2.5 cents per bot week

gt350.00/weekly - 1,000/monthly (USD) gtAlways
Online 5,000 - 6,000 gtUpdated every 10 minutes
gt220.00/weekly - 800.00/monthly (USD) gtAlways
Online 9,000 - 10,000 gtUpdated every 5 minutes
September 2004 postings to SpecialHam.com,
Spamforum.biz
7
Why Worms?
  • All of these applications depend on automated
    mechanisms for subverting large numbers of hosts
  • Self-propagating programs continue to be the most
    effective mechanism for host subversion
  • Prevent automated subversion ? severely undermine
    phishing, DDoS, extortion, etc.
  • Our Goal Develop the understanding and
    technology to address large-scale subversion of
    Internet hosts

8
Today
  • Worm outbreaks
  • What are we up against?
  • Framing the worm problemand solutions
  • What can we do?
  • Two worm detection and monitoring techniques
  • Fundamental basis for understanding of and
    defense against large-scale Internet attacks
  • EarlyBird High-speed network-based content
    sifting
  • Potemkin Large-scale high-fidelity honeyfarm
  • Summarize

9
Network Telescopes
  • Idea Unsolicited packets evidence of global
    phenomena
  • Backscatter response packets sent by victims
    provide insight into global prevalence of DoS
    attacks (and who is getting attacked)
  • Scans request packets can indicate an infection
    attempt from a worm (and who is current infected,
    growth rate, etc.)
  • Very scalable CCIED Telescope monitors 17M IP
    addrs (gt 1 of all routable addresses of the
    Internet)

10
Worm Outbreaks
  • CodeRed worm released in July 2001
  • Exploited buffer overflow in Microsoft IIS
  • Infects 360,000 hosts in 14 hours (CRv2)
  • Propagation is limited by latency of TCP handshake

Moore et al, CodeRed a Case study on the Spread
of an Internet Worm, IMW 2002 andStaniford et
al, How to 0wn the Internet in your Spare Time,
USENIX Security 2002
11
Fast Worms
  • Slammer/Sapphire released in January 2003
  • First 1 min behaves like classic scanning worm
  • Doubling time of 8.5 seconds
  • gt1 min worm saturates access bandwidth
  • Some hosts issue gt 20,000 scans/sec
  • Self-interfering
  • Peaks at 3 min
  • gt55 million IP scans/sec
  • 90 of Internet scanned in lt10 mins

Moore et al, The Spread of the Sapphire/Slammer
Worm, IEEE Security Privacy, 1(4), 2003
12
Was Slammer really fast?
  • Yes, it was orders of magnitude faster than
    CodeRed
  • No, it was poorly written and unsophisticated
  • Who cares? It is literally an academic point
  • The current debate is whether one can get lt 500ms
  • Bottom line way faster than people!

Staniford et al, ACM WORM, 2004
13
Understanding Worms
  • Worms are well modeled as infectious epidemics
  • Homogeneous random contacts
  • Classic SI model
  • N population size
  • S(t) susceptible hosts at time t
  • I(t) infected hosts at time t
  • ? contact rate
  • i(t) I(t)/N, s(t) S(t)/N

Staniford, Paxson, Weaver, How to 0wn the
Internet in Your Spare Time, USENIX Security 2002
14
What Can We Do?
  • 1) Reduce number of susceptible hosts S(t)
  • Prevention
  • 2) Reduce number of infected hosts I(t)
  • Treatment
  • 3) Reduce the contact rate ?
  • Containment
  • 4) Prepare for the inevitable
  • Survival

15
Prevention
  • Reduce of susceptible hosts S(t)
  • Software quality eliminate vulnerability
  • Static/dynamic testing e.g., Cowan, Wagner,
    Engler
  • Active research community, taken seriously in
    industry
  • Security code review alone for Windows Server
    2003 200M
  • Traditional problems soundness, completeness,
    usability
  • Software updating reduce window of vulnerability
  • Most worms exploit known vulnerability (10 days ?
    6 months)
  • Sapphire Vulnerability patch July 2002, worm
    January 2003
  • Some activity (Shield Wang04), yet critical
    problem
  • Is finding security holes a good idea?
    Rescorla04
  • Software heterogeneity reduce impact of
    vulnerability
  • Artificial heterogeneity Forrest02
  • Exploit existing heterogeneity Junqueira05

16
Treatment
  • Reduce of infected hosts I(t)
  • Disinfection Remove worm from infected hosts
  • Develop specialized vaccine in real-time
  • Distribute at competitive rate
  • Counter-worm, anti-worm
  • Code Green, CRclean, Worm vs. Worm Castaneda04
  • Exploit vulnerability, patch host, propagate
  • Seems tough
  • Legal issues of using exploits, even if
    well-intentioned
  • Propagation race problem
  • Automatically patch vulnerability Keromytis03,
    Sidiroglou05
  • Auto-generate and test patches in sandbox
  • Apply within administration domain
  • Requires source, targets known exploits (e.g.,
    overflows)

17
Reactive Containment
  • Reduce contact rate ?
  • Slow worm down
  • Throttle connection rate to slow spread
    Twycross03
  • Important capability, but worm still spreads
  • Quarantine
  • Detect and block worm
  • How feasible is it?

18
Defense Requirements
  • Any reactive defense is defined by
  • Reaction time how long to detect worm,
    propagate information, and activate response
  • Containment strategy how malicious behavior is
    identified
  • Deployment scenario who participates in the
    system
  • Given these, what are the engineering
    requirements for any effective defense?
  • Moore et al., Internet Quarantine
    Requirements for Containing Self-Propagating
    Code, Infocom 2003

19
Containment Requirements
  • Universal deployment for Code Red
  • Address filtering (blacklists), must respond lt 25
    mins
  • Content filtering (signatures), must respond lt 3
    hours
  • For faster worms (slammer) seconds
  • Worse for non-universal deployment
  • Bottom line very challenging

Reaction time
Propagation rate (probes/sec)
20
Survival
  • Prepare for inevitable
  • For the ones that surprise us
  • Approach Informed replication
  • Worms represent large-scale dependent failures
  • Model software configurations ? model dependent
    failures
  • Replicate data on hosts with disjoint
    configurations
  • Exploit existing software heterogeneity
  • Even with software skew, only need 3 replicas
  • Phoenix
  • Cooperative backup system using informed
    replication
  • Junqueira et al., Surviving Internet
    Catastrophes, USENIX 2005

21
Scalable Detection and Monitoring
  • Detection and monitoring are fundamental for
    understanding of and defense against worms
  • Lessons from containment
  • Need to detect worms in less than a second
  • How can we do this?
  • Know thy enemy
  • What does the worm/virus/bot do?
  • Who is controlling it?

22
Signature Inference
  • Challenge In less than a second
  • Detect worm probes
  • Characterize worm packets with a byte signature
  • Approach
  • Monitor network, identify packets with common
    strings spreading like a worm
  • Use signature for content filtering

Packet Header
SRC 11.12.13.14.3920 DST 132.239.13.24.5000
PROT TCP 0110 90 90 90 90 FF 63 64 90 90 90
90 90 90 90 90 90 .....cd.........0120 90 90 90
90 90 90 90 90 90 90 90 90 90 90 90 90
................0130 90 90 90 90 90 90 90 90 EB
10 5A 4A 33 C9 66 B9 ..........ZJ3.f.0140 66 01
80 34 0A 99 E2 FA EB 05 E8 EB FF FF FF 70
f..4...........p
Packet Payload (Content)
23
Content Sifting
  • Assume unique, invariant string W for all worm
    probes
  • Works today, but not forever
  • Consequences
  • Content prevalence W more common in worm traffic
  • Address dispersion traffic with W has many
    distinct src/dests
  • Content Sifting
  • Identify W with high prevalence and high
    dispersion
  • Use W as filter signature in network
  • Singh et al., Automated Worm Fingerprinting,
    OSDI 2004

24
Content Sifting in Early Bird
  • Challenges Time and space
  • Must touch every byte in all packets (1 Gbps ? 12
    us/packet)
  • Simple algorithm consumes 100 MB/s of memory
  • Approach Careful algorithms and data structures
  • Incremental hash functions
  • Value-based sampling
  • Multi-state filters and multi-resolution and
    counting bitmaps
  • Combined 60 us/packet in software
  • Works well in practice
  • Deployed at UCSD CSE for 8 months
  • Detected every worm outbreak reported on security
    lists
  • Identified unknown worms (Kibvu, Sasser)

25
Tech Transfer
  • Content sifting technologies patented by UC and
    licensed to startup, Netsift Inc.
  • Netsift significantly improved performance,
    features
  • Hardware implementation, new capabilities
  • In June 2005, Netsift was acquired by Cisco

26
Going Further
  • Network telescopes, content sifting have
    limitations
  • Passive observation, no interaction with malware
  • Lexical domain is limited
  • Evasion through polymorphism, protocol framing,
    encryption
  • Want to answer deeper questions
  • What does a worm/virus/bot do?
  • What vulnerabilities are exploited, and how?
  • Who is controlling it, how is it controlled?
  • Alternative Endpoint monitoring

27
Active Responders
  • Problem Telescopes are passive, cannot respond
    to TCP handshake
  • Is a SYN from a host infected by CodeRed or
    Welchia? Dunno.
  • What does the worm payload look like? Dunno.
  • Solution Proxy responder
  • Stateless TCP SYN/ACK (Internet Motion Sensor),
    per-protocol responders (iSink)
  • Stateful Honeyd
  • Can differentiate and fingerprint payload
  • False positives generally low since no regular
    traffic

28
Honeypots
  • Problem Poor fidelity
  • Do not know what worm/virus would dono code
    downloaded
  • What bot code would be downloaded? Where from?
    What control channels?
  • Direct traffic to real infectable hosts
    (honeypots)
  • Individual hosts or VM-based Collapsar,
    HoneyStat, Symantec
  • Can reduce false positives/negatives with host
    analysis (e.g., Vigilante, TaintCheck, Minos)
    and behavioral/procedural signatures
  • Challenges
  • Scalability ()
  • Liability (grey legal areas)
  • Isolation (control for inter-malware competition)
  • Detection (VMWare detection code in the wild)

29
Scalability/Fidelity Tradeoff
Telescopes Responders (iSink, honeyd, Internet
Motion Sensor)
VM-based Honeynet (e.g., Collapsar)
Network Telescopes (passive)
Live Honeypot
Highest Fidelity
Most Scalable
30
Can We Achieve Both?
  • Naïve approach one machine per IP address
  • 1M addresses 1M hosts 2B investment
  • Overkill most resources will be wasted
  • In truth, only necessary to maintain the illusion
    of continuously live honeypot systems
  • Maintain illusion on the cheap using
  • Network multiplexing
  • Host multiplexing

31
Network Multiplexing
  • Most addresses are idle at any given time
  • Late bind honeypots to IP addresses
  • Most traffic does not cause an infection
  • Recycle honeypots if do not detect anything
    interesting
  • Only maintain honeypots of interest for extended
    periods
  • Scalability is related to both the workload and
    the recycling rate

32
Net Multiplexing Efficiency
Only need one honeypot for every 100-1000 IP
addresses
33
Host-level multiplexing
  • CPU utilization in each honeypot is quite low
    (ltlt1)
  • Use VMM to multiplex honeypots on single machine
  • Done in practice, but limited by memory
    bottleneck
  • Exploit memory coherence property
  • Few memory pages are actually modified in input
  • Share unmodified pages among VMs
  • Scalability function of unique memory per VM

34
Host multiplexing efficiency
Only need one physical machine for every
100-1000 honeypots
35
Potemkin A High-Fidelity, Large-Scale Honeyfarm
  • Gateway Multiplexes traffic onto VM honeypots
  • Potemkin VMM Multiplexes VMs on servers

Vrable et al., Scalability, Fidelity, and
Containment in the Potemkin Virtual Honeyfarm,
SOSP 2005.
36
Potemkin Gateway
  • Gateway terminates inbound GRE tunnels
  • Maintains external IP address ? type mapping
  • 132.239.4.8 should be a Windows XP box w/IIS
    version 5, etc.
  • Mapping made concrete when packet arrives
  • Flow entry created and packet dispatched to
    type-compatible physical host
  • VMM on host creates new VM with target IP address
  • VM and flow mapping GCd after system determines
    that an interaction is uninteresting (detectors)

37
Potemkin VMM
  • Modified Xen using shadow translate mode
  • Integrated into VT for Windows support
  • Clone manager instantiates frozen VM image and
    keeps it resident in physical memory
  • Flash cloning memory instantiated via eager copy
    of PTE pages and lazy faulting of data pages (no
    software startup)
  • Delta virtualization copy implemented as
    copy-on-write (no memory overhead for shared
    code/data)
  • Supports hundreds of simultaneous VMs per host
  • Overhead currently takes 200-500ms to create new
    VM
  • Imperceptible to human user and under TCP
    handshake timeout
  • Wildly unoptimized (e.g., includes multiple
    Python invocations)
  • Pre-allocated VMs can be invoked in 5ms

38
Containment
  • Key issue 3rd party liability and contributory
    damages
  • Honeyfarm worm accelerator
  • Worse I knowingly allowed my hosts to be
    infected (premeditated negligence and outside
    best-practice safe harbor)
  • Export policy tradeoffs between risk and fidelity
  • Block all outbound packets no TCP connections
  • Only allow outbound packets to host that
    previously send packet no outbound DNS, no
    botnet updates
  • Allow outbound, but scrub is this a best
    practice?
  • In the end, need fairly flexible policy
    capabilities
  • Could do whole talk on interaction between
    technical legal drivers

39
Summary
  • Internet hosts are highly vulnerable to worm
    outbreaks
  • Millions of hosts can be taken before anyone
    realizes
  • Supports vibrant ecosystem of criminal activity
  • Containment (Quarantine) requires automated
    response
  • Prevention is a critical element, but outbreaks
    inevitable
  • Need scalable detection, can also plan to survive
    (Phoenix)
  • Different detection strategies, monitoring
    approaches
  • High-speed network-based content sifting
    (EarlyBird)
  • Large-scale high-fidelity honeyfarm (Potemkin)
  • Smart bad guys still have a huge advantage
  • Escalation Rapid innovation in both problems and
    solutions

40
Many Ongoing CCIED Projects
  • Worm forensics with honeyfarm
  • Network dynamics
  • Network and host support
  • Automated botnet identification and analysis
  • Self-moderating outbreaks (get 80 and stop)
  • Automated outbreak dynamics estimation
  • How many hosts are susceptible, when will
    outbreak peak, etc?
  • Automated vulnerability signature generation
  • At state-machine level (w/MSR)
  • Automated exploit behavioral categorization
  • Prevalence of polymorphism in exploits
  • Honeypot camouflage
  • Panel testing
  • Normalize traffic and replay against range of
    reference images

41
For More Info
  • http//www.ccied.org

42
(No Transcript)
43
Why Internet Epidemics?
  • Why is this a big issue now?
  • Didnt we have worms and viruses back in the 80s?
  • Isnt the real problem now?
  • Isnt all this Internet threat stuff overhyped?
  • My computer continues to work after all

spyware
phishing
botnets
something new
44
What Service-Oriented Computing Really Means
45
CCIED (Seaside)
  • Joint NSF CyberTrust Center Project
  • Collaboration between UCSD and ICSI in Berkeley
  • 5-year effort 10/200410/2009
  • Industrial support Microsoft, Intel, HP, VMware,
    ATT
  • Goal Develop understanding and technology to
    address the threats of large-scale host
    compromise
  • DDoS, worms, virus, botnets,
  • Folks
  • Stefan Savage, George Varghese,
  • Geoff Voelker (UCSD)
  • Vern Paxson, Nick Weaver (ICSI)
  • 25 students staff

46
Major Research Efforts
  • Internet Epidemiology Understanding
  • What kinds of new attacks are going on?
  • How are they controlled? How do they behave? What
    are their side-effects? What are their limits?
  • Automated Network Defenses Reacting
  • Stop new attacks without humans in the loop
  • Forensic, Legal and Economic issues Deploying
  • What investigatory and evidentiary value can we
    provide with technology? What is the legal
    framework for safely using the technologies we
    developed? How do we create the proper
    incentives for defense deployment?

47
Internet Epidemiology
  • Inferring Internet Denial-of-Service Activity
  • 4,000 DoS attacks/week, everyone a victim,
    intense, periodic

Moore et al., Inferring Internet Denial of
Service Activity, USENIX Security, 2001
48
From Model to Defense
  • Prevention Reduce the number of susceptible
    hosts
  • Reduce S(t) while I(t) is still small (ideally
    reduce S(0))
  • Software quality, wrappers, artificial
    heterogeneity, patching, known exploit blocking,
    hygiene enforcement
  • Treatment Reduce the number of infected hosts
  • Reduce I(t) after the fact (clean up)
  • Counter worms, anti-worms, automatic patching
  • Containment Reduce the contact rate
  • Reduce ß while I(t) is still small
  • Proactive Slow worm down (e.g., tarpits)
  • Reactive Detect and block

49
Automated Network Defense
  • Internet Quarantine
  • Automated response, content filtering, global
    deployment

Content Filtering
Top 100 ISPs
Infected (95th perc.)
Reaction time
Reaction time (hours)
Propagation rate (probes/sec)
Moore, Shannon, Voelker, and Savage. Internet
Quarantine Requirements for Containing
Self-Propagating Code. IEEE Infocom 2003
50
Worm Signature Inference
  • Challenge Need to automatically learn a content
    signature for each new worm (in less than a
    second!)
  • Approach Monitor network and look for strings
    common to traffic with worm-like behavior
  • Signatures can then be used for content filtering

51
Content Sifting
  • Assume there exists some (relatively) unique
    invariant bitstring W across all instances of a
    particular worm
  • Two consequences
  • Content Prevalence W will be more common in
    traffic than other bitstrings of the same length
  • Address Dispersion the set of packets containing
    W will address a disproportionate number of
    distinct sources and destinations
  • Content sifting find Ws with high content
    prevalence and high address dispersion and drop
    that traffic

Singh, Estan, Varghese and Savage, Automated Worm
Fingerprinting, OSDI 2004
52
EarlyBird
  • Detected and automatically generated signatures
    for every known worm outbreak over eight months
  • Low latency (us), high-bandwidth (s/w at 200
    Mbps)
  • Transferred to startup NetSift, Inc. ? Purchased
    by Cisco

53
Honeyfarms
  • Honeypots serve as live sandboxes for exploit
  • Examine active exploit in controlled environment
  • Multiple honeypots form a honeyfarm
  • Challenge Want scalability fidelity
  • Scalability monitor 1 million IP addresses
  • Fidelity new machine for each connection
  • Observation
  • In truth, only necessary to maintain the illusion
    of continuously live honeypot systems

54
Approach
  • Network-level multiplexing (102 103
    scalability)
  • Most addresses are idle at any given time
  • Late bind honeypots to IP addresses
  • Most traffic does not cause an infection
  • Recycle honeypots if cant detect anything
    interesting
  • Only maintain honeypots of interest for extended
    periods
  • Host-level multiplexing (another 102 103)
  • CPU utilization in each honeypot is quite low
    (ltlt1)
  • Use VMM to multiplex honeypots on single machine
  • Done in practice, but limited by memory
    bottleneck
  • Memory coherence property
  • Few memory pages are actually modified in input
  • Share unmodified pages between VMs copy-on-write

55
Yes, we can do Windows...
56
Overall challenges for honeyfarms
  • Depends on asynchronous input
  • What if they dont scan that range (smart bias)
  • What if they propagate via e-mail, IM? (doable,
    but privacy issues)
  • Inherent tradeoff between liability exposure and
    detectability
  • Honeypot detection software exists perfect
    virtualization tough (although were working hard
    on it)
  • Resource exhaustion (from outbreak or DoS)
  • It doesnt necessary reflect whats happening on
    your network (cant count on it for local
    protection)
  • Hence, there is a need for both honeyfarm and
    in-situ approaches
Write a Comment
User Comments (0)
About PowerShow.com