Title: UCSD Potemkin Honeyfarm
1UCSD Potemkin Honeyfarm
- Jay Chen, Ranjit Jhala, Chris Kanich,
- Erin Kenneally, Justin Ma, David Moore, Stefan
Savage, - Colleen Shannon, Alex Snoeren, Amin Vahdat, Erik
Vandekeift, - George Varghese, Geoff Voelker, Michael Vrable
2Network Telescopes
- Infected host scans for other vulnerable hosts by
randomly generating IP addresses - Network Telescope monitor large range of unused
IP addresses will receive scans from infected
host - Very scalable. UCSD monitors 17M addresses (/8
/16s)
3Telescopes Active Responders
- Problem Telescopes are passive, cant respond to
TCP handshake - Is a SYN from a host infected by CodeRed or
Welchia? Dunno. - What does the worm payload look like? Dunno.
- Solution proxy responder
- Stateless TCP SYN/ACK (Internet Motion Sensor),
per-protocol responders (iSink) - Stateful Honeyd
- Can differentiate and fingerprint payload
4HoneyNets
- Problem dont know what worm/virus would do?
- No code ever executes after all.
- Solution redirect scans to real infectible
hosts (honeypots) - Individual hosts or VM-based Collapsar,
HoneyStat, Symantec - Can reduce false positives/negatives with
host-analysis (e.g., TaintCheck, Vigilante,
Minos) and behavioral/procedural signatures - Challenges
- Scalability
- Liability (honeywall)
- Isolation (2000 IP addrs -gt 40 physical machines)
- Detection (VMWare detection code in the wild)
5The Scalability/Fidelity tradeoff
Telescopes Responders (iSink, Internet Motion
Sensor)
VM-based Honeynet
Network Telescopes (passive)
Live Honeypot
Nada
Highest Fidelity
Most Scalable
6Potemkin A large scale high-fidelity honeyfarm
- Goal emulate significant fraction of Internet
hosts (10M) - Multiplex large address space on smaller of
servers - Temporal spatial multiplexing
- Scalability, Fidelity, and Containment in the
Potemkin Virtual Honeyfarm, Vrable, Ma, Chen,
Moore, VandeKieft, Snoeren, Voelker, and Savage,
SOSP 2005
7UCSD Honeyfarm Approach
- Make VMs very, very cheap
- Create one (or more) VM per packet on demand
- Deploy many types of VM systems
- Plethora of OSes, versions, configurations
- Monitor VM behavior
- Decide benign or malicious
- Benign Quickly terminate, recycle resources
- Malicious Track propagation, save for offline
analysis, etc. - Assumes common case that most traffic is benign
- Key issues for remainder of talk
- 1) Scaling
- 2) Containment
8Scaling
- Naïve approach one machine per IP address
- 1M addresses 1M hosts 2B investment
- However most of these resources would be wasted
- Claim should be possible to make do with 5-6
orders of magnitude less
9Resulting philosophy
- Only commit the minimal resources needed and only
when you need them - Address space multiplexing
- Late-bind the assignment of IP addresses to
physical machines (on demand assumption of
identity) - Physical resource multiplexing
- Multiple VMs per physical machine
- Exploit memory coherence
- Delta virtualization (allows 1000 VMs per
physical machine) - Flash cloning (low latency creation of on demand
VM)
10Address space multiplexing
- For a given unused address range and service time
distribution, most addresses are idle
/16 network 500ms service time But most of these
arehorizontal port scans!
11The value of scan filtering
- Heuristic no more than one (srcip, dstport,
protocol) tuple per 60 seconds
Max
Mean
12Implementation
- Gateway (Click-based) terminates inbound GRE
tunnels - Maintains external IP address-gttype mapping
- i.e. 132.239.4.8 should be a Windows XP box w/IIS
version 5, etc - Mapping made concrete when packet arrives
- Flow entry created and pkt dispatched to
type-compatible physical host - VMM on host creates new VM with target IP address
- VM and flow mapping GCd after system determines
that no state change - Bottom line 3 orders of magnitude savings
13Physical resource multiplexing
- Can create multiple VMs per host, but expensive
- Memory address spaces for each VM (100s of MB)
- In principal limit for VMWare 64 VMs, practical
limit less - Overhead initializing new VM wasteful
- Claim can support 100s-1000 VMs per host by
specializing hosts and VMM - Specialize each host to software type
- Maintain reference image of active system of that
type - Flash cloning instantiate new VMs via copying
reference image - Delta virtualization share state COW for new VMs
(state proportional to difference from reference
image)
14How much unique memory does a VM need?
15Potemkin VMM implementation
- Xen-based using new shadow translate mode
- New COW architecture being incorporated back into
Xen (VT compatible) - Clone manager instantiates frozen VM image and
keeps it resident in physical memory - Flash clone memory instantiated via eager copy of
PTE pages and lazy faulting of data pages(moving
to lazy profile driven eager pre-copy) - Ram disk or Parallax FS for COW disks
- Overhead currently takes 300ms to create new VM
- Highly unoptimized (e.g. includes python
invocation) - Goal Pre-allocated VMs can be invoked in 5ms
16Containment
- Key issue 3rd party liability and contributory
damages - Honeyfarm worm accelerator
- Worse, I knowingly allowed my hosts to be
infected (premeditated negligence) - Export policy tradeoffs between risk and fidelity
- Block all outbound packets no TCP connections
- Only allow outbound packets to host that
previously send packet no outbound DNS, no
botnet updates - Allow outbound, but scrub is this a best
practice? - In the end, need fairly flexible policy
capabilities - Could do whole talk on interaction between
technical legal drivers - But it gets more complex
17Internal reflection
- If outbound packet not permitted to real
internet, it can be sent back through gateway - New VM generated to assume target address
(honeyfarm emulates external Internet) - Allows causal detection (A-gtB-gtC-gtD) and can
dramatically reduces false positives - However, creates new problem
- Is there only one version of IP address A?
- Yes, single universe inside honeyfarm
- No isolation between infections
- Also allows cross contamination (liability rears
its head again) - No, how are packets routed internally?
18Causal address space aliasing
- A new packet i destined for address t, creates a
new universe Uit - Each VM created by actions rooted at t is said
to exist in the same universe and a single export
policy is shared - In essence, the 32-bit IP address space is
augmented with a universe-id that provides
aliasing - Universes are closed no leaking
- What about symbiotic infections? (e.g., Nimda)
- When a universe is created it can be made open it
to multiple outside influences - Common use a fraction of all traffic is directed
to a shared universe with draconian export rules
19Overall challenges for honeyfarms
- Depends on worms scanning it
- What if they dont scan that range (smart bias)
- What if they propagate via e-mail, IM? (doable,
but privacy issues) - Camouflage
- Honeypot detection software exists perfect
virtualization tough - It doesnt necessary reflect whats happening on
your network (cant count on it for local
protection) - Hence, there is a need for both honeyfarm and
in-situ approaches
20Summary
- Potemkin High-fidelity, scalable honeyfarm
- Fidelity New virtual host per packet
- Scalability 10M IP addresses ? 100 physical
machines - Approach
- Address multiplexing late-bind IPs to VMs
(1031) - Physical multiplexing VM coherence, state
sharing - Flash cloning Clone from reference image
(milliseconds) - Delta virtualization Copy-on-write memory, disk
(100 VMs per host) - Containment
- Risk vs. fidelity Rich space of export policies
in gateway - Challenges
- Attracting attacks, camouflage, denial-of-service