Privacy-preserving collaborative network anomaly detection - PowerPoint PPT Presentation

About This Presentation
Title:

Privacy-preserving collaborative network anomaly detection

Description:

PRIVACY-PRESERVING COLLABORATIVE NETWORK ANOMALY DETECTION Haakon Ringberg And finally, each of the predicates may delve deep into IP packets using computationally ... – PowerPoint PPT presentation

Number of Views:138
Avg rating:3.0/5.0
Slides: 75
Provided by: HaakonR
Category:

less

Transcript and Presenter's Notes

Title: Privacy-preserving collaborative network anomaly detection


1
Privacy-preserving collaborative network anomaly
detection
  • Haakon Ringberg

2
Unwanted network traffic
  • Problem
  • Attacks on resources (e.g., DDoS, malware)
  • Lost productivity (e.g., instant messaging)
  • Costs USD billions every year
  • Goal detect diagnose unwanted traffic
  • Scale to large networks by analyzing summarized
    data
  • Greater accuracy via collaboration
  • Protect privacy using cryptography

3
Challenges with detection
  • Data volume
  • Some commonly used algorithms analyze IP packet
    payload info
  • Infeasible at edge of large networks

Network
4
Challenges with detection
AnomalyDetector
  • Data volume
  • Attacks deliberately mimic normal traffic
  • e.g., SQL-injection, application-level DoS1

Im not sure about Beasty
Network
Let me in!
1Srivatsa TWEB 08, 2Jung WWW 02
5
Challenges with detection
  • Data volume
  • Attacks deliberately mimic normal traffic
  • e.g., SQL-injection, application-level DoS1
  • Is it a DDoS attack or a flash crowd?2
  • A single network in isolation may not be able to
    distinguish

Network
1Srivatsa TWEB 08, 2Jung WWW 02
6
Collaborative anomaly detection
  • Bad guys tend to be around when bad stuff
    happens

CNN.com
Im just not sure about Beasty -/
Im just not sure about Beasty -/
FOX.com
7
Collaborative anomaly detection
  • Bad guys tend to be around when bad stuff
    happens
  • Targets (victims) could correlate
    attacks/attackers1

CNN.com
Fool us once, shame on you. Fool us, we cant get
fooled again!
Fool us once, shame on you. Fool us, we cant
get fooled again!2
FOX.com
1Katti IMC 05, Allman Hotnets 06, Kannan
SRUTI 06, Moore INFOC 032George W. Bush
8
Corporations demand privacy
I dont want FOX to know my customers
  • Corporations are reluctant to share sensitive
    data
  • Legal constraints
  • Competitive reasons

CNN.com
FOX.com
9
Common practice
ATT
Sprint
Every network for themselves!
10
System architecture
  • -like system
  • Greater scalability
  • Provide as a service

ATT
Sprint
  • Collaboration infrastructure
  • For greater accuracy
  • Protects privacy

N.B. collaboration could also be performed
between stub networks
11
Dissertation Overview
11

Providing
Technologies
Venue
CollaborationInfrastructure
Privacy of participants and suspects
Cryptography
SubmittedACM CCS 09
Detection at asingle network
Scalable Snort-like IDS system
Machine Learning
Presented IEEE Infocom 09
Collaboration Effectiveness
Quantifying benefits of coll.
Analysis of Measurements
To be submitted
Haakon Ringberg
12
Chapter I scalable signature-based detection at
individual networks
12
  • Work with att labs
  • Nick Duffield
  • Patrick Haffner
  • Balachander Krishnamurthy

13
Background packet rule IDSes
IP header
TCP header
App header
Payload
Enterprise
  • Intrusion Detection Systems (IDSes)
  • Protect the edge of a network
  • Leverage known signatures of traffic
  • e.g., Slammer worm packets contain MS-SQL (say)
    in payload
  • or AOL IM packets use specific TCP ports and
    application headers

14
Background packet and rule IDSes
  • A predicate is a boolean function on a packet
    feature
  • e.g., TCP port 80
  • A signature (or rule) is a set of predicates
  • Benefits
  • Leverage existing community
  • Many rules already exist
  • CERT, SANS Institute, etc
  • Classification for free
  • Accurate (?)

15
Background packet and rule IDSes
  • A predicate is a boolean function on a packet
    feature
  • e.g., TCP port 80
  • A signature (or rule) is a set of predicates
  • Drawbacks
  • Too many packets per second
  • Packet inspection at the edge requires deployment
    at many interfaces

Network
16
Background packet and rule IDSes
  • A predicate is a boolean function on a packet
    feature
  • e.g., TCP port 80
  • A signature (or rule) is a set of predicates
  • Drawbacks
  • Packet has
  • Port number X, Y, or Z
  • Contains pattern foo within the first 20 bytes
  • Contains pattern bar within the last 40 bytes
  • Too many packets per second
  • Packet inspection at the edge requires deployment
    at many interfaces
  • DPI (deep-packet inspection) predicates can be
    computationally expensive

17
Our idea IDS on IP flows
  • How well can signature-based IDSes be mimicked on
    IP flows?
  • Efficient
  • Only fixed-offset predicates
  • Flows are more compact
  • Flow collection infrastructure is ubiquitous

src IP dst IP src Port dst Port Duration Packets
A B 5 min 36
  • IP flows capture the concept of a connection

18
Idea
  • IDSes associate a label with every packet
  • An IP flow is associated with a set of packets
  • Our system associates the labels with flows

19
Snort rule taxonomy
Header-only Meta-Information Payload dependent
Inspect only IP flow header Inexact correspondence Inspect packet payload
e.g., port numbers e.g., TCP flags e.g., contains abc
Relies on features that cannot be exactly
reproduced in IP flow realm
20
Simple translation
  1. Our systems associates the labels with flows
  • Simple rule translation would capture only flow
    predicates
  • Low accuracy or low applicability

20
Snort rule
  • dst port MS SQL
  • contains Slammer

Slammer Worm
Only flow predicates
  • dst port MS SQL

21
Machine Learning (ML)
  1. Our systems associates the labels with flows
  • Leverage ML to learn mapping from IP flow space
    to label
  • e.g., IP flow space src port packets flow
    duration

if raised

packets
otherwise
src port
22
Boosting
h1
h2
h3
Hfinal
sign
Boosting combines a set of weak learners to
create a strong learner
23
Benefit of Machine Learning (ML)
Slammer Worm
Snort rule
Only flow predicates
ML-generated rule
  • dst port MS SQL
  • contains Slammer
  • dst port MS SQL
  • dst port MS SQL
  • packet size 404
  • flow duration
  • ML algorithms discover new predicates to capture
    rule
  • Latent correlations between predicates
  • Capturing same subspace using different dimensions

24
Evaluation
  • Border router on OC-3 link
  • Used Snort rules in place
  • Unsampled NetFlow v5 and packet traces
  • Statistics
  • One month, 2 MB/s average, 1 billion flows
  • 400k Snort alarms

25
Accuracy metrics
25
  • Receiver Operator Characteristic (ROC)
  • Full FP vs TP tradeoff
  • But need a single number
  • Area Under Curve (AUC)
  • Average Precision (AP)

1 - p
AP of p
FP per TP
p
25
26
Classifier accuracy
5 FP per 100 TP
Rule class Week1-2 Week1-3 Week1-4
Header rules 1.00 0.99 0.99
Meta-information 1.00 1.00 0.95
Payload 0.70 0.71 0.70
43 FP per 100 TP
  • Training on week 1, testing on week n
  • Minimal drift within a month
  • High degree of accuracy for header and meta

27
Variance within payload group
Rule Average Precision
MS-SQL version overflow 1.00
ICMP PING speedera 0.82
NON-RFC HTTP DELIM 0.48
  • Accuracy is a function of correlation between
    flow and packet-level features

28
Computational efficiency
Our prototype can supportOC48 (2.5 Gbps) speeds
  • Machine learning (boosting)
  • 33 hours per rule for one week of OC48
  • Classification of flows
  • 57k flows/sec 1.5 GHz Itanium 2
  • Line rate classification for OC48

29
Chapter II Evaluating the effectiveness of
collaborative anomaly detection
29
  • Work with
  • Matthew Caesar
  • Jennifer Rexford
  • Augustin Soule

30
Methodology
30
  1. Identify attacks in IP flow traces
  2. Extract attackers
  3. Correlate attackers across victims

2)
3)
1)
31
Identifying anomalous events
31
  • Use existing anomaly detectors1
  • IP scans, port scans, DoS
  • e.g., IP scan is more than n IP addresses
    contacted
  • Minimize false positives
  • Correlate with DNS BL
  • IP addresses exhibiting open proxy or spambot
    behavior

1Allan IMC 07, Kompella IMC 04
32
Cooperative blocking
Beasty is very bad!
32
  • A set S of victims agree to participate
  • Beasty is blocked following initial attack
  • Subsequent attacks by Beasty on members of S
    are deemed ineffective

33
DHCP lease issues
33
10.0.0.1
?
  • Dynamic address allocation
  • IP address first owned by Beasty
  • Then owned by innocent Tweety
  • Should not block Tweetys innocuous queries

34
DHCP lease issues
34
  • Update DNS BL hourly
  • Block IP addresses for a period shorter than
    most DHCP leases1
  • Dynamic address allocation
  • IP address first owned by Beasty
  • Then owned by innocent Tweety
  • Should not block Tweetys innocuous queries

1Xie SIGC 07
35
Methodology
35
  • IP flow traces from Géant
  • DNS BL to limit FP
  • Cooperative blocking of attackers for ? hours
  • Metric is fraction of potentially mitigated flows

36
Blacklist duration parameter ?
36
  • Collaboration between all hosts
  • Majority of benefit can be had with small ?

37
Number of participating victims
37
  • Randomly selecting n victims to collaborate in
    scheme
  • Reported number average of 10 random selections

38
Number of participating victims
38
  • Collaboration between most victimized hosts
  • Attackers are more like to continue to engage in
    bad action x than a random other action

39
Chapter conclusion
39
  • Repeat-attacks often occur within one hour
  • Substantially less than average DHCP lease
  • Collaboration can be effective
  • Attackers contact a large number of victims
  • 10k random hosts could mitigate 50
  • Some hosts are much more likely victims
  • Subsets of victims can see great improvement

40
Chapter III Privacy-preserving collaborative
anomaly detection
40
  • Work with
  • Benny Applebaum
  • Matthew Caesar
  • Michael J Freedman
  • Jennifer Rexford

41
Privacy-Preserving Collaboration
  • Protect privacy of
  • Participants do not reveal who suspected whom
  • Suspects only reveal suspects upon correlation

CNN
Secure Correlation
E( )
E( )
E( )
FOX
Google
42
System sketch
  • Trusted third party is a point of failure
  • Single rogue employee
  • Inadvertent data leakage
  • Risk of subpoena

MSFT
Google
Secure Correlation
FOX
CNN
43
System sketch
  • Trusted third party is a point of failure
  • Single rogue employee
  • Inadvertent data leakage
  • Risk of subpoena
  • Fully distributed impractical
  • Poor scalability
  • Liveness issues

MSFT
Google
FOX
CNN
44
Split trust
  • Recall
  • Participant privacy
  • Suspect privacy

CNN
Proxy
DB
FOX
  • Managed by separate organizational entities
  • Honest but curious proxy, DB, participants
    (clients)
  • Secure as long as proxy and DB do not collude

45
Protocol outline
  • Recall
  • Participant privacy
  • Suspect privacy
  • Clients send suspect IP addrs (x)
  • e.g., x 127.0.0.1
  • DB releases IPs above threshold

Client / Participant
x
Proxy
But this violates suspect privacy!
DB
x
1 23
3 2
46
Protocol outline
  • Recall
  • Participant privacy
  • Suspect privacy
  1. Clients send suspect IP addrs (x)
  2. DB releases IPs above threshold

Client / Participant
H(x)
Hash of IP address
Proxy
Still violates suspect privacy!
DB
H(x)
1 23
3 2
47
Protocol outline
  • Recall
  • Participant privacy
  • Suspect privacy
  • Clients send suspect IP addrs (x)
  • IP addrs blinded w/Fs(x)
  • Keyed hash function (PRF)
  • Key s held only by proxy
  • DB releases IPs above threshold

Client / Participant
Fs(x)
Keyed hash of IP address
Proxy
Still violates suspect privacy!
DB
Fs(x)
Fs(1) 23
Fs(3) 2
48
Protocol outline
  • Recall
  • Participant privacy
  • Suspect privacy
  • Clients send suspect IP addrs (x)
  • IP addrs blinded w/EDB(Fs(x))
  • Keyed hash function (PRF)
  • Key s held only by proxy
  • DB releases IPs above threshold

Client / Participant
EDB(Fs(x))
Encrypted keyed hash of IP address
Proxy
But how do clients learn EDB(Fs(x))?
DB
Fs(x)
Fs(1) 23
Fs(3) 2
49
Protocol outline
  • Recall
  • Participant privacy
  • Suspect privacy
  • Clients send suspect IP addrs (x)
  • IP addrs blinded w/EDB(Fs(x))
  • Keyed hash function (PRF)
  • Key s held only by proxy
  • EDB(Fs(x)) learned throughsecure function
    evaluation
  • DB releases IPs above threshold

Client / Participant
EDB(Fs(x))
x
Fs(x)
Proxy
s
DB
Fs(x)
Fs(1) 23
Fs(3) 2
Possible to reveal IP addresses at the end
50
Protocol summary
Client
  • Clients send suspects IPs
  • Learns Fs(x) usingsecure function evaluation
  • Proxy forwards to DB
  • Randomly shuffles suspects
  • Re-randomizes encryptions
  • DB correlates using Fs(x)
  • DB forwards bad Ips to proxy

EDB(Fs(3))
Ds (Fs(3)) 3
Fs(x)
Fs(3)
1
2
Fs(3)
51
Architecture
Client-Facing Proxies
Back-End DB Storage
Clients
Proxy DecryptionOracles
Front-End DB Tier
  • Proxy split into client-facing and decryption
    oracles
  • Proxies and DB are fully parallelizable

52
Evaluation
  • All components implemented
  • 5000 lines of C
  • Utilizing GnuPG, BSD TCP sockets, and Pthreads
  • Evaluated on custom test bed
  • 2 GHz (single, dual, quad-core) Linux machines

Algorithm Parameter Value
RSA / ElGamal key size 1024 bits
Oblivious Transfer k 80
AES key size 256
53
Scalability w.r.t. IPs
  • Single CPU core for DB and proxy each

54
Scalability w.r.t. clients
  • Four CPU cores for DB and proxy each

55
Scalability w.r.t. CPU cores
  • n CPU cores for DB and proxy each

56
Summary
  • Collaboration protocol protects privacy of
  • Participants do not reveal who suspected whom
  • Suspects only reveal suspects upon agreement
  • Novel composition of crypto primitives
  • One-way function hides IPs from DB public key
    encryption allows subsequent revelation secure
    function evaluation
  • Efficient implementation of architecture
  • Millions of IPs in hours
  • Scales linearly with computing resources

57
Conclusion
57
  • Speed
  • ML-based architecture supports accurate and
    scalable Snort-like classification on IP flows
  • Accuracy
  • Collaborating against mutual adversaries
  • Privacy
  • Novel cryptographic protocol supports efficient
    collaboration in privacy-preserving manner

58
Future Work Highlights
  • ML-based Snort-like architecture
  • Cross-site train on site A and test on site B
  • Performance on sampled flow records
  • Measurement study
  • Biased correlation results due to biased DNSBL
    (ongoing)
  • Rate at which information must be exchanged
  • Who should cooperate end-points or ISPs?
  • Privacy-preserving collaboration
  • Other applications, e.g., Viacom-vs-YouTube
    concerns

59
Thank you!
Collaborators Jennifer Rexford, Benny Applebaum,
Matthew Caesar, Nick Duffield, Michael J
Freedman, Patrick Haffner, Balachander
Krishnamurthy, and Augustin Soule
60
Difference in rule accuracy
w/o dst port w/o mean packet size
0.99 0.83
0.79 0.06
0.02 0.22
Rule Overall Accuracy
MS-SQL version overflow 1.00
ICMP PING speedera 0.82
NON-RFC HTTP DELIM 0.48
  • Accuracy is a function of correlation between
    flow and packet-level features

61
Choosing an operating point
  • X alarms we want raised
  • Z alarms that are raised

Y
Precision
Exactness
Z
X
Z
Y
Y
Recall
Completeness
X
  • AP is a single number, but not most intuitive
  • Precision recall are useful for operators
  • I need to detect 99 of these alarms!

62
Choosing an operating point
Rule Precision w/recall 1.00 Precision w/recall0.99
MS-SQL version overflow 1.00 1.00
ICMP PING speedera 0.02 0.83
CHAT AIM receive message 0.02 0.11
  • AP is a single number, but not most intuitive
  • Precision recall are useful for operators
  • I need to detect 99 of these alarms!

63
Quantifying the benefit of collaboration
63
  • Effectiveness of collaboration is a function of
  • Whether different victims see the same attackers
  • Whether all victims are equally likely to be
    targeted

64
IP address blinding
EDB(Fs(x))
Client
  • DB requires injective and one-way function on IPs
  • Cannot use simple hash
  • Fs(x) is keyed hash function (PRF) on IPs
  • Key s held only by proxy

65
Secure Function Evaluation
EDB(Fs(x))
Client
s
x Fs(x)
x
  • IP address blinding can be split into per-IP-bit
    xi problem
  • Client must learn EDB(Fs(xi))
  • Client must not learn s
  • Proxy must not learn xi
  • Oblivious Transfer (OT) accomplishes this1,2
  • Amortized OT makes asymptotic performance equal
    to matrix multiplication3

1Naor et al. SODA 01 ,1Freedman et al. TCC
05 ,2Ishai et al. CRYPTO 03
66
Public key encryption
  • Clients encrypt suspect IPs (x)
  • First w/proxys pubkey
  • Then w/DBs pubkey
  • Forwarded by proxy
  • Does not learn IPs
  • Decrypted by DB
  • Does not learn IPs
  • Does not allow for DB correlation due to padding
    (e.g., OAEP)

Client
EDB(EPX(x))
EPX(x)
67
How client learns Fs(x)
  • Client must learn Fs(x)
  • Client must not learn s
  • Proxy must not learn x
  • Naor-Reingold PRF
  • s si 1 i 32
  • PRF g(?xi1 si)
  • Add randomness ui to obscure si from client

Message ui si
68
How client learns Fs(x)
s
s0
s1
s31
Fs(x)
u0
u1 s1
u31 s31
x
x00
x11
x311
  • For each bit xi of the IP, the client learns
  • ui si, if xi is 1
  • ui, if xi is 0
  • The user also learns ? ui

69
How client learns Fs(x)
? ui ?xi1 si
?xi1 ui si ?xi0 ui
? ui ?xi1 si / ? ui
Fs(x) ?xi1 si
? ui
  • User multiplies together all values
  • Divides out ? ui
  • Acquires Fs(x) w/o having learned s

70
How client learns Fs(x)
  • But how does the client learn
  • si ui, if xi is 1
  • ui, if xi is 0
  • Without the proxy learning the IP x?
  • User multiplies together all values
  • Divides out ? ui
  • Acquires Fs(x) w/o having learned s

70
71
Oblivious Transfer (details)
  • x
  • g(f(x))
  • Client sends f(x0) and f(x1)
  • Proxy doesnt learn x
  • Proxy sends
  • v(0) Eg(f(0))(1 r)
  • v(1) Eg(f(1))(s r)
  • Client decrypts v(x) with g(f(x))
  • Calculates g(f(x))
  • Cannot calculate g(f(1-x))

Client
  • Public
  • f(x)
  • g(x)

f(0) f(1)
v(0) v(1)
s
72
Oblivious Transfer (more details)
Preprocessing
  • Proxy chooses random c and r (at startup)
  • Proxy publishes c and gr
  • Client chooses random k (for each bit)

y0 y1
  1. Keyx gkKey1-x c g-k
  2. Keyxr (gr)kUsed to decrypt yx
  1. Key0r Key0rKey1r cr / Key0r
  2. y0 AESKey1r (u)y1 AESKey0r (s u)

Key0
73
Oblivious Transfer (more details)
  • Proxy never learns x
  • Client can calculate Keyxr (gr)k easily, but
    cannot calculate cr (due to lack of r), which is
    needed for Key1-xr cr (gr)-k

y0 y1
  1. Keyx gkKey1-x c g-k
  2. Keyxr (gr)kUsed to decrypt yx
  1. Key0r Key0rKey1r cr / Key0r
  2. y0 AESKey1r (u)y1 AESKey0r (s u)

Key0
74
Other usage scenarios
  • Cross-checking certificates
  • e.g., Perspectives1
  • Clients end users
  • Keys Hash of certificates received
  • Distributed ranking
  • e.g., Alexa Toolbar2
  • Clients Web users
  • Keys Hash of web pages

1Wendlandt USENIX 08,2www.alexa.com
Write a Comment
User Comments (0)
About PowerShow.com