Title: Intrusion Detection Study Based on Bayesian Network
1Intrusion Detection Study Based on Bayesian
Network
- Qiang Chen
- XiangYang Li
- YeBin Zhang
2Agenda
- Intrusion Detection
- Bayesian Network
- Application
- Result and Discussion
3Intrusion Detection - Defensive System
- Security Policy
- What should we protect?
- Prevention
- How can we prevent an intrusion?
- Detection
- If there is an intrusion, how can we detect it?
- Response/Recovery
- If we detect an intrusion, how can we response?
How can we recover the system from the damage?
4Intrusion Detection - Methods
- Norm-based Approach
- Statistical-based Techniques (SPC)
- Build up a norm profile with statistical methods
- Specification-based Techniques (ANN, BN,...)
- Build up a norm profile with rules and logical
specification - Signature-based Approach (DT, Clustering,...)
- Recognize the pre-defined intrusion signature
from system activities.
5Bayesian Network - Algorithms
- Parameter and Structure Learning Algorithm
- Searching Scoring Based
- MDL
- Information gain
- Dependency Analysis Based
- EM Algorithm
- Inference Algorithm
- HUGIN
- Shenoy-shafer
6Bayesian Network - Advantages
- In BN, state distribution of any objects can be
predicted from any subset of other known objects. - BN makes explicit statements about the certainty
of the state distributions of the known objects
in the networks. - ? Distinguish the normal and abnormal behaviors
based on their different probability distribution
in the network trained by a certain profile.
7Bayesian Network - Disadvantages
- For learning, to find a best BN structure is a NP
hard problem. - For inference, the complexity of evidence is
linear in the number of cliques and is
exponential in the size of the largest clique in
the network. As a result, it is very difficult to
handle large-scale problems with BN.
8Bayesian Network - Why Bayesian
- In computer systems event type A always occur
before or after event type B (C,D...). - The probability of A occur before/after B(C,D)
can be stable by observing a long period. - Attack event shows a special relationship between
event types with a low probability.
9Bayesian Network - Symmetric Network
O4
O2
R234
R12
O1
O3
O4
O2
O4
O2
O1
O3
R23
R12
R34
O1
O3
10Bayesian Network - Reasons
- We do not care about the cause-effect relations
in our current study. - Cause-effect loop exists in the computer system.
- Easy to present the dependent and independent
relations. - Simplify the learning procedure ad reduce the
complexity.
11Bayesian Network - Q Value
- The stronger the dependence between two nodes,
the larger the X2 value, and the larger the Q
value
12Bayesian Network - Select Nodes
Q0.1
X1
X2
X2
X1
R123
Q0.20
R24
Q0.15
Q0.05
X3
X4
X3
X4
Q0.3
13Bayesian Network - Initialize BN
- Arbitrary Constraint
- Maximal number of objects B
- Maximal number of connections for an object D
- The order of replacement is from highest
dimension to lowest dimension.
14Bayesian Network - Prior
- Estimation of prior state distribution of object
- P(OC) NOC /N
- Estimation of prior probability relation table of
relation R among the objects - P(O1C1, O2C2, ..OnCn) N O1C1,O2C2,OnCn/N
15Bayesian Network - Inference
- For objects Ocombine all connected and unvisited
conjoint relations Ri to update the current state
distribution.
- For relation RCombine all connected and
unvisited objects Oi to update the joint
probability table.
16Bayesian Network - Sample
P(0) (O2/E,BN)P(O2/BN) y0.2, n0.8
P(0) (O4/E,BN)P(O4/BN) y0.1 n0.9
O4
O2
target
R234
R12
O1
O3
P(O1/BN) y0.36 n0.64 P( O1)y1 n0
evidence
evidence
P(O3/BN) y0.272 n0.728 P( O3)y1 n0
17Bayesian Network - Sample
P(1) (O4/E,BNy0.16 n0.84 P(0)
(O4/E,BN)P(O4/BN) y0.1 n0.9
P(1) (O2/E,BN) y0.55, n0.44 P(0)
(O2/E,BN)P(O2/BN) y0.2, n0.8
O4
O2
target
R234
R12
O1
O3
P(O1/BN) y0.36 n0.64 P( O1)y1 n0
evidence
evidence
P(O3/BN) y0.272 n0.728 P( O3)y1 n0
18Bayesian Network - Global Probability
19Application - General
- Training and Testing data
- Audit Data(BSM)
- moving windows(size N100)
- Vector (X1X284) as Evidence
- 284 types of events(Nodes)
- States(1 or 0)
- P(X1X284BN)
20Application - Data Source
- Normal events come from Lincoln Lab
- They provide the log data of their simulation
system. - Intrusion events are collected by ourselves.
- Generated by ourselves.
21Application - New Variables in Testing
- Training
- Only parts of variables appear.
- Testing
- New objects appear
- New state of objects appear.
- New state combinations among objects appear.
- Assign a arbitrary small value (0.00001) to any
unknown probability
22Result and Discussion - Result
23Result and Discussion - Discussion
24Result and Discussion - Conclusions
- Bayesian Network can be used in the study of
intrusion detection. - Symmetric network is easier to implemented than a
directed network. - The computation complexity is still a main
barrier in applying it the study with a large
data set.
25Thank you!