Data Mining for Intrusion Detection

1 / 46
About This Presentation
Title:

Data Mining for Intrusion Detection

Description:

First use ripple down rules to overfit the data. Ripple down rules are often used ... M. Joshi, V. Kumar, CREDOS: Classification using Ripple Down Structure, ICDE 2003 ... – PowerPoint PPT presentation

Number of Views:329
Avg rating:3.0/5.0
Slides: 47
Provided by: Man849

less

Transcript and Presenter's Notes

Title: Data Mining for Intrusion Detection


1
Data Mining for Intrusion Detection
  • Donghan Li
  • Alex Pivoshenko

2
Overview
  • Intrusion Detection
  • Datamining for Intrusion Detection
  • Dataming for Misuse Detection
  • Datamining for Anomaly Detection
  • Current Research Projects
  • Commercial Products
  • References

3
Intrusion Detection
  • Intrusions
  • DoS (Denial of Service)
  • Probing / Scanning
  • Compromises
  • Trojan horses / Worms
  • Why we need Intrusion Detection

4
Intrusion Detection Systems
5
IDS Taxonomy
6
Traditional IDS
  • Signature-based
  • Limitations
  • Revising signature database
  • Emerging cyber threats
  • Latency in deployment

7
Key Technical Challenges
  • Large data size
  • High Dimensionality
  • Temporal nature of the data
  • Skewed class distribution
  • Data Preprocessing
  • High Performance Computing

8
Data Mining for Intrusion Detection
  • Goal high detection rate and low false alarm
    rate
  • Data Mining tries to address limitations and
    challenges

9
Basic Steps in DM for ID
  • Convert data
  • Build Data Mining models
  • Analysis and summary

10
Feature Construction
  • Network traffic data is collected
  • Start time and duration
  • Protocol type
  • Source and Dest IP address and port, etc
  • KDDCup99 example
  • Content-based features
  • Time-based traffic features
  • Connection-based features

11
Data sets in Intrusion Detection
  • DARPA 1998
  • 9 weeks of raw TCP dump data
  • Labeled connections
  • DARPA 1999
  • System call traces
  • Data set with virus files

12
Current ID Approaches
  • Misuse detection
  • Known intrusion patterns
  • Record patterns -gt monitor event sequences -gt
    report matched events
  • Anomaly detection
  • Deviation from normal pattern
  • Normal behavior profiles -gt observe current
    activity -gt report deviations

13
Data Mining for Misuse Detection
  • Rule based techniques
  • Tree based approaches
  • Association rules
  • Bayesian classifiers, genetic algorithms
  • Neural networks
  • Cost sensitive modeling

14
PN-rule Learning
  • N-phase
  • Remove FP from examples of P-phase
  • High accuracy and significant support
  • P-phase
  • Positive examples with good support
  • Seek good recall

15
Boosting based algorithms
  • RareBoost
  • Updates the weights differently
  • SMOTEBoost
  • Combination of SMOTE (Synthetic Minority
    Oversampling Technique) and boosting

16
CREDOS
  • First use ripple down rules to overfit the data
  • Ripple down rules are often used
  • Then prune to improve generalization
  • Different mechanism from decision trees

17
Neural Networks
  • For host-based intrusion detection
  • Build user profiles
  • Build profiles of software behavior
  • For network-based intrusion detection
  • Hierarchical network intrusion detection
  • Multi-layer perceptrons (MLP)

18
Cost Sensitive Modeling
  • Detection rate / False Alarm rate may be
    misleading
  • Cost factors damage cost, response cost,
    operational cost
  • Costs for TP, FP, TN, FN
  • Define cumulative cost

19
Anomaly Detection
  • Normal Behavior ? Deviations ? Anomaly Behavior
  • Major approaches
  • Outlier detection
  • Profiling
  • Others
  • Two categories
  • Supervised
  • Unsupervised

20
Sample Data
  • MINDS 01/26/03
  • 48 hours after the Slammer worm

21
Outlier Detection Schemes
  • Detect intrusions (data points) that are very
    different from the normal activities (rest of
    the data points)
  • General Steps
  • Identify normal behavior
  • Construct useful set of features
  • Define similarity function
  • Use outlier detection algorithm
  • Statistics based
  • Distance based
  • Model based

22
Statistics Based Outlier Detection
  • Data points are modeled using stochastic
    distribution
  • Points are determined to be outliers depends on
    their relationship with this model
  • Major approaches
  • Finite Mixtures
  • Using probability distribution
  • Information Theory measures
  • Problems
  • High dimensions ? difficult to estimate
    distributions

23
Statistics Based Finite Mixture
  • Unsupervised Learning Algorithm
  • Data sources
  • Categorical (e.g. protocol, service)
  • Continuous (e.g. duration, src_byptes)
  • Construct FM model as a representation of
    underlying mechanism of data generation
  • Assign score to new input based on how large the
    model has changed

24
Statistics Based-Probability Distributions
  • Supervised Learning Algorithm
  • Basic assumption for training data
  • of normal elements gtgt of anomalies
  • Construct Probability Distribution
  • M majority distribution
  • A anomalous distribution
  • D(1-c)M cA
  • Measure the likelihood L(D) for real data

25
Statistics Based - Information Theory
  • Supervised Learning Algorithm
  • Entropy
  • Measure uncertainty/impurity of data
  • Smaller when the class distribution is skewer
  • Larger when data is partitioned into more regular
    subsets
  • Anomaly detector sets entropy threshold
  • Below threshold ? potential intrusion
  • Smaller threshold ? More accurate
  • Conditional entropy H(XY)
  • How much uncertainty remains in sequence of events

26
Distance Based Outlier Detection
  • Represent data as a vector of features
  • Major approaches
  • Nearest neighbor based
  • Density based
  • Clustering based
  • Problem
  • High dimensionality of data

27
Distance Based Nearest Neighbor
  • Not enough neighbors ? Outliers
  • Compute distance d to the k-th nearest neighbor
  • Outlier points
  • Located in more sparse neighborhoods
  • Have d larger than a certain threashold
  • Mahalanobis-distance based approach
  • More appropriate for computing distance with
    skewed distributions

28
Distance Based Density
  • Local Outlier Factor (LOF)
  • Average of the ratios of the density of example p
    and the density of its nearest neighbors
  • Compute density of local neighborhood for each
    point
  • Compute LOF
  • Larger LOF ? Outliers

29
Distance Based Clustering
  • Radius w of proximity is specified
  • Two points x1 and x2 are near if d(x1, x2)ltw
  • Define N(x) as number of points that are within w
    of x
  • Points in small cluster ? Outliers
  • Fixed-width clustering for speedup

30
Distance Based - Clustering (cont.)
  • K-Nearst Neighbor Canopy Clustering
  • Compute sum of distances to k nearest neighbors
  • Small K-NN ? point in dense region
  • Canopy clustering for speedup
  • WaveCluster
  • Transform data into multidimensional signals
    using wavelet transformation
  • Remove Hign/Low frequency parts
  • Remaining parts ? Outliers

31
Model Based Outlier Detection
  • Similar to Probabilistic Based schemes
  • Build prediction model for normal behavior
  • Deviation from model ? potential intrusion
  • Major approaches
  • Neural networks
  • Unsupervised Support Vector Machines (SVMs)

32
Model Based - Neural Networks
  • Use a replicator 4-layer feed-forward neural
    network
  • Input variables are the target output during
    training
  • RNN forms a compressed model for traning data
  • Outlyingness ? reconstruction error

33
Model Based - SVMs
  • Attempt to separate the entire set of training
    data from the origin
  • Regions where most data lies are labeled as one
    class
  • Parameters
  • Expected outlier rates
  • Good for high quality controlled training data
  • Variance of Radial Basis Function (RBF)
  • Larger ? higher detection rate and more false
    alarm
  • Smaller ? lower detection rate and fewer false
    alarm

34
Profiling Schemes
  • Profiling methods are usually applied to host
    based intrusion detection where users, programs,
    etc are profiled
  • Profiling sequences of Unix shell command lines
  • Profiling users behavior
  • Can also be used to profile alarms produced by
    other ID methods
  • Reduce false positives

35
Profiling Temporal Sequence
  • Data
  • Sequence of Unix shell command lines
  • Set of sequences (user profiles) are reduced and
    filtered to only critical commands
  • Build Instance Based Learning (IBL) model that
    stores historic examples of normal data
  • Deviations ? Potential intrusions

36
Profiling Neural Networks
  • Modeling the behavior of individual users
  • Data
  • Audit logs for each user for several days
  • Form distribution vector
  • How often user executes each command
  • Train NN with these vectors
  • Identify whether the user is regular or illegal
    for each new command distribution vector, I.e for
    each new login session

37
Profiling NNs (cont.)
  • Similar techniques can be applied to profiling
    software behavior in a system
  • Data sequence of system relevant system calls
  • Sum of NN output over certain threshold indicates
    potential malicious software
  • Multi-level NNs architecture
  • Feature detection modules can be combined to meet
    certain IDS need

38
Profiling Mining Alarms
  • Unusual but legitimate behaviors may trigger
    alarms ? false positives
  • Overtime, false alarms can be modeled using IBL,
    association rules, among other DM techniques
  • Can be used to improve the performance of a IDS
    by reducing false alarms

39
Alternative Approaches
  • Artificial Anomalies Generation
  • For sparse regions of data generate more
    artificial anomalies than for the dense data
    regions
  • Filter artificial anomalies to avoid collision
    with known instance
  • Use rule discovery systems (e.g. RIPPER) to form
    anomaly signatures

40
Current Research Projects
  • ADAM (Audit Data Analysis and Mining) - GMU
  • MADAM ID (Mining Audit Data for Automated Models
    for Intrusion Detection) - Columbia, GT, Florida
    Tech.
  • MINDS - Univ. of Minnesota
  • IIDS (Intelligent Intrusion Detection) -
    Mississippi State
  • DM for Network Intrusion Detection - MITRE Corp.
  • Agent based DM system - Iowa State
  • IDDM - Dept. of Defense, Australia

41
Commercial IDS
  • Misuse detection based
  • SNORT (open source NIDS based on signatures)
  • Network Flight Recorder (NFR, detect known
    attacks)
  • NetRanger (CISCO, traffic analyzer)
  • Shadow (collect audit data and run tcmdump
    filters)
  • P-Best (SRI, rule-based expert system)
  • NetStat (UCSB, real time IDS using state
    transition analysis)

42
Commercial IDS (cont.)
  • Anomaly detection based
  • IDES, NIDES (statistical)
  • EMERALD (statistical)
  • SPADE (Statistical Packet Anomaly Detection
    Engine) within SNORT
  • Computer Watch (ATT, expert system)
  • Wisdom Sense (rule based)

43
References
  • D. Barbara, et al., ADAM A Testbed for Exploring
    the Use of Data Mining in Intrusion Detection.
    SIGMOD Record 2001
  • M. Joshi, et al., Pnrule, Mining Needles in a
    Haystack Classifying Rare Classes via Two-Phase
    Rule Induction, ACM SIGMOD 2001
  • M. Joshi, et al, Predicting Rare Classes Can
    Boosting Make Any Weak Learner Strong?, ACM
    SIGKDD 2002
  • M. Joshi, V. Kumar, CREDOS Classification using
    Ripple Down Structure, ICDE 2003
  • K. Yamanishi, On-line unsupervised outlier
    detection using finite mixtures with discounting
    learning algorithms, KDD 2000
  • W. Lee, et al, Information-Theoretic Measures for
    Anomaly Detection, IEEE Symposium on Security 2001

44
References (cont.)
  • E. Eskin, Anomaly Detection over Noisy Data using
    Learned Probability Distributions, ICML 2000
  • S. Ramaswamy, R. Rastogi, S. Kyuseok Efficit
    Algorithms for Mining Outliers from Large Data
    Sets, ACM SIGMOD 2000
  • A. Lazarevic, et al., A Comparative Study of
    Anomaly Detection Schemes in Network Intrusion
    Detection, SIAM 2003
  • E. Eskin et al., A Geometric Framework for
    Unsupervised Anomaly Detection Detecting
    Intrusions in Unlabeled Data, 2002
  • S. Hawkins, et al., Outlier detection using
    replicator neural networks, DaWaK02 2002
  • A. Lazarevi, et al. DM for Intrusion Detection,
    Tutorial on Pacific-Asia Conference on KDD 2003

45
Intrusion Detection Links
  • http//www.cs.umn.edu/aleks/intrusion_detection.h
    tml
  • http//www.cc.gatech.edu/wenke/ids-readings.html
  • http//www.cerias.purdue.edu/coast/intrusion-detec
    tion/welcome.html
  • http//www.cs.ucsb.edu/rsg/STAT/links.html
  • http//cnscenter.future.co.kr/security/ids.html
  • http//www.cs.purdue.edu/homes/clifton/cs590m/
  • http//dmoz.org/Computers/Security/Intrusion_Detec
    tion_Systems/
  • http//www.networkice.com/Advice/Countermeasures/I
    ntrusion_Detection/default.htm
  • http//www.infosyssec.net/infosyssec/intdet1.htm

46
Questions?
  • Thank You!
Write a Comment
User Comments (0)