ANFIS Classifier for Network Intrusion Detection System - PowerPoint PPT Presentation

About This Presentation
Title:

ANFIS Classifier for Network Intrusion Detection System

Description:

Title: Author: Mohsen Kahani Last modified by: university Created Date: 4/23/2001 2:36:48 PM Document presentation format – PowerPoint PPT presentation

Number of Views:547
Avg rating:3.0/5.0
Slides: 34
Provided by: mohsen
Category:

less

Transcript and Presenter's Notes

Title: ANFIS Classifier for Network Intrusion Detection System


1
ANFIS Classifier for Network IntrusionDetection
System
  • ???????? ??????
  • http//www.um.ac.ir/kahani/

2
Network Intrusion Detection
  • Widespread use of computer networks
  • Number of attacks and New hacking tools and
    Intrusive methods
  • An Intrusion Detection System (IDS) is one way of
    dealing with suspicious activities within a
    network.
  • IDS
  • Monitors the activities of a given environment
  • Decides whether these activities are malicious
    (intrusive) or legitimate (normal).

3
Soft Computing and IDS
  • Many soft computing approaches have been applied
    to the intrusion detection field.
  • Our Novel Network IDS includes
  • Neuro-Fuzzy
  • Fuzzy
  • Genetic algorithms
  • Key Contributions
  • Utilization of outputs of neuro-fuzzy network as
    linguistic variables which expresses how reliable
    current output is.

4
KDD cup 99 Dataset
  • Comparison of different works in IDS area
  • Needs of Standard dataset for evaluation of
    computer network IDSes.
  • Fifth ACM SIGKDD International Conference on
    Knowledge Discovery and Data Mining Collected and
    generated TCP dump data of simulated network in
    the form of train-and-test sets of features
    defined for the connection records.
  • We name this standard Dataset as KDD cup 99
    dataset and will use it for our experiments.

5
KDD cup 99 Dataset
  • 41 features derived for each connection.
  • A label which specifies the status of connection
    records as either normal or specific attack type.
  • Features fall in four categories
  • The intrinsic features e.g. duration of the
    connection , type of the protocol (tcp, udp,
    etc), network service (http, telnet, etc), etc.
  • The content feature e.g. number of failed login
    attempts etc.
  • The same host features examine established
    connections in the past two seconds that have the
    same destination host as the current connection,
    and calculate statistics related to the protocol
    behavior, service, etc
  • The similar same service features examine the
    connections in the past two seconds that have the
    same service as the current connection.

6
Basic features of individual TCP connections
feature name description  type
duration  length (number of seconds) of the connection  continuous
protocol_type  type of the protocol, e.g. tcp, udp, etc.  discrete
service  network service on the destination, e.g., http, telnet, etc.  discrete
src_bytes  number of data bytes from source to destination  continuous
dst_bytes  number of data bytes from destination to source  continuous
flag  normal or error status of the connection  discrete 
land  1 if connection is from/to the same host/port 0 otherwise  discrete
wrong_fragment  number of wrong'' fragments  continuous
urgent  number of urgent packets  continuous
7
Content features within a connection suggested by
domain knowledge
feature name description  type
hot  number of hot'' indicators continuous
num_failed_logins  number of failed login attempts  continuous
logged_in  1 if successfully logged in 0 otherwise  discrete
num_compromised  number of compromised'' conditions  continuous
root_shell  1 if root shell is obtained 0 otherwise  discrete
su_attempted  1 if su root'' command attempted 0 otherwise  discrete
num_root  number of root'' accesses  continuous
num_file_creations  number of file creation operations  continuous
num_shells  number of shell prompts  continuous
num_access_files  number of operations on access control files  continuous
num_outbound_cmds number of outbound commands in an ftp session  continuous
is_hot_login  1 if the login belongs to the hot'' list 0 otherwise  discrete
is_guest_login  1 if the login is a guest''login 0 otherwise  discrete
8
Traffic features computed using a two-second time
window
feature name description  type
count  number of connections to the same host as the current connection in the past two seconds  continuous
Note The following  features refer to these same-host connections.
serror_rate  of connections that have SYN'' errors  continuous
rerror_rate  of connections that have REJ'' errors  continuous
same_srv_rate  of connections to the same service  continuous
diff_srv_rate  of connections to different services  continuous
srv_count  number of connections to the same service as the current connection in the past two seconds  continuous
Note The following features refer to these same-service connections.
srv_serror_rate  of connections that have SYN'' errors  continuous
srv_rerror_rate  of connections that have REJ'' errors  continuous
srv_diff_host_rate  of connections to different host continuous
9
KDD CUP 99 Sample Data
  • 0,tcp,http,SF,200,4213,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,15,15,0.00,0.00,0.00,0.00,1.00,0.00,0.00,31,2
    55,1.00,0.00,0.03,0.02,0.00,0.00,0.00,0.00,normal.
  • 0,tcp,http,SF,293,4203,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,2,2,0.00,0.00,0.00,0.00,1.00,0.00,0.00,4,255,
    1.00,0.00,0.25,0.02,0.00,0.00,0.00,0.00,normal.
  • 0,tcp,http,SF,296,6903,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,1,2,0.00,0.00,0.00,0.00,1.00,0.00,1.00,2,255,
    1.00,0.00,0.50,0.03,0.00,0.00,0.00,0.00,normal.
  • 0,udp,domain_u,SF,104,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,1,2,0.00,0.00,0.00,0.00,1.00,0.00,1.00,56,56
    ,1.00,0.00,1.00,0.00,0.00,0.00,0.00,0.00,normal.
  • 0,udp,domain_u,SF,103,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,1,2,0.00,0.00,0.00,0.00,1.00,0.00,1.00,66,66
    ,1.00,0.00,1.00,0.00,0.00,0.00,0.00,0.00,normal.
  • 0,udp,domain_u,SF,89,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    ,0,0,1,2,0.00,0.00,0.00,0.00,1.00,0.00,1.00,76,76,
    1.00,0.00,1.00,0.00,0.00,0.00,0.00,0.00,normal.
  • 0,udp,domain_u,SF,79,32,0,0,0,0,0,0,0,0,0,0,0,0,0,
    0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,86,85
    ,0.99,0.02,0.99,0.00,0.00,0.00,0.00,0.00,normal.
  • 0,tcp,smtp,SF,1367,335,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,21,72,
    0.90,0.10,0.05,0.04,0.00,0.00,0.00,0.00,normal.
  • 184,tcp,telnet,SF,1511,2957,0,0,0,3,0,1,2,1,0,0,1,
    0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,1
    ,3,1.00,0.00,1.00,0.67,0.00,0.00,0.00,0.00,buffer_
    overflow.
  • 305,tcp,telnet,SF,1735,2766,0,0,0,3,0,1,2,1,0,0,1,
    0,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,2
    ,4,1.00,0.00,0.50,0.50,0.00,0.00,0.00,0.00,buffer_
    overflow.
  • 0,tcp,smtp,SF,1518,405,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,1,4,0.00,0.00,0.00,0.00,1.00,0.00,1.00,42,108
    ,0.74,0.07,0.02,0.04,0.05,0.00,0.00,0.00,normal.
  • 0,tcp,smtp,SF,1173,403,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,52,116
    ,0.75,0.06,0.02,0.03,0.04,0.00,0.00,0.00,normal.
  • 257,tcp,telnet,SF,181,1222,0,0,0,0,0,1,0,0,0,0,2,0
    ,0,0,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,62
    ,15,0.21,0.05,0.02,0.13,0.03,0.13,0.00,0.00,normal
    .
  • 0,tcp,smtp,SF,2302,410,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,72,117
    ,0.76,0.04,0.01,0.03,0.03,0.00,0.00,0.00,normal.
  • 1,tcp,smtp,SF,1587,332,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,3,120,
    1.00,0.00,0.33,0.04,0.00,0.00,0.00,0.00,normal.
  • 0,tcp,smtp,SF,1552,333,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,1,2,0.00,0.00,0.00,0.00,1.00,0.00,1.00,13,121
    ,0.85,0.15,0.08,0.04,0.00,0.00,0.00,0.00,normal.
  • 0,tcp,finger,SF,10,223,0,0,0,0,0,0,0,0,0,0,0,0,0,0
    ,0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,23,14,
    0.22,0.13,0.04,0.29,0.00,0.00,0.00,0.00,normal.
  • 0,tcp,smtp,SF,971,335,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
    0,0,1,1,0.00,0.00,0.00,0.00,1.00,0.00,0.00,16,120,
    0.94,0.12,0.06,0.03,0.00,0.00,0.00,0.00,normal.
  • 1,tcp,smtp,SF,2007,335,0,0,0,0,0,1,0,0,0,0,0,0,0,0
    ,0,0,1,3,0.00,0.00,0.00,0.00,1.00,0.00,1.00,26,129
    ,0.92,0.12,0.04,0.03,0.00,0.00,0.00,0.00,normal.

10
KDD cup 99 Dataset
  • Attacks fall into four main categories
  • DOS (Denial of service) making some computing or
    memory resources too busy so that they deny
    legitimate users access to these resources.
  • R2L (Root to local) unauthorized access from a
    remote machine according to exploit machine's
    vulnerabilities.
  • U2R (User to root)  unauthorized access to local
    super user (root) privileges using system's
    susceptibility.
  • PROBE host and port scans as precursors to other
    attacks. An attacker scans a network to gather
    information or find known vulnerabilities.

11
KDD Cup 99 Dataset cont.
  • KDD dataset is divided into following record
    sets
  • Training
  • Testing
  • Original training dataset was too large for our
    purpose10 training dataset, was employed here
    for training phase.

12
KDD Cup 99 Sample Distribution
THE SAMPLE DISTRIBUTIONS ON THE SUBSET OF 10
DATA OF KDD CUP 99 DATASET
Class Number of Samples Samples Percent
Normal 97277 19.69
Probe 4107 0.83
DoS 391458 79.24
U2R 52 0.01
R2L 1126 0.23
492021 100
THE SAMPLE DISTRIBUTIONS ON THE TEST DATA WITH
THE CORRECTED LABELS OF KDD CUP 99 DATASET
Class Number of Samples Samples Percent
Normal 60593 19.48
Probe 4166 1.34
DoS 229853 73.90
U2R 228 0.07
R2L 16189 5.20
311029 100
13
ANFIS
  • ANFIS as an adaptive neuro-fuzzy inference system
  • Ability to construct models solely based on the
    target system sample (Learning)
  • Adopt itself through repeated training
    (Adaptation)
  • Above abilities among others qualifies ANFIS as a
    fuzzy classifier for IDS
  • Here we use ANFIS as Neuro-fuzzy classifier to
    detect intrusions in computer networks based on
    KDD cup 99 datasets.

14
Generating Target fuzzy Inference System
  • Grid partitioning
  • all the possible rules are generated based on the
    number of MFs for each input
  • For example in a two dimensional input space,
    with three MFs in the input sets, the number of
    rules in grid partitioning will result in 9
    rules.
  • Subtractive clustering
  • Subtractive Clustering is a fast, one-pass
    algorithm for estimating the number of clusters
    and the cluster centers in a set of data.
  • The clusters information obtained by this method
    is used for determining the initial number of
    rules and antecedent membership functions, which
    is used for identifying the FIS.

15
Initial SYSTEM ARCHITECTURE
  • Features of KDD had all forms continuous,
    discrete, and symbolic.
  • Preprocessing mapping symbolic valued attributes
    to numeric ones.
  • 150000 randomly selected points of the subset of
    10 of data is used as training.
  • Randomly 40000 records of data selected as the
    checking data (used for validating model).
  • Five trails of 40000 sampled connections from the
    source of training dataset that does not overlap
    neither with training set nor each others, have
    been carried out as the testing data.

16
Initial SYSTEM ARCHITECTURE
  • Subtractive Clustering Method with ra0.5
    (neighborhood radius) partitions the training
    data and generates an FIS structure.
  • Then for further fine-tuning and adaptation of
    membership functions, training dataset was used
    for training ANFIS while the checking dataset was
    used for validating the model identified.
  • The final ANFIS contains 212 nodes and a total
    number of 284 fitting parameters, of which 164
    are premise parameters and 84 are consequent
    parameters.

17
Initial SYSTEM ARCHITECTURE
  • Training ANFIS causes further fine-tuning and
    adaptation of initial membership functions.
    Initial and final membership functions of some
    input features are illustrated here.

18
Initial SYSTEM ARCHITECTURE
  • ANFIS structure has one output, basically.
  • We need to gain an approximate class number by
    rounding off the output number of ANFIS. G is the
    parameter for rounding off which gives us the
    integer value.

19
Standard metrics for evaluating network IDSes
  • ?Some Definition
  • Detection rate is computed as the ratio between
    the number of correctly detected attacks and the
    total number of attacks,
  • False alarm (false positive) rate is computed as
    the ratio between the number of normal
    connections that is incorrectly misclassified as
    attacks and the total number of normal
    connections.
  • Classification rate is defined as ratio between
    number of test instances correctly classified and
    the total number of test instances classified.

20
Results
  • False Alarm, Detection and classification rate
    for training and checking data, G0.5
  • Error measures vs. epoch
  • numbers for the training dataset

Data False Alarm Rate Detection Rate Classification Rate
Training 0.61 99.75 99.68
Checking 1.6 91.00 92.44
21
Results
  • Experiment 1
  • All the records of labeled test dataset
    (corrected) as the testing data to evaluate our
    classifiers
  • False Alarm, Detection and Classification Rate
    for test data of first experiment G0.5

Data False Alarm Rate Detection Rate Classification Rate
Test 1.6 91.07 92.48
22
Results
  • Experiment 2
  • 5 trials of 40000 randomly selected 40000
    samples.
  • Average of the resulting.
  • We compare our classifiers with different fuzzy
    algorithms.
  • Comparing False Alarm, Detection and complexity
    of different algorithms.

Algorithm False Alarm Rate Detection Rate Complexity
Neuro-Fuzzy Classifier 0.59 99.54 O(n)
SRPP 1 3.58 99.08 O(n)
EFRID 7 7 98.96 O(n)
RIPPER5 2.02 94.26 O(n log2n)
23
Final System architecture
24
Proposed System(Data Sources)
  • The distribution of the samples in the two
    subsets that were used for the training
  • SAMPLE DISTRIBUTIONS ON THE FIRST TRAINING AND
    CHECKING DATA RANDOMLY SELECTED OF 10 DATA OF
    KDD CUP 99 DATASET OF 10 DATA OF KDD CUP 99
    DATASET

Normal Probe DoS U2R R2L
ANFIS-N Training 20000 4000 15000 40 1000
Checking 2500 107 2000 12 126
ANFIS-P Training 10000 4000 5000 40 1000
Checking 1000 107 500 12 126
ANFIS-D Training 25000 4000 20000 40 1000
Checking 6000 107 5000 12 126
ANFIS-U Training 200 50 50 46 50
Checking 100 25 25 6 25
ANFIS-R Training 4000 1000 2000 40 1000
Checking 2000 500 1000 12 126
25
Proposed System(Data Sources) cont.
  • SAMPLE DISTRIBUTIONS ON THE SECOND TRAINING AND
    CHECKING DATA RANDOMLY SELECTED OF 10 DATA OF
    KDD CUP 99 DATASET OF 10 DATA OF KDD CUP 99
    DATASET

Normal Probe DoS U2R R2L
ANFIS-N Training 1500 500 500 52 500
Checking 1500 500 500 0 500
ANFIS-P Training 1500 500 500 52 500
Checking 1500 500 500 0 500
ANFIS-D Training 1500 500 500 52 500
Checking 1500 500 500 0 500
ANFIS-U Training 1500 500 500 46 500
Checking 1500 500 500 6 500
ANFIS-R Training 1500 500 500 52 500
Checking 1500 500 500 0 500
26
Proposed System(ANFIS Classifiers)
  • The subtractive clustering method with ra0.5
    (neighborhood radius) has been used to partition
    the training sets and generate an FIS structure
    for each ANFIS.
  • For further fine-tuning and adaptation of
    membership functions, training sets were used for
    training ANFIS.
  • Each ANFIS trains at 50 epochs of learning and
    final FIS that is associated with the minimum
    checking error has been chosen.
  • All the MFs of the input fuzzy sets were
    selected in the form of Gaussian functions with
    two parameters.

27
Proposed System(The Fuzzy Decision Module)
  • A five-input, single-output of Mamdani fuzzy
    inference system
  • Centroid of area defuzzification
  • Each input output fuzzy set includes two MFs
  • All the MFs are Gaussian functions which are
    specified by four parameters.
  • The output of the fuzzy inference engine, which
    varies between -1 and 1,
  • Sspecifies how intrusive the current record is,
  • 1 to show completely intrusive and -1 for
    completely normal
  • FUZZY ASSOCIATIVE MEMORY FOR THE PROPOSED FUZZY
    INFERENCE RULES

PROBE DoS U2R R2L Output
High - - - - Normal
- High High High High Normal
- High - - - Attack
- - High - - Attack
- - - High - Attack
- - - - High Attack
Low - - - - Attack
- Low Low Low Low Normal
28
Proposed System(Genetic Algorithm Module)
  • A chromosome consists of 320 bits of binary data.
  • 8 bits of a chromosome determines one parameter
    out of the four parameters of an MF.

29
Proposed System(Some Metrics)
  • Cost Per Example
  • Where CM is a confusion matrix
  • Each column corresponds to the predicted class,
    while rows correspond to the actual classes. An
    entry at row i and column j, CM (i, j),
    represents the number of misclassified instances
    that originally belong to class i, although
    incorrectly identified as a member of class j.
    The entries of the primary diagonal, CM (i,i),
    stand for the number of properly detected
    instances.
  • C is a cost matrix
  • As well as CM,Entry C(i,j) represents the cost
    penalty for misclassifying an instance belonging
    to class i into class j.
  • N represents the total number of test instances,
  • m is the number of the classes in classification.

30
Proposed System(Fitness Function For GA)
  • Two different fitness functions
  • Cost Per Example with equal misclassification
    costs
  • cost per examples used for evaluating results of
    the KDD'99 competition

Predicted Predicted Predicted Predicted Predicted
Normal PROBE DoS U2R R2L
Actual 0 1 2 2 2
Actual PROBE 1 0 2 2 2
Actual DoS 2 1 0 2 2
Actual U2R 3 2 2 0 2
Actual R2L 4 2 2 2 0
Predicted Predicted Predicted Predicted Predicted
Normal PROBE DoS U2R R2L
Actual Normal 0 1 1 1 1
Actual PROBE 1 0 1 1 1
Actual DoS 1 1 0 1 1
Actual U2R 1 1 1 0 1
Actual R2L 1 1 1 1 0
31
Proposed System(Data Sources For GA)
THE SAMPLE DISTRIBUTIONS ON THE SELECTED SUBSET
OF 10 DATA OF KDD CUP 99 DATASET FOR THE
OPTIMIZATION PROCESS WHICH IS USED BY GA
Normal Probe DoS U2R R2L
Number of Samples 200 104 200 52 104
32
Results
  • 10 subsets of training data for both series were
    used for the classifiers.
  • The genetic algorithm was performed three times,
    each time for one of the five series of selected
    subsets.
  • Totatally 150 different structures were used and
    the result is the average of the results of this
    150 structures.
  • Two different training datasets for training the
    classifiers and two different fitness functions
    to optimize the fuzzy decision-making module were
    used.
  • ABBREVIATIONS USED FOR OUR APPROACHES

Abbreviation Approach
ESC-KDD-1 First Training set with fitness function of KDD
ESC-EQU-1 First Training set with fitness function of equal misclassification cost
ESC-KDD-2 Second Training set with fitness function of KDD
ESC-EQU-2 Second Training set with fitness function of equal misclassification cost
33
Results cont.
CLASSIFICATION RATE, DETECTION RATE(DTR), FALSE
ALARM RATE (FA) AND COST PER EXAMPLE OF KDD(CPE)
FOR THE DIFFERENT APPROACHES OF ESC-IDS ON THE
TEST DATASET WITH CORRECTED LABELS OF KDD CUP 99
DATASET
Model Normal Probe DoS U2R R2L DTR FA CPE
ESC-KDD-1 98.2 84.1 99.5 14.1 31.5 95.3 1.9 0.1579
ESC-EQU-1 98.4 89.2 99.5 12.8 27.3 95.3 1.6 0.1687
ESC-KDD-2 96.5 79.2 96.8 8.3 13.4 91.6 3.4 0.2423
ESC-EQU-2 96.9 79.1 96.3 8.2 13.1 88.1 3.2 0.2493
CLASSIFICATION RATE, DETECTION RATE (DTR), FALSE
ALARM RATE (FA) AND COST PER EXAMPLE OF KDD (CPE)
FOR THE DIFFERENT ALGORITHMS PERFORMANCES ON THE
TEST DATASET WITH CORRECTED LABELS OF KDD CUP 99
DATASET (N/R STANDS FOR NOT REPORTED)
Model Normal Probe DoS U2R R2L DTR FA CPE
ESC-IDS 98.2 84.1 99.5 14.1 31.5 95.3 1.9 0.1579
RSS-DSS 96.5 86.8 99.7 76.3 12.4 94.4 3.5 n/r
Parzen-Window 97.4 99.2 96.7 93.6 31.2 n/r 2.6 0.2024
Multi-Classifier n/r 88.7 97.3 29.8 9.6 n/r n/r 0.2285
Winner of KDD 99.5 83.3 97.1 13.2 8.4 91.8 0.6 0.2331
Runner Up of KDD 99.4 84.5 97.5 11.8 7.3 91.5 0.6 0.2356
PNrule 99.5 73.2 96.9 6.6 10.7 91.1 0.4 0.2371
Write a Comment
User Comments (0)
About PowerShow.com