Intrusion Prevention and Detection for Web Databases

About This Presentation

Title:

Intrusion Prevention and Detection for Web Databases

Description:

... security magazine and steal customer's information (http://www.webappsec.org ... a Rhode Island government web site and stole credit card data from individuals ... – PowerPoint PPT presentation

Number of Views:114

Avg rating:3.0/5.0

Slides: 68

Provided by: cse7

Learn more at: https://www.cse.fau.edu

Category:

more less

Transcript and Presenter's Notes

Title: Intrusion Prevention and Detection for Web Databases

1
Intrusion Prevention and Detection for Web
Databases

by
Ehud Gudes and Alex Roichman
June 2008

2
Outline

Web Applications and Databases
Architecture
Attack surface
Multilayered security solution for web databases
Parameterized View prevention method
Session Key parameter
Rolling Key parameter
Implementation and analysis
DIWeDa intrusion detection method
Architecture
Profile building algorithm
Intrusion detection algorithm
Implementation and evaluation
Summary

3
Traditional DB

Two-tier architecture
Users are fixed and known
Users number is limited
All transactions are passed via the same
connection and users connect directly to the DB

4
Traditional DB Access Control

DB can identify and authenticate its users
DB can authorize users by traditional
user/role-based access control
It is quite easy to follow up single user
transactions to seek signs of intrusion
Views can be used to determine for a user the
only part of the database that interests her

5
Web Databases

Three-tier architecture
Users are casual and unknown
Users number is not limited
Users do not connect directly to the DB

6
Connection Pooling

Different web users can run their SQL statements
on the same DB connection and one user can run
her SQL statements on different connections
This technique contributes to application
efficiency since the time to open and close the
connection is saved per each request
It has serious implications on the databases
access control mechanism

7
Web DB Access Control

Web DB does not identify the real application
user who accesses it
The only user accessing the database is the user
of the web application server most often with
full access privileges (administrator or
super-user)
No user-based access control can be applied
No or very limited RBAC can be applied
The principle of minimal privilege is violated
No more fine-grained access control to the web DB
exists

8
Web DB is vulnerable

There is no way to limit the web user privileges
at the database level of the web databases.
There are still many secured web applications,
but their security can be achieved only by
application means (writing lines of code that
implement security policies) and not by database
access control means (GRANT/REVOKE and VIEWS)
This situation is very problematic
Implementing access control by writing code is a
time consuming task
Enterprise may have one DB but many applications,
then access control is distributed and in many
cases not consistent
Programmers must be security specialists
There is no confidence that all accesses to DB
are authorized
As a result "70 of websites are at immediate
risk of being hacked!" (http//www.acunetix.com)

9
Attack example SQL Injection

In many Web application the SQL sentence is
structured as a string and the users parameter
is concatenated to this string

strSQL "SELECT Salary FROM Salary_Table
WHERE Employee_No 123
AND Salary_Date '" dateParam "'"

But the intruder can type 01.2007' or '1' '1
and retrieve the entire table

SELECT Salary FROM Salary_Table WHERE
Employee_No 123 AND Salary_Date
'01/2007' OR '1' '1'
10
Attack example Business Logic Violation

Another attack on web databases is the Business
Logic Violation (BLV) Attack.
For example an enterprise can require that each
buyer must key up her paying details prior to
buying some book in an online bookstore
application. Buying without entering a payment
details will violate a business rule of an
enterprise.

11
Web attacks reports

Many web applications have serious security holes
and vulnerability reports found on the Internet
show how the existing situation is critical
On November 01, 2005 a high school student used
SQL injection to break into the site of a
Taiwanese information security magazine and steal
customer's information (http//www.webappsec.org/p
rojects/whid/list_id_2005-46.shtml).
On January 13, 2006 hackers broke into a Rhode
Island government web site and stole credit card
data from individuals who have done business
online with state agencies (http//www.webappsec.o
rg/projects/whid/list_id_2006-3.shtml).
On March 29, 2007 a SQL injection flaw in
official Indian government site was discovered
(http//www.webappsec.org/projects/whid/list_id_20
07-12.shtml).
On June 29, 2007 hackers defaces Microsoft U.K.
web page by using an SQL injection attack
(http//www.cgisecurity.com/2007/06/27).
On August 12, 2007 the United Nations web site
was defaced by using SQL injection
(http//hackademix.net/2007/08/12/united-nations-v
s-sql-injections/).

12
Web attacks reports SQL Injection is very
popular

SQL injection attacks are extremely prevalent,
and ranked as the second most common form of
attack on web applications in 2006 in CVE (Common
Vulnerabilities and Exposures list).
The percentage of these attacks among the overall
number of attacks reported rose from 5.5 in 2004
to 14 in 2006 (http//cve.mitre.org).
Recent reports suggest that a large number of
applications on the web are vulnerable to SQL
injection attacks (http//www.darkreading.com/docu
ment.asp?docid103774WT.svlnews1_3),
The number of attacks are on the increase, and
SQL Injection is on the list of most prevalent
forms of attack (http//www.computerweekly.com/Art
icles/2004/04/16/201840/Topfivethreats.htm)
The recent SQL Injection attack on CardSystems
Solutions that exposed several hundreds of
thousands of credit card numbers is an example of
how such attack can victimize an organization and
members of the general public (http//www.securewo
rks.com/press/20060718-sql.html).

13
Related work

Conventional methods for providing protection in
databases rely heavily on the identity of the
entity accessing the database
Several suggestions were published in the
literature to prevent SQL injection attacks
There is research that proposes to refer to the
structure of each SQL statement possible in the
system and to fingerprint that structure
An additional approach is to refer to some
interesting properties of each SQL sentence such
as referenced tables and fields
Another approach to database IDS is not to build
profiles for each SQL sentence, but for each
database user
Another approach is to search for data
dependencies among the data items in a database

14
Related work criticism

The proposed prevention methods are applicative
and cannot be implemented by the database
built-in mechanisms they try to shield databases
from attacks instead of making databases
resistant to them
The proposed protection methods do not fit web
databases since they do not propose any algorithm
for dealing with web session traceability
problem they do not work at the session level
and their approach is not role-based thus they
cannot detect many attacks from web applications

15
Our Approach

We propose the two layered security model for web
databases
The purpose of the first line of defense is to
prevent intrusions.
The purpose of the second line of defense is to
detect attacks which elude the prevention layer.
We use the Parameterized Views method - a novel
fine-grained access control mechanism at the
first prevention line of defense.
We use DIWeDa (Detecting Intrusion in Web
Databases) as our method of intrusion detection
at the second line of defense.
Two layers work together to achieve the highest
level of web database protection.

16
Prevention - the first line of defense

The Parameterized View method solves the problems
of absence of web database internal access
control and enable using of views as a means of
access restriction
This method is capable of preventing attacks to
the DB by the existing native database protection
mechanisms
It is attack-independent and significantly
decreases the database attack surface
It enables real-life development of secure
Internet database by using commercially available
tools

17
Parameterized View

Our concept is based on the use of parameterized
views as the means to supervise the accesses to
the DB from web systems
The parameterized view transfers the user's
identity to the DB and the view displays the
relevant data to this user accordingly, thus
providing fine-grained access control to web DB
Example of the Parameterized View definition

CREATE VIEW Student_Marks_View WITH pStudent_No
SELECT FROM Student_Marks_Table WHERE
Student_No pStudent_No
18
Traditional Parameterized Views without random
parameters

Example of select from the parameterized view
But if instead of the course number an intruder
types 12345
UNION Select From Student_Marks_View(2)
then the SQL
injection will still be possible
The solution to this problem is not to use the
explicit user identity in the SQL statement but
instead use a run-time generated identifier which
will be very difficult to fake.

SELECT FROM Student_Marks_View(1) WHERE Cour
se_No 12345
SELECT FROM Student_Marks_View(1) WHERE
Course_No 12345 UNION SELECT
FROM Student_Marks_View(2)
19
Parameterized views flow of events
20
Our Parameterized View

The main requirement is that the parameter will
be difficult to fake
The application will be authorized to
parameterized views only
In the parameterized view, the parameter will
contain the identity of the user
We present two design solutions for the
parameterized views
Application Session Key parameter
Rolling Key parameter
For details See our SACMAT2007 paper

21
Detection - the second line of defense

Although access control can prevent many attacks
on web databases, it cannot prevent all of them.
One kind of an attack on web databases which
cannot be prevented is the Business Logic
Violation(BLV)
This attack cannot be prevented ether by a
database not by the parameterized views since the
existing database/views access control can grant
or revoke access to resources only according to
the accesor identity/role/parameter and cannot
rely on the business logic of an enterprise.
If many data-centric attacks cannot be
affectively prevented, it make a sense at least
to detect them.

22
Detecting BLV at the DB level vs. Application
level

BLV can be detected either at the application or
at the database level, but-
An enterprise in general has several
applications, but one database.
Applications are changed frequently, but business
rules which are seen at the database level are
stable.
Thus it is preferable to have only one IDS at the
database level to enforce stable business rules
and not to have IDS for each application and to
cope with continues application's changes.
The conclusion it is preferable to detect BLV
at the database level by a web database Intrusion
Detection System.

23
DIWeDa our solution for Web Database Intrusion
Detection Systems

Intrusion Detection Systems (IDS) for network and
operating systems have existing for over 20
years.
IDS for databases is relatively new field of
research that has surfaced in the last few years.
IDS for web databases is almost not exist.
We present DIWeDa, which is able to detect not
only most known attacks such as SQL injections,
but also more complicated and complex kinds of
attacks such as business logic violations.

24
DIWeDa role based approach

Since different roles of an enterprise have
different authorizations, the best strategy for
web database IDS is to build profiles not per an
enterprise but per enterprise roles.
Role-based models are widely used for old desktop
applications, but most of the web applications do
not use roles
the real user of a web session is unknown at the
database level due to connection pooling issues,
thus it is impossible to apply role base access
control to web databases
sometimes the actual role a user uses is
determined dynamically only at run time.
DIWeDa identifies database roles from the learnt
profiles and look for intrusions from one role to
another.

25
Employing many intrusion classifiers

We use data mining techniques to detect
intrusions and employ two different intrusion
classifiers
SQL content classifier (this paper)
SQL parameters classifier
The combination of intrusion classifiers help us
to achieve very high accuracy and preciseness in
detecting of attacks.
The combination of intrusion classifiers was
previously used for network or operation systems
IDS, but we first use it for detecting intrusions
for web databases.

26
DIWeDas System Architecture
27
Profile Builder algorithm

During the training phase DIWeDa learns an
application behavior and builds profiles which
will be used later during the detection phase.
Each DIWeDas profile consists of
the SQL Fingerprint-Set that represents the
enterprise SQL space
Statistical characteristics of parameters for
each fingerprint
The Cluster-Set that represents the SQL content
of each enterprise role

28
Profile Builder Fingerprint-Set

SQL fingerprint represents the structure of a SQL
sentence.
DIWeDa learns all possible fingerprints of an
application and save them in the Fingerprint-Set.
Some attacks, such as SQL injection, always
change the structure of SQL, so if a session
contains an SQL statement without known
fingerprint this session might be under attack.
But more complicated attacks such as logic
violations cannot be detected only by looking for
a statement level SQL structure anomaly- we also
must look for a session level SQL content anomaly.

29
Profile Builder Session Vector

Each application session can be presented by its
SQL Session Vector.
A Session Vector is a binary vector SV with the
length equal to the number of fingerprints in the
application, where the ith bit is 1 if the
application session submits SQL with the ith
fingerprint, else bit i is 0. (note order is
immaterial)
For example, if some application session runs SQL
statements with corresponding fingerprints 1, 3,
5 and there are 7 different fingerprints at all,
we have the following session vector
SV 1, 0, 1, 0, 1, 0, 0

30
Profile Builder SQL space

The Session Vector enables us to formally define
the session SQL contents.
We can think about the application SQL space as
an n-dimensional space, where n is the number of
fingerprints for the application.
Then each session's SQL content can be seen as a
vector (or a point with n coordinates) in the
n-dimensional space.

31
How Sessions are identified?

Connection pooling makes identifying and
separating under sessions difficult
But with parameterized views the SQL statements
issued by a user are uniquely identified by the
unique random key of the parameterized view.
Thus, sessions can be identified and monitored.

32
Profile Builder distribution in SQL space

If a session's SQL content was
absolutely random, then the distribution
of vectors in the space should be uniform.
But in reality, a session's SQL content
is not random- this means that
a session can submit only SQL statements
that a session's role is supposed to submit.
The groups of points in the last figure represent
different access roles and show that SQL contents
of sessions from the same role are very similar,
while SQL contents of sessions from different
roles are far apart.

33
Profile Builder looking for intrusions

We will find clusters (application roles) by
using data mining techniques
We will look for intrusion by looking for
sessions with vectors which are far from the
found clusters

34
Profile Builder Cluster

The Cluster is a group of highly similar Session
Vectors which belong to the same role.
The Cluster Centroid is a vector CC with vector
values that are the respective means of the
cluster vectors.
The distance D between two clusters represented
by their centroids is computed as

35
Profile Builder Clustering Algorithm

Now we can use the data mining technique
(hierarchical clustering algorithm) and cluster
application sessions of the same role.
This algorithm builds a tree which is called the
cluster Dendrogram tree.

36
Profile Builder Building Dendrogram

Hierarchical_Clustering_Algorithm
//Building Dendrogram tree
For each application session build its Session
Vector.
Start with each Session Vector as a separate
cluster.
Save all clusters received at this stage in a
Cluster-Set and initialize i to 1.
Select two closest clusters to merge into a
single cluster.
Compute the new cluster centroid for the merged
cluster.
Save all clusters received at this stage in a
corresponding Cluster-Set and advance i by 1.
Repeat steps 4-6 until we get a single cluster.

37
Profile Builder Cluster-Set

Now we will choose the best Cluster-Set (layer of
the dendrogram tree) to serve as a profile for
DIWeDa.
We recommend to choose a Cluster-Set with the
Minimum Total Distance to serve as the profile
for DIWeDa.
This approach finds a cluster-set with small
specific clusters that are far from a global
centroid.
By using this approach, DIWeDa will find specific
separated roles (clusters).

38
Profile Builder Cluster-Set with Minimum Total
Distance

The Minimum Total Distance is computed by the
following formula
where Intra-Cluster Distance is computed by
and Inter-Cluster Distance is computed by

39
Profile Builder Algorithm

To summarize, our algorithm of building
cluster-based profile is as follows
Build_Cluster_Based_Profile
Find all application fingerprints with support gt
SIR and save them in Fingerprint-Set
Run Build_Dendrogram
Select the appropriate set of clusters (layer in
Dendrogram) with the Minimum Total Distance and
save it in Cluster-Set
Delete Clusters from Cluster-Set with a small
Support

40
Detection Phase Session Anomaly by SQL content

The probability of an analyzed session
represented by its session vector SV to have an
abnormal SQL content is defined by the formula
where MD (Minimum Distance) is the distance
between SV and the closest cluster centroid from
the cluster-set and NUS is the number of
unexpected statements in the analyzed session.
This formula is based on the following
assumption
The further a session vector is from any existing
cluster, the more abnormal a session is.
The more unexpected statement a session has, the
more abnormal a session is.

41
Detection Phase Intrusion Threshold

Any session with probability to be abnormal
greater than the Session Intrusion Threshold will
be classified as intrusion.
The Session Intrusion Threshold is the IDS
sensitivity and is represented by a number from
the range 0, 1, where each session with the
probability to be abnormal greater than this
threshold will be classified as intrusion.
The interesting thing that was learned from our
evaluation is that the best threshold is equal to
the Cluster-Set Maximum Radius.

42
Detection Phase Maximum Radius

The Cluster-Set Maximum Radius as the maximum
between Cluster Maximum Distances over all
clusters in the cluster-set.

The idea is that any session with a vector inside
a cluster will be classified as legal and outside
a cluster as intrusion

43
DIWeDa Implementation and Evaluation

We implemented a prototype of the proposed system
and used it to evaluate the system.
The prototype was developed with C and SQL
Server 2005.
The online bookstore application was used as a
web application benchmark.
The profiles were built by manually operating
this application.
The analyzed sessions were created by synthetic
data

44
Inputs to our system - DB log
45
Test case patterns results
of FN of FP of TN of TP of instances Pattern Desc Pattern
0 9 191 0 200 Legal sessions 1
0 0 0 5 5 Where clause modification 2
0 0 0 5 5 Field clause modification 3
0 0 0 5 5 From clause modification 4
0 0 0 10 10 SQL randomization 5
0 0 0 5 5 Business logic escalation 1 (mix of different roles SQL contents) 6
0 0 0 5 5 Business logic escalation 2 (mix of randomly chosen legal SQLs) 7
5 0 0 0 5 Business logic escalation 3 (sessions with random order of SQLs) 8
0 0 0 15 15 Business logic escalation 4 (sessions built on SQLs without their original contents) 9
0 0 0 5 5 Complex attacks scenario 10
46
DIWeDa Implementation and Evaluation

We found that our proposal for the system
threshold of 0.07 was the best as can be seen
from the following ROC.
At this threshold we achieved TPR 92.5 and
FPR5.
The following table shows the evaluation summary

of FN of FP of TN of TP of instances Pattern
0 9 191 0 200 1
0 0 0 5 5 2
0 0 0 5 5 3
0 0 0 5 5 4
0 0 0 10 10 5
0 0 0 5 5 6
0 0 0 5 5 7
5 0 0 0 5 8
0 0 0 15 15 9
0 0 0 5 5 10
47
Evaluation summary

Our True Positive Rate was about 92.5
Our False Positive Rate was about 5

48
Future work

Improving accuracy
increasing TPR and decreasing FPR may be achieved
by employing more than two intrusion classifiers
and by finding a smart algorithm of classifiers
fusion
Adapting the proposed algorithms to new kind of
web attacks

49
Summary

Multilayered security model for web databases
the novel fine-grained access control mechanism
as the first prevention line of defense
general attack independent prevention mechanism
prevention at DB layer
user/session traceability at DB layer
the session anomaly detection mechanism as the
second intrusion detection line of defense
role-based approach
detection of attacks which can be seen only at
session level
detection accuracy due to the combination of
different intrusion classifiers
observing the normal working application with no
assumption that the learning period is clear from
attacks
Our solution proposes a very high protection
level for existing web database environments, and
this is the main significant advantage of this
research.

50
More on Parameterized Views
51
Our Parameterized View

The main requirement is that the parameter will
be difficult to fake
The application will be authorized to
parameterized views only
In the parameterized view, the parameter will
contain the identity of the user
We present two design solutions for the
parameterized views
Application Session Key parameter
Rolling Key parameter
For details See our SACMAT2007 paper

52
Application Session Key

A user requests to perform a process of
identification when she provides a username and a
password to the application
The application runs a DB stored procedure that
accepts the user's username and password and
returns a random number (AS_key)
DB stores the random number in a table of active
users
The application knows the user who works with it,
and stores the AS_key as well. Each SQL sentence
that will be run on behalf of the user will be
run with a parameter of user's corresponding
AS_key
The AS_key is cleared when a user disconnects
from the application

53
Example of Application Session Key

When Alice entering the system, she is
authenticated and receives a random key like
11011
The application is authorized to access only the
parameterized views
In order the view will return the data, it must
get a valid parameter
Now Alice may access information about her salary
from different months, but not the salary of
different employees.

54
Example of Application Session Key

Each SQL of Alice will look as follows
And the definition of the view looks as follows

SELECT Salary FROM Salary_View(110011) WHERE
Salary_Date '01/2007'
CREATE VIEW Salary_View WITH pAS_key SELECT
FROM Salary_Table WHERE Employee_No IN
( SELECT Employee_No FROM Users_Table
WHERE Users_Table.AS_keypAS_key )
55
Rolling Key

The first three steps are similar to the previous
parameter method based on the application session
key, except that now AS_key also serves as the
seed of the rolling key
DB and the application agree on a common
encryption key (Enc_key). This encryption key is
used to generate the next rolling key from the
current one
DB stores the AS_key, Enc_key and the current
Roll_Key in a table of active users

56
Rolling Key

Now, when the SQL sentence arrives, it contains a
request for a parameterized view with 2
parameters the AS_key and the rolling key. As a
result, two things occur
The view returns the filtered data that belongs
only to the user that the session and rolling
keys belong to
The rolling key is advanced to the next number
both in the application and in the DB

57
Example of Rolling Key

When Alice entering the system, she is
authenticated and receives two random keys
Now she can run a statement like

SELECT Salary FROM Salary_View(110111, 000111)
WHERE Salary_Date '01/2007'

But if she wants to run another statement, she
will submit

SELECT Salary FROM Salary_View(110111, 111010)
WHERE Salary_Date '02/2007'
58
Comparison of our two methods

Both approaches significantly decrease the attack
surface of web DB
The safety of the parameter method rests with the
difficulties of guessing the various keys
Example of SQL Injection into the parameterized
view

SELECT Salary FROM Salary_View(11011) WHERE
Salary_Date '01/2007' OR '1' '1'
Because Salary_View returns only the data of the
attacker, she may access information about her
salary from different months, but not the salary
of different employees.
59
Comparison of our two methods

AS key method is preferable from the efficiency
point of view the calculation of the next key in
the rolling key method is expensive
Rolling key method is preferable from security
point of view the rolling key method is
resistant to replay attacks and a correct guess
of a code is only valid for running one sentence
Both methods allow the identification of a real
user at the DB layer, thus fine-grained access
control can be applied at the DB layer

60
Implementation

The parameterized views are not yet the part of
the SQL standard
But we can implement our method with existing
functions stored in DB (table functions)
These functions are supported in most databases
and they can be used in standard DML sentences

SELECT FROM Student_Marks_Func(10101)

The advantage of table functions is their ability
to support delete and update actions, while
limiting the activity of them to the return value
of a function that depends on the parameter passed

61
More on Parameters Classifier
62
Profile Builder - Characterizing SQL Parameters

Detecting anomalous SQL sentences cannot rely
only on the SQL structure (fingerprints), we also
want to characterize the SQL parameter values.
Our goal is to capture normal characteristics of
parameters and then use these characteristics
during the detecting phase of DIWeDa to discover
anomalies in user parameters that are applied to
SQL statements.
We use two statistical models for characterizing
parameters parameter length and parameter
character distribution.

63
Profile Builder - Characterizing SQL Parameters

By computing the parameter length mean and
variance, we can find anomalies in a parameter,
when a malicious input is passed.
Legal SQL parameters have regular structures and
contain only human-readable characters.
In the case of attacks, such as the parameter
buffer overflow or parameter guessing by a brute
force attack, the character distribution is
changed dramatically.

64
Profile Builder - Characterizing SQL Parameters
65
Profile Builder - Characterizing SQL Parameters

Characterize_Parameters
For each fingerprint F from the Fingerprint Set
do
For each parameter P from F do
Find characteristics of P (AVG, STD and
Character Distribution) and save these
characteristics under the indices of F
and P

66
Detection Phase Session Anomaly by Parameters

Testing parameter length by the following
formula
Testing character distribution by Pearson test
Combining the two tests

67
Detection Phase Classifiers Combiner

Our method compute the probability of a session
to be legal based on following features
Session SQL content anomaly
Session parameter anomaly
Two classifiers support us with a value in the
same range 0, 1 that can be used as evidence to
what degree the analyzed session is abnormal.
Now we combine them. We rely on the fact that our
two classifiers are independent values of
parameters do not influence SQL structures and
vise versa, thus we can compute

Write a Comment

User Comments (0)

About PowerShow.com

Intrusion Prevention and Detection for Web Databases - PowerPoint PPT Presentation

Intrusion Prevention and Detection for Web Databases

... security magazine and steal customer's information (http://www.webappsec.org ... a Rhode Island government web site and stole credit card data from individuals ... – PowerPoint PPT presentation