Title: A View Based Security Framework for XML
1A View Based Security Framework for XML
- Wenfei Fan, Irini Fundulaki, Floris Geerts,
- Xibei Jia, Anastasios Kementsietsidis
- University of Edinburgh
- Digital Curation Center
2Introduction
- XML data management
- The importance is clearly demonstrated by the
wide adoption of XML related technologies in
eScience projects - Selective exposure of information in XML
- a primary concern for data providers, curators
and consumers. - safeguard data confidentiality, privacy and
intellectual property
3Introduction --- Security View
- Security View multiple user groups
- who wish to query the same XML document
- different access policies may be imposed,
specifying the portions of the document the users
are granted or denied access to. - Security views are necessarily virtual
- it is prohibitively expensive to materialize and
maintain a large number of views.
4Example a medical records XML database
Hospital
Psychiatry
Genetics
Record
Record
Record
Date
Doctor
Patient
Patient
Date
Bill
Doctor
Date
Bill
Patient
Doctor
Bill
Diagnosis
Name
Name
Sex
Name
Diagnosis
Sex
Name
Sex
Diagnosis
Name
Name
'Mary'
'David'
'David'
'Mary'
'Mark''
'Angela'
Patient Mary can access his own medical
records
The security admin could see the whole db
Doctor David can only access the records of
his patients
5Insurers view
An insurer can only read his customers' billing
info
6Researchers View
a medical researcher could retrieve the
diagnosis data for research purposes, but not
the information on doctors or patients.
7System Architecture
researchers
security admins
legend
input module
View
Query
Security
Result
output module
Editor
Derivation
Specification S
Viewer
core module
D
Query Q
optional module
R
R
on V
R
Security
Security
Security
virtual view
View V
View V
View V
D
R
P
XML schema
Query
for
for
for
...
Role U
Role U
Role U
XML database
Rewriting
D
R
P
with
with
with
XML data flow
XSD D
XSD D
XSD D
D
R
P
other data flow
Query Q
T
on T
XSD D
for
security spec. lang. LS
Query
document T
used by admins.
Evaluation
T
view spec. lang. LV
transparent to users.
Indexer
view query lang. LQV
used by users.
Query
Optimization
doc query lang. LQR
transparent to users.
8Security Specification
- hospital -gt patient
- (hospital,patient) visit/treatment/medication
autism - patient -gt pname, visit, parent
- (patient,pname) N
- (patient,visit) N
- parent -gt patient
- visit -gt treatment, date
- (visit, treatment) medication
- treatment -gt test medication
- (treatment,test) N
9Security Specification
- Classify the nodes in the XML document
- accessible nodes
- inaccessible nodes
- conditional accessible nodes
- Support
- inheritance
- overriding
- content-based access privilege
- context-dependency
- View derivation module
- schema availability
- the availability of an XML schema that specifies
the structure of accessible data is critical to
the users who can then formulate queries only
over this schema.
10View Specification
- hospital -gt patient
- (hospital, patient) patientvisit/treatment/medi
cation autism - patient -gt treatment, parent
- (patient, treatment) visit/treatmentmedication
- (patient, parent) parent
- parent -gt patient
- (parent, patient) patient
- treatment -gt medication
- (treatment, medication) medication
11Query Over the View
- Regular XPath Query
- a mild extension of XPath that supports the
general Kleene closure (.) instead of the
limited recursion //. - Why XPath is not closed under query rewriting
- i.e. for an XPath query on a recursively defined
view there may not exist an equivalent XPath
query on the underlying document
12Query Over the Document
- Regular XPath Query
- However, the size of the rewritten query QT, if
directly represented in Regualar XPath, may be
exponential in the size of input query QV. - We overcome this challenge by employing an
automaton characterization of QT ,denoted by
MFA(mixed finite state automata), which is linear
in the size of QV. - Query Rewriting Module
13MFA Internal Query Representation
hospital/patient(parent/patient)/visit/treatment
/test and visit/treatmentmedication/text()head
ache/pname
14Query Evaluation HyPE
- We propose a novel algorithm, HyPE (Hybrid Pass
Evaluation), for processing Regular XPath queries
represented by MFAs. - A unique feature of HyPE is that it needs only a
single top-down depth-first traversal of the XML
tree, during which HyPE both - evaluates predicates of the input query
(equivalently, AFA's of the MFA) and - identifies potential answer nodes (by evaluating
the NFA of the MFA). - previous systems require to traverse the XML
document at least twice to evaluate XPath
queries.
15HyPE Cans (candidate answers)
- The potential answer nodes are collected and
stored in an auxiliary structure, referred to as
Cans (candidate answers), which is often much
smaller than the XML document tree. - A pass over Cans is needed to retrieve the real
result nodes.
16HyPE
17SMOQE A Reference Implementation
- We have developed a reference implementation,
called SMOQE(Secure MOdular Query Engine), for
the security framework we proposed in this paper.
- It is implemented in Java.
- demonstrated in VLDB 2006
18Conclusion
- A generic, flexible view based access control
framework for protecting XML data and its
implementation SMOQE - able to enforce fine-grained access policies
according to the structure and values of the
protected XML data - schema availability
- view derivation
- efficient enforcement of security constraints
during XML query evaluation - Query rewriting
- Automaton based representation
- Evaluation using HyPE and optimization
19Thank you!