Data and Applications Security Developments and Directions - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Data and Applications Security Developments and Directions

Description:

time. Report. final. results. Data sources. with information. about terrorists ... If a crime occurs in one state, this information is linked to similar cases ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 35
Provided by: chrisc8
Category:

less

Transcript and Presenter's Notes

Title: Data and Applications Security Developments and Directions


1
Data and Applications Security Developments and
Directions
  • Dr. Bhavani Thuraisingham
  • The University of Texas at Dallas
  • Lecture 19
  • Data Warehousing, Data Mining and Security
  • October 19, 2009

2
Outline
  • Background on Data Warehousing
  • Security Issues for Data Warehousing
  • Data Mining and Security

3
What is a Data Warehouse?
  • A Data Warehouse is a
  • Subject-oriented
  • Integrated
  • Nonvolatile
  • Time variant
  • Collection of data in support of managements
    decisions
  • From Building the Data Warehouse by W. H. Inmon,
    John Wiley and Sons
  • Integration of heterogeneous data sources into a
    repository
  • Summary reports, aggregate functions, etc.

4
Example Data Warehouse
Data Warehouse Data correlating Employees
With Medical Benefits and Projects
Could be any DBMS Usually based on the
relational data model
Users Query the Warehouse
Oracle DBMS for Employees
Sybase DBMS for Projects
Informix DBMS for Medical
5
Some Data Warehousing Technologies
  • Heterogeneous Database Integration
  • Statistical Databases
  • Data Modeling
  • Metadata
  • Access Methods and Indexing
  • Language Interface
  • Database Administration
  • Parallel Database Management

6
Data Warehouse Design
  • Appropriate Data Model is key to designing the
    Warehouse
  • Higher Level Model in stages
  • Stage 1 Corporate data model
  • Stage 2 Enterprise data model
  • Stage 3 Warehouse data model
  • Middle-level data model
  • A model for possibly for each subject area in the
    higher level model
  • Physical data model
  • Include features such as keys in the middle-level
    model
  • Need to determine appropriate levels of
    granularity of data in order to build a good data
    warehouse

7
Distributing the Data Warehouse
  • Issues similar to distributed database systems

Branch A
Branch A
Branch B
Branch B
Branch B Warehouse
Central Bank
Branch A Warehouse
Central Bank
Central Warehouse
Central Warehouse
Distributed Warehouse
Non-distributed Warehouse
8
Multidimensional Data Model
9
Indexing for Data Warehousing
  • Bit-Maps
  • Multi-level indexing
  • Storing parts or all of the index files in main
    memory
  • Dynamic indexing

10
Metadata Mappings
11
Data Warehousing and Security
  • Security for integrating the heterogeneous data
    sources into the repository
  • e.g., Heterogeneity Database System Security,
    Statistical Database Security
  • Security for maintaining the warehouse
  • Query, Updates, Auditing, Administration,
    Metadata
  • Multilevel Security
  • Multilevel Data Models, Trusted Components

12
Example Secure Data Warehouse
13
Secure Data Warehouse Technologies
14
Security for Integrating Heterogeneous Data
Sources
  • Integrating multiple security policies into a
    single policy for the warehouse
  • Apply techniques for federated database security?
  • Need to transform the access control rules
  • Security impact on schema integration and
    metadata
  • Maintaining transformations and mappings
  • Statistical database security
  • Inference and aggregation
  • e.g., Average salary in the warehouse could be
    unclassified while the individual salaries in the
    databases could be classified
  • Administration and auditing

15
Security Policy for the Warehouse
Federated Policy
Federated Policy
for Federation
for Federation
F2
F1
Export Policy
Export Policy
Export Policy
Export Policy
for Component A
for Component B
for Component B
for Component C
Generic Policy
Generic Policy
Generic policy
for Component A
for Component B
for Component C
Component Policy
Component Policy
Component Policy
for Component A
for Component B
for Component C
Security Policy Integration and Transformation
Federated policies become warehouse policies?
16
Security Policy for the Warehouse - II
17
Secure Data Warehouse Model
18
Methodology for Developing a Secure Data Warehouse
19
Multi-Tier Architecture
Tier N Data Warehouse
Tier N Secure Data Warehouse
Builds on Tier N
-
1
Builds on Tier N
-
1


Each layer builds on the Previous
Layer Schemas/Metadata/Policies


Tier 2 Builds on Tier 1
Tier 2 Builds on Tier 1
Tier 1Secure Data Sources
Tier 1Secure Data Sources
20
Administration
  • Roles of Database Administrators, Warehouse
    Administrators, Database System Security
    officers, and Warehouse System Security Officers?
  • When databases are updated, can trigger mechanism
    be used to automatically update the warehouse?
  • i.e., Will the individual database administrators
    permit such mechanism?

21
Auditing
  • Should the Warehouse be audited?
  • Advantages
  • Keep up-to-date information on access to the
    warehouse
  • Disadvantages
  • May need to keep unnecessary data in the
    warehouse
  • May need a lower level granularity of data
  • May cause changes to the timing of data entry to
    the warehouse as well as backup and recovery
    restrictions
  • Need to determine the relationships between
    auditing the warehouse and auditing the databases

22
Multilevel Security
  • Multilevel data models
  • Extensions to the data warehouse model to support
    classification levels
  • Trusted Components
  • How much of the warehouse should be trusted?
  • Should the transformations be trusted?
  • Covert channels, inference problem

23
Inference Controller
24
Status and Directions
  • Commercial data warehouse vendors are
    incorporating role-based security (e.g., Oracle)
  • Many topics need further investigation
  • Building a secure data warehouse
  • Policy integration
  • Secure data model
  • Inference control

25
Data Mining for Counter-terrorism
26
Data Mining Needs for Counterterrorism
Non-real-time Data Mining
  • Gather data from multiple sources
  • Information on terrorist attacks who, what,
    where, when, how
  • Personal and business data place of birth,
    ethnic origin, religion, education, work history,
    finances, criminal record, relatives, friends and
    associates, travel history, . . .
  • Unstructured data newspaper articles, video
    clips, speeches, emails, phone records, . . .
  • Integrate the data, build warehouses and
    federations
  • Develop profiles of terrorists,
    activities/threats
  • Mine the data to extract patterns of potential
    terrorists and predict future activities and
    targets
  • Find the needle in the haystack - suspicious
    needles?
  • Data integrity is important
  • Techniques have to SCALE

27
Data Mining for Non Real-time Threats
Clean/
Integrate
Build
modify
data
Profiles
data
of Terrorists
sources
and Activities
sources
Mine
Data sources
the
with information
about terrorists
data
and terrorist activities
Report
Examine
final
results/
results
Prune
results
28
Data Mining Needs for Counterterrorism
Real-time Data Mining
  • Nature of data
  • Data arriving from sensors and other devices
  • Continuous data streams
  • Breaking news, video releases, satellite images
  • Some critical data may also reside in caches
  • Rapidly sift through the data and discard
    unwanted data for later use and analysis
    (non-real-time data mining)
  • Data mining techniques need to meet timing
    constraints
  • Quality of service (QoS) tradeoffs among
    timeliness, precision and accuracy
  • Presentation of results, visualization, real-time
    alerts and triggers

29
Data Mining for Real-time Threats
Rapidly
Integrate
Build
sift through
data and
data
real
-
time
discard
models
sources in
irrelevant
real
-
time
data
Mine
Data sources
the
with information
about terrorists
data
and terrorist activities
Report
Examine
final
Results in
results
Real
-
time
30
Data Mining Outcomes and Techniques for
Counter-terrorism
31
Example Success Story - COPLINK
  • COPLINK developed at University of Arizona
  • Research transferred to an operational system
    currently in use by Law Enforcement Agencies
  • What does COPLINK do?
  • Provides integrated system for law enforcement
    integrating law enforcement databases
  • If a crime occurs in one state, this information
    is linked to similar cases in other states
  • It has been stated that the sniper shooting case
    may have been solved earlier if COPLINK had been
    operational at that time

32
Where are we now?
  • We have some tools for
  • building data warehouses from structured data
  • integrating structured heterogeneous databases
  • mining structured data
  • forming some links and associations
  • information retrieval tools
  • image processing and analysis
  • pattern recognition
  • video information processing
  • visualizing data
  • managing metadata

33
What are our challenges?
  • Do the tools scale for large heterogeneous
    databases and petabyte sized databases?
  • Building models in real-time need training data
  • Extracting metadata from unstructured data
  • Mining unstructured data
  • Extracting useful patterns from
    knowledge-directed data mining
  • Rapidly forming links and associations get the
    big picture for real-time data mining
  • Detecting/preventing cyber attacks
  • Mining the web
  • Evaluating data mining algorithms
  • Conducting risks analysis / economic impact
  • Building testbeds

34
IN SUMMARY
  • Data Mining is very useful to solve Security
    Problems
  • Data mining tools could be used to examine audit
    data and flag abnormal behavior
  • Much recent work in Intrusion detection (unit
    18)
  • e.g., Neural networks to detect abnormal patterns
  • Tools are being examined to determine abnormal
    patterns for national security
  • Classification techniques, Link analysis
  • Fraud detection
  • Credit cards, calling cards, identity theft etc.
  • BUT CONCERNS FOR PRIVACY
Write a Comment
User Comments (0)
About PowerShow.com