Cardinality-based Inference Control in OLAP Systems An Information Theoretic Approach

1 / 24
About This Presentation
Title:

Cardinality-based Inference Control in OLAP Systems An Information Theoretic Approach

Description:

Cardinality-based Inference Control in OLAP Systems An Information Theoretic Approach Nan Zhang Texas A&M University This is a joint work with Dr. Wei Zhao and Dr ... –

Number of Views:158
Avg rating:3.0/5.0
Slides: 25
Provided by: Freed150
Learn more at: https://cci.drexel.edu
Category:

less

Transcript and Presenter's Notes

Title: Cardinality-based Inference Control in OLAP Systems An Information Theoretic Approach


1
Cardinality-based Inference Control in OLAP
SystemsAn Information Theoretic Approach
  • Nan Zhang
  • Texas AM University
  • This is a joint work with Dr. Wei Zhao and Dr.
    Jianer Chen

2
Privacy Concern
  • Growing Privacy Concern in Database Applications
    on the Internet (e.g., Data Mining)
  • 17 privacy fundamentalists, 56 pragmatic
    majority, 27 marginally concerned (ATT Survey)
  • Challenge Can we build accurate models of the
    aggregate data without access to the precise
    values of individual data?

3
Problem Definition
  • Will the application invade privacy?

Application (Data Miner)
OLAP Server
Randomization
Data Providers
DataProviders

4
Inference Problem
5
Inference Problem
  • SU 20
  • S1S3-SB-ST 87

6
Goal
  • Reject queries that may result in an inference
    problem
  • Answer as many other queries as we can

Application (Data Miner)
OLAP Server
Database
DataWarehouse
7
Related Work
  • A lot of work on statistical databases
  • Survey
  • Differences
  • Restriction on OLAP queries
  • Structure of data cube
  • Online response time

8
Related Work
  • A similar scheme
  • Our Advantages
  • Much easier approach
  • A tighter bound
  • More general framework

9
Definition Query
1-dimensional queries
2-dimensional queries
10
Data Cube and Lattice of Cuboids
11
Definition Query
  • There exists a unique cuboid S such that a cell
    of S is the aggregation of W.
  • Suppose that S is a k-dimensional cuboid. The
    dimensionality of Q is defined to be n - k.

12
Definition compromisability
SU Sales amount of used books in Feb
13
Definition compromisability
  • Compromisability
  • direct inference
  • Compromisability lt 1

14
Cardinality-based Inference Control
S3, ST Minimum compromisability 2,
21(43)-222-1 5 gt 2
S1, SB Minimum compromisability 2,
21(43)-222-1 5 5
S1, SD Minimum compromisability 2,
21(43)-222-1 5 gt 4
15
Our Approach
  • A k-dimensional query Q(F, W) can be safely
    answered if every k1 dimensional dice X in X
    that
  • Contains W as a subset
  • Can be queries as a cell of a (n-k-1)-dimensional
    cuboid
  • satisfies

16
Comparison with Previous Result
  • vs.

17
Proof of Our Bound
  • Basic idea

18
An Information-Theoretic Definition

19
An Information-Theoretic Definition
  • Let
  • we have
  • Thus, no inference problem exists in a data cube
    X if

20
Bounds on fmax(t0)
21
Maximum Non-Compromisable Data Cube
22
Main Theorem
  • Let
  • we have

23
Final Remarks
  • Future Work
  • Quantitative measure of the inference problem
  • Combination of randomization and inference
    control approaches

24
Thank you
  • Questions
Write a Comment
User Comments (0)
About PowerShow.com