Title: CSCI283 Fall 2005
1Database Security and Privacy
2Announcements
- Final Exam December 6, 2006
- Regular classroom Gelman 607
- Regular class hours 710-940 pm
- Syllabus Lectures 6, 8-13
- Block and Stream Ciphers, Public Key
Cryptography, Public Key Infrastructure,
Identity, Authentication, Malicious Logic, Risk
Analysis, Database Privacy, Information Flow and
Covert Channels slides and associated readings.
3Databases provide
- Shared access
- Controlled access
- Data consistency
- Data integrity
- Minimal redundancy
4Database Security Requirements
- Integrity
- Physical
- Logical
- element
- Access Control
- User Authentication
- Availability
- Auditability
5Integrity
- Physical integrity i.e. response to power
failures and other physical problems - Backup
- Transaction log to update since previous backup
- Integrity
- Field checks
- Access control and consistency among values
- Maintaining change logs
6More Security Requirements
- Auditability
- Helps determine if inappropriate disclosure has
occurred - Helps track divulged information to prevent
inference (example, if it is known that I am a
woman and over 40, it is known that I am losing
Calcium) - Difficult to record access of fields often a
field is reported to have been accessed when it
has not e.g. SELECT all entries with ZIP 20007
7More Security Requirements
- Access Control
- Inference is a problem (fields are related
knowledge of race can be a good predictor of
salary, for example) - Size and granularity different from access
control in OS - User Authentication
- DBMS does its own authentication because no
trusted path between DBMS and OS when DBMS is an
application program on top of OS - Availability important because of shared access
goal.
8Integrity and Consistency Two-Phase Update
- Assign seats to passengers
- Phase I intent phase
- Collect all resources required to complete task
- Shadow values store key data points
- Committing Set commit-flag. Marks end of intent
- Phase II commit phase
- Shadow values are copied
9Seat assignments
- intent phase
- if COMMIT-FLAG SET
- loop or halt
- else
- seat.shadow_name John Doe
- seat.shadow_flag taken
- commit phase
- COMMIT-FLAG SET
- seat.name seat.shadow_name
- seat.flag seat.shadow_flag
- Unset COMMIT-FLAG
10Consistency - Monitors
- Checks values for consistency with field type, or
with rest of database - Range comparisons
- State constraints (for example, if all activities
completed, COMMIT-FLAG should not be SET. This is
a state constraint which might not be satisfied
if power is lost between finishing activities and
unsetting COMMIT-FLAG) - Transition constraints can be difficult to
implement
11Sensitive Data
- Inherently sensitive
- From a sensitive source (request for information
to be kept confidential) - Declared sensitive by database administrator
- Part of a sensitive record (information on George
Bush) or attribute (salary) - Sensitive in context of previously-released
information
12Access Control
- Access decisions depend on
- Availability of data
- Acceptability of access
- User Authentication
- Types of Disclosures
- Exact
- Bounds
- Negative/Positive Result
- Existence of data
- Probable value
13Multilevel Databases
- Differentiated Security
- Security at the level of element
- Two levels not sufficient to represent
sensitivity of data (hence multilevel) - Security of aggregate or function/combination of
fields might be different from that of individual
values - Not used much
14Separation
- Partitioning Each confidentiality level in a
separate database - Difficult to update
- Difficult for users who can access multiple
security levels - Encryption
- Problems?
- Block Chaining
- Integrity Lock
- Secure hash of element position sensitivity
stored with element - Sensitivity Lock
- Identifier Sensitivity encrypted
15Inference
- Believe a new fact based on other information
(Sweeney) - Suppose
- Count(Female AND Caucasian AND HIV) 1
- And we know there is exactly one Female Caucasian
in the database, Mary Smith - We know Mary Smith has HIV even though we did not
ask that, and the database would not reveal that - One way to limit inference is to not allow
divulging a small value
16Other methods for inference
- count(a AND b AND c)
- count(a) count(a AND NOT(b AND c))
- Count(F AND C AND HIV)
- Count(F) Count(F AND NOT(C AND HIV))
- Count(F) Count(F AND (NOT C OR NOT HIV)
b
a
c
17Example of inference
- m1 median(S1)
- m2 median(S2)
- m1 lt m2
- S2 ? S1
- What can you say about S1 and S2? About S1 AND
NOT(S2)
18SOURCE "Housing Characteristics 1990",
Residential Energy Consumption Survey, Energy
Information Administration, DOE/EIA-0314(90),
page 54.
19Statistical Policy Working Paper 22 - Report on
Statistical Disclosure Limitation Methodology'',
Chapter 2, Federal Committee on Statistical
Methodology, May 1994.
Social Security Administration (SSA) rules
prohibit tabulations in which a detail cell is
equal to a marginal total or which would allow
users to determine an individual's age within a
five year interval, earnings within a 1000
interval or benefits within a 50 interval.
20Statistical Policy Working Paper 22 - Report on
Statistical Disclosure Limitation Methodology'',
Chapter 2, Federal Committee on Statistical
Methodology, May 1994.
21Statistical Policy Working Paper 22 - Report on
Statistical Disclosure Limitation Methodology'',
Chapter 2, Federal Committee on Statistical
Methodology, May 1994.
22Statistical Policy Working Paper 22 - Report on
Statistical Disclosure Limitation Methodology'',
Chapter 2, Federal Committee on Statistical
Methodology, May 1994.
23Statistical Policy Working Paper 22 - Report on
Statistical Disclosure Limitation Methodology'',
Chapter 2, Federal Committee on Statistical
Methodology, May 1994.
24Statistical Policy Working Paper 22 - Report on
Statistical Disclosure Limitation Methodology'',
Chapter 2, Federal Committee on Statistical
Methodology, May 1994.
25Swapping microdata records
- Find similar records
- Swap some fields
- Perform averages
- Have a requirement swap so as to preserve
averages in all directions
26Aggregation
- Putting together records (from different
databases) that share some attribute values to
create a composite record.
27k-anonymity
- A single record has the same quasi-identifier as
k-1 others. - Attacks
- Unsorted matching (see Fig. 3 Sweeney)
- Complementary release (see Fig. 4 and 5 -
Sweeney) - Temporal Attack
28L. Sweeney. k-anonymity a model for protecting
privacy. International Journal on Uncertainty,
Fuzziness and Knowledge-based Systems, 10 (5),
2002 557-570.
Race Birth Gender ZIP
Problem t1 Black 1965 m 0214
short breath t2 Black 1965 m 0214
chest pain t3 Black 1965 f 0213
hypertension t4 Black 1965 f 0213
hypertension t5 Black 1964 f 0213
obesity t6 Black 1964 f 0213 chest
pain t7 White 1964 m 0213 chest
pain t8 White 1964 m 0213
obesity t9 White 1964 m 0213 short
breath t10 White 1967 m 0213 chest
pain t11 White 1967 m 0213 chest
pain
Example of k-anonymity, where k2 and QIRace,
Birth, Gender, ZIP
29L. Sweeney. k-anonymity a model for protecting
privacy. International Journal on Uncertainty,
Fuzziness and Knowledge-based Systems, 10 (5),
2002 557-570.
- Race ZIP Race ZIP Race ZIP
- Asian 02138 Person 02138 Asian 02130
- Asian 02139 Person 02139 Asian 02130
- Asian 02141 Person 02141 Asian 02140
- Asian 02142 Person 02142 Asian 02140
- Black 02138 Person 02138 Black 02130
- Black 02139 Person 02139 Black 02130
- Black 02141 Person 02141 Black 02140
- Black 02142 Person 02142 Black 02140
- White 02138 Person 02138 White 02130
- White 02139 Person 02139 White 02130
- White 02141 Person 02141 White 02140
- White 02142 Person 02142 White 02140
Examples of k-anonymity tables unsorted matching
attack
30L. Sweeney. k-anonymity a model for protecting
privacy. International Journal on Uncertainty,
Fuzziness and Knowledge-based Systems, 10 (5),
2002 557-570.
Race BirthDate Gender ZIP Problem Race BirthDate
Gender ZIP Problem b 65 M 02141 short of breath b
65 M 02141 short of breath b 65 M 02141 chest
pain b 65 M 02141 chest pain - 65 F 0213
painful eye b 65 F 02138 painful eye - 65 F
0213 wheezing b 65 F 02138 wheezing b 64
F 02138 obesity b 64 F 02138 obesity b 64
F 02138 chest pain b 64 F 02138 chest pain w
64 M 0213 short of breath w 60-69 M 02138 short
of breath - 65 F 0213 hypertension w 60-69 -
02139 hypertension w 64 M 0213 obesity w
60-69 - 02139 obesity w 64 M 0213 fever
w 60-69 - 02139 fever w 67 M 02138 vomiting
w 60-69 M 02138 vomiting w 67 M 02138 back pain
w 60-69 M 02138 back pain
Complementary Release Attack Two k-anonymity
tables k2