Xiaowei Ying, Xintao Wu

About This Presentation

Title:

Xiaowei Ying, Xintao Wu

Description:

On Link Privacy in Randomizing Social Networks Xiaowei Ying, Xintao Wu Univ. of North Carolina at Charlotte PAKDD-09 April 28, Bangkok, Thailand – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 18

Provided by: Xiaowe8

Category:

more less

Transcript and Presenter's Notes

Title: Xiaowei Ying, Xintao Wu

1
On Link Privacy in Randomizing Social Networks

Xiaowei Ying, Xintao Wu
Univ. of North Carolina at Charlotte
PAKDD-09 April 28, Bangkok, Thailand

2
Motivation

Privacy Preserving Social Network Publishing
node-anonymization
cannot guarantee identity/link privacy due to
subgraph queries.
Backstrom et al. WWW07, Hay et al. UMass TR07
edge randomization
Random Add/Del
Random Switch
K-anonymity
Hay et al. VLDB08, LiuTerzi SIGMOD08, ZhouPei
ICDE08
Utility preserving randomization
Spectral feature preserving YingWu SDM08
Real space feature preserving YingWu SDM09

3
Problem Formalization
Add k then del k edges
Prior belief
vs. Posterior belief
YingWu SDM08
This paper
similarity measure value between node i and j
4
Polbooks network
Network of US political books (105 nodes, 441
edges, r8) Books about US politics sold by
Amazon.com. Edges represent frequent
co-purchasing of books by the same buyers. Nodes
have been given colors of blue, white, or red to
indicate whether they are "liberal", "neutral",
or "conservative". http//www-personal.umich.edu
/mejn/netdata/
5
Proportion of true edges vs. similarity
After randomly add/delete 200 edges (totally 441
edges)
6
Similarity measures vs. Link prediction

Similarity measures
The number of common neighbors
Adamic/Adar, the weighted number of common
neighbors
Katz, a weighted sum of the number of paths
connecting two nodes
Commute time, the expected steps of random walks
from node i to j and back to i.
Similarity measures have been exploited in the
classic link prediction problem.
Liben-NowellKleinberg CIKM03

7
Proportion of true edges vs. similarity
After randomly add/delete 200 edges (totally 441
edges)
8
Calculating Posterior belief
Applying Bayes theorem
The attacker does not know this value, what he
can do?
9
MLE estimation

Estimate based on randomized graph

Posterior belief can be calculated by attackers
10
Comparison
11
Comparison
12
Empirical Evaluation

Attackers Prediction Strategy
Calculate posterior probability of all node pairs
Choose top t node pairs (with highest post.
Prob.) as predicted candidate links

For each t, the precision of predictions (k0.5m)
13
Empirical Evaluation
The posteriori beliefs with similarity measures
achieve higher precision than that without
exploiting similarity measures. One measure that
is best for one data is not necessarily best for
another data.
14
Determining k to guarantee privacy
Data Owner
15
Conclusion Future Work