Title: Privacy issues with social networks
1Privacy issues with social networks
2Outline
- Introduction
- Attacks
- Methods countering attacks
- Summary
3Introduction
- Social networks become popular
- A lot of research can be done with the data from
social networks - Network structure
- Node information
- Edge information
- Releasing the SN data for research
- SN data are about people
- Privacy is a big issue
- No concrete examples for utilizing SN data for
privacy attacks yet. - However, well agreed that node identification is
the big privacy threat
4Types of potential privacy breach
A Age Salary hobbies
Friends in messenger
Send emails
C Age Salary hobbies
B Age Salary hobbies
In the same club
Friends in facebook
D Age Salary hobbies
- Types of privacy threats
- Node identification
- Link disclosure
- Content disclosure
5Types of potential privacy breach
- Most papers focus on node identification
- Passive vs. active attacks
- Passive attacks do not change the network
structure - Active attacks may insert malicious nodes and
links into the network.
6Passive attacks
- Need certain knowledge about the network
- Colluded users subgraph
- Compromised users
- Node degree
- Degree of the friends of the node (Linkedin)
- Papers 163-165
7Active attacks
- Paper 167
- Two active attacks are described
- Walk-based attack
- Cut-based attack
- Basic idea
- Plug in a set of fake nodes into the real SN,
which have special interconnection structures - These fake nodes are identifiable in the
published/anonymized SN - Then, the nodes connected by the fake nodes are
identified
8Walk-based attack
w1,w2,wi,,wb
V1,v2,v3, vn
Original SN (G-H)
w1,w2,wi,,wb are targeted nodes
Subset Nb in fake SN All Ni are distinct
some random edges to the original SN
X1--x2--xi--,,--xk
fake SN (H)
½ prob to node xi
Xi should have ?i edges in total
9Theoretical results
- H can be efficiently recovered from G
- Degree test, xi has ?i degree
- Internal structure test we know the internal
structure of xi
10Cut-based attack
- Let b be the number of users we wish to target,
and k3b3 - Create H
- k fake users xi
- Generate edges between xi in a similar way
- Properties of H
- d(H) minimum degree in H
- r(H) minimum cut in H
- r(H) d(H) gtk/3 gt b, in high probability
- H has no non-trivial automorphisms
- Create links to G-H
- Choose b nodes x1,,xb in H, create edge(xi,wi)
11Theoretical results
- H can be efficiently recovered
- Using Gomory-Hu tree of G
- Minimum (v,w) cut minimum edge weight in the
path from v to w in the tree T - Remove edges of weight ltb from T
- T turns to a forest
- Check each component Si, with Si is isomorphic to
H - H has non-trivial automorphisms ? H can be
uniquely recovered.
12Protect from passive attacks
- Attackers utilize
- Degree of nodes (paper 164)
- Neighborhood structures (paper 163,165)
- methods similar to k-anonymization
13K-degree anonymization
- Paper 164
- Definition
- At least k nodes that have the same node degree
d. - Address the attacks that depend on node degree to
identify the node in the published SN
14The method
- Two-steps
- K-degree anonymization
- Construct graph with the modification of node
degree
15K-degree anonymization
- Sort the node degrees d(1),,d(n)
- di,j d(i),d(i1),,d(j)
- Cost of anonymization
- DA(di,dj) di-dj
- I(di,j) sum(d(i)-d(j))
- DP Algorithm
- Change some node degree so that the k-degree
anonymization is satisfied and the cost is
minimized - ilt2k, DA(d1,i I(d1,i)
- igt2k,
16Graph construction
- Check whether the degree sequence is relializable
- Lemma from Erdos and Gallai
17- Realizable with constraints
- Original G
- G with the sequence of degrees d
- E in E of G
- Another lemma states the condition for realizable
subject to the input G - Algorithms for graph constructions
- Probing
- Relaxed construction (greedy_swap)
- E of G contains almost all E
18Considering neighborhood structure
- Paper 165
- Attacks based on neighborhood structure
19Idea
- Describe the neighborhood structure of each node
- Definition of neighborhood
- Method for encoding the neighborhood
- For nodes with the same neighborhood structure,
apply anonymization - Minimize some cost
20definition
- d-neighbors
- d the distance from the node to its neighbor
- 1-neighbor all nodes directly connected to the
target node
21Encoding the neighborhood
- Neighborhood components
- Encoding each component
22Encoding neighborhood component
- Using DFS-tree code
- minimum DFS code
- Unique for the same structure
- I.e. isomorphic graphs will have the same minimum
DFS code. - Check the referred paper
23Neighborhood Anonymization
- Sorting vertex degrees
- Same degree? check structure
- Cost Minimize the change to the structure
- Anonymize neighborhood components
-
24Summary
- Current research focuses on node identification
- Through node degree
- Through neighborhood structure
- Discover more attacks
- Design different methods countering the attacks
- Utility metric for SN