Privacy issues with social networks - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Privacy issues with social networks

Description:

Properties of H. d(H) : minimum degree in H. r(H): minimum cut in H ... Choose b nodes {x1,...,xb} in H, create edge(xi,wi) Theoretical results ... – PowerPoint PPT presentation

Number of Views:3075
Avg rating:3.0/5.0
Slides: 25
Provided by: keke9
Category:

less

Transcript and Presenter's Notes

Title: Privacy issues with social networks


1
Privacy issues with social networks
2
Outline
  • Introduction
  • Attacks
  • Methods countering attacks
  • Summary

3
Introduction
  • Social networks become popular
  • A lot of research can be done with the data from
    social networks
  • Network structure
  • Node information
  • Edge information
  • Releasing the SN data for research
  • SN data are about people
  • Privacy is a big issue
  • No concrete examples for utilizing SN data for
    privacy attacks yet.
  • However, well agreed that node identification is
    the big privacy threat

4
Types of potential privacy breach
A Age Salary hobbies
Friends in messenger
Send emails
C Age Salary hobbies
B Age Salary hobbies
In the same club
Friends in facebook
D Age Salary hobbies
  • Types of privacy threats
  • Node identification
  • Link disclosure
  • Content disclosure

5
Types of potential privacy breach
  • Most papers focus on node identification
  • Passive vs. active attacks
  • Passive attacks do not change the network
    structure
  • Active attacks may insert malicious nodes and
    links into the network.

6
Passive attacks
  • Need certain knowledge about the network
  • Colluded users subgraph
  • Compromised users
  • Node degree
  • Degree of the friends of the node (Linkedin)
  • Papers 163-165

7
Active attacks
  • Paper 167
  • Two active attacks are described
  • Walk-based attack
  • Cut-based attack
  • Basic idea
  • Plug in a set of fake nodes into the real SN,
    which have special interconnection structures
  • These fake nodes are identifiable in the
    published/anonymized SN
  • Then, the nodes connected by the fake nodes are
    identified

8
Walk-based attack
w1,w2,wi,,wb
V1,v2,v3, vn
Original SN (G-H)
w1,w2,wi,,wb are targeted nodes
Subset Nb in fake SN All Ni are distinct
some random edges to the original SN
X1--x2--xi--,,--xk
fake SN (H)
½ prob to node xi
Xi should have ?i edges in total
9
Theoretical results
  • H can be efficiently recovered from G
  • Degree test, xi has ?i degree
  • Internal structure test we know the internal
    structure of xi

10
Cut-based attack
  • Let b be the number of users we wish to target,
    and k3b3
  • Create H
  • k fake users xi
  • Generate edges between xi in a similar way
  • Properties of H
  • d(H) minimum degree in H
  • r(H) minimum cut in H
  • r(H) d(H) gtk/3 gt b, in high probability
  • H has no non-trivial automorphisms
  • Create links to G-H
  • Choose b nodes x1,,xb in H, create edge(xi,wi)

11
Theoretical results
  • H can be efficiently recovered
  • Using Gomory-Hu tree of G
  • Minimum (v,w) cut minimum edge weight in the
    path from v to w in the tree T
  • Remove edges of weight ltb from T
  • T turns to a forest
  • Check each component Si, with Si is isomorphic to
    H
  • H has non-trivial automorphisms ? H can be
    uniquely recovered.

12
Protect from passive attacks
  • Attackers utilize
  • Degree of nodes (paper 164)
  • Neighborhood structures (paper 163,165)
  • methods similar to k-anonymization

13
K-degree anonymization
  • Paper 164
  • Definition
  • At least k nodes that have the same node degree
    d.
  • Address the attacks that depend on node degree to
    identify the node in the published SN

14
The method
  • Two-steps
  • K-degree anonymization
  • Construct graph with the modification of node
    degree

15
K-degree anonymization
  • Sort the node degrees d(1),,d(n)
  • di,j d(i),d(i1),,d(j)
  • Cost of anonymization
  • DA(di,dj) di-dj
  • I(di,j) sum(d(i)-d(j))
  • DP Algorithm
  • Change some node degree so that the k-degree
    anonymization is satisfied and the cost is
    minimized
  • ilt2k, DA(d1,i I(d1,i)
  • igt2k,

16
Graph construction
  • Check whether the degree sequence is relializable
  • Lemma from Erdos and Gallai

17
  • Realizable with constraints
  • Original G
  • G with the sequence of degrees d
  • E in E of G
  • Another lemma states the condition for realizable
    subject to the input G
  • Algorithms for graph constructions
  • Probing
  • Relaxed construction (greedy_swap)
  • E of G contains almost all E

18
Considering neighborhood structure
  • Paper 165
  • Attacks based on neighborhood structure

19
Idea
  • Describe the neighborhood structure of each node
  • Definition of neighborhood
  • Method for encoding the neighborhood
  • For nodes with the same neighborhood structure,
    apply anonymization
  • Minimize some cost

20
definition
  • d-neighbors
  • d the distance from the node to its neighbor
  • 1-neighbor all nodes directly connected to the
    target node

21
Encoding the neighborhood
  • Neighborhood components
  • Encoding each component

22
Encoding neighborhood component
  • Using DFS-tree code
  • minimum DFS code
  • Unique for the same structure
  • I.e. isomorphic graphs will have the same minimum
    DFS code.
  • Check the referred paper

23
Neighborhood Anonymization
  • Sorting vertex degrees
  • Same degree? check structure
  • Cost Minimize the change to the structure
  • Anonymize neighborhood components

24
Summary
  • Current research focuses on node identification
  • Through node degree
  • Through neighborhood structure
  • Discover more attacks
  • Design different methods countering the attacks
  • Utility metric for SN
Write a Comment
User Comments (0)
About PowerShow.com