Generating and Tracking Communities Based on Implicit Affinities - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Generating and Tracking Communities Based on Implicit Affinities

Description:

Social Capital for Community Tracking. Experiments & Observations ... LinkedIn, Flickr, YouTube, ... list at Bloglines.com. 570 blogs. 2380 bloggers ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 25
Provided by: DML
Category:

less

Transcript and Presenter's Notes

Title: Generating and Tracking Communities Based on Implicit Affinities


1
Generating and Tracking Communities Based on
Implicit Affinities
  • Matthew Smith smitty_at_byu.edu
  • BYU Data Mining Lab
  • April 2007

2
Outline
  • Introduction Motivation
  • Project
  • Community Generation IANs
  • Social Capital for Community Tracking
  • Experiments Observations
  • Conclusions and Future Work

3
Introduction
  • Online Communities
  • Continually emerging many sites are adding this
    aspect
  • Like offline communities, they are complex and
    dynamic
  • Examples
  • USENET (1980), Google Groups, Wikipedia
  • LinkedIn, Flickr, YouTube, MySpace, Facebook,
    etc.
  • Medical Communities (e.g., DailyStrength, NAAF)
  • Political Communities
  • Blogosphere focus of experiments

4
Motivation
Explicit Links
Explicit Social Network (ESN) Links Friends, Web
Links, etc.
5
Motivation
Explicit Links Implicit Affinities
cancer
bald
smoke
ESN and Implicit Affinity Network
(IAN) Applications Medical, Blogosphere, etc.
6
Implicit Affinity
  • Affinity
  • The overlapping of attributes-values for any
    common attribute
  • Community
  • Set of individuals characterized by attributes
  • Linked by affinities rather than explicit
    relationships

7
IAN Community Generation
  • Individuals nodes
  • characterized by attributes
  • Affinities edges
  • unlike traditional social networks where links
    represent explicit relationships, the links in
    our approach are based strictly on affinities
  • Connections emerge naturally

8
Affinity Scoring
  • Affinity score for a particular attribute
  • Affinity score for all attributes

9
Affinity Network Building
IAN
10
Social Capital for Community Tracking
  • Social Capital The advantage available through
    connections between individuals within a
    particular network
  • Bonding and Bridging Metrics

11
Preliminary Experiments Observations
12
Scobleizers Blog List
  • Robert Scoble (Scobleizer)
  • Blogger and book author
  • Technical evangelist (formerly with Microsoft)
  • Data Set Details
  • Scobleizers reading list at Bloglines.com
  • 570 blogs
  • 2380 bloggers

13
Data Set Statistics Blog posts per day
Lack of data for all bloggers during first few
days
We observe fewer posts during the weekend (Friday
Saturday)
14
Single Attribute Companies
  • Motivation
  • Many bloggers talk about various companies and
    what they are doing
  • Methodology
  • Whenever a company is mentioned in a bloggers
    post, it becomes a feature of the blogger
  • Static company list used as attributes
  • 1,914 company names

15
Cyclic Feature Usage
16
Power-law Behavior Features
  • Observations
  • Few companies
  • mentioned by many
  • Many companies
  • mentioned by few

17
Blog Community Evolution
  • Observations
  • Weekend bonding?
  • Bridging indicates
  • newly used features
  • new bloggers
  • Overall bonding (expected)
  • static set of features
  • no decay
  • blogosphere is full of buzz

18
Blog-based IAN Feb. 24
19
Conclusions
  • Blog posts were cyclic within this community
  • Posted more during the week and less during the
    weekends
  • Interestingly, bonding occurs during the weekends
  • Companies were mentioned in a power-law way
  • Few companies are mentioned often
  • Most companies are mentioned rarely
  • Niche sub-communities
  • Bloggers focusing on long-tail companies were
    identified
  • Blog-based IAN
  • Appears to follow power-law connectivity like ESNs

20
Future Work (In Progress)
  • Compare IAN and ESN of the same community
  • Analyze evolution (social capital vs. density)
  • Compare snapshots
  • Identify and report similarities and differences
  • Develop hybrid sub-community identification
  • Experiment on domain-specific communities
  • Medical patient communities
  • Political jump start grass-roots campaigns

21
More Future Work
  • Refine implicit attribute extraction
  • Allow for dynamic feature extraction
  • Allow features to naturally decay with time
  • Use LDA to extract concepts
  • Putnams puzzle
  • Consider adapting Social Capital measures to
    allow for uncorrelated bonding and bridging

22
Questions
  • ?

23
Affinity Score Distribution
24
Blog-based IANs Filtered by Threshold
Affinity Scores GTE 0.5
Affinity Score of 1.0
25
Blog-based IAN Filtered by Thresholds
Affinity Thresholds Score GTE 0.5Count GTE 3
2/15 3/15
Write a Comment
User Comments (0)
About PowerShow.com