Title: Modeling the Spread of Influence on the Blogosphere
1Modeling the Spread of Influence on the
Blogosphere
- Akshay Java, Pranam Kolari, Tim Finin, and Tim
Oates - UMBC Tech Report
- 04/12/06
2Outline
- What is influence?
- Basic Influence Model
- Influence models for the blogosphere
- Results
- Conclusions
3What is Influence?
- Main Entry influence Pronunciation
'in-"flü-n(t)s, esp Southern in-'Function
nounEtymology Middle English, from Middle
French, from Medieval Latin influentia, from
Latin influent-, influens, present participle of
influere to flow in, from in- fluere to flow --
more at FLUID1 a an ethereal fluid held to
flow from the stars and to affect the actions of
humans b an emanation of occult power held to
derive from stars2 an emanation of spiritual
or moral force3 a the act or power of
producing an effect without apparent exertion of
force or direct exercise of command b corrupt
interference with authority for personal gain4
the power or capacity of causing an effect in
indirect or intangible ways SWAY5 one that
exerts influence- under the influence affected
by alcohol DRUNK ltwas arrested for driving
under the influencegt
NOT This Kind of Influence! -)
4Motivation
- Influence models studied for cocitation graphs
- David Kempe, Jon Kleinberg, Eva Tardos Maximizing
the Spread of Influence through a Social Network,
KDD 2003 - Applies to blogs also.
- Recent Examples Startups, Microsoft Origami,
Walmart,DoD - GOAL Predict influential blogs
- Target nodes to help achieve a Tipping Point
The Tipping Point Malcolm Gladwell
5Influence on the Blogosphere
Post was Influenced by NPR, eWeek
6Influence Models for the Blogosphere
Blog Graph
Influence Graph
1/3
U
2
2
1
3
3
2/5
1/3
V
1/3
1
1
1
1/5
5
5
2/5
4
4
1/2
1/2
Wu,v Cu,v / dv
U links to V gt U is Influenced by V
7Basic Influence Models
Influence Graph
- Linear Threshold Model
- S bvw ?v
- w is the active neighbor of v
- Cascade Model
- Pvw - probability with which a
- node can activate each of its
- neighbors, independent of
- history.
1/3
Active
2
1
3
2/5
1/3
?v
1/3
1
1
1/5
5
2/5
Active
4
Inactive
1/2
1/2
8Node Selection Heuristics
- Inlinks
- Easily spammed
- Centrality
- Expensive to compute for every large graphs
- PageRank
- Requires link information
- However, is easy to compute
- Greedy Heuristic
- Computationally expensive
- However performs better
9Effect of Splogs on Node Selection(indegree vs
pagerank)
Almost 54 of the links were from splogs/failed
to splogs/failed!
10Effect of Splogs on Inlinks
rank URL inlinks
1 http//www.livejournal.com/users/pics 3072
2 http//www.boingboing.net 2191
3 http//www.dailykos.com 2017
4 http//www.engadget.com 1942
5 http//profiles.blogdrive.com 1526
6 http//michellemalkin.com 1242
7 http//www.opinionjournal.com 1232
8 http//instapundit.com 1187
9 http//slashdot.org 1124
10 http//www.powerlineblog.com 909
11 http//www.huffingtonpost.com/theblog 905
12 http//corner.nationalreview.com 853
13 http//www.talkingpointsmemo.com 733
14 http//www.captainsquartersblog.com/mt 728
15 http//espn-presents2003-world-seriesofpoker.blogspot.com 711
16 http//3-world-series-of-poker-online-3.blogspot.com 711
17 http//worldseries-of-poker-network-tv-show.blogspot.com 711
18 http//wsop2003.blogspot.com 711
19 http//wsop-bracelet1.blogspot.com 711
20 http//worldseries-poker.blogspot.com 711
21 http//worldseries-of-poker-official.blogspot.com 711
22 http//worldseries-of-poker-wsop.blogspot.com 711
23 http//world-series-of-poker-nocd-patch66.blogspot.com 711
24 http//4-world-series-of-poker-past-winners.blogspot.com 711
25 http//7-wsop-games-7.blogspot.com 711
Tightly Knit Community of Splog
11Influence Models(without splog detection)
Number of nodes selected
12Influence Models (After splog removal)
13Influence Models(w.r.t. Technorati Ranks)
14Conlusions
- Influence models can be applied to blogs not just
cocitation graphs - Splogs are a problem
- Greedy heuristics work well, pagerank is an
inexpensive approximation
15Ideas for CIKM 06
- Good or bad influence? Associating sentiment with
links. - Finding influential blogs for a topic. (SVM
accuracy 75-85) - Community structure of blogs.
16- Questions
- Comments/ Feedback?
- Thanks!
- Acknowledgement
- Buzzmetrics/Blogpulse for the dataset.
-