Title: Modeling Trust and Influence on Blogosphere using Link Polarity
1Modeling Trust and Influence on Blogosphere
using Link Polarity
- Anubhav Kale
- Masters Thesis, 2007
2Overview
- Motivation
- Problem Statement
- Approach
- Link Polarity
- Trust Propagation
- Experiments
- Future Work
- Q A
3Overview
- Motivation
- Problem Statement
- Approach
- Link Polarity
- Trust Propagation
- Experiments
- Future Work
- Q A
4Social Media
- Social media describes the online tools and
platforms that people use to share opinions,
insights, experiences, and perspectives -
wikipedia - Level of user participation and thought sharing
across varied topics
Twitterment beta
5Blogs Essence of Social Media
- Blogs
- Means by which new ideas and information spreads
rapidly on social media
6Communities in Blogosphere
- Can you track the buzz about Ipod among bloggers
? - What are the blogs that always criticize Ipod and
the ones that are Ipod fans ? - Are there any neutral bloggers who would like to
have the best of both worlds ? - Can you analyze the changes in opinions/biases ?
- Are there any influential blogs in both
communities ?
- Can you find the right set of individuals
(like-minded) to target ?
7Overview
- Motivation
- Problem Statement
- Approach
- Link Polarity
- Trust Propagation
- Experiments
- Future Work
- Q A
8Problem Statement
- Convert a sparsely connected blog graph without
any knowledge of sentiments across blog-blog
links, to a densely connected graph with
sentiments associated to every link. - Sentiment represents the opinion/trust/distrust
of the blogger nodes connected by the link. - Use the densely connected polar graph to
determine like-minded blogs
9Overview
- Motivation
- Problem Statement
- Approach
- Link Polarity
- Trust Propagation
- Experiments
- Future Work
- Q A
10Approach
- Identify the polarity of link that points from
one blog post to another - Simple sentiment detection techniques
- Polarity may be positive, negative or neutral
- Use trust propagation models to spread the
sentiment from the subset of connected blogs to
all blogs - Compute polarity from pre-defined influential
blogs in each community to deduce like-minded
blogs - Validation with a hand-labeled dataset
11Birds Eye View Step 1
E
C
B
D
foo
F
A
12Birds Eye View Step 2
cool!
E
C
B
I like him
What crap!
He is great
D
foo
amazing!
ridiculous
F
A
-ve bias
ve bias
13Birds Eye View Step 3
E
C
B
D
foo
F
A
-ve bias
ve bias
14Birds Eye View Step 4
E
C
B
D
foo
F
A
-ve bias
ve bias
15Birds Eye View Step 4
E
C
B
D
foo
F
A
-ve bias
ve bias
16Overview
- Motivation
- Problem Statement
- Approach
- Link Polarity
- Trust Propagation
- Experiments
- Future Work
- Q A
17Link Polarity
- Its very generic !
- In co-authorship graphs, polarity may be defined
as the number of times authors have collaborated - On Amazon.com, polarity is the ranking scheme in
the reviews - How does it apply to blogs ?
- Represents the opinion of source blog about
destination blog - Sign represents whether the bias is for, against
or neutral - Magnitude represents the strength or weakness of
bias
18How to compute polarity ?
- Blogrolls
- Measure of association between blogs
- Indicates that the blogger is interested in
following the blog - May not indicate any bias
- Static nature once created, never updated
Blogroll from dailykos ?
19How to compute polarity ?
- Comments
- Feedback on complete blog post granularity is
coarse - Verbose comments a challenge for NLP
- Pull source blog may not be associated with
the comment author - Tendency to comment anonymously on controversial
topics
20How to compute polarity ?
- Explicit Links
- Strongest evidence of interaction
- Text surrounding the link generally contains
sentiments - Shallow Natural Language Processing can help
since the target text is highly focused.
21How to compute polarity ?
- Explicit Links
- Strongest evidence of interaction
- Text surrounding the link generally contains
sentiments - Shallow Natural Language Processing can help
since the target text is highly focused.
22Our Approach to Link Polarity
- Sentiment Analysis
- Calculate the number of positively oriented (Np)
and Negatively oriented words (Nn) in the
text-window around the link - Apply Stemming, basic canonicalization
- Corpus includes simple bi-grams of the form
not_good - Polarity (Np Nn) / (Np Nn)
- Denominator acts as a normalization mechanism
- Natural Language Processing is shallow, yet
large-scale effects help !
23Link Polarity Example
- Stephen Colbert's performance at the White House
Correspondents' Association dinner has garnered
him huge applause in the blogosphere and also on
C-Span where it was shown more than once. Those
of us who have been angry with Bush for quite
some time because of his arrogant and feckless
corruption of our country were even more thrilled
to see and know that he had no recourse but to
sit there and watch his aspirations for greatness
be destroyed by a master of irony. - This will be his legacy I stand by this
man. I stand by this man because he stands for
things. Not only for things, he stands on things.
Things like aircraft carriers and rubble and
recently flooded city squares. And that sends a
strong message, that no matter what happens to
America, she will always rebound -- with the most
powerfully staged photo ops in the world. We who
have been watching Stephen Colbert eviscerate
politicians that have come on his show knew he
was a gifted comedian. But it took Saturday's
dinner to demonstrate how incredibly effective
the art form Colbert has chosen is for exposing
the Potemkin Regime Bush and his henchmen have
created. Rove and the right wing machine have no
answer to the performance but to say "it bombed",
"it wasn't funny", and to hope that by ignoring
it, the caustic cleansing agent it has lobbed
into their camp can be contained. Yet, the
Republican spinmeisters are the masters of
spin.2 - This - http//dailykos.com/storyonly/2006/4/30/1
441/59811 -
- 2http//www.pacificviews.org/weblog/archives/00
1989.html
24Link Polarity Example
- Stephen Colbert's performance at the White House
Correspondents' Association dinner has garnered
him huge applause in the blogosphere and also on
C-Span where it was shown more than once. Those
of us who have been angry with Bush for quite
some time because of his arrogant and feckless
corruption of our country were even more thrilled
to see and know that he had no recourse but to
sit there and watch his aspirations for greatness
be destroyed by a master of irony. - This will be his legacy I stand by this
man. I stand by this man because he stands for
things. Not only for things, he stands on things.
Things like aircraft carriers and rubble and
recently flooded city squares. And that sends a
strong message, that no matter what happens to
America, she will always rebound -- with the most
powerfully staged photo ops in the world. We who
have been watching Stephen Colbert eviscerate
politicians that have come on his show knew he
was a gifted comedian. But it took Saturday's
dinner to demonstrate how incredibly effective
the art form Colbert has chosen is for exposing
the Potemkin Regime Bush and his henchmen have
created. Rove and the right wing machine have no
answer to the performance but to say "it bombed",
"it wasn't funny", and to hope that by ignoring
it, the caustic cleansing agent it has lobbed
into their camp can be contained. Yet, the
Republican spinmeisters are the masters of
spin.2 - This - http//dailykos.com/storyonly/2006/4/30/1
441/59811 - Np 8, Nn 4 Polarity Np Nn / Np Nn
0.33 - 2http//www.pacificviews.org/weblog/archives/00
1989.html
25Overview
- Motivation
- Problem Statement
- Approach
- Link Polarity
- Trust Propagation
- Experiments
- Future Work
- Q A
26Trust Propagation
- Based on work of Guha et al1 for modeling
propagation of trust and distrust - Framework
- Mij represents bias from user i to j.(0 1)
- Belief Matrix M represents the initial set of
known beliefs - Mij can be based on trust matrix (T), distrust
matrix (D) or a combination of trust and distrust
(T-D) from i to j. - T Positive Polarities and D Negative
Polarities - Goal is to compute all unknown values in M
- Results from validations on dataset from
epinions are impressive - 1 Guha R, Kumar R, Raghavan P, Tomkins A.
Propagation of trust and distrust. In
Proceedings of the Thirteenth International World
Wide Web Conference, New York, NY, USA, May 2004.
ACM Press, 2004.
27Atomic Propagation
- Direct Propagation
- Given A trusts B and B trusts C
- Implies A trusts C
- Operator M
- Co-citation
- Given A trusts B and C, D trust C
- Implies D trusts B
- Operator MT M
-
B
C
A
A
B
C
D
28Atomic Propagation Contd
- Transpose Trust
- Given A trusts B and C trusts B
- Implies C trusts A, A trusts C
- Operator MT
- Trust Coupling
- Given D trusts A, A trusts C
- and B trusts C
- Implies D trusts B
- Operator M MT
A
B
C
C
A
B
D
29Atomic Propagation contd
- Combined Operator
- Ci a1 M a2 MTM a3 MT a4 MMT
- ai 0.4, 0.4, 0.1, 0.1 represents weighing
factor - Belief Matrix after ith atomic propagation
- Mi1 Mi Ci
- We perform propagations till convergence (till
the new iteration does not change values in M
above threshold)
30Models to compute final belief matrix
- Trust-only
- Ignore distrust (negative polarities) completely
- Final Belief Matrix Mk , M0 T
- (K Number of atomic propagations till
convergence) - One-step Distrust
- Distrust propagates single step while trust
propagates repeatedly - Final Belief Matrix Mk (T-D) , M0 T
- (K Number of atomic propagations till
convergence) - Propagated Distrust
- Treat distrust and trust equivalent
- Final Belief Matrix Mk , M0 T - D
- (K Number of atomic propagations till
convergence)
31Models to compute final belief matrix
- Trust-only
- Ignore distrust (negative polarities) completely
- Final Belief Matrix Mk , M0 T
- (K Number of atomic propagations till
convergence) - One-step Distrust
- Distrust propagates single step while trust
propagates repeatedly - Final Belief Matrix Mk (T-D) , M0 T
- (K Number of atomic propagations till
convergence) - Propagated Distrust
- Treat distrust and trust equivalent
- Final Belief Matrix Mk , M0 T - D
- (K Number of atomic propagations till
convergence)
32Overview
- Motivation
- Problem Statement
- Approach
- Link Polarity
- Trust Propagation
- Experiments
- Future Work
- Q A
33Experiments
- Domain
- Political Blogosphere
- Dataset from Buzzmetrics2 provides post-post
link structure over 14 million posts - Few off-the-topic posts help aggregation
- Potential business value
- Reference Dataset
- Hand-labeled dataset from Lada Adamic et al3
classifying political blogs into right and left
leaning bloggers - Timeframe 2004 presidential elections, over
1500 blogs analyzed - Overlap of 300 blogs between Buzzmetrics and
reference dataset - Goal
- Classify the blogs in Buzzmetrics dataset as
democrat and republic and compare with reference
dataset
- 2 Lada A. Adamic and Natalie Glance, "The
political blogosphere and the 2004 US Election",
in Proceedings of the WWW-2005 Workshop - Buzzmetrics www.buzzmetrics.com
34Effect of Link Polarity
- Republican blogs classified more correctly than
democrats - Trust propagation on polar links more effective
than over non-polar links - Link Polarity improves classification by
approximately 26
35Effect of text window size
- Optimal window size is 750 characters for our
experiments - Small window size Non-opinionated phrases
- Large Window size Analysis of non-related text
- Specific to our experiments, numbers may not be
generalized
36Effect of atomic propagation parameters
- X-axis Bitset direct trust, cocitation,
transpose trust and trust coupling 0001 -
1111 - Each parameter set to either 0 or its optimal
value - Collective influence of all parameters helps !
37Evaluation Metrics
Confusion Matrix
How did I compute the numbers ?
38Evaluation Metrics Continued
- Accuracy 73
- True Positive Rate (Recall) 78
- False Positive Rate (FP) 31
- True Negative Rate (Recall) 69
- False Negative Rate (FN) 21
- Precision (Positive) 75
- Precision (Negative) 72
- (Positive Republican, Negative Democrat)
http//www2.cs.uregina.ca/dbd/cs831/notes/confusi
on_matrix/confusion_matrix.html
39Sample Data
- Trust propagation compensates for initial
incorrect polarity (DK AT) - Trust propagation does not change correct
polarity (AT-DK) - Trust propagation assigns correct polarity for
non-existent direct links (AT-IP) - Numbers in italics problematic (MM-AT)
- Improve sentiment detection ?
40Main Stream Media Classification
- Goal
- Classify main stream media news sources (e.g.
guardian, foxnews, truthout ) as left and right
leaning - Use links from blog posts to media sources ( drop
blog-blog links ) - Graph Structure
P
a
b
c
Blogs
Q
MSM
d
R
41MSM Classification Results
42Interesting Observations
- 24 out of 27 sources classified correctly
- Well-known sources like guardian, foxnews,
truthout and mediamatters classified
correctly - Main Outliers -- thenation and boston globe
- google news classified as left leaning
- Both left and right leaning blogs talk negatively
about nytimes and abcnews and positively
about rawstory and examiner
43Overview
- Motivation
- Problem Statement
- Approach
- Link Polarity
- Trust Propagation
- Experiments
- Future Work
- Q A
44Future Work
- Link Polarity
- More sophisticated NLP techniques
- Topic as a parameter
- Trust Propagation
- Evaluate other models
- Augment trust model with data from other domains
(communities in MySpace etc) - Experiments
- Evaluations on larger heterogeneous datasets
- Domains with noisy data and multi-subject posts
45Thank You !!