Title: Analyzing the Political Blogosphere
1Analyzing the Political Blogosphere
19 March 2007
2Social Media
- Social media describes the online tools and
platforms that people use to share opinions,
insights, experiences, and perspectives -
wikipedia - Level of user participation and thought sharing
across varied topics
3Blogs Essence of Social Media
Blogs spread new ideas and information rapidly
4Knowing Influencing your Audience
- Your goal is to campaign for a presidentialcandid
ate - How can you track the buzz about him/her?
- What are the relevant communities andbogs?
- Which communities are supporters, which are
skeptical, which are put off by the hype? - Is your campaign having an effect? The desired
effect? - Which bloggers are influential with political
audience? Of these, which are already onboard and
which are lost causes? - To whom should you send details or talk to?
5Influence Detection
- Often voters are influenced by opinions and
reviews on blogs - Detecting influential nodes and their role in how
people perceive a political party could be an
important tool during campaigning - Using topic, social structure, opinions, biases
and temporal information we can develop an
accurate model for influence
6Influence in Communities
http//michellemalkin.com/
http//instapundit.com
http//dailykos.com
http//volokh.com
http//crooksandliars.com
http//rightwingnews.com
Communities detected using Fast algorithm for
detecting community structure in networks, M.E.
J. Newman
7Influence of MSM
- Citation count alone is not an indicator of
influence who cites is a factor.
Using a list of 130 dem and 140 rep blogs
8Computing Influence of MSM
- For Democratic Citations
-
- Score(i) Pd(i)log(Pd(i)/Pr(i))Nd(i)
- where
- i is the MSM source
- Pd(i) probability that a democratic blog links
to MSM i - Pr(i) probability that a republican blog links
to MSM i - Nd(i) number of distinct democratic blogs
linking to i - Similar ranking for republican blogs
9Opinions in Social Media
Readers Perspective Starbucks Sandwiches are
bad!
- TREC 06 Finding opinionated posts, either
positive or negative, about a query - 2006 TREC Blog corpus
- 80K blogs
- 300K posts
- 50 test queries
- Challenges open domain sentiment words, slangs,
subject
- I went to school early so I would have time to
grab some lunch. Which ended up consisting of a
crappy sandwich from starbucks and a chai latte.
Lacey came into Starbucks while I was there so we
chatted for a little bit and she thought that I
might be in her class. After I finished eating I
headed to school and checked the board..1
Narrative
Expressed Opinions
Opinions can effect buying decisions of customers
1 http//annamay13x.livejournal.com/7061.html
10Finding Feeds That Matter
Analysis of Bloglines Feeds 83K publicly
listed subscribers 2.8M feeds, 500K are unique
26K users (35) use folders to organize
subscriptions Data collected in May 2006
Before Merge
After Merge
http//ftm.umbc.edu
11Finding Feeds That Matter
- Top Feeds for Politics (Merging political,
political blogs) - Talking Points Memo by Joshua Micah Marshall
- Daily Kos State of the Nation
- Eschaton
- The Washington Monthly
- Wonkette, Politics for People with Dirty Minds
- http//instapundit.com/
- Informed Comment
- Power Line
- AMERICAblog Because a great nation deserves the
truth - Crooks and Liars
12Finding Feeds That Matter
- Tag Based Feed Recommender Feeds under similar
folder names - http//www.dailykos.com
- Recommended Feeds
- http//www.andrewsullivan.com/index.php
http//www.talkingpointsmemo.com/
http//atrios.blogspot.com - http//jameswolcott.com/
- http//mediamatters.org/ http//yglesias.type
pad.com/matthew/ - http//billmon.org/
- http//digbysblog.blogspot.com
- http//instapundit.com/
- http//www.washingtonmonthly.com/
13Finding Influential Feeds using Co-Citations
www.dailykos.com
Feed recommendations
Blogs influenced by seed set
Leading blogs about Politics. Seed set is top
blogs in politics from bloglines and blog graph
used is from Blogpulse dataset..
14Link Polarity / Bias
- Linking alone is not indicator of influence
- Polarity can indicate the type of influence
- Consistent negative / positive opinion over a
period of time can indicate bias - Link polarity/citation signal can also be helpful
in determining trust
Strong Negative Opinion
Mildly Negative opinion
Republican Blog
Democrat Blog
15Modeling Influence Using Link Polarity
- Motivation
- Growing interest in exploring role of communities
in social media - Better community detection algorithms using
sentiment associated with links - Convert sparsely connected blog graph into a
densely connected one with sentiment weight
attached to every link - Approach
- Link Polarity Analyze post text surrounding
links to determine bias of bloggers about each
other - Trust Propagation Use trust propagation models
to spread the polarity from a small subset of
connected bloggers to all bloggers. - Experiments
- Study political blogosphere with goal to classify
blogs as left/right leaning - Bias detection using positive/neural/negative
score from influential bloggers (high in-link
blogs) in both communities - Validation with a hand-labeled dataset indicates
60 correct classification
16Birds Eye View Step 1
E
C
B
D
foo
F
A
17Birds Eye View Step 2
E
C
B
He is great
D
What crap!
foo
I like him
ridiculous
F
A
-ve bias
ve bias
18Birds Eye View Step 3
E
C
B
D
foo
F
A
-ve bias
ve bias
19Birds Eye View Step 4
E
C
B
D
foo
F
A
-ve bias
ve bias
20Birds Eye View Step 4
E
C
B
D
foo
F
A
-ve bias
ve bias
21Link Polarity Example
- Stephen Colbert's performance at the White House
Correspondents' Association dinner has garnered
him huge applause in the blogosphere and also on
C-Span where it was shown more than once. Those
of us who have been angry with Bush for quite
some time because of his arrogant and feckless
corruption of our country were even more thrilled
to see and know that he had no recourse but to
sit there and watch his aspirations for greatness
be destroyed by a master of irony. - This will be his legacy I stand by this
man. I stand by this man because he stands for
things. Not only for things, he stands on things.
Things like aircraft carriers and rubble and
recently flooded city squares. And that sends a
strong message, that no matter what happens to
America, she will always rebound -- with the most
powerfully staged photo ops in the world. We who
have been watching Stephen Colbert eviscerate
politicians that have come on his show knew he
was a gifted comedian. But it took Saturday's
dinner to demonstrate how incredibly effective
the art form Colbert has chosen is for exposing
the Potemkin Regime Bush and his henchmen have
created. Rove and the right wing machine have no
answer to the performance but to say "it bombed",
"it wasn't funny", and to hope that by ignoring
it, the caustic cleansing agent it has lobbed
into their camp can be contained. Yet, the
Republican spinmeisters are the masters of
spin.2 - This - http//dailykos.com/storyonly/2006/4/30/1
441/59811 - Np 8, Nn 4 Polarity 0.33
- 2http//www.pacificviews.org/weblog/archives/00
1989.html
22Trust Propagation
- Based on Guhas work on propagating trust and
distrust - Mij represents bias from user i to j (0 lt Mij lt
1) - Belief matrix M represents the initial set of
known beliefs - Mij can be based on trust matrix (T), distrust
matrix (D) or a combination of trust and distrust
(T-D) from i to j. - T Positive Polarities and D Negative
Polarities - Goal is to compute all unknown values in M
- Results from validations on dataset from
epinions are impressive - 1 Guha R, Kumar R, Raghavan P, Tomkins A.
Propagation of trust and distrust. In Proc. 13th
Int. World Wide Web Conf., New York, NY, USA, May
2004. ACM Press, 2004.
23Experiments
- Domain
- Political Blogosphere
- Dataset from Buzzmetrics2 provides post-post
links over 1.5M posts - Few off-the-topic posts help aggregation
- Potential business value
- Reference Dataset
- Adamics 3 Hand-labeled dataset classifies
blogs as right or left leaning - Timeframe 2004 presidential elections, over 1500
blogs analyzed - Overlap of 300 blogs between Buzzmetrics and
reference dataset - Goal
- Classify the blogs in Buzzmetrics dataset as
democrat and republic and compare with reference
dataset
- 2 Lada A. Adamic and Natalie Glance, "The
political blogosphere and the 2004 US Election",
in Proceedings of the WWW-2005 Workshop, MAY
2005. Buzzmetrics www.buzzmetrics.com
24Effect of Link Polarity
Trust propagation on polar links more effective
than on non-polar.
Republican blogs classified more correctly
Link Polarity yields 30 classification
improvement
Convergence after 20 propagations
25Effect of text window size
- Optimal window size is 750 characters for our
experiments - Small window size Non-opinionated phrases
- Large Window size Analysis of non-related text
- Specific to our experiments, numbers may not be
generalized
26Sample Data
- Trust propagation compensates for initial
incorrect polarity (DKAT) - Trust propagation doesnt change correct polarity
(AT-DK) - Trust propagation assigns correct polarity for
non-existent links (AT-IP) - Numbers in italics problematic (AT-MM)
- Make polarities below threshold zero?
- Improve sentiment detection?
27Future Work
- Bias and trustworthiness of MSM sources
- Trend extraction and meme tracking for political
blogs - Real time classification positive and negative
opinions for presidential candidates - Determining genres in political opinions using
content analysis.