Title: Computational Extraction of Social and Interactional Meaning from Speech
1Computational Extraction of Social and
Interactional Meaning from Speech
Dan Jurafsky and Mari Ostendorf Lecture 5
Agreement, Citation, Propositional Attitude Mari
Ostendorf
2Agreement, Citation, Propositional Attitude
- Agreement vs. disagreement with propositions (and
people) - How to make friends influence people
- Tool for affiliation, indicator of influence
- Tool for distancing, indicator of factions or
rifts in groups - Important component of group problem solving
3Speech Examples Revisited
A Thiss probably what the LDC uses. I mean they
do a lot of transcription at the LDC. B OK.
A I could ask my contacts at the LDC what it is
they actually use. B Oh! Good idea, great idea.
A After all these things, he raises hundreds of
millions of dollars. I mean uh the fella B but
he never stops talking about it. A but ok B
Arent you supposed to y- I mean A well thats a
little- the Lord says B Does charity mean
something if youre constantly using it as a
cudgel to beat your enemies over the- Im better
than you. I give money to charity. A Well look,
now I
4Subgroups Example Wikipedia Talk Page
- By including the "Haditha Massacre" in the Human
Rights Abuse section, we are effectively
convicting the Marines that are currently on
trial. I think we need to wait until the trial is
over. UnregisteredUser1 - Disagree. All I see is the listing "Haditha
killings (Under investigation)." Is the word
Massacre used? If not, I believe it should be
because this word fits every version of the story
presented in the public, including Time, the US
Marines, and the Iraqi Government.
RegisteredUser1 - I agree with RegisteredUser1, this is about
(current) history, not law. Just because
something hasn't been decided by a court doesn't
mean it didn't happen. It should be enough in the
article to just mention that the marines
charged/suspected of the massacre have not yet
been convicted. RegisteredUser2 - I disagree, you cannot call it a human rights
violation if its not stated what happened there.
Also your statement "have not yet been convicted"
is kind of the thing we are attempting to avoid.
Without guilt or a better understanding of the
situation I think its premature to put it in the
human rights violation section. RegisteredUser3 - Actually, as long as NPOV, WPVerifiability are
maintained you can call it a human rights
violation even if it is untrue. As Wikipedia says
"As counterintuitive as it may seem, the
threshold for inclusion in Wikipedia is
verifiability, not truth." Like it or not, as
long as there are reputable sources calling it a
massacre and/or a human rights violation then it
can be included in the article. RegisteredUser4 - Calling it a human rights violation in itself is
POV. I also do not think anyone would appreciate
you attempting to manipulate wiki policy for the
sake of adding POV into an article.
RegisteredUser3
5Influencing Example
There is a guideline that we shouldn't
semi-protect articles linked from front page, so
as to allow new editors a chance to edit articles
they are most likely to read. But in this case
all we are doing is enabling a swarm of socks.
Semi-protection is definitely needed in this
instance, with an apology should a new,
well-intentioned editor actually show up amidst
the swarm and be prevented from editing.
Semi-protect this sucker, or we'll never
determine the appropriate course of action for
this article. RegUser2 Even though
semi-protection is defidentally good for what is
nominally "my" side it's against policy and not
appropriate. Please take it off. RegUser3
Is is absolutely not against policy.
WikipediaProtection policy is very clear For
this article at this time, it's necessary. That's
in perfect compliance with policy. RegUser2
Removing the image without discussion
is aggressively bad editing (which I am often
guilty of). It's not vandalism. sprotect is only
for vandalism. RegUser3
Repeated violations of 3RR and using sockpuppets,
together with admitting that the purpose of
removing the image is to curry favour with one's
god and not to improve Wikipedia, doesn't so much
cross the line from bad editing to vandalism as
pole vault it. RegUser4
Ok, my WPAGF is falling. I still think sprotect
is agressive, but not as badly as I did before.
RegUser3
Influenced participant alignment change
6Online Political Discussion Forum
- Q Gavin Newsom- I expected more from him when I
supported him in the 2003 election. He showed
himself as a family-man/Catholic, but he ended up
being the exact oppisate, supporting abortion,
and giving homosexuals marriage licenses. I love
San Francisco, but I hate the people. Sometimes,
the people make me want to move to Sacramento or
DC to fix things up. - R And what is wrong with giving homosexuals the
right to settle down with the person they love?
What is it to you if a few limp-wrists get
married in San Francisco? Homosexuals are people,
too, who take out their garbage, pay their taxes,
go to work, take care of their dogs, and what
they do in their bedroom is none of your business.
7Citations (from Teufel et al., 2006)
- Following Pereira et al. 93, we measure word
similarity by the relative entropy or
Kulbach-Leibler (KL) distance, between the
corresponding conditional distributions. - His Hindles notion of similarity seems to
agree with our intuitions in many cases, but it
is not clear how it can be used directly to
construct word classes and corresponding models
of association.
8Overview
- Common threads
- Examples
- Agreements disagreements in meetings
- Agreements disagreements in online discussions
- Citation function
- More common threads
(Plus examples from unpublished UW studies on
Wikipedia discussions.)
9Overview
- Common threads
- Examples
- Agreements disagreements in meetings
- Agreements disagreements in online discussions
- Citation function
- More common threads
10Common Threads
- Sentiment detection (sort of)
- Discussions agreement/disagreement/neutral
- Citations positive/negative/neutral (opt.
contrast) - Most studies detect person/paper as target, not
the proposition per se - Challenges
- Cultural bias infrequent negatives
- Bag of words is not enough
- Identifying person/paper target of agreement
(context can extend beyond the sentiment
sentence) - Computational modeling
11Challenge Cultural Bias
- English meetings many more agreements than
disagreements - Mandarin wiki dicussions fewer explicit
disagreements than in English - Citations several studies find that negative
citations are rare (presumably because they are
politically dangerous) - People use positive words to soften the blow
- right but., yeah with negative intonation
12Challenge Polarity Words in BOW
- Need to account for negation
- agree vs. dont agree, absolutely vs.
absolutely not - BUT fewer than half the positive words in
negative turns are lexically negated - Some part-of-speech issues, e.g. well
- People include positive words to soften the blow
- dissenting turns have more positive words than
negative - right occurs 75 times in dissenting turns, 162
times in neutral turns only 33 times in
supporting turns
13Polarity Word Trickiness (cont.)
- Positive negatives
- yeah larry i i want to correct something randi
said of course - right but but you you can't say that punching
him in the back of the head is justified - Negative positives
- Steph- vent away that sucks
- no you stick with what you're doing
14Challenge Identifying the Target
- Baseline The target is the most recent speaker
- 67 accurate for Wiki discussions
- 80 accurate for meetings
- Adding names doesnt help much (70 accurate for
Wiki discussions) - Target can be more than one person
- In political discussion forum (Abbott et al. 11),
82 of posts with quotes have quotes that can be
linked to previous post - Citation information often not in the same
sentence as the citation (Teufel et al. 06).
15Chat complication of asynchrony
PubCoord Are we agreed on about 60 for
soda? Acct yeah, only ourselves are set apart, I
think Secty They can't take a bottle. Secty Okay,
I agree on 60 for soda PubCoord Vote PubCoord
agreed ProjMgr Yeah, agree Secty How much does
ice cost? PubCoord 2.50 per pack Acct how about
50, because project manager
won't drink that much soda PubCoord
probably PubCoord What is he a camel? Acct and
some folks won't drink any? Secty lol Acct no,
some people dont like flavor, carbonation ProjMgr
Shut up! Soda can be harsh Acct or, OMG
calories Secty please stay on topic Acct yeah,
i dont like the carbonation PubCoord Alright,
I've identified two of you ProjMgr I was just
going to say that... Acct me too! Secty so was
that 50 for ice? Acct actually, I guess I know
who everyone is then PubCoord What?
Acct no, 50 for pop Secty oh PubCoord No, 50
for soda is fine I guess Secty please vote
between 50 or 60 PubCoord I think maybe 10 for
ice ProjMgr Yeah / Acct and someone already
volunteered their cooler? PubCoord Yessir Secty
please vote between 50 or 60 for soda Secty I
vote 60 PubCoord 60 ProjMgr 50 Acct i vote
50 ProjMgr TIE! PubCoord then? Secty 50 it
is Acct g d it Acct yeah, 55 Secty okay,
55 Secty so how much is left, accountant?
?
?
16Computational Modeling -- Review
- Standard text classification problem
- Extract feature vector ? apply model ? score
classes - Choose class with best score
- Popular models
- Naïve Bayes
- Decision trees/forests vs. boostexter/icsiboost
- Maximum entropy
- SVMs
- K-nearest neighbor (lazy learning or
memory-based) - Feature selection or regularization
- Evaluation
- Classification accuracy or Macro F (mean of F
measures)
New since Lec 5
17Feature Extraction Noise Issue
- Both speech and text have noise challenges
- Speech speech recognition errors (especially
when there is overlapping speech) - Online discussions typos and funny spellings
- defidentally good
- the exact oppisate
- Not a big issue for edited text (e.g. most
articles that would have citations)
18Challenge Skewed Priors
- Large percentage of sentences are neutral,
standard training algorithms emphasize the
frequent classes - Some solutions
- Use development set to tune detection thresholds
- Random sampling using biased priors and bagging
(classifier combination)
19Overview
- Common threads
- Examples
- Agreements disagreements in meetings
- Agreements disagreements in online discussions
- Citation function
- More common threads
20Detecting (Dis)Agreements in Meetings
A I could ask my contacts at the LDC what it is
they actually use. B Oh! Good idea, great idea.
- Adjacency pair speaker detection (given B, find
A) - Target detection for agreements disagreements
- Also includes question/answer, offer/acceptance,
etc. - Classify B as agreement/disagreement/other
- (Backchannels modeled separately, but including
in other for scoring.)
Galley et al. 2004
21Meeting Data
- ICSI Meeting corpus
- 75 1-hour meetings, average of 6.5
participants/meeting - Hand transcribed, audio automatically time
aligned - Hand labeled for adjacency pairs
- 7 meetings pause-segmented into spurts
- Class distribution
- Agree 12
- Disagree 7
- Other 81
22Adjacency Pair Speaker Ranking
- Features (B given, A is candidate target)
- Structural /- overlap, of speakers/spurts
between A B, etc - Duration duration of overlap, duration of A,
time between A B, overlap with others, speaking
rate - Lexical word counts, counts of shared words, cue
word indicators, name indicator, - Dialog acts (oracle)
- Feature selection incremental
- Classifier Maximum entropy
23Adjacency Pair Results
Only small gain from oracle DA information 91.3
24Agreement/Disagreement Classifier
- Features
- Structural previous next spurt same/diff
- Duration spurt, silence overlap duration,
speech rate - Lexical similar to adjacency pairs, plus
polarity word counts - Label dependency contextual tags (a speaker is
likely to disagree with someone who disagrees
with them) - Classifier
- Conditional Markov model (Max Entropy Markov
Model)
25Agreement/Disagreement Results
26Overview
- Common threads
- Examples
- Agreements disagreements in meetings
- Agreements disagreements in online discussions
- Citation function
- More common threads
27Detecting (Dis)Agreement in Online Discussions
Task label R in a Q-R (quote-response) pair as
agreement/disagreement.
28ARGUE Data
- 110k forum posts (11k discussion threads, 2764
authors) from website 4forums.com - Forums include evolution, gun control, abortion,
gay marriage, healthcare, death penalty, - Annotations by Mechanical Turkers with -5,5
scale - Disagree-agree (Krippendorffs a 0.62)
- Other annotations had a lt 0.5 attach,
fact/emotion, sarcasm, nice/nasty - 8k good Q-R pairs annotated ? sample use
(-1,1) threshold gives 682 pairs for testing - Class distribution resampled to be balanced
29(Dis)Agree Classifier
- Features
- MetaPost author info, time between posts,
other quotes - Unigram Bigram counts, initial
unigram/bigram/trigram - Repeated punctuation (collapsed to ??,!!, ?!)
- LIWC measures
- Parse dependencies ltrelation,wi,wjgt, POS-polarity
opinion dependencies - Tf-idf cosine distance to previous post
- Classifier Naïve Bayes JRip (WEKA toolkit)
- Chi-squared feature selection, plus feature
selection implicit in JRip (rule learner)
30Sample (Dis)Agree Classifier
31(Dis)Agree Classification Results
- JRip beats NB
- JRip Accuracy
- Local features 68
- Othe annotations 81
Caveat optimistic, since neutral cases are
removed.
32Overview
- Common threads
- Examples
- Agreements disagreements in meetings
- Agreements disagreements in online discussions
- Citation function
- More common threads
33Classification of Citation Function
- Teufel et al., 2006
- Agreement, usage, compatibility (6)
- Weakness (4)
- Contrast
- neutral
34Citation Study Data
- 26 articles w/ 548 citations
- Kappa 0.72 for 12 categories
- Class distribution gt67 neutral neutral
contrast, 4 negative, 19 usage
35Citation Classifier
- Features
- Grammar of 1762 cue phrases, e.g. as far as we
are aware from other work 892 from this corpus - 185 POS patterns for recognizing agents
(self-cites vs. others) w/ 20 manually acquired
verb clusters - Verb tense, voice, modality
- Sentence location in paragraph section
- Classifier K-nearest neighbor (WEKA toolkit)
36Citation Classification Results
K0.75 for humans for these categories
37Overview
- Common threads
- Examples
- Agreements disagreements in meetings
- Agreements disagreements in online discussions
- Citation function
- More common threads
38Collected Observations re Features
- Phrase patterns and location-based n-grams are
useful - Structural features are useful
- Location of turn relative to other
authors/speakers - Location of sentence in turn document
- Broader context (beyond target sentence) is
useful - Sequential patterns of disagreement
- Emotion context
- Simple cosine similarity is not so useful
- Prosodic features not being taken advantage of
39More Challenges
- Explicit agreement disagreement do not capture
all the phenomena associated with alignment
distancing - Implicit (dis)agreement via stating an opposite
opinion - A The video is still an allegation
- B The video is hard evidence or rhetorical
question - or a rhetorical question
- A Such a topic is far more broad than the
current article but should certainly contain a
link back to this one. - B How is the Iraq invasion controversy
suggestion more broad? - Support vs. attack
- Well, you have proven yoruself sic to be a man
with no brain - Steph- vent away that sucks
- These phenomena are hard for human annotators to
more consistently (exception citation labels?) - Different studies may group or distinguish them
40Example Wikipedia Talk Page
- The victims were teenagers, not children.
Furthermore, the teenagers were throwing rocks
and makeshift grenades at the soldiers. Second,
the video is still an allegation. We should wait
until the investigation is completed before
putting it up. RegisteredUser1 - The video is hard evidence. If this was 1945,
you'd be telling us not to include any footage of
the Nazi concentration camps until the Germans
had concluded that they committed war crimes.
As for your suggestions that those children
deserved what happened because they allegedly
throw rocks at soldiers carrying assault rifles,
I find that as offensive as suggesting that
America deserved the 9/11 attack because of its
foreign policies. AnonymousUser1 -
- THEY WEREN'T CHILDREN! The article makes NO
mention of children whatsoever. So before you all
let your emotions run wild over this a) they
weren't children b) they had hand grenades.
RegisteredUser1 -
- YES THEY WERE CHILDREN! Watch the video. The
soldiers are clearly acting in hatred and
blood-lust, not self-defense. Defending them is
like defending a child molester or serial
murderer. The video SHOWS children being
assaulted. AnonymousUser2 -
- A 14 year old is definitely a child. There's a
reason we don't let 14 year-olds drink, vote,
drive, "consent" to sex with adults, or sign
legal agreements without a guardian.
RegisteredUser2 -
- At 14 you are definitely a teenager, not a child.
14 year olds can throw a grenade and shoot a
rifle, and know the consequences of their
actions. Furthermore 18 isn't the age of majority
in Iraq so far as I know. In much of the world
the drinking and driving ages are 14 and 16. The
world is not centered upon our American beliefs,
and it's high time that we started accepting that
in ALL situations, not just the ones we deem
acceptable. I'm absolutely sickened by the
brainwashed vehemence and anti-US hatred
expressed by so many so called "liberals" on
Wikipedia. - RegisteredUser1 - In the English language the word adult is
generally not used for people under the age of
18. If you want to use it differently you need to
explain it in the article in order not to be
misleading. Please calm down and do not
personally attack others as "brainwashed" or
spreading "hatred". RegisteredUser4
41Summary
- Why look for (dis)agreement, support, etc?
- Dissecting discussions for influence, subgroups,
affiliation, successful problem solving, etc - Understanding citation impact
- These tasks are very related to sentiment
detection, except that the target is often part
of the problem - Different ways of handling agreement vs. support
- The neutral class is huge dont ignore it
- Computational advice
- Many better alternatives to Naïve Bayes
- Consider features beyond n-grams