Computational Models of Text Quality

About This Presentation

Title:

Computational Models of Text Quality

Description:

* Discourse (coherence) relations Only recently empirically results have shown that discourse relations are predictive of text quality (Pitler and Nenkova, ... – PowerPoint PPT presentation

Number of Views:281

Avg rating:3.0/5.0

Slides: 107

Provided by: AniNe

Learn more at: https://www.cis.upenn.edu

Category:

more less

Transcript and Presenter's Notes

Title: Computational Models of Text Quality

1
Computational Models of Text Quality

Ani Nenkova
University of Pennsylvania
ESSLLI 2010, Copenhagen

2
The ultimate text quality application

Imagine your favorite text editor
With spell-checker and grammar checker
But also functions that tell you
Word W is repeated too many times
Fill the gap is a cliché
You might consider using this more figurative
expression
This sentence is unclear and hard to read
What is the connection between these two
sentences?
..

3
Currently

It is our friends who give such feedback
Often conflicting
We might agree that a text is good, but find it
hard to explain exactly why
Computational linguistics should have some
answers
Though far from offering a complete solution yet

4
In this course

We will overview research dealing with various
aspects of text quality
A unified approach does not yet exist, but many
proposals
have been tested on corpus data
integrated in applications

5
Current applications education

Grading student writing
Is this a good essay?
One of the graders of SAT and GRE essays is in
fact a machine! 1
http//www.ets.org/research/capabilities/automated
_scoring
Providing appropriate reading material
Is this text good for a particular user?
Appropriate grade level
Appropriate language competency in L2 2,3
http//reap.cs.cmu.edu/

6
Current applications information retrieval

Particularly user generated content
Questions and answers on the web
Blogs and comments
Searching over such content poses new problems
4
What is a good question/answer/comment?
http//answers.yahoo.com/
Relevant for general IR as well
Of the many relevant document some, are better
written

7
Current applications NLP

Models of text quality
lead to improved systems 5
offer possibilities for automatic evaluation 6
Automatic summarization
Select important content and organize it in as
well-written text
Language generation
Select, organize and present content on document,
paragraph, sentence and phrase level
Machine translation

8
Text quality factors

Interesting
Style (clichés, figurative language)
Vocabulary use
Grammatical and fluent sentences
Coherent and easy to understand
In most types of writing, well-written means
clear and easy to understand. Not necessarily so
in literary works.
Problems with clarity of instructions motivated a
fair amount of early work.

9
Early work keep in mind these predate modern
computers!

Common words are easier to understand
stentorian vs. loud
myocardial infarction vs. heart attack
Common words are short
Standard readability metrics
percentage of words not among the N most frequent
average numbers of syllables per word
Syntactically simple sentences are easier to
understand
average number of words per sentence
Flesch-Kincaid, Automated Readability Index,
Gunning-Fog, SMOG, Coleman-Liau

10
Modern equivalents

Language models
Word probabilities from a large collection
http//www.speech.cs.cmu.edu/SLM_info.html
Features derived from syntactic parse 2,7,8,9
Parse tree height
Number of subordinating conjunctions
Number of passive voice constructions
Number of noun and verb phrases

11
Language models

Unigram and bigram language models
Really, just huge tables
Smoothing necessary to account for unseen words

12
Features from language models

Assessing the readability of text t consisting of
m words, for intended audience class c
Number of out of vocabulary words in the text
with respect to the language model for c
Text likelihood and perplexity

13
Application to grade level predictionCollins-Thom
pson and Callan, NAACL 2004 10
14
Application to grade level predictionCollins-Thom
pson and Callan, NAACL 2004 10
15
Results on predicting grade levelSchwarm and
Ostendorf, ACL 2005 11

Flesch-Kincaid Grade Level index
number of syllables per word
sentence length
Lexile
word frequency
sentence length
SVM features
language models and syntax

16
Models of text coherence

Global coherence
Overall document organization
Local coherence
Adjacent sentences

17
Text structure can be learnt in an unsupervised
manner
Location, time

Human-written examples from a domain

damage
magnitude
relief efforts
18
Content model Barzilay Lee04 12

Hidden Markov Model (HMM)-based
States - clusters of related sentences topics
Transition prob. - sentence precedence in corpus
Emission prob. - bigram language model

Generating sentence in current topic
Earthquake reports
Transition from previous topic
location, magnitude
relief efforts
casualties
19
Generating Wikipedia articlesSauper and
Barzilay, 2009 12

Articles on diseases and American film actors
Create templates of subtopics

Focus only on subtopic level structure
Use paragraphs from documents on the web

20
Template creation

Cluster similar headings
signs and symptoms, symptoms, early symptoms
Choose k clusters
average number of subtopics in that domain
Find majority ordering for the clusters

Biography Early life Career Personal life Death
Diseases Symptoms Causes Diagnosis Treatment
21
Extraction of excerpts and ranking

Candidates for a subtopic
Paragraphs from top 10 pages of search results
Measure relevance of candidates for that subtopic
Features unigrams, bigrams, number of
sentences

22
Need to control redundancy across subtopics

Integer Linear Program
Variables
One per excerpt (value 1-chosen or 0)
Objective
Minimize sum of the ranks of the excerpts chosen

1 2 3 4 5
causes
symptoms
diagnosis
treatment

Constraints
Cosine similarity between any selected pair lt
0.5
One excerpt per subtopic

23
Linguistic models of coherenceHalliday and
Hasan, 1976 13

Coherent text is characterized by the presence of
various types of cohesive links that facilitate
text comprehension
Reference and lexical reiteration
Pronouns, definite descriptions, semantically
related words
Discourse relations (conjunction)
I closed the window because it started raining.
Substitution (one) or ellipses (do)

24
Referential coherence

Centering theory
tracking focus of attention across adjacent
sentences 14, 15, 16, 17
Syntactic form of references
Particularly first and subsequent mention 18,
19, pronominalization
Lexical chains
Identifying and tracking topics within a text
20, 21, 22, 23

25
Discourse relations

Explicit vs. implicit
I stayed home because I had a headache
Signaled by a discourse connective
Inferred without the presence of a connective
I took my umbrella. Because The forecast was
for rain in the afternoon.

26
Lexical chains

Often discussed as cohesion indicator,
implemented systems, but not used in text quality
tasks
Find all words that refer to the same topic
Find the correct sense of the words
LexChainer Tool http//www1.cs.columbia.edu/nlp/t
ools.cgi 23
Applications summarization, IR, spell checking,
hypertext construction
John bought a Jaguar. He loves the car.
LC jaguar, car, engine, it

27
Centering theory ingredients(Grosz et al, 1995)

Deals with local coherence
What happens to the flow from sentence to
sentence
Does not deal with global structuring of the text
(paragraphs/segments)
Defines coherence as an estimate of the
processing load required to understand the text

28
Processing load

Upon hearing a sentence a person
Cognitive effort to interpret the expressions in
the utterance
Integrates the meaning of the utterance with that
of the previous sentence
Creates some expectations on what might come next

29
Example

John met his friend Mary today.
He was surprised to see her.
He thought she is still in Italy.
Form of referring expressions
Anaphora needs to be resolved
Create a discourse entity at first mention with
full noun phrase
Creating expectations

30
Creating and meeting expectations

(1) a. John went to his favorite music store to
buy a piano.
b. He had frequented the store for many
years.
c. He was excited that he could finally buy
a piano.
d. He arrived just as the store was closing
for the day.
(2) a. John went to his favorite music store to
buy a piano.
b. It was a store John had frequented for
many years.
c. He was excited that he could finally buy
a piano.
d. It was closing just as John arrived.

31
Interpreting pronouns

Terry really goofs sometimes.
Yesterday was a beautiful day and he was excited
about trying out his new sailboat.
He wanted Tony to join him on a sailing
expedition.
He called him at 6am.
He was sick and furious at being woken up so
early.

32
Basic centering definitions

Centers of an utterance
Set of entities serving to link that utterance to
the other utterances in the discourse segment
that contains it
Not words or phrases themselves
Semantic interpretations of noun phraes

33
Types of centers

Forward looking centers
An ordered set of entities
What could we expect to hear about next
Ordered by salience as determined by grammatical
function
Subject gt Indirect object gt Object gt Others
John gave the textbook to Mary.
Cf John, Mary, textbook
Preferred center Cp
The highest ranked forward looking center
High expectation that the next utterance in the
segment will be about Cp

34
Backward looking center

Single backward looking center, Cb (U)
For each utterance other than the segment-initial
one
The backward looking center of utterance Un1
connects with one of the forward looking centers
of Un
Cb (U1) is the most highly ranked element from
Cf (Un) that is also realized in U1

35
Centering transitions ordering
Cb(Un1)Cb(Un) OR Cb(Un)? Cb(Un1) ! Cb(Un)
Cb(Un1) Cp(Un1) continue smooth-shift
Cb(Un1) ! Cp(Un1) retain rough-shift
36
Centering constraints

There is precisely one backward-looking center
Cb(Un)
Cb(Un1) is the highest-ranked element of Cf(Un)
that is realized in Un1

37
Centering rules

If some element of Cf(Un) is realized as a
pronoun in Un1 then so is Cb(Un1)
Transitions not equal
continue gt retain gt smooth-shift gt rough-shift

38
Centering analysis

Terry really goofs sometimes.
CfTerry, Cb?, undef
Yesterday was a beautiful day and he was excited
about trying out his new sailboat.
CfTerry,sailboat, CbTerry, continue
He wanted Tony to join him in a sailing
expedition.
CfTerry, Tony, expedition, CbTerry, continue
He called him at 6am.
CfTerry,Tony, CbTerry, continue

He called him at 6am.
CfTerry,Tony, CbTerry, continue
Tony was sick and furious at being woken up so
early.
CfTony, CbTony, smooth shift
He told Terry to get lost and hung up.
CfTony,Terry, CbTony, continue
Of course, Terry hadnt intended to upset Tony.
CfTerry,Tony, Cb Tony, retain

40
Rough shifts in evaluation of writing skills
(Miltsakaki and Kukich, 2002)

Automatic grading of essays by E-rater
Syntactic variety
Represented by features that quantify the
occurrence of clause types
Clear transitions
Cue phrases in certain syntactic constructions
Existence of main and supporting points
Appropriateness of the vocabulary content of the
essay
What about local coherence?

41
Essay score model

Human score available
E-rater prediction available
Percentage of rough-shifts in each essay
analysis done manually
Negative correlation between the human score and
the percentage of rough-shifts

Linear multi-factor regression
Approximate the human score as a linear function
of the e-rater prediction and the percentage of
rough-shifts
Adding rough shifts significantly improves the
model of the score
0.5 improvement on 16 scale
How easy/difficult would it be to fully automate
the rough-shift variable?

43
Variants of centering and application to
information ordering

Karamanis et al, 09 is the most comprehensive
overview of variants of centering theory and an
evaluation of centering in a specific task
related to text quality

44
Information ordering task

Given a set of sentences/clauses, what is the
best presentation?
Take a newspaper article and jumble the
sentences---the result will be much more
difficult to read than the original
Negative examples constructed by randomly
permuting the original
Criteria for deciding which of two orderings is
better
Centering would definitely be applicable

45
Centering variations

Continuity (NOCBlack of continuity)
Cf(Un) and Cf(Un1) share at least one element
Coherence
Cb(Un) Cb(Un1)
Salience
Cb(U) Cp(U)
Cheapness (fulfilled expectations)
Cb (Un1) Cp(Un)

46
Metrics of coherence

M.NOCB (no continuity)
M.CHEAP (expectations not met)
M.KP sum of the violations of continuity,
cheapness, coherence and salience
M. BFP seeks to maximize transitions according to
Rule 2

47
Experimental methodology

Gold-standard ordering
The original order of the text (object
description, news article)
Assume that other orderings are inferior
Classification error rate
Percentage orderings that score better than the
gold-standard 0.5percentage of the orderings
that score the same

48
Results

NOCB gives best results
Significantly better than the other metrics
Consistent results for three different corpora
Museum artifact descriptions (2)
News
Airplane accidents
M.BFP is the second best metric

49
(No Transcript)
50
Entity grid(Barzilay and Lapata, 2005, 2008)

Inspired by centering
Tracks entities across adjacent sentences, as
well as their syntactic positions
Much easier to compute from raw text
Brown Coherence Toolkit
http//www.cs.brown.edu/melsner/manual.html

51
Entity grid applications

Several applications , with very good results
Information ordering
Comparing the coherence of pairs of summaries
Distinguishing readability levels
Child vs. adult
Improves over PetersenOstendorf

52
Entity grid example

1 The Justice DepartmentS is conducting an
anti-trust trialO against Microsoft Corp.X
with evidenceX that the companyS is
increasingly attempting to crush competitorsO.
2 MicrosoftO is accused of trying to forcefully
buy into marketsX where its own productsS are
not competitive enough to unseat established
brandsO.
3 The caseS revolves around evidenceO of
MicrosoftS aggressively pressuring NetscapeO
into merging browser softwareO.
4 MicrosoftS claims its tacticsS are
commonplace and good economically.
5 The governmentS may file a civil suitO
ruling that conspiracyS to curb competitionO
through collusionX is a violation of the
Sherman ActO.
6 MicrosoftS continues to show increased
earningsO despite the trialX.

53
Entity grid representation
54
16 entity grid features

The probability of each type of transition in the
text
Four syntactic distinctions
S, O, X, _

55
Type of reference and info ordering(Elsner and
Charniak, 2008)

Entity grid features not concerned with how an
entity is mentioned
Discourse old vs. discourse new
Kent Wells, a BP senior vice president said on
Saturday during a technical briefing that the
current cap, which has a looser fit and has been
diverting about 15,000 barrels of oil a day to a
drillship, will be replaced with a new one in 4
to 7 days.
The new cap will take 4 to 7 days to be
installed, and in case the new cap is not
effective, Mr. Wells said engineers were prepared
to replace it with an improved version of the
current cap.

The probability of a given sequence of discourse
new and old realizations gives a further
indication about ordering
Similarly, pronouns should have reasonable
antecedents
Adding both models to the entity grid improves
performance on the information ordering task

57
Sentence Ordering

n sentences
Output from a generation or summarization system
Find most coherent ordering
n! permutations

With local coherence metrics
Adjacent sentence flow
Finding best ordering is NP complete
Reduction from Traveling Salesman Problem

58
Word co-occurrence model(Lapata, ACL 2003
Soricut and Marcu, 2005) 23,24

Idea from statistical machine translation
Alignment models

John went to a restaurant. He ordered fish. The
waiter was very attentive.
John est allé à un restaurant.Il ordonna de
poisson.Le garçon était très attentif.
John went to a restaurant. He ordered fish. The
waiter was very attentive.
He ordered fish.The waiter was very
attentive.John gave him a huge tip.
P(ordered restaurant)
P(fish poisson)
We ate at a restaurant yesterday.
P(waiter ordered)
We also ordered some take away.
P(tip waiter)

59
Discourse (coherence) relations

Only recently empirically results have shown that
discourse relations are predictive of text
quality (Pitler and Nenkova, 2008)

60
PDTB discourse relations annotations

Largest corpus of annotated discourse relations
http//www.seas.upenn.edu/pdtb/
Four broad classes of relations
Contingency
Comparison
Temporal
Expansion
Explicit and implicit

61
Implicit and explicit relations

(E1) He is very tired because he played tennis
all morning.
(E2) He is not very strong but he can run
amazingly fast.
(E3) We had some tea in the afternoon and later
went to a restaurant for a big dinner
(I1) I took my umbrella this morning. because
The forecast was for rain.
(I2) She is never late for meetings. but He
always arrives 10 minutes late.
(I3) She woke up early. afterwards She had
breakfast and went for a walk in the park.

62
What is the relative importance of factors in
determining text quality?

Competent readers (native English speaker)
graduate students at Penn
Wall Street Journal texts
30 texts ranked on scale 1 to 5
How well-written is this article?
How well does the text fit together?
How easy was it to understand?
How interesting is the article?

Several judgments for each text
Final quality score was the average
Scores range from 1.5 to 4.33
Mean 3.2

Which of the many indicators will work best?
Usually research study focus on only one or two
How do indicators combine?
Metrics
Correlation coefficient
Accuracy of pair-wise ranking prediction

Correlation coefficients between assessor ratings
and different features

66
Baseline measures

Average Characters/Word
r -.0859 (p .6519)
Average Words/Sentence
r .1637 (p .3874)
Max Words/Sentence
r .0866 (p .6489)
Article length
r -.3713 (p .0434)

67
Vocabulary factors

Language model probability of the article
M estimated from PTB (WSJ)
M estimated from general news (NEWS)

68
Correlations with well-written assessment

Log likelihood, WSJ
r .3723 (p .0428)
Log likelihood, NEWS
r .4497 (p .0127)
Log likelihood with length, WSJ
r .3732 (p .0422)
Log likelihood with length, NEWS
r .6359, p .0002

69
Syntactic features

Average parse tree height
r -.0634 (p .7439)
Avr. number of noun phrases per sentence
r .2189 (p .2539)
Average SBARs
r .3405 (p .0707)
Avr. number of verb phrases per sentence
r .4213 (p .0228)

70
Elements of lexical cohesion

Avr. cosine similarity between adjacent sents
r -.1012 (p .5947)
Avr. word overlap between adjacent sentences
r -.0531, p .7806
Avr. NounPronoun Overlap
r .0905, p .6345
Avr. Pronouns/Sent
r .2381, p .2051
Avr Definite Articles
r .2309, p .2196

71
Correlation with well-written score

Prob. of S-S transition
r -.1287 (p .5059)
Prob. of S-O transition
r -.0427 (p .8261)
Prob. of S-X transition
r -.1450 (p .4529)
Prob. of S-N transition
r .3116 (p .0999)
Prob. of O-S transition
r .1131 (p .5591)
Prob. of O-O transition
r .0825 (p .6706)
Prob. of O-X transition
r .0744 (p .7014)
Prob. of O-N transition
r .2590 (p .1749)

Prob. of X-S transition
r .1732 (p .3688)
Prob. of X-O transition
r .0098 (p .9598)
Prob. of X-X transition
r -.0655 (p .7357)
Prob. of X-N transition
r .1319 (p .4953)
Prob. of N-S transition
r .1898 (p .3242)
Prob. of N-O transition
r .2577 (p .1772)
Prob. of N-X transition
r .1854 (p .3355)
Prob. of N-N transition
r -.2349 (p .2200)

73
Well-writteness and discourse

Log likelihood of discourse rels
r .4835 (p .0068)
of discourse relations
r -.2729 (p .1445)
Log likelihood of rels with of rels
r .5409 (p .0020)
of relations with of words
r .3819 (p .0373)
Explicit relations only
r .1528 (p .4203)
Implicit relations only
r .2403 (p .2009)

74
Summary significant factors

Log likelihood of discourse relations
r .4835
Log likelihood , NEWS
r .4497
Average verb phrases per sentence
r .4213
Log likelihood, WSJ
r .3723
Number of words
r -.3713

75
Text quality prediction as ranking

Every pair of texts with ratings differing by 0.5
Features are the difference of feature values for
each text
Task predict which of the two articles has
higher text quality score

76
Prediction accuracy (10-fold cross validation)

None (Majority Class) 50.21
number of words 65.84
ALL 88.88
Grid only 79.42
log l discourse rels 77.77
Avg VPs sen 69.54
log l NEWS 66.25

77
Findings

Complex interplay between features
Entity grid features not significantly correlated
with well-written score but very useful for the
ranking task
Discourse information is very helpful
But here we used gold-standard annotations
Developing automatic classifier underway

78
Implicit and explicit discourse relations
Class Explicit Implicit
Comparison 69 31
Contingency 47 53
Temporal 80 20
Expansion 42 58
79
Sense classification based on connectives only

Four-way classification
Explicit relations only
93 accuracy
All relations (implicitexplicit)
75 accuracy
Implicit relations are the real challenge

80
Explicit discourse relations, tasksPitler and
Nenkova, 2009 25

Discourse vs. non-discourse use
I will be happier once the semester is over.
I have been to Ohio once.
Relation sense
Contingency, comparison, temporal, expansion
I havent been to Paris since I went there on a
school trip in 1998. Temporal
I havent been to Antarctica since it is very far
away. Contingency

81
Penn Discourse Treebank

Largest available annotated corpus of discourse
relations
Penn Treebank WSJ articles
18,459 explicit discourse relations
100 connectives
although vs. or
91 discourse 3 discourse

82
Discourse Usage Experiments

Positive examples discourse connectives
Negative examples same strings in PTDB,
unannotated
10-fold cross validation
Maximum Entropy classifier

83
Discourse Usage Results
84
Discourse Usage Results
85
Sense Disambiguation Comparison, Contingency,
Expansion, or Temporal?
Features Accuracy
Connective 93.67
Connective Syntax 94.15
Interannotator Agreement 94
86
Tool

Automatic annotation of discourse use and sense
of discourse connectives
Discourse Connectives Tagger
http//www.cis.upenn.edu/epitler/discourse.html

87
What about implicit relations?

Is there hope to have a usable tool soon?
Early studies on unannotated data gave reason for
optimism
But when recently tested on the PDTB, their
performance is poor
Accuracy of contingency, comparison and temporal
is below 50

88
Classify implicits and explicits together

Not easy to infer from combined results how early
systems performed on implicits
As we saw, one can get reasonable overall
performance by doing nothing for explicts
Same sentence 26
Graphbank corpus doesnt distinguish implicit
and explicit 27

89
Classify on large unannotated corpus

Create artificial implicits by deleting
connective 28, 29, 30
I am in Europe, but I live in the United States.
First proposed by Marcu and Echihabi, 2002
Very good initial results
Accuracy of distinguishing between two rels, gt75
But these were on balanced classes
Not the case in real text
Not tested on real implicits (but see 30,29)

90
Experiments with PDTB

Pitler et al, ACL 2009 31
Wide variety of features to capture semantic
opposition and parallelism
Lin et al, EMNLP 2009 32
(Lexicalized) syntactic features
Results improve over baselines, better
understanding of features, but the classifiers
are not suitable for application in real tasks

91
Word pairs as features

Most basic feature for implicits
I_there, I_is, , tired_time, tired_difference

I
am
a
Iittle
tired
there
is
a
13
hour
time
difference
Marcu and Echihabi , 2002
92
Intuition with large amounts of data, will find
semantically-related pairs

The recent explosion of country funds mirrors the
closed-end fund mania of the 1920s, Mr. Foot
says, when narrowly focused funds grew wildly
popular.
They fell into oblivion after the 1929 crash.

93
Meta error analysis of prior work

Using just content words reduces performance (but
has steeper learning curve)
Marcu and Echihabi, 2002
Nouns and adjectives dont help at all
Lapata and Lascarides, 2004 33
Filtering out stopwords lowers results
Blair-Goldensohn et al., 2007

94
Word pairs experimentsPitler et al 2009

Synthetic implicits Cause/Contrast/None
Explicit instances from Gigaword with connective
deleted
Because ? Cause, But ? Contrast
At least 3 sentences apart ? None
Blair-Goldensohn et al., 2007
Random selection
5,000 Cause
5,000 Other
Computed information gain of word pairs

95
Function words have highest information gain
ButDidnt we remove the connective?
96
but signals Not-Comparison in synthetic data

The government says it has reached most isolated
townships by now, but because roads are blocked,
getting anything but basic food supplies to
people remains difficult.
but because ? Comparison
but because ? Contingency

97
Results Word pairs

Pairs of words from the two text spans
What doesnt work
Training on synthetic implicits
What really works
Use synthetic implicits for feature selection
Train on PDTB

98
Best Results f-scores
Comparison 21.96 (17.13) Contingency 47.13 (31.10)
Expansion 76.41 (63.84) Temporal 16.76 (16.21)
Comparison/Contingency baseline synthetic
implicits word pairs Expansion/Temporal baseline
real implicits word pairs
99
Further experiments using context

Results from classifying each relation
independently
Naïve Bayes, MaxEnt, AdaBoost
Since context features were helpful, tried CRF
6-way classification, word pairs as features
Naïve Bayes accuracy 43.27
CRF accuracy 44.58

100
Do we need more coherence factors?Louis and
Nenkova, 2010 34

If we had perfect co-reference and discourse
relation information, would we be able to explain
local discourse coherence
Our recent corpus study indicates the answer is
NO
30 of adjacent sentences in the same paragraph
in PDTB
Neither share an entity nor have an implicit
comparison contingency or temporal relation
Lexical chains?

101
References

1 Burstein, J. Chodorow, M. (in press).
Progress and new directions in technology for
automated essay evaluation. In R. Kaplan (Ed.),
The Oxford handbook of applied linguistics (2nd
Ed.). New York Oxford University Press.
2 Heilman, M., Collins-Thompson, K., Callan,
J., and Eskenazi, M. (2007). Combining Lexical
and Grammatical Features to Improve Readability
Measures for First and Second Language Texts.
Proceedings of the Human Language Technology
Conference. Rochester, NY.
3 S. Petersen and M. Ostendorf, A machine
learning approach to reading level assessment,
Computer, Speech and Language, vol. 23, no. 1,
pp. 89-106, 2009
4 Finding High Quality Content in Social Media,
Eugene Agichtein, Carlos Castillo, Debora Donato,
Aristides Gionis, Gilad Mishne, ACM Web Search
and Data Mining Conference (WSDM), 2008
5 Regina Barzilay and Lillian Lee, Catching the
Drift Probabilistic Content Models, with
Applications to Generation and Summarization,
HLT-NAACL 2004 Proceedings of the Main
Conference, pp113120, 2004

102
References

6 Emily Pitler, Annie Louis and Ani Nenkova,
Automatic Evaluation of Linguistic Quality in
Multi-Document Summarization, Proceedings of ACL
2010
7 Schwarm, S. E. and Ostendorf, M. 2005.
Reading level assessment using support vector
machines and statistical language models. In
Proceedings of ACL 2005.
8 Jieun Chae, Ani Nenkova Predicting the
Fluency of Text with Shallow Structural Features
Case Studies of Machine Translation and
Human-Written Text. In proceedings of EACL 2009
139-147
9 Charniak, E. and Johnson, M. 2005.
Coarse-to-fine n-best parsing and MaxEnt
discriminative reranking. In Proceedings of ACL
2005.
10 K. Collins-Thompson and J. Callan. (2004). A
language modeling approach to predicting reading
difficulty. Proceedings of HLT/NAACL 2004.
11 Sarah E. Schwarm and Mari Ostendorf. Reading
Level Assessment Using Support Vector Machines
and Statistical Language Models. In Proceedings
of ACL, 2005.

103
References

12 Automatically generating Wikipedia articles
A structure-aware approach, C. Sauper and R.
Barzilay, ACL-IJCNLP 2009
13 Halliday, M. A. K., and Ruqaiya Hasan.
1976.Cohesion in English. London Longman
14 B. Grosz, A. Joshi, and S. Weinstein. 1995.
Centering a framework for modelling the local
coherence of dis- course. Computational
Linguistics, 21(2)203226
15 E. Miltsakaki and K. Kukich. 2000. The role
of centering theorys rough-shift in the teaching
and evaluation of writing skills. In Proceedings
of ACL00, pages 408 415.
16 Karamanis, N., Mellish, C., Poesio, M., and
Oberlander, J. 2009. Evaluating centering for
information ordering using corpora. Comput.
Linguist. 35, 1 (Mar. 2009), 29-46.
17 Regina Barzilay, Mirella Lapata, "Modeling
Local Coherence An Entity-based Approach,
Computational Linguistics, 2008.
18 Ani Nenkova, Kathleen McKeown References to
Named Entities a Corpus Study. HLT-NAACL 2003

104
References

19 Micha Elsner, Eugene Charniak
Coreference-inspired Coherence Modeling. ACL
(Short Papers) 2008 41-44
20 Morris, J. and Hirst, G. 1991. Lexical
cohesion computed by thesaural relations as an
indicator of the structure of text. Comput.
Linguist. 17, 1 (Mar. 1991), 21-48.
21 Regina Barzilay and Michael Elhadad, "Text
summarizations with lexical chains, In Inderjeet
Mani and Mark Maybury, editors, Advances in
Automatic Text Summarization. MIT Press, 1999.
22 Silber, H. G. and McCoy, K. F. 2002.
Efficiently computed lexical chains as an
intermediate representation for automatic text
summarization. Comput. Linguist. 28, 4 (Dec.
2002), 487-496.
23 Mirella Lapata, Probabilistic Text
Structuring Experiments with Sentence Ordering,
Proceedings of ACL 2003.
24 Discourse generation using utility-trained
coherence models, R. Soricut D. Marcu,
COLING-ACL 2006

105
References

25 Emily Pitler and Ani Nenkova. Using Syntax
to Disambiguate Explicit Discourse Connectives in
Text. Proceedings of ACL, short paper, 2009
26 Radu Soricut and Daniel Marcu. 2003.
Sentence Level Discourse Parsing using Syntactic
and Lexical Information. Proceedings of the Human
Language Technology and North American
Association for Computational Linguistics
Conference (HLT/NAACL-2003)
27 Ben Wellner, James Pustejovsky, Catherine
Havasi, Roser Sauri and Anna Rumshisky.
Classification of Discourse Coherence Relations
An Exploratory Study using Multiple Knowledge
Sources. In Proceedings of the 7th SIGDIAL
Workshop on Discourse and Dialogue
28 Daniel Marcu and Abdessamad Echihabi (2002).
An Unsupervised Approach to Recognizing Discourse
Relations. Proceedings of the 40th Annual Meeting
of the Association for Computational Linguistics
(ACL-2002)
29 Sasha Blair-Goldensohn, Kathleen McKeown,
Owen Rambow Building and Refining
Rhetorical-Semantic Relation Models. HLT-NAACL
2007 428-435

106
References

30 Sporleder, C. and Lascarides, A. 2008. Using
automatically labelled examples to classify
rhetorical relations An assessment. Nat. Lang.
Eng. 14, 3 (Jul. 2008), 369-416.
31 Emily Pitler, Annie Louis, and Ani Nenkova.
Automatic Sense Prediction for Implicit Discourse
Relations in Text. Proceedings of ACL, 2009.
32 Ziheng Lin, Min-Yen Kan and Hwee Tou Ng
(2009). Recognizing Implicit Discourse Relations
in the Penn Discourse Treebank. In Proceedings of
EMNLP
33 Lapata, Mirella and Alex Lascarides. 2004.
Inferring Sentence-internal Temporal Relations.
In Proceedings of the North American Chapter of
the Assocation of Computational Linguistics,
153-160.
34 Annie Louis and Ani Nenkova, Creating Local
Coherence An Empirical Assessment, ?Proceedings
of NAACL-HLT 2010