Title: The Cooccurrence Retrieval Framework applied to Text Classification
1The Co-occurrence Retrieval Frameworkapplied to
Text Classification
- Jonathon Read
- j.l.read_at_sussex.ac.uk
- http//www.sussex.ac.uk/Users/jlr24
2Outline
- Introduction
- Feature Retrieval for Text Classification
- Model optimisation
- Some initial results
- Some observations
- Future work
3Introduction
- Co-occurrence Retrieval Framework
- Weeds 2003
- Weeds and Weir 2003
- Measuring the distributional similarity of words
using co-occurrence information
4Introduction
- Measuring the similarity of feature vectors, w1
and w2 - by analogy with Information Retrieval
- w1 retrieved features
- w2 desired features
- Precision the proportion of features correctly
retrieved - Recall proportion of desired features that have
been retrieved
5Introduction
- Similarity metric for feature vectors
- Documents can be represented as feature vectors
can Co-occurrence Retrieval be used to measure
similarity of documents? - Test task Sentiment Classification
6Feature Retrieval forText Classification
- A subset, s, is the unification of n texts
- A text, t, is a vector of features, f, each with
an associated weight, D( t, f )
7Feature Retrieval forText Classification
- Subsets and texts are text units referred to
using a polymorphic term, u - SF is the set of features that are shared by two
units of text
8Feature Retrieval forText Classification
- The Precision of u1s retrieval of u2s features
is the proportion of u1s features that appear in
both units, weighted by their importance in u1
9Feature Retrieval forText Classification
- The Recall of u1s retrieval of u2s features is
the proportion of u2s features that appear in
both units, weighted by their importance in u2
10Feature Retrieval forText Classification
- The measures of precision and recall may be
combined by weighting the harmonic and arithmetic
means (using some constants ? and ?)
11Feature Retrieval forText Classification
- Given a corpus C
- we can say that a problem text, t, is predicted
to be a member of the subset, s, that has the
highest similarity score
12Feature Retrieval forText Classification
- Additive models make no distinction about the
extent of feature occurrence with respect to each
unit - Extent can be measured in terms of precision and
recall of individual features
13Feature Retrieval forText Classification
- Weight functions
- Determine the importance of each feature in a
given unit of text
14Feature Retrieval forText Classification
- Extent functions
- Determine the extent to which a feature goes with
a unit of text
15Model Optimisation
- Sentiment datasets
- Polarity 1.0 (Movie Reviews before 2002)
- Polarity 2004 (Movie Reviews after 2002)
- Newswire (Business news articles)
- Choosing the optimal
- Weight function
- Extent function
- ? and ? parameters
16Model Optimisation
Optimal parameters for datasets
17Some initial results
Five-fold cross-validated accuracies of
classifiers on datasets, in percent with standard
deviations
18Some observations
Optimising ? and ? using Polarity 1.0, Dz Ewmi
19Some observations
Optimising ? and ? using Polarity 2004, Dz Ewmi
20Some observations
Optimising ? and ? using Newswire, Dwmi Et
21Some observations
Optimising ? and ? using Newswire, Dz Ewmi
22Some observations
- Interpreting the Information Retrieval metaphor
for Text Classification - Precision measures the similarity using the
features observed in the problem text - Recall measures the similarity using the
features absent in the problem text
23Some observations
- The optimised ? indicates the relative importance
of Precision or Recall in a set - ? ?
24Some observations
? 0.43
Recall
Precision
Optimising ? using Polarity 1.0, Dwmi Et ? 0
25Some observations
? 0.26
Recall
Precision
Optimising ? using Polarity 2004, Dwmi Et ?
0
26Some observations
- Feature Retrieval is different from other models
as it considers both the presence and absence of
features - but this is also a drawback significantly
greater computational expense!
27Future work
- Careful optimisation
- New weight and extent functions
- Impact of features
- Unigrams, n-grams, grammatical relations, etc.
- Optimal feature selection
- Assess other text classification problems
- Investigate similarities with Naïve Bayes, etc