Stylistics and Stylometry - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Stylistics and Stylometry

Description:

... comparison between usages Identify the stylistic function of the features so identified */28 Types of features Invariable features due ... anaphoric nouns. – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 29
Provided by: mcu45
Category:

less

Transcript and Presenter's Notes

Title: Stylistics and Stylometry


1
Stylistics and Stylometry
  • CSC 5930 Machine Translation
  • Fall 2012 Dr. Tom Way

2
What is style?
  • Term not much loved by linguists
  • Too vague
  • Has connotations in similar fields (style
    good style, a value judgment)
  • Many books/articles make reference to etymology
    of the word (Lat. stilus pen), so it follows
    that style is mainly about written language
  • Various definitions, some very close to things
    already seen (especially register)
  • Two main aspects widely supposed
  • style is choice
  • style is described by reference to something else

3
Style as choice
  • For any intended meaning there are a range of
    alternative ways of expressing that meaning
  • Different choices express nuances
  • of meaning
  • of other things (style?) eg buy vs purchase
  • Example
  • Visitors are respectfully informed that the coin
    required for the meter is a quarter no other
    coin is acceptable
  • Quarters only
  • Propositional meaning is the same difference in
    expression conveys something else

4
Style as choice (2)
  • Style is a choice, but often the choice is
    somewhat predetermined
  • For example a choice between appropriate and
    inappropriate style
  • So perhaps style does not connote good or bad
    but merely the way in which the author expresses
    or conveys things

5
Style and the norm
  • Some writers define style as
  • individual characteristics of a text
  • total sum of deviations from a norm
  • But what is the norm?
  • Is there some form of the language that is
    neutral as regards style?
  • Note also that the norm shifts for example, many
    works are written in the vernacular of their time
  • Literary stylistics focuses on the exceptional

6
Style and the norm (2)
  • Even if there is no norm, we can describe style
    comparatively
  • Stylistics mainly involves comparing and
    contrasting texts
  • and associating linguistic variance with
    contextual explanation
  • Some authors see style as being what is added to
    the text

7
Stylistic analysis
  • Gulf between literary vs. linguistic stylistics
  • Literary criticism focuses on effect on the
    reader, intended or otherwise, so largely
    intuitive and subjective
  • Linguistic stylistics looks for characterisations
    of style (including literary style) in terms of
    linguistic phenomena at the various levels of
    linguistic description

8
Stylistic analysis (2)
  • Inventory of linguistic devices and their effect
  • usually in a contrastive way
  • in contrast with other writers in a similar genre
  • in contrast with other genres
  • Linguistic devices described in terms of the
    usual linguistic levels of description
    phonology, morphology, lexis, grammar, etc.
  • Effects can be directly expressive, or
    indirectly, by association
  • example onomatopoeia vs. alliteration as a
    phonological device

9
Stylistic analysis (3)
  • Informally identify stylistic features felt to be
    significant
  • Devise a method of analysis which facilitates
    comparison between usages
  • Identify the stylistic function of the features
    so identified

10
Types of features
  • Invariable features due to the individual or
    the time usually of little interest
  • Discourse features
  • medium ( Hallidays mode), what features
    distinguish written language from spoken language
  • participation eg monologue vs dialogue
  • Province ( field) lexis and syntax
  • Status ( tenor) features relating to relative
    social standing of writer/speaker and
    reader/listener
  • Modality ( text type) eg message delivered as a
    letter, postcard, text message, email, etc
  • Singularity deliberate occasional idiosyncracies

11
Method and function
  • Methods and features determine each other
  • you can only measure features that you can
    extract
  • simple counting features are easy to extract
  • more complex features can be extracted thanks to
    NLP techniques of corpus annotation (tagging,
    parsing, etc)
  • Describing the function of observed differences
  • could be based on intuition
  • or (see later) partially automated (factor
    analysis)

12
What to count
  • Simple things may characterize different styles
  • average sentence length
  • average word length
  • typetoken ratio (vocabulary richness)
  • number of types number of different words
  • number of tokens total number of words
  • vocabulary growth (homogeneity of text)
  • number of new types in 1st, 2nd, , nth 1000
    words
  • in rich varied text, number will climb steadily
  • Especially when used comparatively

13
What to count (2)
  • More complex analyses can give a more interesting
    picture
  • specific syntactic structures
  • degree of modification in NPs
  • types of verbs (eg verbs of persuasion, speech
    verbs, action verbs, descriptive verbs)
  • distribution of pronouns (1st/2nd/3rd person)
  • etc (anything you can think of)
  • Quite sophisticated mathematical techniques can
    give an overall picture
  • eg factor analysis identifies from a (big) range
    of variables which ones best identify/characterize
    differences

14
Normalization and significance
  • Always important to compare like with like
  • It is usual when counting things to normalize
    over the length of the text
  • If one text is longer than the other, of course
    you would expect higher frequencies of everything
  • Issue of statistical significance
  • Small differences may not really tell you
    anything
  • Various measures can confirm whether difference
    is statistically significant or due to random
    fluctuation

15
How to count
  • How to recognize paragraph breaks?
  • How to recognize sentence breaks?
  • Headlines dont end in a full stop
  • Not all sentences end in a full stop
  • Not all full stops are sentence ending
    (abbreviations)
  • How to count words
  • Hyphenated words, contractions e.g. dont
  • How to measure word-length/complexity
  • length only roughly corresponds to complexity
  • number of characters vs. number of syllables
  • counting syllables implies either a dictionary or
    an algorithm

16
More sophisticated counting
  • Tagging and parsing allows you to look at
    grammatical and lexical issues
  • Use of particular POSs (conjunctions, pronouns,
    auxiliaries, modals)
  • Use of particular features (tenses, )
  • Use of particular constructions (passives,
    interrogatives)

17
Quantifying register differences
  • Much work based on corpora trying to quantify and
    characterize register differences
  • Work pioneered by Douglas Biber
  • Simple counts like the ones suggested
  • Also, more complex computations

18
Example
From D. Biber, S. Conrad R. Reppen, Corpus
Linguistics Investigating Language Structure and
Use, Cambriufge University Press, 1998. Ch 5 the
study of discourse characteristics
19
Multidimensional analysis
  • Collect a huge range of measures of a wide
    variety
  • some simple word counts
  • syntactic features
  • classes and subclasses of N,V,Adj,Avd
  • Factor analysis

20
(No Transcript)
21
150 features in all
22
Factor analysis
  • Statistical method to take large number of
    apparently random variables and group them
    together into factors
  • Factors will be groups of (ve and ve) features
  • Linguist might then try to characterize the
    factors in terms of some psycholinguistic feature

23
(No Transcript)
24
Example
  • Biber took two Google classifications of text
    types Home and Science
  • Harvested 1500 webpages in each category (3.74m
    words)
  • originally got 2500 webpages, but some were not
    suitable

http//jan.ucc.nau.edu/biber/Web text types.ppt
25
(No Transcript)
26
Summary of analysis
27
(No Transcript)
28
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com