Investigating Chinese Learner English - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Investigating Chinese Learner English

Description:

Chinese learners used more pronouns, but fewer determiners, prepositions, and numerals. ... Chinese learners tend to write down what they speak, though they ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 35
Provided by: itsc5
Category:

less

Transcript and Presenter's Notes

Title: Investigating Chinese Learner English


1
Investigating Chinese Learner English
  • Centre for Linguistics and Applied Linguistics,
  • Guangdong University of Foreign Studies
  • Gui Shichun

2
Background
  • The corpus consists of one million words of
    written compositions by 5 types of learners
    senior middle-school, tertiary college English
    (band 4), tertiary college English (band 6),
    tertiary majors in English (1st and 2nd years),
    tertiary majors in English (3rd and 4th years).
    The corpus is annotated with grammatical tags
    (automatically) and error tags (manually).
  • It is avaiblable at http//www.clal.org.cn/baseinf
    o/ achievement/ Achievement1. htm,

3
(No Transcript)
4
Areas of Investigation
  • Leech (1998) raises two specific questions in
    connection with the study of learner language
  • What are the particular areas of overuse,
    underuse and error which native speakers of
    language A are prone to in learning target
    language T, as contrasted with native speakers of
    languages B, C, D . . . ?
  • What, in general, is the proportion of non-native
    target language behaviour (overuse, underuse,
    error) peculiar to native speakers of language A,
    as opposed to such behaviour which is shared by
    all learners of the language, whatever their
    mother tongue?

5
  • Contrastive study must be very careful, because
    the corpora under investigation are based on
    different types of language performance. One of
    the key issues is to identify the context-free
    variables, e.g. functional words and some of the
    most frequently used notional words. We believe
    that an annotated learner corpus is useful in the
    following ways

6
  • Identifying the words and structures that are
    typically underused or overused in the learner
    corpus
  • Identifying the kinds of error learners at
    different levels are likely to commit
  • Predicting the language proficiency of the
    learners
  • Providing diagnostic information to the both the
    teachers and the learners.

7
Comparison of grammatical tags
  • In our POS tagging program, we used the same
    133133 matrix of tag-transition frequencies, and
    had CLEC grammatically tagged automatically. Then
    we tried to compare the grammatical tags of CLEC,
    Brown, and LOB.

8
  • The native corpora (BROWN and CLEC) are fairly
    consistent in terms of their grammatical tagging.
  • Chinese learners used more pronouns, but fewer
    determiners, prepositions, and numerals. Use of
    more pronouns and fewer numerals reflects the
    differences of subject matter between the learner
    corpus and the native speaker corpus, because
    what the learners have written are related to
    their personal and school life and activities.
    But use of fewer determiners and prepositions may
    have something to do with the learner problems in
    their writing.
  • Another step forward is to study in greater
    details each type of the tagging scheme. Lets
    look at use of determiners as an example.

9
(No Transcript)
10
(No Transcript)
11
Some observations of Chinese learners use of
determiners
  • Chinese learners used fewer determiners, but the
    total frequencies of ST6 learners were closer to
    those of the native speakers.
  • Chinese learners used fewer articles the, no
    a, any were the most underused determiners.
  • Some tendency can be observed the more
    proficient the learners, the closer is their use
    of some determiners to that of the native
    speakers. For example, quite, rather, half, all,
    both, these, those, many, much, next, former, and
    other. The last five are post-determiners, which
    were used much more often by native speakers.
    They can be considered as discourse markers. We
    hypothesize that they can be used for text
    identification as automatic scoring of learners
    compositions.

12
Under-use of the learner lexicon
  • The most frequently used lexical items are
    more or less context-free, and it is a suitable
    place to start with our analysis. They include
  • Use of most functional words like determiners and
    prepositions
  • Some of the modal or auxiliary verbs
  • Some of the polysemous words like go, make, take,
    great, risk, etc
  • Some pronouns, especially personal pronouns.

13
Overuse of Modal Verbs (can, may,much,should)
14
  • As observed by Biber et al , can (ability or
    permission), must (logical necessity) are used
    much more common in conversation, the overuse of
    can and must can be considered as an indication
    of Chinese learners writing style. They make no
    distinction between the stylistic differences of
    spoken and the written forms. The materials of
    CET learners were collected mostly from CET test
    papers, yet they displayed the greatest number of
    uses of can, must, and should. Chinese learners
    tend to write down what they speak, though they
    may not be well versed in speaking, as is
    indicated by the underuse of could, have to, had
    better and have got to.

15
Comparing keyness of CLEC and FLOB
  • By using the keyness programme of Wordsmith, we
    are able to identify the underuse of the learner
    corpus in terms of keyness,which is the classic
    chi-square test of significance with Yates
    correction for a 2 X 2 table. For better
    estimation of keyness, Ted Dunning's Log
    Likelihood test is used when contrasting long
    texts or a whole genre against the reference
    corpus. The higher the chi-square value, the
    greater is the difference between the frequencies
    of two corpora under observation.

16
Fewer third person pronouns
  • This is the result of Chinese transfer,
    because in modern colloquial Chinese, the third
    person pronouns do not make any gender
    difference. There is no underuse of first and
    second person pronouns. St3-4 learners show wider
    discrepancies, because their compositions are
    mainly thematic writing.

17
Fewer passive voice constructions
  • This is shown by the underuse of been and by,
    and partially by was and were. The st3-4
    group and the st5-6 group seem to follow the same
    tendency

18
Fewer relative clauses
  • Chinese learners tend to use fewer relative
    clauses, as is shown by the underuse of wh-words.
    The discrepancies of st5-6 are smaller, showing
    that they are closer to the native speakers.

19
Contrastive analysis of risk and its synonyms
across a few corpora
Using more danger than risk
20
  • While the frequencies of risk, danger, threat,
    and hazard are fairly consistent in the native
    speaker corpora, the performance of Chinese
    learners is quite different.
  • Danger is a more generic term. The following
    errors are produced as a result of the generic
    use of danger
  • Fake furniture brings danger to people.
    (It is risky buying fake furniture.)
  • Water is facing the danger of shortage.
    (We are facing the threat of water shortage.)
  • Their knowledge of risk is quite limited. They
    know how to use take the risk(8), at the
    risk(3)and to risk(6) whereas native speakers
    say avoid/carry/eliminate/ignore/crease/involve/
    give/reduce/run/ worth/lack of / the risk
    conventional/maximum/no/some/suicide/own/
    unnecessary/hazard/ with/ without/ risk
  • Chinese learners do not know how to use high risk
    ,which is used quite often in the native speaker
    corpora?

21
Analyzing learner errors The Cognitive Model
  • We use error as a cover term for all ways of
    being wrong as an FL learner. Errors are results
    of uncertainty in language performance, and
    there are various kinds of uncertainty that can
    be traced back to cognition
  • False analogy books, news
    knowledges, informations
  • Incomplete application of rules
    development advantagement
  • Redundancy ??????????it was a
    three-story-tall building
  • Overgeneralization entered the
    classroomreturned the classroom

22
  • Verbal behaviour (errors as well as linguistic
    structures) can be considered as an emergence
    process, as a result of competition of cues.
  • To set up our cognitive framework of error
    analysis we make use of only those errors whose
    frequencies are well above 1 of the total. There
    are altogether 21 error types.
  • Errors can be divided into several levels that
    are equivalent to the processes of lexicalization
    ? syntaticalization ?relexicalization
    (Skehan,1998).

23
  • Lexical perceptual level, also known as
    substance errors (James, 1998), and defined by
    MacWhinney as the level that involves the
    acquisition of basic lexical structures in small
    areas of cortex called local maps. They are
    related to perceptual representations, especially
    to memory, such as memory failure or memory
    distortion. Typically these errors can be
    identified at single-word level, as
  • spelling or number errors (great graet
    information informations) ,
  • or by looking at its close neighbors as
  • absence of articles or prepositions. (the moon
    is?brightest I dressed myself in?hurry I sat
    back?my chair).

24
  • Lexico-grammatical (or lexical grammatical)
    level. Misconception of target language system.
    When looking at the errors of our learners, it is
    very difficult to isolate grammar and lexis into
    separate categories, because grammar does not
    exist on its own. James defines it as text-level
    errors whereas MacWhinney chooses to call it
    the level that involves the interaction between
    lexical structures in terms of lexical groups.
    Typically these errors can be identified at the
    inter-word level, by looking at the word and its
    neighbors.
  • using the wrong parts of speech (POS errors It
    is not difficulty that we can find)
  • wrong word (substitution errors ???????If
    you match difficult problemPeople take
    (pay)more attention to it)
  • wrong collocates (They must listen to the lesson
    more carefully.)
  • verb agreement (People argues that euthanasia or
    mercy killing is humane.)
  • Reference (My aunt came to my home with his
    son.).

25
  • Syntactic level. Errors can be identified at a
    broader context, at the sentential level. James
    chooses to call it discourse-level errors, but
    we propose to reserve the word discourse for
    another upper level. L2 learners may often
    produce grammatical sentences that sound foreign.
    (Pawley and Syder, 1983). MacWhinney defines it
    as the level that involves the processing of
    syntactic information across longer neural
    distances in functional neural circuits.
    Syntactic errors vary from

26
  • capitalization (he learned English and Russian
    and Wrote the Civil War in France. )
  • punctuation (When playing football or basketball.
    You might be using 400 calories an hour.), to
  • run-on sentences (If I am not famous, it doesnt
    matter, I dont mind this.),
  • fragmentary sentences (As they do more exercises
    and often think deeply.) and
  • structural deficiency (During I spent my holidays
    in Beijing about ten years ago,).

27
  • Figure 9 The Cognitive Model

28
  • Confirmatory factor analysis was conducted by
    using Lisrel 8.50, which shows clearly that there
    are 3 factors, and they are grouped under 3
    categories as what have been defined. Path
    analysis shows that all the parameters (values of
    ?s) of the hypothetical paths are significant
    except run-on sentences.

29
Lamda0.28, insignif-icant
30
Correspondence Analysis (learners types by error
types)
31
Some General Remarks
  • On the whole, identifying errors at 3 levels
    seems to be working well in our cognitive model.
    So far weve not covered errors at the discourse
    level, because,
  • It is difficult to set down the standards for
    native-like selection as defined by Pawley and
    Syder (1983)
  • It is even more difficult for Chinese markers of
    errors to observe the standards.

32
  • The grouping of errors is not as clear-cut as
    what weve thought. Very often the same type of
    error can be put into different categories or the
    same type of errors can occur at 3 different
    levels depending on the situations. We can only
    say this is done according to the main tendency.
  • At every level, language transfer seems to play
    an important role. This is because the adult
    learners have set up their L1 (more complete)
    linguistic system and are in the process of
    setting up another linguistic system (rather
    incomplete). As mature learners, when they want
    to express their complex thinking, they often
    fall back on using the linguistic system that is
    more familiar to them.

33
  • Occurrences of errors depend very much on the
    writing task and the learners certainty of
    fulfilling the task. They may not be an
    indication of the language proficiency of the
    learners. CET learners tend to commit more
    lexico-grammatical errors because their data were
    collected mainly from CET compositions.
  • This points to the necessity of inclusion of more
    learner data, so that we can have a more balanced
    collection of error types for further
    investigation.

34
Thanks!
Your Comments are Welcome!
Write a Comment
User Comments (0)
About PowerShow.com