Projet CorrecTools - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Projet CorrecTools

Description:

Necessity for Non Native Speakers of English (NNS) to produce ... Didactic perspective: explanation of errors and grammar rules given alongside the corrections ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 16
Provided by: etu48
Category:

less

Transcript and Presenter's Notes

Title: Projet CorrecTools


1
An English Writing Assistantfor Non Native
Speakers
  • Projet CorrecTools
  • (CAPRA Compagnon dApprentissage et de
    Perfectionnement à la Rédaction en Anglais)
  • M. Garnier, A. Rykner
  • Université Toulouse 2
  • P. Saint-Dizier
  • CNRS
  • France

2
Introduction
  • English main language for international
    communication
  • ? Necessity for Non Native Speakers of English
    (NNS) to produce satisfactory English texts
    (personal/professional spheres)
  • Learning and practice necessary requirements for
    long-term acquisition of writing skills
  • Each language and linguistic community encounter
    specific problems in writing English (language
    transfer)
  • Need for an automatic English writing
    assistant
  • Presentation of our project
  • Aims and challenges
  • Corpus constitution and error analysis
  • Annotation of errors
  • Some results of the analysis of a Thai to English
    corpus

3
1. Aims and challenges
  • Presence of grammatical, lexical and stylistic
    errors in the productions of NNS
  • Make comprehension difficult damage credibility
  • A lot of errors are not treated by text editors
    such as MS Word etc.
  • Didactic perspective explanation of errors and
    grammar rules given alongside the corrections
  • Focus on pairs of languages (French to English,
    Thai to English)
  • Prototypicality of errors easier correction
    process
  • Knowledge of the L1 more efficient analysis and
    correction of errors

4
2. Corpus constitution and error analysis
  • Exploratory corpus emails, reports, scientific
    publications, web pages, blogs
  • Parameters
  • Variety of authors (professionals, researchers,
    students)
  • Different domains of production (business,
    research, personal sphere)
  • Different levels of control, i.e. amount of care
    devoted to the production of a document
  • First stage manual detection, annotation, and
    correction of errors
  • Classification of errors creation of a system of
    categories
  • Characteristics of the system
  • Categories created according to linguistic
    criteria, i.e. NP, PP, VP, Clause and Sentence
  • Inclusion of two levels of subtypes of errors
    inside main categories
  • Inclusion of indications concerning broad
    linguistic parameters Lexicon, Morpho-Syntax,
    Syntax, Semantics, Style

5
2. Corpus constitution and error analysis (2)
  • Example

6
3. The annotation of errors
  • Errors are annotated using a standard XML
    formalism enriched with attributes
  • Schema designed so as to reflect cognitive
    strategies used by human correctors when
    detecting and correcting errors
  • Delimitation and characterization of errors

7
3. The annotation of errors (2)
  • Delimitation and characterization of corrections

8
3. The annotation of errors (3)
  • Example of an annotated error with multiple
    corrections
  • The second stage has therefore two goals ...
    and the construction of the meaning utterance.

9
4. Some results on a Thai-English corpus
  • Preliminary study conducted on a limited corpus
    of English texts written by Thai native speakers
  • Description of corpus
  • 10 scientific abstracts
  • 1755 words
  • Various research domains and writers
  • Steps completed so far
  • Detection of errors
  • Classification of errors
  • Highlighting several aspects of error
    distribution
  • Future steps
  • Annotation of errors
  • Collaboration with Thai native speakers in order
    to study the extent of transfer effects
  • Towards a correction system?

10
4. Some results on a Thai-English corpus (2)
  • Distribution of errors according to broad
    linguistic parameters (number of subtypes of
    errors vs. number of errors in total for each
    axis)

Lexicon
MorphoSyntax
11
4. Some results on a Thai-English corpus (3)
  • Distribution of errors according to main
    categories of our system (number of subtypes of
    errors vs. number of errors in total for each
    category)

12
4. Some results on a Thai-English corpus (4)
  • Distribution of errors according to subtypes of
    errors
  • Main types of errors omission of determiner,
    omission of plural, erroneous subject/verb
    agreement, abusive NØN construction

13
4. Some results on a Thai-English corpus (5)
  • Omission of determiner
  • World of information technologies can be
    classifiied into 2 main groups.
  • ? The world of information technologies can be
    classified into 2 main groups
  • Omission of plural
  • Reading from book and website is a way to
    diagnose diseases.
  • ? Reading from books and websites is a way to
    diagnose diseases.
  • Erroneous subject/verb agreement
  • Precision depend on noise in each website.
  • ? Precision depends on noise in each website.
  • Abusive NØN construction
  • It will decrease the plant quality.
  • ? It will decrease the quality of the plant /
    the plants quality.

14
Perspectives
  • French to English
  • Extend the initial corpus
  • Investigate the relevance of learner corpora
  • Stabilize the classification system and the
    annotation schema
  • Focus on certain errors and start drafting rules
    for correction
  • Evaluate the needs of a population of users and
    the demand for such a tool
  • Thai to English
  • Extend the initial corpus
  • Work with Thai researchers to evaluate the needs
    of potential users and assess the quality of the
    analyses proposed
  • Draft a roadmap for the continuation of the
    project in Thailand

15
  • Kop khun khà!
  • CorrecTools website
  • http//www.irit.fr/recherches/ILPL/webct/ct.html
Write a Comment
User Comments (0)
About PowerShow.com