Matthieu Hermet, Stan Szpakowicz - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Matthieu Hermet, Stan Szpakowicz

Description:

Process different words (using a dictionary) to detect synonyms. Control of syntax in S ... A dictionary of synonyms. A derivational dictionary. Locally derived ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 23
Provided by: coe147
Category:

less

Transcript and Presenter's Notes

Title: Matthieu Hermet, Stan Szpakowicz


1
Matthieu Hermet, Stan Szpakowicz
  • Automated Analysis of Students Free-text Answers
    for Computer-Assisted Assessment
  • University of Ottawa, Canada

2
CAA for CALL
  • To address the specificity of CALL
  • ? where student material contains syntactic and
    orthographic errors
  • with minimal pre-encoded material
  • Content validation simple
  • Form validation difficult
  • ? A good case for automating based on Natural
    Language Processing

3
Text comprehension
  • The uOttawa project CALL solutions for helping
    French-as-a-Second-Language students to enhance
    autonomous reading comprehension
  • ? master the structure of text in order to
    understand the authors discursive intention
  • ? guess the meaning of unknown words
  • ? develop reformulation and synthesis capabilities

4
DidaLect
  • is a FSL tool aimed at teaching autonomous
    reading skill (designed for intermediate- and
    advanced-level students)
  • Is an instance of eLearning Intelligent Tutoring
    System
  • adaptation to individual students skills and
    agenda
  • access to external resources (dictionaries)
  • built to reflect the cognitive concerns such as
    matching feedback to the students behaviour

5
Intelligence in DidaLect
  • DidaLect begins its operation with a placement
    test to determine a students initial level
  • varying order of questions to pick up the best of
    a students skill
  • the implementation includes fuzzy logic methods
  • A separate element of DidaLect is the processing
    of free-text answers
  • need of a robust CAA component
  • a trade-off between symbolic processing and
    Machine Learning techniques

6
Free-text answer assessment
  • The problem is to know in advance what material
    to expect in student answers.
  • Usually implemented as a classification problem
    a student answer must match reference answer(s).
  • ? Size and form of reference material affects the
    process
  • Here, a reference answer is the text itself
  • ? A case for trying symbolic processing using
    techniques of Computational Linguistics

7
Expected limitations
  • No possibility of modifying the size and form of
    the reference material, except by automatic
    processing to control reformulation.
  • Therefore, this only works for limited forms of
    questions.
  • Strong need to ground selection of questions in a
    firm didactic theory.
  • Questions on texts for Text Comprehension
    (didactics offers a classification of question
    types).

8
Question types 1
  • Text-Implicit based on two co-referenced
    sentences
  •  le détecteur de décélération situé à l'avant du
    véhicule génère instantanément un courant
    électrique, qui déclenche une amorce, qui
    elle-même enflamme un mélange allumeur. Ce
    dernier met finalement le feu à l'agent
    propulseur responsable du gonflement du
    coussin. 
  • Q Quelle est la réaction en chaîne qui se
    produit lorsque survient un impact ?
  • Text-Explicit based on a single sentence
  •  d'habitude, l'hermaphrodisme frappe surtout les
    mâles, qu'elle dote de simulacres d'appareils
    génitaux féminins. 
  • Q Quarrive-t-il aux ours mâles lorsquils sont
    frappés dhermaphrodisme ?
  • Ex R  Ils ont les génitaux féminins 

9
Question types 2
  • Identification, cause-effect, goal, comparison,
    definition, instrumental
  • These categories express linguistically through
    lexical connectors
  • Goal for, so that, in order to
  • CauseEffect because, therefore
  • So, the control of reformulation can be automated

10
Processing
  • Find lexical differences between the students
    answer S and reference R
  • Parse S and R, produce dependency relations
  • Process different words (using a dictionary) to
    detect synonyms
  • Control of syntax in S
  • Control of reformulation in S wrt R

11
Tools
  • A robust parser that enables partial recovery
    from errors in students answer
  • A dictionary of synonyms
  • A derivational dictionary
  • Locally derived resources
  • State and action verbs
  • Ensemble of typical errors, set of syntactic and
    reformulation structures

12
Semantics and synonyms
  • Examine word set differences and commonalities in
    search for
  • Common words
  • Reformulated words
  • Different words
  • Detect synonyms accross parts of speech
  • Derive forms for a word lemma
  • Search synonyms for each form and look for a
    match in Word Sets

13
Syntax and reformulation
  • Correct syntactical structures to verify syntax
    of students answer
  • Lexicalized reformulation structures to verify
    discourse conformity
  • Ex pollution has increased with the rise of
    transportation
  • Q Why has pollution increased ?
  • Ans With the rise of transportation is
    partially wrong
  • ? Because of the rise of transportation
  • OR it has increased due to the rise of
    transportation
  • ETC.

14
Parsing and tree-building
  • S Le cardio-vasculaire d'un rat s'approche à
    une personne humain.
  • SUBJ(ltapprocheapprocher53gt, ltcardio-vasculairec
    ardio-vasculaire48gt)
  • OBJ(ltapprocheapprocher53gt, ltpersonnepersonne56
    gt)
  • VMOD_POSIT1(ltapprocheapprocher53gt, ltuneun55gt)
  • NMOD_POSIT1(ltcardio-vasculairecardio-vasculaire4
    8gt, ltratrat51gt)
  • PREPOBJ(ltuneun55gt, ltàà54gt)
  • PREPOBJ(ltratrat51gt, ltd'de49gt)
  • DETERM(ltratrat51gt, ltunun50gt)
  • The above are incrementally recomposed, based on
    lexical selection that maximizes promise of
    discovering material which diverges from R. That
    material is processed in parallel, in a similar
    fashion
  • SUBJ( ltOBJ(ltapprochergt, ltNMOD(ltêtregt,
    lthumaingt)gt)gt,
  • ltNMOD(ltNMOD(ltsystèmegt, ltcardio-vasculairegt)gt,
    ltratgt)gt)

15
Main types of heuristics
  • To address syntactic correctness and/or
    equivalence between S and R the same sense but
    different structures
  • ? bank of typical errors and correct structures
  • To address discursive variations, detected as
    supplementary material
  • ? bank of state and action verbs action verbs
    must be present, possibly reformulated
  • To address, partially, errors in S
  • ? word replacement to relaunch parsing when
    stopped due to lexical mistakes

16
Reformulation rules
  • Examples
  • Abstraction incidence sur le temps de gestation
    ? incidence sur la possibilité davoir une
    gestation écourtée (words like fact, chance,
    etc.)
  • Cause-Effect Le plasma augmente et dilue les
    paramètres chimiques ? Laugmentation du plasma
    dilue les paramètres chimiques
  • Is-A Le rat est un animal qui S ? Le rat S
  • Attribute Le rat possède un système
    cardio-vasculaire ? Le système C-V du rat

17
Assessment
  • Must give student feedback on
  • Agreement and orthography
  • Syntax signal errors and provides correction via
    display of a correct structure
  • Semantics signals error and provides admissible
    words
  • Completeness of content with respect to R

18
Example
  • R Et puisque le rat est un animal qui possède un
    système cardio-vasculaire très semblable à celui
    de lhumain, il est donc permis de tirer les
    mêmes conclusions pour lhumain.
  • Q Pourquoi peut-on tirer les mêmes conclusions
    pour l'humain et pour le rat ?
  • S Le cardio-vasculaire dun rat sapproche à une
    personne humain.
  • Start by creating wordlists
  • Words of S absent in R
  • ? sapprocher, personne
  • Words of R absent in S
  • ? animal, posséder, système, semblable
  • Common words
  • ? rat, cardio-vasculaire, humain

19
Parse (partial output)
  • R
  • SUBJ(ltestêtre2gt,ltratrat1gt)
  • OBJ_SPRED(ltestêtre2gt, ltanimalanimal4gt)
  • OBJ(ltpossèdeposséder6gt, ltsystèmesystème8gt)
  • COREF_REL(ltanimalanimal4gt, ltquiqui5gt)
  • NMOD_POSIT1(ltsystèmesystème8gt,
    ltcardio-vasculairecardio-vasculaire9gt)
  • NMOD_POSIT1(ltsystèmesystème8gt,
    ltsemblablesemblable11gt)
  • NMOD_POSIT1(ltceluicelui13gt, lthumainhumain16gt)
  • ADJMOD(ltsemblablesemblable11gt,
    ltceluicelui13gt)
  • PREPOBJ(lthumainhumain16gt, ltdede14gt)
  • PREPOBJ(ltceluicelui13gt, ltàà12gt)
  • DETERM(ltsystèmesystème8gt, ltunun7gt)
  • DETERM(ltanimalanimal4gt, ltunun3gt)
  • DETERM_DEF(ltratrat1gt, ltlele0gt)
  • CONNECT_REL(ltpossèdeposséder6gt, ltquiqui5gt)
  • S
  • SUBJ(ltapprocheapprocher53gt, ltcardio-vasculairec
    ardio-vasculaire48gt)
  • OBJ(ltapprocheapprocher53gt, ltpersonnepersonne56
    gt)
  • VMOD_POSIT1(ltapprocheapprocher53gt, ltuneun55gt)
  • NMOD_POSIT1(ltcardio-vasculairecardio-vasculaire4
    8gt, ltratrat51gt)
  • PREPOBJ(ltuneun55gt, ltàà54gt)
  • PREPOBJ(ltratrat51gt, ltd'de49gt)
  • DETERM(ltratrat51gt, ltunun50gt)

20
Comparison (partial)
  • SUBJ(lt OBJ(ltapprocheapprocher53gt,ltpersonneperso
    nne56gt) gt,ltNMOD_POSIT1(ltcardio-vasculairecardio-
    vasculaire48gt,ltratrat51gt)gt)
  • SUBJ(ltOBJ_SPRED(ltestêtre2gt,lt COREF_REL(ltanimala
    nimal4gt,lt CONNECT_REL(lt OBJ(ltpossèdeposséder6gt,
    lt NMOD_POSIT1(lt NMOD_POSIT1(ltsystèmesystème8gt,ltc
    ardio-vasculairecardio-vasculaire9gt)
    gt,ltADJMOD(ltsemblablesemblable11gt,lt
    NMOD_POSIT1(ltceluicelui13gt,lthumainhumain16gt)
    gt)gt)gt) gt,ltquiqui5gt) gt) gt)gt,ltratrat1gt)
  • Consider structures to assess for syntactic
    correctness
  • Heuristics to put some structures into
    equivalence
  • ? Here, Le SCV du rat and Le rat est un animal
    qui possède un SCV are equivalent, but expressed
    in different syntactic structures

21
Synonyms
  • - ltRESULTSgt
  • - ltDEFgt
  • - ltE L"fr"gt
  •   ltWgtsemblablelt/Wgt
  •   ltSW ENC"n"gt00730065006D0062006C00610062006C0065
    lt/SWgt
  •   ltPgtadj.lt/Pgt
  •   lt/Egt
  • - ltDF N"1"gt
  •   ltWgtqui ressemble à comparable, similaire.lt/Wgt
  •   lt/DFgt
  •   lt/DEFgt
  • - ltDEFgt
  • ..
  • - ltWD L"fr" W"2"gt
  •   ltW L"fr"gtapprochantlt/Wgt
  •   ltSW ENC"n"gt0061007000700072006F006300680061006E
    0074lt/SWgt
  •   lt/WDgt
  • To retrieve a synonymy relation
  • Produce derivations for all words in List 1
  • Find matches in a synonymy basis under entries
    for words of List 2
  • The search process can be repeated at most once,
    using ltDEFgt lexemes
  • ? semblable  approchant
  • OR
  • ? ressembler à sapprocher de
  • 4. In this way we can catch both synonyms and
    attached prepositions.

22
Conclusions and future work
  • Automation is possible, with 2 main restrictions
  •  bad faith  answers
  • Lexical errors based on homonymy
  • ? as long as S contains elements of answer, S can
    be evaluated
  • Future Work to assemble the parts through
    software engineering !
Write a Comment
User Comments (0)
About PowerShow.com