Creating Morphological Data: From Markup to Generalizations - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

Creating Morphological Data: From Markup to Generalizations

Description:

Non-expert linguists. On-line tools and help. Machine testing of hypotheses ... Feature systems should be transparent to non-expert user ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 77
Provided by: MikeMa
Category:

less

Transcript and Presenter's Notes

Title: Creating Morphological Data: From Markup to Generalizations


1
Creating Morphological Data From Markup to
Generalizations
  • Mike MaxwellSIL International Web-Based
    Language Documentation and Description12-15
    December 2000, Philadelphia, USA.

2
Tools for lesser known languages
  • Languages for which no adequate computer
    processing is being developed, risk gradually
    losing their place in the global Information
    Society, or even disappearing, together with the
    cultures they embody, to the detriment of one of
    humanitys great assets its cultural diversity.
  • Zampolli and Varile, Forward to the Survey of
    the state of the art in human language
    technology. (1997 xvi)

3
Methods for creating computational grammars
  • Linguistic Expert approach
  • Machine Learning approach
  • Middle road?
  • Non-expert linguists
  • On-line tools and help
  • Machine testing of hypotheses

4
What would ideal morphology tools do?
  • Handle complex morphologies, including
  • Morphological processes (reduplication, infixes,
    ablaut, suprafixes)
  • Phonological and morphosyntactic features
  • Multidimensional paradigms
  • Complex affix ordering
  • Automatically turn raw text data into a
    morphological grammar and lexicon

5
Next best tools would
  • Handle complex morphologies
  • Let the user concentrate on the task at hand,
    e.g. morpheme-level markup
  • Learn from the user
  • Help diagnose incorrect parses or missing parses
  • Display the current state of the grammar
  • Provide easy migration path between stages of
    grammatical analysis
  • with non-expert users

6
Morphology tools should...
  • Handle morphological processes
  • Reduplication
  • Infixes
  • Ablaut
  • Suprafixes
  • Issue for parser/ generator and data model

7
Morphology tools should...
  • Handle Morphosyntactic features
  • Feature systems should be transparent to
    non-expert user
  • Solution Grammatical Gloss List incorporating
    morphosyntactic features

8
Grammatical Gloss List
9
Grammatical Gloss List
10
Grammatical Gloss List
11
Grammatical Gloss List
12
Grammatical Gloss List
13
Grammatical Gloss List
14
Grammatical Gloss List
15
Grammatical Gloss List
16
Grammatical Gloss List
17
Grammatical Gloss List
18
Grammatical Gloss List
19
Grammatical Gloss List
20
Grammatical Gloss List
21
Grammatical Gloss List
22
Grammatical Gloss List
23
Grammatical Gloss List
24
Grammatical Gloss List
25
Grammatical Gloss List Summary
  • Use Grammatical Gloss List to hide
    morphosyntactic feature system
  • Feature system remains accessible to advanced
    users
  • Interlinear glossing builds language-specific
    feature system and feeds lexicon
  • Possibly similar solution for phonological
    features

26
Morphology tools should...
  • Learn from the user
  • Lexical information
  • Grammatical information
  • Example interlinear text glossing

27
Interlinear Text
28
Interlinear Text
29
Interlinear Text
30
Interlinear Text
31
Interlinear Text
32
Interlinear Text
33
Interlinear Text
34
Interlinear Text
35
Interlinear Text
36
Interlinear Text
37
Interlinear Text Summary
  • Parser automatically picks up some information
  • Lexical entries for new morphemes (form,
    prefix/suffix/stem status, category of stems,
    gloss/ morphosyntactic features)
  • Preferred parse of wordforms
  • User can designate incorrect parses, and diagnose
    them

38
Morphology tools should...
  • Help diagnose incorrect parses or missing parses

39
Morphology tools should...
40
Morphology tools should...
  • Handle complex multidimensional paradigms
  • Paradigms useful for finding
  • missing inflectional affixes
  • co-occurrence restrictions among inflectional
    affixes
  • syncretism
  • allomorph constraints
  • Solution Paradigm Charting tool

41
Paradigm Charting Tool
42
Paradigm Charting Tool
43
Paradigm Charting Tool
44
Paradigm Charting Tool
45
Paradigm Charting Tool
46
Paradigm Charting Tool
47
Paradigm Charting Tool
48
Paradigm Charting Tool
49
Paradigm Charting Tool
50
Paradigm Charting Tool
51
Paradigm Charting Tool
52
Paradigm Charting Tool
53
Generated paradigm chart
54
Paradigm Charting Tool Summary
  • Dimensions selected from language-specific
    feature gloss list/ morphosyntactic feature
    system
  • Cell fillers from attested wordforms and/or
    generator

55
Morphology tools should...
  • Help determine affix ordersExample inflectional
    templates

56
Create an Inflectional Template
57
Create an Inflectional Template
58
Morphology tools should...
  • Allow the user to debug grammar
  • Turn rules off and on
  • Change rule order
  • Trace the parse
  • Compare traces after changing the grammar

59
Tracing and Debugging
60
Tracing and Debugging
61
Tracing and Debugging
62
Tracing and Debugging
63
Tracing and Debugging Summary
  • Easier to understand generation traces
    (derivations) than parse traces
  • Diffing derivations still messy

64
Morphology tools should...
  • Help the user see the current state of the
    grammar
  • Dump the grammar out as XML and apply a
    customizable style sheet.

65
Grammar Write-up View
66
Grammar Write-up View
67
Grammar Write-up View
68
Grammar Write-up View
69
Grammar Write-up View
70
Grammar Write-up View Summary
  • Grammar information can be dumped out in text/
    chart form
  • Suitable for web
  • Could benefit from NL generation techniques

71
Morphology tools should...
  • Provide easy migration path between stages of
    analysis
  • Allow the user to capture adhoc restrictions--but
    let the user come back to those restrictions to
    look for generalizations

72
Migration Path
73
Migration Path
74
Migration Path
75
Migration Path
  • Going from observations to generalizations, e.g.
  • Ad hoc co-occurrence restrictions among
    allomorphs
  • Lexically listed allomorphs with phonological
    restrictions
  • Underlying forms with phonological rules to
    derive allomorphs
  • Parsing requires observational adequacy, not
    necessarily descriptive adequacy
  • Database support for finding less-than-adequate
    areas of analysis

76
Symbiosis
  • The information processing equipment, for its
    part, will convert hypotheses into testable
    models and then test the models against data
    (which the human operator may designate roughly
    and identify as relevant when the computer
    presents them for his approval). The equipment
    will answer questions. It will simulate the
    mechanisms and models, carry out procedures, and
    display the results to the operator... In
    general, it will carry out the routinizable,
    clerical operations that fill the intervals
    between decisions Finally, it will do as much
    diagnosis, pattern matching, and relevance
    recognizing as it profitably can, but it will
    accept a clearly secondary status in those areas.
  • J.C.R. Licklider, Man-Computer Symbiosis.
    (1960)

77
Current Status
  • Morphology modeling nearly complete
  • Testing model with real data
  • Evaluating parsers/ generators
  • Developing tools to diagnose missing/ incorrect
    parses
  • Designing migration path tools
Write a Comment
User Comments (0)
About PowerShow.com