Finite State Morphological Parsing - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Finite State Morphological Parsing

Description:

Cascading running rules in series with the output of one feeding ... possessive ( s, s') Verb - 3rd pers sg (shows) pres. part (showing) past tense (showed) ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 24
Provided by: chssMon
Category:

less

Transcript and Presenter's Notes

Title: Finite State Morphological Parsing


1
Finite State Morphological Parsing
  • Lecture 3

2
Concepts
  • Input Output
  • Generating cat cats
  • Recognizing cats yes
  • Parsing cats N pl

3
Definitions
  • Cascading running rules in series with the
    output of one feeding the input of the second.
    hope -gt hope zpl
  • hope zpl -gt hope s/-vc__
  • Composing converting the cascade into a single
    two level transducer.
  • Stemming removing affixes

4
English Inflectional Morphology
  • Adjective comparative (-er)
  • superlative (-est)
  • Noun - plural (-s)
  • possessive (s, s)
  • Verb - 3rd pers sg (shows)
  • pres. part (showing)
  • past tense (showed)
  • past part (shown)

5
Rules for English Inflection
  • Mostly local rules involving spelling
  • fox, foxes
  • beg, begging
  • busy, busier
  • Some long distance rules
  • beautiful, beautifuler

6
English Derivational Morphology
  • Semantic constraints on affixes
  • un happy unhappy
  • un big unbig
  • child ish childish
  • spoon ish spoonish
  • spoon ful spoonful
  • child ful childful

7
Morphological Complexity
  • English is a morphologically finite language
  • All inflectional forms and derivational forms can
    be enumerated in a lexicon without much cost
  • This is not so for agglutinative languages like
    Turkish, Hungarian, and Finnish

8
Turkish example
  • uygarlastiramadiklarimizdanmissinizcasina
  • uygarlastiramadiklarimizdanmissinizcasin
    a
  • Behaving as if you are among those whom we could
    not cause to become civilized
  • And this does not include the derivational
    affixes!
  • Consider that the affixes for behave, cause,
    become could all recur in this example, and you
    can see the possibility of an infinite number of
    words an uncomputable problem.

9
Template Morphology - Arabic
10
Templatic Morphology
  • Far more complex than English
  • But, unlike agglutination, it does not have the
    potential for an infinite number of words

11
Representing spoonish and
childful wksht 2
12
An FSA for English Nouns with inflection
13
An FSA for English Nouns with inflection
  • In the previous slide, the FSA gives all the
    letters the same status.
  • In particular, the s is not identified as a
    separate morpheme.
  • This FSA is only working on one level the
    surface level.

14
Two FSAs working in parallel a Finite State
Transducer
  • Lexical
  • Surface

Generation goes from Lexical to
Surface Recognition and Parsing goes from
Surface to Lexical
15
An FST for English Nouns with inflection
16
FST Conventions
  • Symbols
  • (morpheme boundary)
  • (word boundary)
  • e (empty element null)
  • _at_ (any element)
  • Input output pairs are represented as
  • Inputoutput, e.g., PLs

17
An Intermediate Level
  • The FST on the previous slide contains both
    morphological (s) and part-of-speech (N)
    information. This necessitates two levels below
    the surface.

marks morpheme boundary marks word boundary
18
Implementing an FST Rule
19
e insertion rule for kisses, foxes,
klutzes wksht 3
20
A State Transition Table Example
  • State Transition Table   
  • Input State 1 0
  • S1 S1 S2
  • S2 S2 S1
  • All the possible inputs to the FSA are enumerated
    across the columns of the table.
  • All the possible states are enumerated across the
    rows.
  • From the state transition table, it is easy to
    see that if the FSA is in S1 (the first row), and
    the next input is character 1, the FSA will stay
    in S1. If a character 0 arrives, the machine will
    transition to S2 as can be seen from the second
    column. In the diagram this is shown by the arrow
    from S1 to S2 labeled with a 0.

1
1
0
0
21
The State Transition Table for the -e insertion
rule(path for kisses starting at the second s)
22
PC-KIMMO Sample Rule
  • RULE "Voicing sz ltgt V___V" 4 4
  • V s V _at_ lexical level of states
  • V z _at_ _at_ surface level of columns
  • 1 2 0 1 1
  • 2 2 4 3 1
  • 3 0 0 1 1
  • 4. 2 0 0 0
  • State transitions defined for each state

states
23
Problems with FST Analysis
  • Local ambiguity asses(s)
  • Requires a solution to handle non-determinism
    (back-up, look-ahead, parallelism)
  • Long distance dependencies beautifuler
  • The rule that determines when to add er depends
    on syllable count, which is outside the scope of
    Finite State machines
Write a Comment
User Comments (0)
About PowerShow.com