Title: Lecture 2 Finite State Machines
1Lecture 2Finite State Machines Morphology
CSCE 771 Natural Language Processing
- Topics
- Finite State Machines
- Morphology
- Readings Chapter 3 (skim Chapter 2)
June 4, 2009
2- Last Time
- Challenge of 2001s HAL
- Areas of Research
- Examples of Language Processing
- Formal Languages
- Alphabets, strings,
- regular expressions denote languages
- finite automata (DFA, NFA) accept languages
- Today
- Slides from Lecture 1 30-
- Regular expressions in Perl, grep, vi, emacs,
word? - Eliza
- Morphology
3Python ??
4Concepts on Regular Languages
- If r is a regular expression then there is a
construction from r that will yield an NFA Mr
such that L(r) L(Mr) - Every NFA can be converted into an equivalent
DFA? - There are languages that cannot be regular.
- Every regular language can be generated by a
regular grammar.
5Grep family revisited
- Global match Regular Expression and Print (GREP)
- grep uUnix f1 f2 fn
- egrep pat files // efficient NFA?DFA, then
execute - fgrep pat files // fixed grep for fixed strings
- Find for searching directories (not really reg
expr) - find dir name pat // search for files with name
matching pat - find dir -exec grep pat //search in
files for the pattern pat
6Editing scripts
- Create a script of editing commands then execute
with - ex file1 lt edScript
- Example
- 1,s/uUnix/UNIX/g
- 1,s/langauge/language/g
- g//d // delete empty lines start of line
end -
- w
- q
7Other Unix regular expression Based Tools
- sed (stream editor)
- awk
- Perl scripting language
- Python, Ruby
- Nltk now in Python
8Perl Regular Expressions
9Eliza Substitutions
- Eliza took the input and performed several
transformations (substitutions) to produce the
output. - First
- s/mMy/YOUR/g my ? YOUR
- s/Im/YOU ARE/g Im ? YOU ARE
10Eliza in Perl
- print "Welcome to Elizalike. Talk to me! (Or type
\"bye\" to quit.)\n" - Start an infinite loop
- while (1 1)
- This line reads in user input, and stores it in
the special - variable _, which makes the regular expression
- statements below more succint.
- _ ltSTDINgt
- Allow users to quit
- if (_ /byeByeBYE/)
- print "Elizalike Well, it was nice talking
with you!\n" exit (0)
http//en.wikipedia.org/wiki/ELIZA
11- Insert a tag at the beginning of the line to
identify - it as Eliza's reponse, and to make finding word
- boundaries easier.
- s//Elizalike /
- Replace all instances of "you are" with "Eliza
is" - Note how (\W) and \1, etc are used to mark word
- boundaries and keep whatever non-word character
was - in the input in the output.
- s/(\W)(youYou) are(\W)/\1Eliza is\3/g
http//en.wikipedia.org/wiki/ELIZA
12Transformations
http//en.wikipedia.org/wiki/ELIZA
13- Print the result to STDOUT.
- print STDOUT
-
http//en.wikipedia.org/wiki/ELIZA
14Morphology -
- A writer is someone who writes, a stinger is
something that stings. But fingers dont fing,
grocers dont groc, hammers dont ham and
humdingers dont humding. - Richard Lederer, Crazy English
- Consider mapping singular nouns to plurals
- Duck ? ducks, cat ? cats, book ? books
- Fox ? foxes, bush ? bushes
- Goose ? geese
- Fish ? fish
- Morphological parsing recognizing plurals
- Spelling rules
- Morphological rules
15Morphology -
- Morphemes minimal meaning bearing unit
- Stems and affixes
- Cat -s
- Affixes
- Prefixes
- Suffixes
- Infixes
- Circumfixes
- Inflection combination of word stem with affix
reulting in a word of the same class (noun ?
noun) - Dervivation combination of word stem with
grammatical morpheme resulting in another class
verb?noun (e.g. walk ing ? walking)
16Turkish Example of a complex language
- The following is a Turkish word and its
decomposition.
17English Regular Nouns (Number)
- In other languages you typically indicate other
features such as gender. - Plurality
- possesive
SLP fig from 3.1.1
18Regular Verbs
SLP fig from 3.1.1
19Irregular Verbs
SLP fig from 3.1.1
20To Love in Spanish
OLD SLP fig from 3.1.1
21Nominalization
- Nominalization formation of nouns from verbs or
adjectives
SLP fig from 3.1.2
22Adjectives from Nouns and Verbs
23Morphological Parsed Output
- N Noun, V Verb
- PL, SG, Present (default not shown), Past,
Pres-Participle, Past-Participle
SLP fig 3.2
24Figure 3.3 FSA for nominal inflection
25Figure 3.4 FSA for verbal inflection
26Verb stem classes
27Fig 3.5 FSA for Adjective morphology
- Conisder
- Cool, happy, natural
- Big, equal
28 29Fig 3.6 FSA for another Fragment of English
derivational morphology
303.7 English nouns with inflection
31 32 33 34Fig 3.14 FSA for English nominilazation
35 36 37 38 39 40