Natural Language Processing - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Natural Language Processing

Description:

Natural Language Processing CS480/580 Levels of Linguistic Analysis Phonology---recognize speech sounds Morphology---analysis of word forms (e.g., adding s to make a ... – PowerPoint PPT presentation

Number of Views:217
Avg rating:3.0/5.0
Slides: 20
Provided by: muk1
Learn more at: https://www.cs.odu.edu
Category:

less

Transcript and Presenter's Notes

Title: Natural Language Processing


1
Natural Language Processing
  • CS480/580

2
Levels of Linguistic Analysis
  • Phonology---recognize speech sounds
  • Morphology---analysis of word forms (e.g., adding
    s to make a plural etc.)
  • Syntax---sentence structure
  • Semantics---meaning
  • Pragmatics---relation of language to context

3
Tokenization
  • A string broken into words, punctuations removed,
    and key information represented as a sequence of
    words or tokens.
  • E.g., How are you today? is converted to how,
    are, you, today.

4
Tokenize.pl
  • lower_case(A, B) - Agt65, Alt90, !, B is
    A32.
  • lower_case(A, A).
  • tokenize(, ) - !.
  • tokenize(A, BE) - grab_word(A, C, D),
    name(B, C), tokenize(D, E).
  • punctuation_mark(A) - Alt47.
  • punctuation_mark(A) - Agt58, Alt64.
  • punctuation_mark(A) - Agt91, Alt96.
  • punctuation_mark(A) - Agt123.
  • grab_word(32A, , A) - !.
  • grab_word(, , ).
  • grab_word(AB, C, D) - punctuation_mark(A),
    !, grab_word(B, C, D).
  • grab_word(DA, EB, C) - grab_word(A, B,
    C), lower_case(D, E).
  • tokenize("This is CS480/580 course", X).
  • X this, is, cs480580, course.
  • name(john,X).
  • X 106, 111, 104, 110.

5
Template System
  • Templates --- stored sentence patterns
  • Each template is accompanied by a translation
    schema
  • E.g., X, is, a , Y is translated to Y(X).
  • process(X, is, a, Y) - Fact .. Y, X,
    assert(Fact).
  • Process(is, X, a T) - Query .. Y, X,
    call(Query).

6
Template.pl
  • grab_word(32A, , A) - !.
  • grab_word(, , ).
  • grab_word(AB, C, D) - punctuation_mark(A),
    !, grab_word(B, C, D).
  • grab_word(DA, EB, C) - grab_word(A, B, C),
    lower_case(D, E).
  • punctuation_mark(A) - Alt47.
  • punctuation_mark(A) - Agt58, Alt64.
  • punctuation_mark(A) - Agt91, Alt96.
  • punctuation_mark(A) - Agt123.
  • lower_case(A, B) - Agt65, Alt90, !, B is
    A32.
  • lower_case(A, A).
  • write_str(AB) - put(A), write_str(B).
  • write_str().
  • read_str_aux(-1, ) - !.
  • read_str_aux(10, ) - !.
  • read_str_aux(13, ) - !.
  • read_str_aux(A, AB) - read_str(B).
  • do_one_sentence - write(gt), read_str(A),
    tokenize(A, B), process(B).
  • note(A) - asserta(A), write('OK'), nl.
  • read_atom(A) - read_str(B), name(A, B).

7
  • remove_s(A, C) - name(A, B), remove_s_list(B,
    D), name(C, D).
  • read_str(B) - get0(A), read_str_aux(A, B).
  • remove_s_list(115, ).
  • remove_s_list(AB, AC) - remove_s_list(B,
    C).
  • process(B, is, a, A) - !, C..A, B,
    note(C).
  • process(A, is, an, B) - !, process(A, is, a,
    B).
  • process(is, B, a, A) - !, C.. A, B,
    check(C).
  • process(is, A, an, B) - !, process(is, A, a,
    B).
  • process(A, are, B) - !, remove_s(A, D),
    remove_s(B, C), F..C, E, G..D, E,
    note((F-G)).
  • process(does, B, A) - !, C..A, B,
    check(C).
  • process(A, B) - \ remove_s(A, _),
    remove_s(B, C), !, D..C, A, note(D).
  • process(A, B) - remove_s(A, C), \
    remove_s(B, _), !, E..B, D, F..C, D,
    note((E-F)).
  • process(_) - write('I do not understand.'),
    nl.
  • tokenize(, ) - !.
  • tokenize(A, BE) - grab_word(A, C, D),
    name(B, C), tokenize(D, E).
  • start.
  • TEMPLATE.PL at your service.
  • Terminate by pressing Break.
  • gtCS480 is a course.

8
Generative Grammars
  • Templates are inadequate to describe human
    language (in the last example only sentences that
    were allowed was X is a Y.)
  • John arrived
  • Max said John arrived
  • Bill claimed Max said John arrived
  • Mary thought Bill claimed Max said John arrived
  • Chomskys suggestion Treat syntax as a problem
    in set theory---express infinite set as a finite
    description

9
Context Free Grammars
  • Phrase Structure Rules
  • S ? NP VP
  • NP ? Det N
  • N ? N PP
  • N ? N N
  • PP ? P NP
  • VP ? IV VP ? TV NP VP ? DV NP NP
  • Lexical Entries
  • N ? book, cow, course,
  • P ? in, on, with,
  • Det ? the, every,
  • IV ? ran, hid,
  • TV ? likes, hit,
  • DV ? gave, showed

Noam Chomsky
10
Context-Free Derivations
  • S ? NP VP ? Det N VP ? the N VP ? the kid VP ?
    the kid IV ? the kid ran
  • Penn TreeBank bracketing notation (Lisp-like)
  • (S (NP (Det the)
  • (N kid))
  • (VP (IV ran)))
  • Theorem A sequence has a derivation if and only
    if it has a parse tree

11
Standard Parse Tree Notation
12
A simple Parser
  • verb_phrase(A, C) - verb(A, B), noun_phrase(B,
    C).
  • verb_phrase(A, C) - verb(A, B), sentence(B,
    C).
  • determiner(theA, A).
  • determiner(aA, A).
  • sentence(A, C) - noun_phrase(A, B),
    verb_phrase(B, C).
  • noun_phrase(A, C) - determiner(A, B), noun(B,
    C).
  • noun(dogA, A).
  • noun(catA, A).
  • noun(boyA, A).
  • noun(girlA, A).
  • verb(chasedA, A).
  • verb(sawA, A).
  • verb(saidA, A).
  • verb(believedA, A).
  • 2 ?- sentence(the, cat, saw, the, dog, ).
  • true .
  • 3 ?- sentence(the, dog, saw, the, dog, ).
  • true .
  • 4 ?- sentence(a, dog, chased, the, cat, ).

13
Definite Clause Grammar (DCG)
  • This is a Prolog notation to provide an easy way
    to write grammar rules.
  • E.g., sentence ? non_phrase, verb_phrase.
  • This is equivalent to the rule
  • sentence(X,Z) - noun_phrase(X,Y),
    verb_phrase(Y,Z).
  • Also, noun ?dog or noun ? dog cat boy
    girl
  • or verb ? gives, up where gives up is a
    single verb.
  • A query to the above sentence rule will be
    sentence/2
  • E.g., sentence(the dog, chased, the, cat,).
  • Try sentence(A,B,C,D,E,) or sentence(the, A,
    B, C, catE,).
  • Non-terminal symbols can also take arguments
    e.g., sentence(N) ? noun_phrase(N),
    verb_phrase(N).

14
Parser2.pl based on DCG
  • sentence --gt noun_phrase, verb_phrase.
  • noun_phrase --gt determiner, noun.
  • verb_phrase --gt verb, noun_phrase.
  • verb_phrase --gt verb, sentence.
  • determiner --gt the.
  • determiner --gt a.
  • noun --gt dog cat boy girl.
  • verb --gt chased saw said believed.
  • verb --gt saw.
  • verb --gt said.
  • verb --gt believed.

15
Grammatical Features
  • How to handle agreement in tense and number
    between the noun and the verb?
  • sentence(N) --gt noun_phrase(N), verb_phrase(N).
  • noun_phrase(N) --gt determiner(N), noun(N).
  • verb_phrase(N) --gt verb(N), noun_phrase(_).
  • verb_phrase(N) --gt verb(N), sentence.
  • determiner(singular) --gt a.
  • determiner(_) --gt the.
  • determiner(plural) --gt .
  • noun(singular) --gt dogcatboygirl.
  • noun(plural) --gt dogscatsboysgirls.
  • verb(singular) --gt chasesseessaysbelieve
    s.
  • verb(plural) --gt chaseseesaybelieve.

16
  • sentence(plural, the, dogs, A, B, C,).
  • A chase,
  • B a,
  • C dog
  • A chase,
  • B a,
  • C cat
  • A chase,
  • B a,
  • C boy
  • A chase,
  • B a,
  • C girl
  • A chase,
  • B the,
  • C dog

17
Morphology
  • How to generate plural nouns from singular?
  • How to generate third person singular verbs from
    plural verbs?
  • Mostly by adding s

18
  • Sentence(N) --gt noun_phrase(N), verb_phrase(N).
  • noun_phrase(N) --gt determiner(N), noun(N).
  • verb_phrase(N) --gt verb(N), noun_phrase(_).
  • verb_phrase(N) --gt verb(N), sentence.
  • determiner(singular) --gt a.
  • determiner(_) --gt the.
  • determiner(plural) --gt .
  • noun(N) --gt X, morph(noun(N),X) .
  • verb(N) --gt X, morph(verb(N),X) .
  • morph(noun(singular),dog). Singular nouns
  • morph(noun(singular),cat).
  • morph(noun(singular),boy).
  • morph(noun(singular),girl).
  • morph(noun(singular),child).
  • morph(noun(plural),children). Irregular
    plural nouns
  • morph(noun(plural),X) - Rule for
    regular plural nouns
  • remove_s(X,Y),
  • morph(noun(singular),Y).
  • morph(verb(plural),chase). Plural verbs

19
  • morph(verb(plural),chase). Plural verbs
  • morph(verb(plural),see).
  • morph(verb(plural),say).
  • morph(verb(plural),believe).
  • morph(verb(singular),X) - Rule for
    singular verbs
  • remove_s(X,Y),
  • morph(verb(plural),Y).
  • remove_s(X,-X1) lifted from TEMPLATE.PL
  • removes final S from X giving X1,
  • or fails if X does not end in S.
  • remove_s(X,X1) -
  • name(X,XList),
  • remove_s_list(XList,X1List),
  • name(X1,X1List).
  • remove_s_list("s",).
  • remove_s_list(HeadTail,HeadNewTail) -
  • remove_s_list(Tail,NewTail).
Write a Comment
User Comments (0)
About PowerShow.com