Logic, Language and Learning - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Logic, Language and Learning

Description:

word(v,amuse). word(v,amuses). Top down (1) parse(?C,?S1,?S) Parse a ... word(v,amuse). word(v,amuses). Left-corner parsing: Combining bottom up and top down ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 39
Provided by: informati3
Category:

less

Transcript and Presenter's Notes

Title: Logic, Language and Learning


1
Logic, Language and Learning
  • Chapter 11
  • Parsing Techniques
  • Luc De Raedt
  • (Following Covington)

2
Outline
  • Todays lecture- Parsing Techniques

3
Parsing a survey
4
Grammar for Illustration (1)
5
In Prolog Notation (2)
  • Building interpreters instead of
    using compiled DCGs
  • S-gtNP,VP
  • S(L1,L) - np(L1,L2),vp(L2,L)
  • Works for a variety of parsing
    techniques
  • Phrase-structure
  • rules
  • rule(s,np,vp).
  • rule(np,d,n).
  • rule(vp,v,np).
  • rule(vp,v,np,pp).
  • rule(pp,p,np).
  • rule(d,).

6
In Prolog Notation (2)
  • Lexicon
  • word(d,the).
  • word(p,near).
  • word(n,dog). word(n,dogs).
  • word(n,cat). word(n,cats).
  • word(n,elephant).
  • word(n,elephants).
  • word(v,chase). word(v,chases).
  • word(v,see). word(v,sees).
  • word(v,amuse). word(v,amuses).

7
Top down (1)
  • parse(?C,?S1,?S) Parse a constituent of category
    C
  • starting with input string S1 and ending up
    with input string S.
  • parse(C,WordS,S) -
  • word(C,Word).
  • parse(C,S1,S) -
  • rule(C,Cs),
  • parse_list(Cs,S1,S).
  • What happens to
  • rule(np,np,conj,np). ?

8
Top down (1)
  • parse(?C,?S1,?S) Parse a constituent of category
    C
  • starting with input string S1 and ending up
    with input string S
  • parse_list(Cs,?S1,?S)
  • Like parse/3, but Cs is a list of categories
    to be parsed in succession.
  • parse_list(CCs,S1,S) -
  • parse(C,S1,S2),
  • parse_list(Cs,S2,S).
  • parse_list(,S,S).
  • What happens to
  • rule(np,np,conj,np). ?

9
Bottom up Shift-Reduce (1)
  • Shift a word on the stack
  • Reduce the stack repeatedly using lexical
    entries and PS, until no further reductions
    possible
  • If there are more words in the input string, go
    to 1. Otherwise stop.
  • p157, Covington

10
Bottom up
11
Bottom up Shift reduce (2)
12
Bottom up Shift reduce (3)
  • Here "brule" means "backward rule."
  • brule(vp,npX,sX).
  • brule(n,dX,npX).
  • brule(np,vX,vpX).
  • brule(pp,np,vX,vpX).
  • brule(np,pX,ppX).
  • brule(np,conj,npX,npX).
  • brule(WordX,CatX) - word(Cat,Word).
  • Lexicon
  • word(conj,and).
  • word(p,near).
  • word(d,the).
  • word(n,dog). word(n,dogs).
  • word(n,cat). word(n,cats).
  • word(n,elephant). word(n,elephants).
  • Bottom-up shift-reduce parser
  • parse(S,?Result) parses input string S, where
    Result is list of categories to which it
    reduces.
  • parse(S,Result) -
  • shift_reduce(S,,Result).
  • shift_reduce(S,Stack,?Result) parses input
    string S, where Stack is
  • list of categories parsed so far.
  • shift_reduce(S,Stack,Result) -
  • shift(Stack,S,NewStack,S1), fails if S
  • reduce(NewStack,ReducedStack),
  • shift_reduce(S1,ReducedStack,Result).
  • shift_reduce(,Result,Result).
  • shift(Stack,S,-NewStack,-NewS) shifts first
    element from S onto Stack.
  • shift(X,HY,HX,Y).
  • reduce(Stack,-ReducedStack) repeatedly reduces
    beginning of Stack
  • to form fewer, larger consituents.
  • reduce(Stack,ReducedStack) -

13
Left-corner parsing Combining bottom up and top
down
  • 1. Parse constituent of type C
  • Accept a word from input. Call its category W
  • Complete C. If WC, youre done. Otherwise
  • Look at the rules and find a a constituent P
    whose expansion begins with W
  • Recursively left-corner-parse the remaining
    elements of the expansion of P
  • Put P in place of W and go to 2.
  • The of type D
  • Use rule NP-gt D N
  • So, WD and P NP

14
Left Corner parsing (2)
  • complete(W,C,S1,-S)
  • Verifies that W can be the first
    sub-constituent
  • of C, then left-corner-parses the rest of C.
  • complete(C,C,S,S). if CW, do nothing.
  • complete(W,C,S1,S) -
  • rule(P,WRest),
  • parse_list(Rest,S1,S2),
  • complete(P,C,S2,S).
  • rule(pp,p,np).
  • rule(d,). not suitable for
    left-corner parser
  • Lexicon
  • word(conj,and).
  • parse(C,S1,-S)
  • Parse a constituent of category C
  • starting with input string S1 and
  • ending up with input string S.
  • parse(C,WordS2,S) -
  • word(W,Word),
  • complete(W,C,S2,S).
  • parse_list(Cs,S1,-S)
  • Like parse/3, but Cs is a list of
  • categories to be parsed in succession.
  • parse_list(CCs,S1,S) -
  • parse(C,S1,S2),
  • parse_list(Cs,S2,S).
  • parse_list(,S,S).

15
Left corner parsing (3)
  • What about rules of the form
  • A-gtBC
  • C-gt ??
  • Top-down
  • rule(a,b,c).
  • rule(c,).
  • parse(C,S2,S) -
  • rule(W,),
  • complete(W,C,S2,S).
  • Looping with parses that do not succeed

16
Left corner with links
  • Modify
  • parse(C,WordS2,S) -
  • word(W,Word),
  • link(W,C),
  • complete(W,C,S2,S).
  • parse(C,S2,S) -
  • rule(W,), for null constituents
  • link(W,C),
  • complete(W,C,S2,S).
  • Consider
  • S-gtNP VP
  • NP-gtD N
  • VP-gtV N
  • Add
  • link(np,s).
  • link(d,np).
  • link(d,s).
  • link(v,vp).
  • link(X,X).

17
BUP
  • Left corner parsing directly in Prolog
  • NP -gt D N PP
  • d(C,S1,S)-
  • parse(n,S1,S2),
  • parse(pp,S2,S3),
  • np(C,S3,S)
  • np(np,S,S).
  • n(n,S,S).
  • Etc.

18
Chart parsing (1)
  • Consider
  • VP -gt V NP (PP)
  • Realized as
  • VP -gt V NP
  • VP -gt V NP PP
  • Double WORK !

19
Chart parsing first approach (2)
  • parse(C,S1,S) - chart(C,S1,S).
  • parse(C,WordS,S) -
  • word(C,Word).
  • parse(C,S1,S) -
  • rule(C,Cs),
  • parse_list(Cs,S1,S),
  • asserta(chart(C,S1,S)).
  • clear_chart - abolish(chart/3).
  • For each constituent store
  • What kind it is
  • Where it begins
  • Where it ends
  • Represent (initially) e.g. as
  • chart(np,the,cat,into,the,garden,into,the,garde
    n)

20
Chart parsing numerical positions
  • More compact representation.
  • Use
  • chart(np,0,2)
  • instead of
  • chart(np,the,cat,into,the,garden,into,the,garde
    n)
  • Words
  • c(the,0,1).
  • c(dog,1,2).
  • c(sees,2,3).
  • c(the,3,4).
  • c(cat,4,5).
  • c(near,5,6).
  • c(the,6,7).
  • c(elephant,7,8).
  • parse(C,S1,S) -
  • chart(C,S1,S).
  • parse(C,S1,S) -
  • c(Word,S1,S), this is the only change
  • word(C,Word).
  • parse(C,S1,S) -
  • rule(C,Cs),
  • parse_list(Cs,S1,S),
  • asserta(chart(C,S1,S)).

21
Chart parsing - completeness
  • parse(C,S1,-S) Parse a constituent of
    category C starting with input string S1 and
    ending up with input string S.
  • parse(C,WordS,S) -
  • word(C,Word).
  • parse(C,S1,S) -
  • complete(C,S1),
  • !,
  • chart(C,S1,S).
  • parse(C,S1,S) -
  • rule(C,Cs),
  • parse_list(Cs,S1,S2),
  • asserta(chart(C,S1,S2)),
  • S2 S.
  • parse(C,S1,_) -
  • asserta(complete(C,S1)),
  • fail.
  • Problems Chart parsers so far
  • Recall constituents found
  • Forget constituents tried but failed
  • E.g. the cat is NP
  • Is not PP
  • VP-gt V NP (PP) (Adv)
  • NP-gt D N (PP)
  • PP-gt P NP
  • vp saw the boy yesterday
  • Every parse can be found in chart and using
    backtracking. Double work

22
Chart parsing - completeness
  • parse(C,S1,-S) Parse a constituent of
    category C starting with input string S1 and
    ending up with input string S.
  • parse(C,WordS,S) -
  • word(C,Word).
  • parse(C,S1,S) -
  • complete(C,S1),
  • !,
  • chart(C,S1,S).
  • parse(C,S1,S) -
  • rule(C,Cs),
  • parse_list(Cs,S1,S2),
  • asserta(chart(C,S1,S2)),
  • S2 S.
  • parse(C,S1,_) -
  • asserta(complete(C,S1)),
  • fail.
  • When to use the chart ?
  • Only if it is complete for a particular type of
    constituent in a particular position
  • Complete(np,the,cat,into,the,garden).

23
Chart parsing - Subsumption
  • If nodes have arguments such as np(X) and the
    chart contains np(singular) ?
  • Should we use the chart ?
  • Definitely not !
  • Add a subsumption check
  • An atom A subsumes an atom B if and only if there
    is a substitution ? such that A??B
  • Cf. Part on Learning

24
Chart parsing - Subsumption
  • subsumes_chk(T1,T2) -
  • \ (numvars(T2), \ (T1 T2)).
  • parse(C,S1,S) -
  • complete(C0,S1),
  • subsumes_chk(C0,C),
  • !,
  • C0 C,
  • chart(C,S1,S).

25
Earleys algorithm
  • Consider
  • S-gtNP VP
  • s(S1,S)-np(S1,S2), vp(S2,S)
  • s(the,dog,barked,).
  • s(the,dog,barked,) - np(the,dog,barked,S1),
    vp(S1,)
  • s(the,dog,barked,) - vp(S1,)
  • s(the,dog,barked,).
  • Earley (1970) chart parsing
  • Time O(n3)
  • Correct for null constituents
  • Does not loop for left recursive rules A-gtAB
  • Combines top-down and bottom up
  • Active chart parser
  • Stores complete and work in progress
  • Does not backtrack
  • Pursues different alternatives in parallel.

26
Earley (2)
  • In Earleys notation
  • S -gt ? NP VP 0 0
  • S-gt NP ? VP 0 2
  • S-gt NP VP ? 0 3
  • In Prolog
  • chart(s,the,dog,barked,np,vp,the,dog,barked)
    .
  • chart(s,the,dog,barked,vp,barked).
  • chart(s,the,dog,barked,,).
  • chart(Constituent,WhereConstituentStarts,Goals,Whe
    reGoalsStart).

27
Earley (3)
  • chart(s,the,dog,barked,np,vp,the,dog,barked)
  • With NP -gt D N produces
  • chart(np,the,dog,barked,d,n,the,dog,barked).
  • Gives
  • chart(np,the,dog,barked,n,dog,barked).
  • chart(np,the,dog,barked,,barked).
  • chart(s,the,dog,barked,vp,barked)

28
Earley (3)
  • Predictor
  • Looks for rules expanding current goals and
    creates new goals
  • Scanner
  • Accepts a word from the input string and uses it
    to satisfy the current goals
  • Completer
  • Looks at the output of the scanner and determines
    which, if any, larger constituents have been
    completed.
  • parse(C,S1,-S) Parse a constituent of type C
    from input string S1, leaving remainder of input
    in S.
  • parse(C,S1,S) -
  • clear_chart,
  • store(chart(start,S1,C,S1)),
  • process(S1),
  • chart(C,S1,,S).
  • process(Position) Starting with input string
    Position, work through he Earley parsing process.
  • process() - !.
  • process(Position) -
  • predictor(Position),
  • scanner(Position,NewPosition),
  • completer(NewPosition),
  • process(NewPosition).

29
(No Transcript)
30
Earley - Predictor
31
Earley - Scanner
  • scanner(Position,-NewPosition)
  • Accept a word and use it to satisfy goals.
  • scanner(WWords,Words) -
  • chart(C,PC,GGoals,WWords),
    for each current goal at current position
  • word(G,W),
    if category of W matches it
  • store(chart(C,PC,Goals,Words)),
    make a new chart entry
  • fail.
  • scanner(_Words,Words).
    then succeed with no further action.

32
Earley - Completer
33
Earley Implementation (1)
  • Avoiding loops
  • store(chart(A,B,C,D))
  • Make a new chart entry if it does not already
    exist.
  • store(chart(A,B,C,D)) -
  • \ chart(A,B,C,D),
  • assertz(chart(A,B,C,D)).
  • Only modify chart if it does not yet occur !!!
  • chart(np,the,dog,and,the,cat,d,n,the,dog,and,
    the,cat)
  • chart(np, the,dog,and,the,cat,np,conj,np,the,
    dog,and,the,cat)

34
Earley Implementation (2)
  • Null constituents
  • D-gt ?
  • rule(d,).
  • Add a clause for predict.
  • predict(Goal,Position) -
  • rule(Goal,),
  • store(chart(Goal,Position, ,Position)),
  • complete(Goal,Position,Position),
  • fail.
  • Subsumption
  • Modify store to take into account subsumption

35
Earley Implementation (3)
  • Other modifications
  • Represent positions by numbers
  • Use indexing
  • Reduce the use of assert
  • Restriction
  • Earley deduction
  • David H.D. Warren
  • From f(a)-g(a) and g(X) - h(X), j(X)
  • Derive g(a)-h(a),j(a)
  • From h(a) and k(X)-h(X),m(X)
  • Derive k(a) - m(a)

36
Earley deduction at work
37
Performance
38
  • Why ?
  • Lack of optimization
  • Small Grammars
  • Subsumption
  • Complexity of parsing
  • Earley n3
  • Recursive descent and shift reduce kn
  • Typically n is small (max 30)
  • Worst case
  • If nodes have arguments features, then NP
    complete
  • Buffalo Buffalo Buffalo Buffalo Buffalo
  • (Boston cattle bewilder Boston cattle)
  • Final note
  • Many of the components can be combined
Write a Comment
User Comments (0)
About PowerShow.com