Title: Syntactical analysis and Prolog
1Syntactical analysis and Prolog
2Methods for syntactical analysis
- Notation Alphabet - finate set of symbols.
- Word of alphabet A is a finite sequence of
symbols from A. - A the set of all word nade from
symbols of the alphabet A. - Grammar G is defined by ltN, T, P, Sgt, where S
(start) is one fixed element of N denoting the
starting symbol and the other symbols are sets
with following content - N nonterminal symbols,
- T terminal symbols,
- P rewriting rules
- such that
- N and T are disjunctive.
- Rewriting rule has the form L --gt R,
- s.t. L, R are some words in the alphabet N U T
meeting following constraints - L contains at least 1 nonterminal (condition
referred to as ntl ). - In other words, P is a set of pairs of words s.t.
- P is a subset of (N U T) N (N U T) x (N U T)
3Language and grammars
Language LG generated by grammar G consists of
all words in the alphabet of terminals, ie. from
T, which can be obtained from the starting
symbol S using the rewriting rules P. Let us
denote concatenation of finite sequence of
applications of a rule by --gt. LG ? T
S --gt ? Chomskys hiearchy of grammars based
on the format of rewriting rules Notation Let
a, ß be any words from (N U T) ? nonempty
words from (N U T) G. of typ 0 The most
general case the rewriting rules have to meet ntl
condition, only. These languages (of type 0) are
exactly those languages, which are recognized by
Turing machines. G. of typ 1 Context grammars.
Any rewriting rule must have the form
a X ß --gt a ? ß, where X is a symbol from N. G.
of typ 2 Context-free grammars. Any rewriting
rule must have the form X --gt a, where X is
a symbol from N. G. of typ 3 Regular grammars.
Any rewriting rule must have the form X --gt aY or
X --gt a, where X, Y are some symbols from N and a
is from T.
4Notation Let a, b, c be terminals, X, Y and S
nonterminals.
Example 1 Context-free grammar G1 with 2 rules
S --gt a S b, S --gt ab LG1 an bn n is any
natural number
 Example 2 Context-free grammar G2 with this
set of rules S --gt a S a, S --gt b S b, S --gt
aa, S --gt bb, S --gt a, S --gt b,
LG2 a a R a is any word from symbols a,b,
a R is reverse of a LG2 contains all the
palindroms of the considered alphabet (p. is a
word with same reading from left and from the
right).
Example 3 Find the simplest grammar generating
just the words an bn cn, where n is any natural
number.
5Eaxmple 4 Find out what which words are created
by the grammer G3 with this set of rewriting
rules S --gt LR, L --gt LaA, L --gt LbB, L --gt
e AR --gtRa, BR --gt Rb, R --gt e, Aa --gt aA, Ab
--gtbA, Ba --gtaB,Bb --gtbB , where e is the empty
word. Indicate the type of this grammar.
6NLP and the means of Prologu
Context-free grammars correspond to rules of
Prolog. The rule sentence --gt noun_phrase,
verb_phrase. Can be expressed as sentence(S)-
noun_phrase(N),verb_phrase(V),append(N,V,S). De
finite Clause Grammars (DCG) Prolog has build in
operator --gt, complemented by several useful
predicates, e.g.. phrase(Non_terminal,
List_of_words, Remainder). Success if
List_of_words (List_of_words1 concatenated with
Remainder) and List_of_words1 is correctly
created from Non_terminal using the rules of
considered grammar
7DCG in Prologu
Shortcut notation a --gt z,a.
--gt build-in binary operator, Expression is a
terminal symbol. To generate by all the Words of
the considered grammar, we can apply predicate
phrase/3 phrase(Non_terminal, String, Remainder_w
hich_could_not_be_analyzed_as_the non_terminal)
?- phrase(a,Y, ). Yz Yz,z
Yz,z,z
8NLP and means of Prolog
- sentence --gt noun_phrase, verb_phrase.
- noun_phrase --gt proper_noun.
- noun_phrase --gt article,adjective,noun.
- noun_phrase --gt article,noun.
- verb_phrase --gt itransitive_verb.
- verb_phrase --gt transitive_verb,noun_phrase.
- article --gt the.
- adjective --gt lazyrapid.
- proper_noun --gt achilles.
- noun --gt turtlepresidentqueencatfish
. - itransitive_verb --gt sleeps.
- transitive_verb --gt beats.
- ?- phrase(sentence,Y, ). Y achilles,sleeps
- ?- phrase(sentence,achilles, beats, the, lazy,
turtle, ).
9How to ensure matching between subject and the
other parts of the sentence?
- A)
- sentence --gt noun_phrase_singular,
verb_phrase_singular. - sentence --gt noun_phrase_plural,
verb_phrase_plural. - B)
- sentence --gt noun_phrase(case),
verb_phrase(case). - DCG allow for the solution B) with a new
argument this solution is elegant, transparent
and easily modifiable!!!
10Example Achilles
- sentence(s(NP,VP)) --gt noun_phrase(NP),verb_phrase
(VP). - noun_phrase(np(N)) --gt proper_noun(N).
- noun_phrase(np(Art,Adj,N))--gt article(Art),adjecti
ve(Adj),noun(N). - noun_phrase(np(Art,N)) --gt article(Art),noun(N).
- verb_phrase(vp(IV)) --gt itransitive_verb(IV).
- verb_phrase(vp(TV,NP))--gt transitive_verb(TV),noun
_phrase(NP). - article(art(the)) --gt the.
- adjective(adj(lazy)) --gt lazy.
- adjective(adj(rapid)) --gt rapid.
- proper_noun(pn(achilles)) --gt achilles.
- noun(n(turtle)) --gt turtle.
- noun(n(president)) --gt president.
noun(n(cat)) --gt cat. - itransitive_verb(iv(sleeps)) --gt sleeps.
- transitive_verb(tv(beats)) --gt beats.
- ?- phrase(sentence(X),S,).
- X s(np(pn(achilles)),vp(tv(beats), np(art(the),
adj(lazy), n(turtle)))) - S achilles, beats, the, lazy, turtle
11Utilization of arguments in DCG
?- phrase(sentence (X), Sent). X s( np (pn
(achilles)), vp( iv (sleeps)) ) Veta
achilles, sleeps, Arguments help to identify
selected parts of the sentence. ?-
phrase(sentence (T), achilles,beats,the,
lazy,turtle). T s(np(pn(achilles)),
vp(tv(beats), np(art(the),
adj(lazy),
n(turtle)))) It can be complemented by a
self-explanatory printing -----s-----np-----pn---
achilles -----vp-----tv-----beats
-----np-------art------the
-------adj-----lazy
---------n----turte Argument can have a
numeric value see example numbers
12Example Achilles - continued
- ?- phrase(sentence(X), the, rapid, cat, sleeps,
). - X s(np(art(the), adj(rapid), n(cat)),
vp(iv(sleeps))) - ?- phrase(sentence(X), the, rapid, cat, beats,
achilles, with, bag, ). - No (0.00s cpu)
- ?- phrase(sentence(X), the, rapid, cat, beats,
achilles, with, bag, Z). - X s(np(art(the), adj(rapid), n(cat)),
vp(tv(beats), np(pn(achilles)))) - Z with, bag
- The last argument returns the rest of the input
string, which could not be analyzed wrt.
considered nonterminal.
13Example numbers
- numeral(N) --gt n_999(N).
- numeral(N) --gt n1_9(N1),thousand, n_999(N2),N
is N11000 N2. - n_999(N) --gt n_99(N).
- n_999(N) --gt n1_9(N1), hundred, n_99(N2), N is
N1100 N2. - n_99(N) --gt n0_9(N).
- n_99(N) --gt n10_19(N).
- n_99(N) --gt n20_90(N).
- n_99(N) --gt n20_90(N1), n1_9(N2), N is N1
N2. - n0_9(0) --gt .
- n0_9(N) --gt n1_9(N).
- n1_9(1) --gt one. n1_9(2) --gt two.
- n10_19(10) --gt ten. n10_19(11) --gt eleven.
- n20_90(20) --gt twenty. n20_90(30) --gt
thirty. - ?- phrase(numeral(X), Y, ). X 1 Y oneX
2 Y two
14Example numbers - continued
- ?- phrase(numeral(X), one, thousand, two,
hundred, eleven, ). - X 1211
- ?- phrase(numeral(X), one, thousand, two, ).
- X 1002
- ?- phrase(numeral(X), thousand, two, hundred,
one, ). - No
- ?- phrase(numeral(X), two, thousand, two,
hundred, one, ). - X 2201
- ?- phrase(numeral(X), two, thousand, eleven,
). - X 2011
- ?- phrase(numeral(X), two, thousand, ).
- X 2000
- Caution!
- ?- phrase(numeral(X), zero, V).
- X 0 V zero More (0.00s cpu)
- No
15p.140
Interpretation
- The meaning of the proper noun Socrates is the
term socrates - proper_noun(socrates) --gt socrates.
- The meaning of the property mortal is a mapping
from terms to literals containing the unary
predicate mortal - property(Xgtmortal(X)) --gt mortal.
- The meaning of a proper noun - verb phrase
sentence is a clause with empty body and head
obtained by applying the meaning of the verb
phrase to the meaning of the proper noun - sentence((L-true)) --gt proper_noun(X),verb_phrase
(XgtL).?-phrase(sentence(C),socrates,is,mortal.
C (mortal(socrates)-true)
16- A transitive verb is a binary mapping from a pair
of terms to literals - transitive_verb(YgtXgtlikes(X,Y)) --gt likes.
- A proper noun instantiates one of the arguments,
returning a unary mapping - verb_phrase(M) --gt transitive_verb(YgtM),proper_no
un(Y).
17- sentence((L-true)) --gt proper_noun(X),verb_phrase
(XgtL).sentence((H-B))--gt every,noun(XgtB),ver
b_phrase(XgtH). NB. separate determiner rule
removed, see later - verb_phrase(M) --gt is,property(M).
- property(M) --gt a,noun(M).property(Xgtmortal(X)
) --gt mortal. - proper_noun(socrates) --gt socrates.
- noun(Xgthuman(X)) --gt human.
- ?-phrase(sentence(C),S).
- C human(X)-human(X)S every,human,is,a,human
- C mortal(X)-human(X)S every,human,is,mortal
- C human(socrates)-trueS socrates,is,a,human
- C mortal(socrates)-trueS socrates,is,mortal
18(No Transcript)