Prolog for Linguists Symbolic Systems 139P/239P - PowerPoint PPT Presentation

About This Presentation
Title:

Prolog for Linguists Symbolic Systems 139P/239P

Description:

Prolog for Linguists Symbolic Systems 139P/239P John Dowding Week 5, Novembver 5, 2001 jdowding_at_stanford.edu – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 84
Provided by: JohnDo58
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Prolog for Linguists Symbolic Systems 139P/239P


1
Prolog for Linguists Symbolic Systems 139P/239P
  • John Dowding
  • Week 5, Novembver 5, 2001
  • jdowding_at_stanford.edu

2
Office Hours
  • We have reserved 4 workstations in the Unix
    Cluster in Meyer library, fables 1-4
  • 430-530 on Thursday this week
  • Or, contact me and we can make other arrangements

3
Course Schedule
  • Oct. 8
  • Oct. 15
  • Oct. 22
  • Oct. 29
  • Nov. 5 (double up)
  • Nov. 12
  • Nov. 26 (double up)
  • Dec. 3
  • No class on Nov. 19

4
More about cut!
  • Common to distinguish between red cuts and green
    cuts
  • Red cuts change the solutions of a predicate
  • Green cuts do not change the solutions, but
    effect the efficiency
  • Most of the cuts we have used so far are all red
    cuts
  • delete_all(Element, List, -NewList)
  • delete_all(_Element, , ).
  • delete_all(Element, ElementList, NewList) -
  • !,
  • delete_all(Element, List, NewList).
  • delete_all(Element, HeadList, HeadNewList)
    -
  • delete_all(Element, List, NewList).

5
Green cuts
  • Green cuts can be used to avoid unproductive
    backtracking
  • identical(?Term1, ?Term2)
  • identical(Var1, Var2)-
  • var(Var1), var(Var2),
  • !, Var1 Var2.
  • identical(Atomic1,Atomic2)-
  • atomic(Atomic1), atomic(Atomic2),
  • !, Atomic1 Atomic2.
  • identical(Term1, Term2)-
  • compound(Term1),
  • compound(Term2),
  • functor(Term1, Functor, Arity),
  • functor(Term2, Functor, Arity),
  • identical_helper(Arity, Term1, Term2).

6
Input/Output of Terms
  • Input and Output in Prolog takes place on Streams
  • By default, input comes from the keyboard, and
    output goes to the screen.
  • Three special streams
  • user_input
  • user_output
  • user_error
  • read(-Term)
  • write(Term)
  • nl

7
Example Input/Output
  • repeat/0 is a built-in predicate that will always
    resucceed
  • classifing terms
  • classify_term -
  • repeat,
  • write('What term should I classify? '),
  • nl,
  • read(Term),
  • process_term(Term),
  • Term end_of_file.

8
Streams
  • You can create streams with open/3
  • open(FileName, Mode, -Stream)
  • Mode is one of read, write, or append.
  • When finished reading or writing from a Stream,
    it should be closed with close(Stream)
  • There are Stream-versions of other Input/Output
    predicates
  • read(Stream, -Term)
  • write(Stream, Term)
  • nl(Stream)

9
Characters and character I/O
  • Prolog represents characters in two ways
  • Single character atoms a, b, c
  • Character codes
  • Numbers that represent the character in some
    character encoding scheme (like ASCII)
  • By default, the character encoding scheme is
    ASCII, but others are possible for handling
    international character sets.
  • Input and Output predicates for characters follow
    a naming convention
  • If the predicate deals with single character
    atoms, its name ends in _char.
  • If the predicate deals with character codes, its
    name ends in _code.
  • Characters are character codes is traditional
    Edinburgh Prolog, but single character atoms
    were introduced in the ISO Prolog Standard.

10
Special Syntax I
  • Prolog has a special syntax for typing character
    codes
  • 0a is a expression that means the character codc
    that represents the character a in the current
    character encoding scheme.

11
Special Syntax II
  • A sequence of characters enclosed in double quote
    marks is a shorthand for a list containing those
    character codes.
  • abc 97, 98, 99
  • It is possible to change this default behavior to
    one in which uses single character atoms instead
    of character codes, but we wont do that here.

12
Built-in Predicates
  • atom_chars(Atom, CharacterCodes)
  • Converts an Atom to its corresponding list of
    character codes,
  • Or, converts a list of CharacterCodes to an Atom.
  • put_code(Code) and put_code(Stream, Code)
  • Write the character represented by Code
  • get_code(Code) and get_code(Stream, Code)
  • Read a character, and return its corresponding
    Code
  • Checking the status of a Stream
  • at_end_of_file(Stream)
  • at_end_of_line(Stream)

13
Review homework problems last/2
  • last(?Element, ?List)
  • last(Element, Element).
  • last(Element, _HeadTail)-
  • last(Element, Tail).
  • Or
  • last(Element, List)-
  • append(_EverthingElse, Element, List).

14
evenlist/1 and oddlist/1
  • evenlist(?List).
  • evenlist().
  • evenlist(_HeadTail)-
  • oddlist(Tail).
  • oddlist(List)
  • oddlist(_HeadTail)-
  • evenlist(Tail).

15
palindrome/1
  • palindrome1(List).
  • palindrome1().
  • palindrome1(_OneElement).
  • palindrome1(HeadTail)-
  • append(Rest, Head, Tail),
  • palindrome1(Rest).

16
Or, palindrome/1
  • palindrome(List)
  • palindrome(List)-
  • reverse(List, List).
  • reverse(List, -ReversedList)
  • reverse(List, ReversedList)-
  • reverse(List, , ReversedList).
  • reverse(List, Partial, ReversedList)
  • reverse(, Result, Result).
  • reverse(HeadTail, Partial, Result)-
  • reverse(Tail, HeadPartial, Result).

17
subset/2
  • subset(?Set, ?SubSet)
  • subset(, ).
  • subset(ElementRestSet, ElementRestSubSet)-
  • subset(RestSet, RestSubSet).
  • subset(_ElementRestSet, SubSet)-
  • subset(RestSet, SubSet).

18
union/3
  • union(Set1, Set2, -SetUnion)
  • union(, Set2, Set2).
  • union(ElementRestSet1, Set2,
    ElementSetUnion)-
  • union(RestSet1, Set2, SetUnion),
  • \ member(Element, SetUnion),
  • !.
  • union(_ElementRestSet1, Set2, SetUnion)-
  • union(RestSet1, Set2, SetUnion).

19
intersect/3
  • intersect(Set1, Set2, ?Intersection)
  • intersect(, _Set2, ).
  • intersect(ElementRestSet1, Set2,
    ElementIntersection)-
  • member(Element, Set2),
  • !,
  • intersect(RestSet1, Set2, Intersection).
  • intersect(_ElementRestSet1, Set2,
    Intersection)-
  • intersect(RestSet1, Set2, Intersection).

20
split/4
  • split(List, SplitPoint, -Smaller, -Bigger).
  • split(, _SplitPoint, , ).
  • split(HeadTail, SplitPoint, HeadSmaller,
    Bigger)-
  • Head lt SplitPoint,
  • !, green cut
  • split(Tail, SplitPoint, Smaller, Bigger).
  • split(HeadTail, SplitPoint, Smaller,
    HeadBigger)-
  • Head gt SplitPoint,
  • split(Tail, SplitPoint, Smaller, Bigger).

21
merge/3
  • merge(List1, List2, -MergedList)
  • merge(, List2, List2).
  • merge(List1, , List1).
  • merge(Element1List1, Element2List2,
    Element1MergedList)-
  • Element1 lt Element2,
  • !,
  • merge(List1, Element2List2, MergedList).
  • merge(List1, Element2List2, Element2MergedLis
    t)-
  • merge(List1, List2, MergedList).

22
Sorting quicksort/2
  • quicksort(List, -SortedList)
  • quicksort(, ).
  • quicksort(HeadUnsortedList, SortedList)-
  • split(UnsortedList, Head, Smaller, Bigger),
  • quicksort(Smaller, SortedSmaller),
  • quicksort(Bigger, SortedBigger),
  • append(SortedSmaller, HeadSortedBigger,
    SortedList).

23
Sorting mergesort/2
  • mergesort(List, -SortedList).
  • mergesort(, ).
  • mergesort(_One, _One)-
  • !.
  • mergesort(List, SortedList)-
  • break_list_in_half(List, FirstHalf, SecondHalf),
  • mergesort(FirstHalf, SortedFirstHalf),
  • mergesort(SecondHalf, SortedSecondHalf),
  • merge(SortedFirstHalf, SortedSecondHalf,
    SortedList).

24
Merge sort helper predicates
  • break_list_in_half(List, -FirstHalf,
    -SecondHalf)
  • break_list_in_half(List, FirstHalf, SecondHalf)-
  • length(List, L),
  • HalfL is L /2,
  • first_n(List, HalfL, FirstHalf, SecondHalf).
  • first_n(List, N, -FirstN, -Remainder)
  • first_n(HeadRest, L, HeadFront, Back)-
  • L gt 0,
  • !,
  • NextL is L - 1,
  • first_n(Rest, NextL, Front, Back).
  • first_n(Rest, _L, , Rest).

25
Lexigraphic Ordering
  • We can extending sorting predicates to sort all
    Prolog terms using a lexigraphic ordering on
    terms.
  • Defined recursively
  • Variables _at_lt Numbers _at_lt Atoms _at_lt CompoundTerms
  • Var1 _at_lt Var2 if Var1 is older than Var2
  • Atom1 _at_lt Atom2 if Atom1 is alphabetically earlier
    than Atom2.
  • Functor1(Arg11, Arg1N) _at_lt Functor2(Arg21,,
    Arg2M) if
  • Functor1 _at_lt Functor2, or Functor1 Functor2 and
  • N _at_lt M, or Functor1Functor2, NM, and
  • Arg11 _at_lt Arg21, or
  • Arg11 _at_ Arg21 and Arg12 _at_lt Arg22, or

26
Built-in Relations
  • Less-than _at_lt
  • Greater than _at_gt
  • Less than or equal _at_lt
  • Greater than or equal _at_gt
  • Built-in predicate sort/2 sorts Prolog terms on a
    lexigraphic ordering.

27
Tokenizer
  • A token is a sequence of characters that
    constitute a single unit
  • What counts as a token will vary
  • A token for a programming language may be
    different from a token for, say, English.
  • We will start to write a tokenizer for English,
    and build on it in further classes

28
Homework
  • Read section in SICTus Prolog manual on
    Input/Output
  • This material corresponds to Ch. 5 in Clocksin
    and Mellish, but the Prolog manual is more up to
    date and consistent with the ISO Prolog Standard
  • Improve the tokenizer by adding support for
    contractions
  • cant., wont havent, etc.
  • wouldve, shouldve
  • Ill, shell, hell
  • Hes, Shes, (contracted is and contracted has,
    and possessive)
  • Dont hand this in, but hold on to it, youll
    need it later.

29
My tokenizer
  • First, I modified to turn all tokens into lower
    case
  • Then, added support for integer tokens
  • Then, added support for contraction tokens

30
Converting character codes to lower case
  • occurs_in_word(Code, -LowerCaseCode)
  • occurs_in_word(Code, Code)-
  • Code gt 0'a,
  • Code lt 0'z.
  • occurs_in_word(Code, LowerCaseWordCode)-
  • Code gt 0'A,
  • Code lt 0'Z,
  • LowerCaseWordCode is Code (0'a - 0'A).

31
Converting to lower case
  • case for regular word tokens
  • find_one_token(WordCodeCharacterCodes, Token,
    RestCharacterCodes)-
  • occurs_in_word(WordCode, LowerCaseWordCode),
  • find_rest_word_codes(CharacterCodes,
    RestWordCodes, RestCharacterCodes),
  • atom_chars(Token, LowerCaseWordCodeRestWordCode
    s).
  • find_rest_word_codes(CharacterCodes,
    -RestWordCodes, -RestCharacterCodes)
  • find_rest_word_codes(WordCodeCharacterCodes,
    LowerCaseWordCodeRestWordCodes,
    RestCharacterCodes)-
  • occurs_in_word(WordCode, LowerCaseWordCode),
  • !, red cut
  • find_rest_word_codes(CharacterCodes,
    RestWordCodes, RestCharacterCodes).
  • find_rest_word_codes(CharacterCodes, ,
    CharacterCodes).

32
Adding integer tokens
  • case for integer tokens
  • find_one_token(DigitCodeCharacterCodes, Token,
    RestCharacterCodes)-
  • digit(DigitCode),
  • find_rest_digit_codes(CharacterCodes,
    RestDigitCodes, RestCharacterCodes),
  • atom_chars(Token, DigitCodeRestDigitCodes).
  • find_rest_digit_codes(CharacterCodes,
    -RestDigitCodes, -RestCharacterCodes)
  • find_rest_digit_codes(DigitCodeCharacterCodes,
    DigitCodeRestDigitCodes, RestCharacterCodes)-
  • digit(DigitCode),
  • !, red cut
  • find_rest_digit_codes(CharacterCodes,
    RestDigitCodes, RestCharacterCodes).
  • find_rest_digit_codes(CharacterCodes, ,
    CharacterCodes).

33
Digits
  • digit(Code)
  • digit(Code)-
  • Code gt 0'0,
  • Code lt 0'9.

34
Contactions
  • Turned unambiguous contractions into the
    corresponding English word
  • Left ambiguous contractions contracted.
  • Handled 2 cases
  • Simple contractions
  • Hes gt He s
  • Hell gt He will
  • Theyve gt They have
  • Exceptions
  • cant gt can not
  • wont gt will not

35
Simple Contractions
  • simple_contraction("'re", "are").
  • simple_contraction("'m", "am").
  • simple_contraction("'ll", "will").
  • simple_contraction("'ve", "have").
  • simple_contraction("'d", "'d"). had, would
  • simple_contraction("'s", "'s"). is, has,
    possessive
  • simple_contraction("n't", "not").

36
handle_contractions/2
  • handle_contractions(TokenChars,
    -FrontTokenChars, RestTokenChars)
  • handle_contractions("can't", "can", "not")-
  • !.
  • handle_contractions("won't", "will", "not")-
  • !.
  • handle_contractions(FoundCodes, Front,
    NewCodes)-
  • simple_contraction(Contraction, NewCodes),
  • append(Front, Contraction, FoundCodes),
  • Front \ ,
  • !.

37
Modify find_one_token/3
  • case for regular word tokens
  • find_one_token(WordCodeCharacterCodes, Token,
    RestCharacterCodes)-
  • occurs_in_word(WordCode, LowerCaseWordCode),
  • find_rest_word_codes(CharacterCodes,
    RestWordCodes, TempCharacterCodes),
  • handle_contractions(LowerCaseWordCodeRestWordCo
    des, FirstTokenCodes, CodesToAppend),
  • append(CodesToAppend, TempCharacterCodes,
    RestCharacterCodes),
  • atom_chars(Token, FirstTokenCodes).

38
Dynamic predicates and assert
  • Add or remove clauses from a dynamic predicate at
    run time.
  • To specify that a predicate is dynamic, add
  • - dynamic predicate/Arity.
  • to your program.
  • assert/1 adds a new clause
  • retract/1 removes one or more clauses
  • retractall/1 removes all clauses for the
    predicate
  • Cant modify compiled predicates at run time
  • Modifying a program while it is running is
    dangerous

39
assert/1, asserta/1, and assertz/1
  • Asserting facts (most common)
  • assert(Fact)
  • Asserting rules
  • assert( (Head - Body) ).
  • asserta/1 adds the new clause at the front of the
    predicate
  • assertz/1 adds the new clause at the end of the
    predicate
  • assert/1 leaves the order unspecified

40
Built-In retract/1
  • retract(Goal) removes the first clause that
    matches Goal.
  • On REDO, it will remove the next matching clause,
    if any.
  • Retract facts
  • retract(Fact)
  • Retract rules
  • retract( (Head - Body) ).

41
Built-in retractall/1
  • retractall(Head) removes all facts and rules
    whose head matches.
  • Could be implemented with retract/1 as
  • retractall(Head) -
  • retract(Head),
  • fail.
  • retract(Head)-
  • retract( (Head - _Body) ),
  • fail.
  • retractall(_Head).

42
Built-In abolish(Predicate/Arity)
  • abolish(Predicate/Arity) is almost the same as
  • retract(Predicate(Arg1, , ArgN))
  • except that abolish/1 removes all knowledge
    about the predicate, where retractall/1 only
    removes the clauses of the predicate.
  • That is, if a predicate is declared dynamic,
    that is remembered after retractall/1, but not
    after abolish/1.

43
Example Stacks Queues
  • - dynamic stack_element/1.
  • empty_stack -
  • retractall(stack_selement(_Element)).
  • push_on_stack(Element)
  • push_on_stack(Element)-
  • asserta(stack_element(Element)).
  • pop_from_stack(-Element)
  • pop_from_stack(Element)-
  • var(Element),
  • retract(stack_element(Element)),
  • !.

44
Queues
  • dynamic queue_element/1.
  • empty_queue -
  • retractall(queue_element(_Element)).
  • put_on_queue(Element)
  • put_on_queue(Element)-
  • assertz(queue_element(Element)).
  • remove_from_queue(-Element)
  • remove_from_queue(Element)-
  • var(Element),
  • retract(queue_element(Element)),
  • !.

45
Example prime_number.
  • - dynamic known_prime/1.
  • find_primes(Prime)-
  • retractall(known_prime(_Prime)),
  • find_primes(2, Prime).
  • find_primes(Integer, Integer)-
  • \ composite(Integer),
  • assertz(known_prime(Integer)).
  • find_primes(Integer, Prime)-
  • NextInteger is Integer 1,
  • find_primes(NextInteger, Prime).

46
Example prime_number (cont)
  • composite(Integer)
  • composite(Integer)-
  • known_prime(Prime),
  • 0 is Integer mod Prime,
  • !.

47
Aggregation findall/3.
  • findall/3 is a meta-predicate that collects
    values from multiple solutions to a Goal
  • findall(Value, Goal, Values)
  • findall(Child, parent(james, Child), Children)
  • Prolog has other aggregation predicates setof/3
    and bagof/3, but well ignore them for now.

48
findall/3 and assert/1
  • findall/3 and assert/1 both let you preserve
    information across failure.
  • - dynamic solutions/1.
  • findall(Value, Goal, Solutions)-
  • retractall(solutions/1),
  • assert(solutions()),
  • call(Goal),
  • retract(solutions(S)),
  • append(S, Value, NextSolutions),
  • assert(solutions(NextSolutions)),
  • fail.
  • findall(_Value, Goal, Solutions)-
  • solutions(Solutions).

49
Special Syntax III Operators
  • Convenience in writing terms
  • Weve seem them all over already
  • union(ElementRestSet1, Set2,
    ElementSetUnion)-
  • union(RestSet1, Set2, SetUnion),
  • \ member(Element, SetUnion),
  • !.
  • This is just an easier way to write the term
  • -(union(ElementRestSet,Set2,ElementSetUnio
    n),
  • ,(union(RestSet1,Set2,SetUnion),
  • ,(\(member(Element, SetUnion),
  • !)))

50
Operators (cont)
  • Operators can come before their arguments
    (prefix)
  • \, dynamic
  • Or between their arguments (infix)
  • , is lt
  • Of after their arguments (postfix)
  • Prolog doesnt use any of these (yet)
  • The same Operator can be more than one type
  • -

51
Precedence and Associativity
  • Operators also have precedence
  • 5 2 3 (5 2) 3
  • Operators can be associative, or not,
  • Left associative or right associative
  • Explicit parenthesization can override defaults
    for associatiativity and precendence

52
Built-in current_op/3
  • current_op/3 gives the precedence and
    associativity of all current operators.
  • current_op(Precedence, Associativity, Operator)
  • where Precedence in an integer 1-1200
  • and Associativity is of
  • fx or fy for prefix operators
  • xf or yf for postfix operators
  • xfx, xfy, yfx, yfy for infix operators

53
Associativity
  • These atoms fx, fy, xf, yf, xfx, xfy, yfx, yfy
    draw a picture of the associativity of the
    operator
  • The location of the f tells if the operator is
    prefix, infix, or postfix.
  • x means that the argument must be of lower
    precedence
  • y means that the argument must be of equal or
    lower precedence.
  • A y on the left means the operator is left
    associative
  • A y on the right means the operator is right
    associative

54
Operator Examples
Precedence Associativity Operator
1200 xfx -
1150 fx dynamic
1000 xfy ,
900 fy \
700 xfx
700 xfx is
700 xfx lt
500 yfx
500 fx
400 yfx
300 xfx mod
55
Creating new operators
  • Built-in op/3 creates new operators
  • op(Precedence, Associativity, Operator)
  • - op(700, xfx, equals).
  • - op(650, fx, ).
  • - op(650, xf, cents).
  • Dollars equals Cents cents -
  • Cents is 100 Dollars.

56
Consult
  • The operation for reading in a file of Prolog
    clauses and treating them as a program is
    traditional known as consulting the file.
  • We will write a simple consult/1 predicate, and
    build on it over time.
  • We will write similar

57
Consult (cont)
  • consult_file(File)-
  • open(File, read, Stream),
  • consult_stream(Stream),
  • close(Stream).
  • consult_stream(Strea)-
  • repeat,
  • read(Stream, Term),
  • consult_term(Term),
  • at_end_of_stream(Stream),
  • !.

58
Consult (cont)
  • consult_term((- Goal))-
  • !,
  • call(Goal).
  • consult_term((Goal - Body))-
  • !,
  • assertz((Goal - Body)).
  • consult_term(Fact)-
  • assertz(Fact).

59
Parsing, grammars, and language theory
  • The development of Prolog (by Colmeraur at
    Marseilles) was motivated in part by a desire to
    study logic and language.
  • Grammars are formal specifications of languages
  • Prolog takes these specifications and treats them
    as logical theories about language, and as
    computations
  • Grammar ? Proof ? Computation
  • Pereira and Warren, Parsing as Deduction, 1984.
  • Ideas from Prolog/Logic Programming, particularly
    unification, are found in modern Linguistics.

60
Overview of formal language theory
  • An Alphabet ? is a set of symbols
  • A Sentence is a finite sequence of symbols from
    some alphabet
  • A Language L is a (potentially infinite) set of
    sentences from some alphabet
  • A Grammar is a finite description of a language
  • L(G) is the language described by the grammar G
  • We will be interested in several problems
  • Is a given sentence a member of L(G)?
  • What structure does G assign to the sentence?

61
Context-Free Grammars
  • A Context-Free Grammar consists of
  • An alphabet ?
  • A set of nonterminal symbols N (N???)
  • A distinguished start symbol S?N
  • A set of production rules ? of the form
  • A ? B1 BN, where A ?N and B1 BN ? (N ??)

62
CFG example
  • S ? NP VP
  • NP ? DET N
  • VP ? V
  • VP ? V NP
  • DET ? the
  • DET ? a
  • N ?man
  • N ?men
  • N ?woman
  • N ?women

N ? cat N ? cats N? dog N? dogs V? like V?
likes V? sleep V? sleeps
63
Derivations
  • S gt NP VP
  • gt DET N VP
  • gt the N VP
  • gt the man VP
  • gt the man V NP
  • gt the man likes NP
  • gt the man likes DET N
  • gt the man likes the N
  • gt the man likes the woman

64
A Prolog Program for that CFG
  • s(S) - np(NP), vp(VP), append(NP, VP, S).
  • np(NP) - det(DET), n(N), append(DET, N, NP).
  • vp(VP) - v(V), VVP.
  • vp(VP) - v(V), np(NP),
  • append(V, NP, VP).
  • det(the).
  • det(a).
  • n(man).
  • n(men).
  • n(woman).
  • n(women).
  • n(cat).
  • n(cats).
  • n(dog).
  • n(dogs).
  • v(like).
  • v(likes).
  • v(sleep).
  • v(sleeps).

65
Automatically generating that grammar
  • We can define an operator ? to define grammar
    rules,
  • And update consult_file/1 to translate them into
    Prolog clauses automatically
  • These facilities are already built into the
    built-in consult/1, but we will build them
    ourselves

66
Updates to consult_file
  • - op(1200, xfx, '--gt').
  • Add a new clause to consult_term/1
  • consult_term((NT --gt Rule))-
  • !,
  • grammar_rule_body(Rule, Body, Phrase),
  • functor(Goal, NT, 1),
  • arg(1, Goal, Phrase),
  • assertz((Goal - Body))

67
grammar_rule_body/3
  • grammar_rule_body((Rule1, Rule2),(Body1, Body2,
    append(Phrase1, Phrase2, Phrase)), Phrase)-
  • !,
  • grammar_rule_body(Rule1, Body1, Phrase1),
  • grammar_rule_body(Rule2, Body2, Phrase2).
  • grammar_rule_body(List, true, List)-
  • is_list(List),
  • !.
  • grammar_rule_body(NT, Goal, Phrase)-
  • atom(NT),
  • functor(Goal, NT, 1),
  • arg(1, Goal, Phrase).

68
The grammar can now look like this
  • s --gt np, vp.
  • np --gt det, n.
  • vp --gt v.
  • vp --gt v, np.
  • det --gt the.
  • det --gt a.
  • n --gt man.
  • n --gt men.
  • n --gt woman.
  • n --gt women.
  • n --gt dog.
  • n --gt dogs.
  • v --gt like.
  • v --gt likes.
  • v --gt sleep.
  • v --gt sleeps.

69
A better way to do the translation
  • So, we can transform the grammar into a program
    automatically,
  • But, its not a very good program
  • We could try to move the assert/3 around, but
    that would not be very reversible.
  • Instead, use difference lists
  • Use two variables, one to keep track of the start
    of each phrase, and one to keep track of its
    end.

70
Difference lists as indicies
  • Traditional parsing uses indicies to keep track
    of phrase boundaries
  • the man likes the dog
  • 0 1 2 3 4 5
  • the man is an NP spanning 0-2
  • likes the dog is a VP spanning 2-5
  • Well use difference lists to indicate spans,
  • the dog is an NP spanning the,dog-
  • the man is an NP spanning the,man,likes,the,dog
    -likes,the,dog

71
Difference list grammar rule translation
  • s ? np, vp.
  • Translates to
  • s(S0, SN) - np(S0, S1), vp(S1, SN).
  • Instead of one variable, we have two, for the
    start and end points of the phrase,
  • And the phrases are linked so that the end of one
    phrase is the same as the start of the adjacent
    phrase.

72
Ruling out ungrammatical phrases
  • Weve got a little grammar, but it accepts a lot
    of ungrammatical sentences
  • First, lets deal with number agreement between
    subject NP and the verb
  • Conventional to indicate ungrammatical sentences
    with a
  • The man sleeps.
  • The man sleep.

73
We could just add more rules
  • s ? np_sing, vp_sing
  • s ? np_plural, vp_plural.
  • np_sing ? det, n_sing.
  • np_plural ? det, n_plural.
  • vp_sing ?v_sing.
  • vp_plural ? v_plural.
  • vp_sing ? v_sing np_sing.
  • vp_sing ? v_sing np_plural.
  • vp_plural ? v_plural, np_sing.
  • vp_plural ? v_plural, np_plural.
  • det ? the.
  • det ? a.
  • n_sing ? man.
  • n_sing ? woman.
  • n_sing ? cat.
  • n_sing ? dog.
  • n_plural ? men.
  • n_plural ? women.
  • n_plural ? cats.
  • n_plural ?dogs.
  • v_sing ? likes.
  • v_sing ? sleeps.
  • v_plural ? like.
  • v_plural ? likes.

74
Features
  • But, this leads to duplicating a lot of rules
  • What if we want to eliminate other ungrammatical
    sentences
  • Number agreement between determiner and noun
  • Transitive and Intransitive verbs
  • A man sleeps.
  • A men sleep.
  • The men like the cat.
  • The men like.
  • The men sleep.
  • The men sleep the cat.

75
Features
  • We can add features on rules to express these
    constraints concisely.
  • s(Number) ? np(Number), vp(Number).
  • np(Number) ? det(Number), n(Number).
  • vp(Number) ? v(Number, intranitive).
  • vp(Number) ? v(Number, transitive), np(_).
  • det(singular) ? a.
  • det(_) ? the.
  • n(singular) ? man.
  • n(plural) ? men.
  • v(singular, transitive) ? likes.
  • v(singular, intransitive) ? sleeps.

76
Improved Consult
  • consult_term((NT --gt Rule))-
  • !,
  • grammar_rule_body(Rule, Body, Start, End),
  • make_nonterminal(NT, Start, End, Goal),
  • assertz((Goal - Body)).
  • make_nonterminal(NT, Start, End, Goal)-
  • NT .. List,
  • append(List, Start,End, FullList),
  • Goal .. FullList.

77
Improved Consult (cont)
  • grammar_rule_body((Rule1, Rule2),(Body1, Body2),
    Start, End)-
  • !,
  • grammar_rule_body(Rule1, Body1, Start, Next),
  • grammar_rule_body(Rule2, Body2, Next, End).
  • grammar_rule_body(List, true, Start, End)-
  • is_list(List),
  • !,
  • append(List, End, Start).
  • grammar_rule_body(NT, Goal, Start, End)-
  • make_nonterminal(NT, Start, End, Goal).

78
Possible Class Projects
  • Should demonstrate competence in Prolog
    programming
  • Expect problems with solutions in 5-20 pages of
    code range.
  • Talk/email with me about your project

79
Information extraction from a web page
  • Pick a web page with content that might be well
    represented in a Prolog database
  • Sports statistics
  • TV listings
  • Write a program to parse the HTML, extract the
    relevant information, and turn it into a Prolog
    database.

80
Question-Answering
  • Write a program to accept users questions typed
    at the keyboard, parse them, and generate answers
    from a known database.

81
Breadth-first Prolog interpreter
  • Write a breadth-first Prolog interpreter
  • Test it with some simple programs, and compare it
    with depth-first Prolog, and iterative deepening.

82
Compare/contrast with LP language
  • Select another logical programming language
  • Mercury, Eclipse, etc.
  • Test a variety of the kinds of programs we have
    written in this class (generate-and-test, DCGs,
    etc.), and see how they would be written.
  • Only consider this if you are confident that you
    have already demonstrated Prolog competence.

83
What to cover in remaining weeks
  • Weve got 4 more sessions, I have these plans
  • Another session on DCGs
  • A session on iterative deepening
  • Some time on logical foundations/theorem proving
  • Any thoughts on other things yould like to
    cover?
  • More review?
  • Help with class projects?
Write a Comment
User Comments (0)
About PowerShow.com