Title: LING 406 Intro to Computational Linguistics ContextFree Grammars
1LING 406Intro to Computational
LinguisticsContext-Free Grammars
- Richard Sproat
- URL http//catarina.ai.uiuc.edu/L406_08/
2This Lecture
- Syntactic constituency
- Context-free grammars
- Chomsky Normal Form grammars
- Other issues
3Context free grammars
- Constituent structures in syntax
- Context-Free Grammars
- Power of CFGs compared to Regular Grammars
- Various applications of CFGs
- Limitations of CFGs for natural language
description - A note on dependency representations.
4Constituent structure in syntax
- Consider a sentence
- The old woman gave a big red book to the small
boy. - Various arguments can be given to show that, for
a given word, it bonds more tightly to some words
than it does to others. - These units of bonding are called constituents.
5Arguments based on movement
- Passivization
- The old woman gave a big red book to the small
boy. - A big red book was given by the old woman to the
small boy. - book was given by the old woman a big red to the
small boy. - Topicalization
- The old woman gave a big red book to the small
boy. - To the small boy, the old woman gave a big red
book.
6Arguments based on movement
- wh-movement
- The old woman gave a big red book to the small
boy. - Who did the old woman give a big red book to?
- The old woman gave a big red book to the small
boy. - To whom did the old woman give a big red book?
- Pseudoclefts
- The old woman gave a big red book to the small
boy. - What the old woman did was give a big red book to
the small boy
7Arguments based on anaphora
- In particular, one-anaphora
- The old woman gave a big red book to the small
boy, and a small green one to John. - The old woman gave a big red book to the small
boy, and a small one to John.
8Summary
- We have some evidence that the following strings
of word behave as units, either for the purposes
of (some) movement operations, or for the
purposes of anaphora - a big red book
- red book
- the old woman
- to the small boy
- the small boy
- gave a big red book to the small boy
9Phrase-structure trees
10Labeled Tree
11Uses of trees in syntactic theory
- Syntactic theories in the Extended Standard
Theory, Government and Binding, Principles and
Parameters tradition have tended to propose ever
more abstract tree structures, encoding more and
more grammatical information using arboreal
representations. - Others, such as unification-based theories like
Lexical Functional Grammar, have tended to keep
trees simple, and encode abstract information in
other ways.
12Free word-order languages
- A perhaps not entirely fair example from
Catullus
Literally To whom do I dedicate this charming
new little book, with dry just pumice polished?
Cornelius, to you, because you were accustomed,
my to be something consider trifles. The point
languages can show linkages between words in
other ways than glomming them together into
constituents. One common way is with inflectional
morphology such as case, number and gender
agreement in Latin.
13Context-free grammars
14Grammar for example above
15Convenient shorthand
16What CFGs can do that regular grammars cant
17Chomsky-normal form
18Constructing CNF grammars
19Constructing CNF grammars
20Why CNF grammars?
- CNF grammars are useful in some parsing
algorithms where it is more efficient to limit
the possible productions to being either unary or
binary.
21Applications of CFGs outside Natural Language
- Three applications areas
- Programming language syntax
- Document structure
- Two-dimensional grammars mathematical formulae,
Chinese characters . . .
22Programming language syntax
23Document structure
lthtmlgt ltheadgt lttitlegtLING 406 Introduction to
Computational Linguisticslt/titlegt lt/headgt ltcentergt
ltpgt lt!-- ltfont colorred size3gtUnder
Developmentlt/fontgt --gt ltpgt lth1gtLING 406
Introduction to Computational Linguisticslt/h1gt lth2
gtRichard Sproatlt/h2gt lth3gtSpring 2006lt/h3gt lth3gtMW
4-520, Siebel Center 1131lt/h3gt ltpgt Office Hours
Wednesdays 10-12, Beckman 2057 lt/centergt ltcentergt
lta href"overview"gtOverviewlt/agtnbspnbsp lta
href"syllabus"gtSyllabuslt/agtnbspnbsp lta
href"requirements"gtPrerequisites and
Requirementslt/agtnbspnbsp lta
href"texts"gtTextslt/agtnbspnbsp lta
href"homeworks"gtHomeworks, Exams,
Gradinglt/agt lta href"email"gtEmailing
melt/agt ltpgt lt/centergt
HTML (and other SGML/XML-based schemata) are
formally described by Document Type Definitions
(DTDs), which at their core are CFGs.
24CFG Limitations
- Pure CFGs are rarely used in NLP. One reason is
that the grammars can blow up once you start
trying to add features. Consider subject-verb
agreement in English - The boy likes potatoes
- The boy like potatoes
- The boys likes potatoes
- The boys like potatoes
- To capture the dependency using purely
context-free mechanisms requires expanding the
rules - S ? NP-sg VP-sg
- S ? NP-pl VP-pl
- VP-sg ? V-sg NP
- VP-pl ?V-pl NP
- This might not look so bad, but things can
quickly get out of hand.
25Ambiguity
- Some (possibly not all genuine) newspaper
headlines (from Jason Eisner) - Iraqi Head Seeks Arms
- Juvenile Court to Try Shooting Defendant
- Teacher Strikes Idle Kids
- Stolen Painting Found by Tree
- Kids Make Nutritious Snacks
- Local High School Dropouts Cut in Half
- Obesity Study Looks for Larger Test Group
- British Left Waffles on Falkland Islands
- Red Tape Holds Up New Bridges
- Man Struck by Lightning Faces Battery Charges
- Clinton Wins on Budget, but More Lies Ahead
- Hospitals Are Sued by 7 Foot Doctors
26Ambiguity
27Ambiguity
- Time flies like an arrow.
- The manner of Times flying is like that of an
arrow (verb is flies) - The time flies really like a good arrow (verb is
like) - You should time flies the same way you would time
an arrow (verb is time)
28Ambiguity
- I saw the man on the hill with a telescope
- I saw the man on the hill with a telescope
- I saw the man on the hill with a telescope
- I saw the man on the hill with his dog.
- I saw the man on the hill with my own eyes.
29A note on dependency-based formalisms
- Constituent structure is not the only way to
represent the close bonding of words in a
sentence. - Another (much older!) approach is to use
dependency representations, where some words
(heads) have other words as dependents. - Such representations are much more popular in
Europe than they are in the US.
30A dependency tree
31Dependency representations pros and cons
- Pro Dependency representations are very
convenient for semantic interpretation - Con the dependents of a head are all on a par.
- So theres no way to represent internal structure
of the kind we saw with - the big red book