Applications of Sequence Learning CMPT 825 Mashaal A. Memon - PowerPoint PPT Presentation

About This Presentation
Title:

Applications of Sequence Learning CMPT 825 Mashaal A. Memon

Description:

Part Of Speech (POS) Tagging is a sequence learning problem. 3 approaches to ... What We Know ... Dick Cheney of Halliburton . O B-PER I-PER O B-ORG O (3) ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 21
Provided by: A8355
Category:

less

Transcript and Presenter's Notes

Title: Applications of Sequence Learning CMPT 825 Mashaal A. Memon


1
Applications of Sequence LearningCMPT 825
Mashaal A. Memon
2
What We Know of Sequence Learning
  • Part Of Speech (POS) Tagging is a sequence
    learning problem.
  • 3 approaches to solving the problem
  • Noisy-Channel
  • Classification
  • Rule-Based

3
What We Know About POS Tagging
  • A part of speech (POS) explains not what the
    word is, but how it is used.
  • Problem Which POS does each word represent?
  • Tags POS tags (i.e. NN Noun, VB Verb, etc)
  • Training Words sequences with corresponding POS
    tags.
  • Input Word sequences.

4
What We Know About POS Tagging Continued
  • Examples

Anoop is a great professor .
NN VBZ DT JJ NN .
I am kissing butt right now .
PRP VBP RB NN RB RB .
5
What Is My Point?
  • Other interesting and important problems can be
    represented as tagging problems.
  • The same three approaches can be used.
  • 4 such applications will be briefly introduced
  • Chunking
  • Named Entity Recognition
  • Cascaded Chunking
  • Word Segmentation

6
(1) Chunking
  • A chunk is a syntactically correlated part of a
    language (i.e. noun phrase, verb phrase, etc.)
  • Problem Which type of chunk does each word or
    group of words belong to?
  • Note Chunks of the same type can sometimes kiss
    each other.

7
(1) Chunking Continued
Noun-Phrase (NP) Chunking
  • Only look for noun phrase chunks.
  • Tags B beginning noun phrase
  • I in noun phrase
  • O other
  • Training Word sequences with corresponding POS
    and NP tags.
  • Input Word sequences and POS tags.

8
(1) Chunking Continued
Noun-Phrase (NP) Chunking
  • Examples

The student talked to Anoop .
B I O O B O
The guy he talked to was smelly .
B I B O O O O O
9
(1) Chunking Continued
General Chunking
  • Look for other syntactical constructs as well as
    noun phrases.
  • Tags - B or I prefix to each chunk type
  • - chunk types (NP noun phrase, VP verb
    phrase, PP prepositional phrase, O other)
  • Training Word sequences with corresponding POS
    and chunk tags.
  • Input Word sequences and POS tags.

10
(1) Chunking Continued
General Chunking
  • Examples

Anoop should give me an A .
B-NP B-VP I-VP B-NP B-NP I-NP O
His presentation is boring me to
death .
B-NP I-NP B-PP B-VP
B-NP B-PP B-VP O
11
(2) Named Entity Recognition
  • A named entity is a phrase that contains names
    of persons, organizations or locations
  • Problem Does a word or group of words represent
    a named entity or not?
  • Tags - B or I prefix to each NE type
  • - NE types (PER person, ORG organization,
    LOC location, O other)
  • Training Word sequences with corresponding POS
    and NE tags. Sometimes lists of NE data are used
    (Cheating!!)
  • Input Word sequences with POS tags.

12
(2) Named Entity Recognition Continued
  • Examples

The United States of America
O B-LOC I-LOC I-LOC I-LOC
has an intelligent leader in D.C.
O O O O O B-LOC
, Dick Cheney of Halliburton .
O B-PER I-PER O B-ORG O
13
(3) Cascaded Chunking
  • Cascaded chunking gives us the parse tree of
    the sentence back.
  • Can think of it as chunker taking initial input
    and then continues to work on its OWN output
    until no more changes are made to input.
  • Difference Chunks may contain other chunks and
    POS

14
(3) Cascaded Chunking Continued
CHUNKER (W w1..wn, T t1..tn)
? T t1..tn
CASCADE (W w1..wn, T t1..tn)
OutputBefore Ø OutputAfter CHUNKER
(W,T) while (OutputBefore ! OutputAfter) do
OutputBefore OutputAfter OutputAfter
CHUNKER (W, OutputBefore) / Output result of
current iteration /
15
(3) Cascaded Chunking Continued
  • Example

The effort to establish such a conclusion is
unnecessary .
DT NN TO VB PDT DT NN VBZ
JJ .
______ __ ________ __________
___________ DT NP IP VP PDT DT
NP AP
__________ ____________ __________________
______________ DP CP DP
CP
...
__________________________________________________
_________ S
  • Chunking is an intermediate step to a full parse

16
(4) Word Segmentation
  • When written, some languages like Chinese dont
    have obvious word boundries.
  • Problem Find whether a character or group of
    characters is a single word?
  • Tags B beginning of word
  • I in word
  • Training Character sequences with corresponding
    WS tags.
  • Input Character sequences.

17
(4) Word Segmentation Continued
  • Example

????????????????
B I I B I B I B I B B B I B B I
18
Conclusion
  • All problems are different in their goals, but
    with the same type of representation, they all
    can be solved with the same approaches.
  • We all LOVE sequence learning ?

THE END
19
Questions?!
20
References
  • Manning D., H. Schultze. Foundations of
    Statistical Natural Language Processing. 1999.
  • CoNLL shared task on Chunking 2000. Website
    (http//cnts.uia.ac.be/conll2000/chunking/)
  • CoNLL shared task on NER 2003. Website
    (http//cnts.uia.ac.be/conll2003/ner/)
  • CoNLL shared task on NER 2002. Website
    (http//cnts.uia.ac.be/conll2002/ner/)
  • Abney, S.. Parsing By Chunks. In Journal of
    Psychological Research, 18(1), 1989.
  • Chinese Word Segmentation Bakeoff 2003. Website
    (http//www.sighan.org/bakeoff2003)
Write a Comment
User Comments (0)
About PowerShow.com