74.419%20Artificial%20Intelligence - PowerPoint PPT Presentation

About This Presentation
Title:

74.419%20Artificial%20Intelligence

Description:

e.g. smoke, dream, rest, run. Several morphological forms e.g. non-3rd person - eat ... construct a parse tree, i.e. the derivation of the sentence based on ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 83
Provided by: christe
Category:

less

Transcript and Presenter's Notes

Title: 74.419%20Artificial%20Intelligence


1
74.419 Artificial Intelligence
  • Speech and Natural Language Processing

2
Speech and Natural Language Processing
  • Communication
  • Natural Language
  • Syntax
  • Semantics
  • Pragmatics
  • Speech

3
Evolution of Human Language
  • communication for "work"
  • social interaction
  • basis of cognition and thinking
  • (Whorff Saphir)

4
Communication
"Communication is the intentional exchange of
information brought about by the production and
perception of signs drawn from a shared system of
conventional signs." Russell Norvig, p.651
5
Natural Language - General
  • Natural Language is characterized by
  • a common or shared set of signs alphabeth
    lexicon
  • a systematic procedure to produce combinations of
    signs
  • syntax
  • a shared meaning of signs and combinations of
    signs
  • (constructive) semantics

6
Speech and Natural Language
  • Speech Recognition
  • acoustic signal as input
  • conversion into phonemes and written words
  • Natural Language Processing
  • written text as input sentences (or
    'utterances')
  • syntactic analysis parsing grammar
  • semantic analysis "meaning", semantic
    representation
  • pragmatics
  • dialogue discourse
  • Spoken Language Processing
  • transcribed utterances
  • Phenomena of spontaneous speech

7
Speech Recognition
Acoustic / sound wave
Filtering, FFT Spectral
Analysis Frequency Spectrum
Features (Phonemes Context)
Signal Processing / Analysis
Phoneme Recognition HMM, Neural
Networks Phonemes
Grammar or Statistics Phoneme Sequences /
Words
Grammar or Statistics for likely word
sequences Word Sequence / Sentence
8
Areas in Natural Language Processing
  • Morphology (word stem ending)
  • Syntax, Grammar Parsing (syntactic description
    analysis)
  • Semantics Pragmatics (meaning constructive
    context-dependent references ambiguity)
  • Pragmatic Theory of Language Intentions
    Metaphor (Communication as Action)
  • Discourse / Dialogue / Text
  • Spoken Language Understanding
  • Language Learning

9
NLP Syntax Analysis - Processes
Part-of-Speech (POS) Tagging
Morphological Analyzer
Parser
Grammar Rules
Lexicon
the the determiner Det NP ? Det
Noun NP recognized NP Det
Noun parse tree
Linguistic Background Knowledge
10
NLP - Syntactic Analysis
Part-of-Speech (POS) Tagging
Morphological Analyzer
Parser
Grammar Rules
Lexicon
eat s eat verb Verb VP ? Verb
Noun VP recognized 3rd sing
VP Verb Noun
parse tree
11
Morphology
  • A morphological analyzer determines (at least)
  • the stem ending of a word,
  • and usually delivers related information, like
  • the word class,
  • the number,
  • the person and
  • the case of the word.
  • The morphology can be part of the lexicon or
    implemented as a single component, for example as
    a rule-based system.
  • eats ? eat s verb, singular, 3rd pers
  • dog ? dog noun, singular


12
Lexicon
  • The Lexicon contains information on words, as
  • inflected forms (e.g. goes, eats) or
  • word-stems (e.g. go, eat).
  • The Lexicon usually assigns a syntactic category,
  • the word class or Part-of-Speech category
  • Sometimes also
  • further syntactic information (see Morphology)
  • semantic information (e.g. agent)
  • syntactic-semantic information (e.g. verb
    complements like 'give' requires a direct
    object).

13
Lexicon
  • Example contents
  • eats ? verb singular, 3rd person (-s)
  • can have direct object
  • (verb subcategorization)
  • dog ? dog, noun, singular
  • animal
  • (semantic annotation)

14
POS (Part-of-Speech) Tagging
  • POS Tagging determines the word class or
    part-of-speech category (basic syntactic
    categories) of single words or word-stems.
  • The det (determiner)
  • dog noun
  • eats verb (3rd person singular)
  • the det
  • bone noun

15
Open Word Class Nouns
  • Nouns denote objects, concepts,
  • Proper Nouns
  • Names for specific individual objects, entities
  • e.g. the Eiffel Tower, Dr. Kemke
  • Common Nouns
  • Names for categories or classes or abstracts
  • e.g. fruit, banana, table, freedom, sleep, ...
  • Count Nouns
  • enumerable entities, e.g. two bananas
  • Mass Nouns
  • not countable items, e.g. water, salt, freedom

16
Open Word Class Verbs
  • Verbs
  • denote actions, processes, states
  • e.g. smoke, dream, rest, run
  • Several morphological forms e.g.
  • non-3rd person - eat
  • 3rd person - eats
  • progressive/ - eating
  • present participle/
  • gerundive
  • past participle - eaten
  • Auxiliaries, e.g. be, as sub-class of verbs

17
Open Word Class Adjectives
  • Adjectives
  • denote qualities or properties of objects, e.g.
    heavy, blue, content
  • most languages have concepts for
  • colour - white, green, ...
  • age - young, old, ...
  • value - good, bad, ...
  • not all languages have adjectives as separate
    class

18
Open Word Class Adverbs
  • Adverbs
  • denote modifications of actions (verbs),
    qualities (adjectives)
  • e.g. walk slowly, heavily drunk
  • Directional or Locational Adverbs
  • Specify direction or location
  • e.g. go home, stay here
  • Degree Adverbs
  • Specify extent of process, action, property
  • e.g. extremely slow, very modest

19
Open Word Class Adverbs 2
  • Manner Adverbs
  • Specify manner of action or process
  • e.g. walk slowly, run fast
  • Temporal Adverbs
  • Specify time of event or action
  • e.g. yesterday, Monday

20
Closed Word Classes
  • prepositions on, under, over, at, from, to,
    with, ...
  • determiners a, an, the, ...
  • pronouns he, she, it, his, her, who, I, ...
  • conjunctions and, or, as, if, when, ...
  • auxiliary verbs can, may, should, are
  • particles up, down, on, off, in, out,
  • numerals one, two, three, ..., first, second, ...

21
Language and Grammar
  • Natural Language described as Formal Language L
    using a Formal Grammar G
  • start-symbol S sentence
  • non-terminals NT syntactic constituents
  • terminals T lexical entries/ words
  • production rules P grammar rules
  • Generate sentences or recognize sentences
    (Parsing) of the language L through the
    application of grammar rules.

22
Grammar
  • Here, POS Tags are included in the grammar rules.
  • det ? the
  • noun ? dog bone
  • verb ? eat
  • NP ? det noun (NP ? noun phrase)
  • VP ? verb (VP ? verb phrase)
  • VP ? verb NP
  • S ? NP VP (S ? sentence)
  • Most often we deal with Context-free Grammars,
    with a distinguished Start-symbol S (sentence).

23
Parsing
  • Parsing
  • derive the syntactic structure of a sentence
    based on a language model (grammar)
  • construct a parse tree, i.e. the derivation of
    the sentence based on the grammar (rewrite system)

24
Parsing (here bottom-up)
  • determine the syntactic structure of the sentence
  • the ? det
  • dog ? noun
  • det noun ? NP
  • eats ? verb
  • the ? det
  • bone ? noun
  • det noun ? NP
  • verb NP ? VP
  • NP VP ? S

25
Sample Grammar
Grammar (S, NT, T, P) - NT Non-Terminal T
Terminals P Productions Sentence Symbol S ? NT
Word-Classes / Part-of-Speech ? NT syntactic
Constituents ? NT terminal words ? NT Grammar
Rules P ? NT ? (NT ? T) S ? NP VP Aux NP
VP NP ? Det Nominal Proper-Noun Nominal ?
Noun Nominal PP VP ? Verb Verb NP Verb PP
Verb NP PP PP ? Prep NP Det ? that this
a Noun ? book flight meal money Proper-Noun
? Houston American Airlines TWA Verb ? book
include prefer Prep ? from to on Auc ? do
does
26
Sample Parse Tree
Parse "Does this flight include a meal?" S
Aux NP VP
Det Nominal Verb NP
Noun Det Nominal does this
flight include a meal
27
Bottom-up vs. Top-Down Parsing
Bottom-up from word-nodes to sentence-symbol
Top-down Parsing from sentence-symbol to
words S Aux NP
VP Det Nominal Verb NP
Noun Det Nominal does
this flight include a meal
28
Ambiguity
  • One morning, I shot an elephant in my pajamas.
  • How he got into my pajamas, I dont know.
  • Groucho Marx
  • syntactical or structural ambiguity several
    parse trees
  • example above sentence
  • semantic or lexical ambiguity several word
    meanings
  • bank (where you get money) and (river) bank
  • even different word categories possible (interim)
  • He books the flight. vs. The books are
    here.
  • Fruit flies from the balcony vs. Fruit flies
    are on the balcony.

29
Lexical Ambiguity
  • Several word senses or word categories
  • e.g. chase noun or verb
  • e.g. plant - ????

30
Syntactic Ambiguity
  • Several parse trees
  • e.g. The dog eats the bone in the park.
  • e.g. The dog eats the bone in the package.
  • Who/what is in the park and who/what is in the
    package?
  • Syntactically speaking
  • How do I bind the Prepositional Phrase
  • "in the ... " ?

31
Problems in Parsing
  • Problems with left-recursive rules like NP ? NP
    PP dont know how many times recursion is
    needed.
  • Pure Bottom-up or Top-down Parsing is inefficient
    because it generates and explores too many
    structures which in the end turn out to be
    invalid.
  • Combine top-down and bottom-up approach
  • Start with sentence use rules top-down
    (look-ahead) read input try to find shortest
    path from input to highest unparsed constituent
    (from left to right).
  • ? Chart-Parsing / Earley-Parser

32
Chart-Parsing / Early Algorithm
  • Essence
  • Integrate top-down and bottom-up parsing.
  • Keep recognized sub-structures (sub-trees) for
    shared use during parsing.
  • Top-down Prediction Start with S-symbol.
    Generate all applicable rules for S. Go further
    down with left-most constituent in rules and add
    rules for these constituents until you encounter
    a left-most node on the RHS which is a word
    category (POS).
  • Bottom-up Completion Read input word and
    compare. If word matches, mark as recognized and
    continue the recognition bottom-up, trying to
    complete active rules.

33
Earley Algorithm - Functions
  • predictor
  • generates new rules for partly recognized RHS
    with constituent right of (top-down
    generation)
  • indicates how far a rule has been recognized
  • scanner
  • if word category (POS) is found right of the ,
    the Scanner reads the next input word and adds a
    rule for it to the chart (bottom-up mode)
  • completer
  • if rule is completely recognized (the is far
    right), the recognition state of earlier rules in
    the chart advances the is moved over the
    recognized constituent (bottom-up recognition).

34
Chart
S ? VP .
VP? V NP .
NP? Det Nom .
Nom ? Noun .
Det
V
Noun
Book this flight
35
(No Transcript)
36
Semantics
37
Semantic Representation
  • Representation of the meaning of a sentence.
  • Generate
  • a logic-based representation or
  • a frame-based representation
  • based on the syntactic structure, lexical
    entries, and particularly the head-verb
    (determines how to arrange parts of the sentence
    in the semantic representation).

38
Semantic Representation
  • Verb-centered Representation
  • Verb (action, head) is regarded as center of
    verbal expression and determines the case frame
    with possible case roles other parts of the
    sentence are described in relation to the action
    as fillers of case slots. (cf. also Schanks CD
    Theory)
  • Typing of case roles possible (e.g. 'agent'
    refers to a specific sort or concept)

39
General Frame for "eat"
  • Agent animate
  • Action eat
  • Patiens food
  • Manner e.g. fast
  • Location e.g. in the yard
  • Time e.g. at noon

40
Example-Frame with Fillers
  • Agent the dog
  • Action eat
  • Patiens the bone / the bone in the package
  • Location in the park

41
  • General Frame for drive Frame with fillers
  • Agent animate Agent she
  • Action drive Action drives
  • Patiens vehicle Patiens the convertible
  • Mannerthe way it is done Manner fast
  • Location Location-spec Location in the Rocky
    Mountains
  • Source Location-spec Source from home
  • Destination Location-spec Destination to the
    ASIC conference
  • Time Time-spec Time in the summer holidays

42
Representation in Logic
  • Action eat
  • Agent the dog
  • Patiens the bone / the bone in the package
  • Location in the park

predicate
eat (dog-1, bone-1, park-1)
constants
43
Representation in Logic
eat (dog-1, bone-1, park-1)
lexical
variables
eat ( x, y, z )
general
syntactic
eat ( NP-1, NP-2, PP )
animate-being (x) food (y) location (z)
NP-1 (x) NP-2 (y) PP (z)
syntactic frame
semantic frame
44
Pragmatics
45
Pragmatics
  • Pragmatics includes context-related aspects of NL
    expressions (utterances).
  • These are in particular anaphoric references,
    elliptic expressions, deictic expressions,
  • anaphoric references refer to items mentioned
    before
  • deictic expressions simulate pointing
    gestures
  • elliptic expressions incomplete expression
  • relate to item mentioned
    before

46
Pragmatics
  • I put the box on the top shelve.
  • I know that. But I cant find it there.

deictic expression
anaphoric reference
The candy-box?
elliptic expression
47
Intentions
  • Intentions
  • One philosophical assumption is that natural
    language is used to achieve things or situations
    Do things with words.
  • The meaning of an utterance is essentially
    determined by the intention of the speaker.

48
Intentionality - Examples
  • What was said What was meant
  • There is a terrible "Can you please
  • draft here. close the window."
  • How does it look "I am really mad
  • here? clean up your room."
  • "Will this ever end?" "I would prefer to be
  • with my friends than to sit in class
    now."

49
Metaphors
  • Metaphors
  • The meaning of a sentence or expression is not
    directly inferable from the sentence structure
    and the word meanings. Metaphors transfer
    concepts and relations from one area of discourse
    into another area, for example, seeing time as
    line (in space) or seing friendship or life as a
    journey.

50
Metaphors - Examples
  • This car eats a lot of gas.
  • She devoured the book.
  • He was tied up with his clients.
  • Marriage is like a journey.
  • Their marriage was a one-way road into hell.
  • (see George Lakoff, Women, Fire and Dangerous
    Things)

51
Dialogue and Discourse
52
Discourse / Dialogue Structure
  • Grammar for various sentence types (speech acts)
    dialogue, discourse, story grammar
  • Distinguish questions, commands, and statements
  • Where is the remote-control?
  • Bring the remote-control!
  • The remote-control is on the brown table.
  • Dialogue Grammars describe possible sequences of
    Speech Acts in communication, e.g. that a
    question is followed by an answer/statement.

53
Speech
54
(No Transcript)
55
Speech Production Reception
  • Sound and Hearing
  • change in air pressure ? sound wave
  • reception through inner ear membrane / microphone
  • break-up into frequency components receptors in
    cochlea / mathematical frequency analysis (e.g.
    Fast-Fourier Transform FFT) ? Frequency Spectrum
  • perception/recognition of phonemes and
    subsequently words (e.g. Neural Networks,
    Hidden-Markov Models)

56
(No Transcript)
57
Speech Recognition Phases
  • Speech Recognition
  • acoustic signal as input
  • signal analysis - spectrogram
  • feature extraction
  • phoneme recognition
  • word recognition
  • conversion into written words

58
Speech Signal
  • Speech Signal composed of
  • harmonic signal (sinus waves)
  • with different frequencies and amplitudes
  • frequency - waves/second ? like pitch
  • amplitude - height of wave ? like loudness
  • non-harmonic signal (not sinus wave) noise

59
(No Transcript)
60
glottis and speech signal in lingWAVES (from
http//www.lingcom.de)
61
Speech Signal Analysis
  • Analog-Digital Conversion of Acoustic Signal
  • Sampling in Time Frames (windows)
  • frequency 0-crossings per time frame
  • ? e.g. 2 crossings/second is 1 Hz (1 wave)
  • ? e.g. 10kHz needs sampling rate 20kHz
  • measure amplitudes of signal in time frame
  • ? digitized wave form
  • separate different frequency components
  • ? FFT (Fast Fourier Transform)
  • ? spectrogram
  • other frequency based representations
  • ? LPC (linear predictive coding),
  • ? Cepstrum

62
Waveform
Amplitude/ Pressure
Time
"She just had a baby."
63
Waveform for Vowel ae
Amplitude/ Pressure
Time
Time
64
Waveform and Spectrogram
65
Waveform and LPC Spectrum for Vowel ae
Amplitude/ Pressure
Time
Energy
Formants
Frequency
66
Phoneme Recognition
  • Recognition Process based on
  • features extracted from spectral analysis
  • phonological rules
  • statistical properties of language/ pronunciation
  • Recognition Methods
  • Hidden Markov Models
  • Neural Networks
  • Pattern Classification in general

67
Speech Signal Characteristics
  • Derive from signal representation
  • formants - dark stripes in spectrum
  • strong frequency components characterize
    particular vowels gender of speaker
  • pitch fundamental frequency
  • baseline for higher frequency harmonics like
    formants gender characteristic
  • change in frequency distribution
  • characteristic for e.g. plosives (form of
    articulation)

68
Features for Vowels Consonants
69
Probabilistic FAs as Word Models
70
Word Recognition with Hidden Markov Model
71
Viterbi-Algorithm
  • The Viterbi Algorithm finds an optimal sequence
    of states in continuous Speech Recognition, given
    an observation sequence of phones and a
    probabilistic (weighted) FA (state graph). The
    algorithm returns the path through the automaton
    which has maximum probability and accepts the
    observation sequence.
  • as,s' is the transition probability (in the
    phonetic word model) from current state s to next
    state s', and bs',ot is the observation
    likelihood of s' given ot. bs',ot is 1 if the
    observation symbol matches the state, and 0
    otherwise.
  • (cf. Jurafsky Ch.5)

72
Speech Recognizer Architecture
73
Speech Processing - Characteristics
  • Speech Recognition vs. Speaker Identification
    (Voice Recognition)
  • speaker-dependent vs. speaker-independent
  • training
  • unlimited vs. large vs. small vocabulary
  • single word vs. continuous speech

74
Spoken Language
75
Spoken Language
  • Output of Speech Recognition System as input
    "text".
  • Can be associated with probabilities for
    different word sequences.
  • Contains ungrammatical structures, so-called
    "disfluencies", e.g. repetitions and corrections.

76
(No Transcript)
77
Spoken Language - Examples
  1. no s- straight southwest
  2. right to my my left
  3. that is that is correct

From Robin J. Lickley. HCRC Disfluency Coding
Manual. http//www.ling.ed.ac.uk/robin/maptask/
HCRCdsm-01.html
78
Spoken Language - Examples
  1. we're going to g-- ... turn straight back
    around for testing.
  2. come to ... walk right to the ... right-hand
    side of the page.
  3. right up ... past ... up on the left of the ...
    white mountain walk ... right up past.
  4. i'm still ... i've still gone halfway back
    round the lake again.

79
Spoken Language - Examples
  1. Id d if I need to go
  2. its basi-- see if you go over the old mill
  3. you are going make a gradual slope to your
    right
  4. Ive got one I dont realize why it is there

80
Spoken Language - Disfluency
  • Reparandum and Repair

come to ... walk right to the ... the
right-hand side of the page
Reparandum
Repair
81
Additional References
  • Jurafsky, D. J. H. Martin, Speech and Language
    Processing, Prentice-Hall, 2000
  • Hong, X. A. Acero H. Hon Spoken Language
    Processing. A Guide to Theory, Algorithms, and
    System Development. Prentice-Hall, NJ, 2001
  • Kemke, C., 74.793 Natural Language and Speech
    Processing - Course Notes, 2nd Term 2004, Dept.
    of Computer Science, U. of Manitoba
  • Robin J. Lickley. HCRC Disfluency Coding Manual.
  • http//www.ling.ed.ac.uk/robin/maptask/HCRCdsm-01
    .html

82
Figures
  • Figures taken from
  • Jurafsky, D. J. H. Martin, Speech and Language
    Processing, Prentice-Hall, 2000, Chapters 5 and
    7.
  • lingWAVES (from http//www.lingcom.de
Write a Comment
User Comments (0)
About PowerShow.com