ESSLLI 2006 Summer school - PowerPoint PPT Presentation

1 / 60

About This Presentation

Title:

ESSLLI 2006 Summer school

Description:

In Proceedings of the 3rd Conference on Computational Lexicography and Text Research. References Andrei Mikheev (2003): – PowerPoint PPT presentation

Number of Views:224

Avg rating:3.0/5.0

Slides: 61

Provided by: ili91

Category:

more less

Transcript and Presenter's Notes

Title: ESSLLI 2006 Summer school

1

ESSLLI 2006 Summer school
Malaga, Spain
31 July 11 August

2
General Comments

PLUS
Courses on time
Proceedings of all courses
Workshops
Student sessions
Internet connection

MINUS
Not well organized
Site not updated on time
Lunch tickets

3
Courses

Counting Words An Introduction to Lexical
Statistics
Formal Ontology for Communicating Agents
(Workshop)
Word Sense Disambiguation
Introduction to Corpus Resources, Annotation
Access
An Empirical View on Semantic Roles Within and
Across Languages
Approximate Reasoning for the Semantic Web

4
Counting WordsMarco Baroni and Stefan Evert

Contents
Introduction
Distributions
Zipfs Law
The ZipfR package
Practical Consequences and Conclusion

5
Introduction

The frequency of words plays an important role in
corpus linguistics.
The study of word frequency distributions is
called Lexical Statistics.
It seems that word frequency distributions are
more of interest to theoretical physicists than
to theoretical linguists.
This course introduces some of the empirical
phenomena pertaining to word frequency
distributions and the classic models that have
been proposed to capture them.

6
DistributionsBasic Terminology

Types distinct words
Tokens instances of all distinct words
Corpus size (N) number of tokens in the corpus
Vocabulary size (V) number of types
Frequency list list that reports the number of
tokens of each type in the corpus
Rank/Frequency profile replace the types with
the frequency ranks
Frequency Spectrum a list reporting how many
types in a frequency list have a certain frequency

7
DistributionsExample

Sample a b b c a a b a d
N9, V4
Freq. list rank/freq. prof. Freq.
spect.

type f
a 4
b 3
c 1
d 1
rank f
1 4
2 3
3 1
4 1
f V(f)
1 2
3 1
4 1
8
DistributionsTypical frequency patterns

Top ranks are occupied by function words (the,
of, and..)
Frequency decreases quite rapidly
The lowest frequency elements are content words

9
Zipfs Law

The frequency is a non-linear decreasing function
of rank.
Zipfs model f(w)C/r(w)a
The model predicts a very rapid decrease in
frequency among the most frequent words, which
becomes slower as the rank grows.
Mathematical property
logf(w)logC-alogr(w) (Linear function)

10
Zipfs LawApplications and explanations

Zipfian distributions are encountered in various
phenomena
City populations
Incomes in economics
Frequency of citations of scientific papers
Visits to web sites
Least effort principle

11
ZipfR Package

Statistical package for modeling lexical
distributions.
url http//www.purl.org/stefan.evert/zipfR
Dependencies the R package
url http//www.r-project.org
Binaries available for Win and MacOS.
Source available for Linux.
Open source, GNU Licensed project.

12
Practical Consequences and Conclusion

The Zipfian nature of word frequency distribution
causes data sparseness problems.
Although V is growing with corpus size, we cannot
use it as a measure of lexical richness when
comparing corpora.
Interested readers should proceed to Baayen(2001)
for a thorough introduction to word frequency
distributions with an emphasis to statistical
modeling.

13
References

Abney, Steven (1996), Statistical methods and
linguistics. In Klavans, J. Resnik, P. (eds)
The balancing act Combining symbolic and
statistical approaches to language. Cambridge MA
MIT Press, 1-23.
Baayen, Harald (2001), Word frequency
distributions. Dordrecht Kluwer
Baldi, Pierre/Frasconi, Paolo/Smyth, Padhraic
(2003), Modeling the internet and the web.
Chichester Wiley
Biber, Douglas/Conrad, Susan/Reppen, Randi
(1998), Corpus linguistics. Cambridge Cambridge
University Press
Creutz, Mathias (2003), Unsupervised segmentation
o words using prior distributions of morph length
and frequency. In Proceedings of ACL 03, 280-287
Delgaard, Peter (2002), Introductory statistics
with R. New York Springer
Evert, Stefan (2004), The statistics of word
co-occurrences Word pairs and collocations.PhD
thesis, University of Stuttgard/IMS

14
References

Evert, Stefan/Baroni, Marco (2006), Testing the
extrapolation quality of word frequency models.
In Proceedings of Corpus Linguistics 2005,
available from http//www.corpus.bham.ac.uk./PCLC
Li, Wentian (2002), Zipfs Law everywhere. In
Glottometrics 5, 14-21
Manning, Christopher/Schutze, Hinrich (1999),
Foundations of statistical natural language
processing. Cambridge MA MIT Press
McEnery, Tony and Andrew Wilson (2001), Corpus
Linguistics, 2nd edition. Edinburgh Edinburgh
University Press
Oakes, Michael (1998), Statistics for corpus
linguistics. Edinburgh Edinburgh University
Press
Sampson Geoffrey (2002), Review of Harald Baayen
Word frequency distributions. In Computational
Linguistics 28, 565-569
Zipf, George Kingsley (1949), Human behavior and
the principle of least effort. Cambridge MA
Addison-Wesley
Zipf, George Kingsley (1965), The psycho-biology
of language. Cambridge MA MIT Press

15
Formal Ontology for Communicating Agents
(FOCA)Workshop

Contents
Introduction
Communicative acts
The missing ontological link
Semantic Coordination
A Communication Acts Ontology for Software Agents
Interoperability
OWL DL as a FIPA ACL content Language

16
Introduction

Purpose of the workshop
To gather contributions that
Take seriously into account the ontological
aspects of communication and interaction
Use formal ontologies for achieving a better
semantic coordination between interacting and
communicating agents

17
IntroductionCommunicative acts

According to Austin, 3 kinds of acts can be
performed simultaneously through a single
utterance
Locutionary act producing noises that conform to
a system
Illocutionary act what is performed in saying
something
Perlocutionary act what is performed by saying
something
An important issue is the distinction between the
last two acts.

18
IntroductionThe missing ontological link

Ontological ingredients
Events, states, actions, speech acts, relations,
plans, propositions, arguments, facts,
commitments,..
Top-level ontologies focus on the sub-domain of
concrete entities, like time, space,..
There is a need for the integration of the large
amount of the philosophical work on other domains
like that of abstract entities.

19
IntroductionSemantic Coordination

An important aspect of interaction and
communication involves the management of
ontologies.
Scenaria identified w.r.t. semantic coordination
With a shared pre-existing ontology
With different ontologies but linked to a
pre-existing common upper level ontology
With different ontologies but mapped directly
onto each other
When agents are involved
Keep static ontologies but manage a shared
dynamic one
Create new static ontologies through a
negotiation phase
Modify their ontology during the interaction
while maintaining some kind of negotiation
meaning

20
A Communication Acts Ontology for Software Agents
Interoperability

Different classes of communication acts to each
ACL.
The use of an agreed ontology can open a
possibility of real agents interoperation based
on a wide agreement on some classes of
communication acts that will serve as a bridge
among different ACL islands
Main design criterion follow the speech act
theory and also embed an approach for expressing
the semantics of the communication acts
Use the OWL DL language

21
A Communication Acts Ontology for Software Agents
Interoperability

Upper layer
CommunicationAct ? ?hasSender.Actor ?
1.hasSender ? ?hasReceiver.Actor ?
?hasContent.Content
Assertive ? CommunicationAct ? ?hasContent.Proposi
tion ? ?hasCommit.AssertiveCommitment
Directive ? CommunicationAct ? ?hasContent.Action
?hasCommit.DirectiveCommitment
Commisive ? CommunicationAct ? ?hasContent.Action
?hasCondition.Proposition ? ?hasCommit.CommissiveC
ommitment
Expressive ? CommunicationAct ?
?hasContent.Proposition ? ?hasState.PsyState
?hasCommit.ExpressiveCommitment
Declarative ? CommunicationAct ?
?hasContent.Proposition

22
A Communication Acts Ontology for Software Agents
Interoperability

The Standards Layer extends the Upper Layer with
terms representing classes of communication acts
of general purpose ACLs, like FIPA-ACL.
The Applications Layer is the most specific.
Defines communication acts classes for a specific
application.
Concluding Classes in the upper layer are
considered the framework agreement for general
communication. Classes in the standard layer
reflect classes of communication acts that
different standard ACLs define. Classes in the
application layer concern the particular
communication acts used by each agent system
committing to the ontology.

23
References

J. L. Austin. How to Do Things With Words. Oxford
University Press. Oxford, 1962
J. R. Searle. Speech Acts An Essay in the
Philosophy of Language. Cambridge University
Press. New York, 1969
M. P. Singh. Agent Communication Languages
Rethinking the Principles. IEEE Computer, vol.31,
num.12, pp.40-47, 1998
M. Wooldridge. Semantic Issues in the
Verification of Agent Communication Languages.
Journal of Autonomous Agents and Multi-Agent
Systems, vol.3, num.1, pp.9-31, 2000
Y. Labrou, T. Finin, Y. Pen. Agent Communication
Languages the Current Landscape. IEEE
Intelligent Systems, vol.14, num.2, pp.45-52,
1999
M. P. Singh. A Social Semantics for Agent
Communication Languages. Issues in Agent
Communication, pp.31-45. Spinger-Verlag, 2000
FIPA Communicative Act Library Specification.
Foundation For Intelligent Physical Agents, 2005.
http//www.fipa.org/specs/fipa00037/SC00037J.html

24
References

N. Asher and A. Lascarides. Logics of
Conversation. Cambridge University Press, 2003
S. Levinson. Pragmatics. Cambridge University
Press, 1983
J.R. Searle and D. Vanderveken. Foundations of
illocutionary logic. Cambridge University Press,
1975
J.R. Searle. The Construction of Social Reality.
Free Press, New York, 1995
R. Stalnaker. Assertion. Syntax and Semantics,
9315-332, 1978
J. Ginzburg. Dynamics and the Semantics of
Dialogue. CSLI Stanford, 1996
H. H. Clark. Using Language. Cambridge University
Press, 1996
S. Carberry. Plan Recognition in Natural Language
Dialogue. MIT Press, 1990

25
OWL DL as a FIPA ACL Content Language

FIPA-SL content language is in general
undecidable.
Use OWL DL in order to enable semantic validation
in the content of the ACL message and to separate
speech act semantics from content semantics.
Their ontology defines some of the FIPA
specifications (message structure, ontology
service, content language, communicative act lib)

26
OWL DL as a FIPA ACL Content Language

Advantages
Application ontologies are domain independent.
They can be applied to a MAS in different
domains.
Various application ontologies in OWL DL are
available. This shows a great potential for
reusing already formulated ontologies.
W3C suggests the use of OWL within agents.

27
References

Eric Miller et al. Web Ontology Language (OWL),
2004
RACER Systems GmbH. The features of racerpro
version 1.9, 2005
Foundation for Intelligent Physical Agents. FIPA
ACL Message Structure Specification, 2002
Foundation for Intelligent Physical Agents. FIPA
Ontology Service Specification, 2001
Foundation for Intelligent Physical Agents. FIPA
SL Content Language Specification, 2002
Foundation for Intelligent Physical Agents. FIPA
Communicative Act Library Specification, 2002
Web Ontology Working Group. OWL Web Ontology
Language Use Cases and Requirements, 2004
Giovani Caire. JADE Introduction AAMAS 2005, 2005

28
Introduction to Corpus Resources, Annotation
AccessSabine Schulte im Walde and Heike
Zinsmeister

Contents
Basic definitions
Corpora
Annotation
Tokenization Morpho-Syntactic Annotation

29
Introduction to Corpus Resources, Annotation
Access

Basic Definitions
Linguistics Characterization and explanation of
linguistic observations
Corpus Any collection of more than one text
Annotation The practice of adding
interpretative, linguistic information to an
electronic corpus of spoken and/or written
language

30
Corpora

Corpora give only a partial description of a
language
They are incomplete
(e.g. Brown corpus does not include vocabulary
related to WWW and e-mail)
They are biased
They include ungrammatical sentences
(e.g. typos, copy-and-paste errors, conversion
errors)
We have to sample a corpus according to some
design criteria such that it is balanced and
representative for a specific purpose

31
Annotation

Levels
POS tags
Lemmata
Senses
Semantic roles
Named Entities
Topic
Co reference

Principles
The raw corpus should be recoverable
Annotation should be extricable from the corpus
Easy access to documentation
Annotation scheme
How, where, by whom the annotation was applied

32
Tokenization and Morpho-Syntactic Annotation

Tokenization divides the raw input character
sequence of a text into sentences and the
sentences into tokens
Problems
Language dependent task
Sentence boundaries
Numbers
Abbreviations
Capitalization
Hyphenation
Multiword expressions
Clitics
? So.. We need to apply disambiguation methods

33
Tokenization and Morpho-Syntactic Annotation

Part-Of-Speech Tagging (POS tagging) The task of
labeling each word in a sequence of words with
its appropriate part-of-speech.
Performs a limited syntactic disambiguation
Context helps to disambiguate tags
Tagset A set of part-of-speech tags
Classical 8 classes noun, verbs, article,
participle, pronoun, preposition, adverb,
conjunction

34
Tokenization and Morpho-Syntactic Annotation

Morphology morphology is concerned with the
inner structure of words and the formation of
words from smaller units.
Root the morphem of the word
Stemming A process that strips off affixes and
leaves the stem.
Lemmatization A process that gives the lemma of
a word. Includes disambiguation at the level of
lexemes, depending on the part-of-speech.
Co reference is the reference in one expression
to the same referent in another expression
Anaphora is co reference of one expression with
its antecedent

35
References

Tony McEnery (2003). Corpus Linguistics. In The
Oxford Handbook of Computational Linguistics,
pp.448-463. Oxford University Press
Tony McEnery and Andrew Wilson (2001). Corpus
Linguistics. 2nd edition. Edinburgh University
Press, chapter 1
Sue Atkins, Jeremy Clear and Nicholas Ostler
(1992). Corpus Design Criteria. In Literary and
Linguistic Computing, 7(1)1-16
Nancy Ide (2004). Preparation and Analysis of
Linguistic Corpora. In Schreibman, S., Siemens,
R., Unsworth, J., eds. A Companion to Digital
Humanities. Blackwell
Geoffrey Leech (1997). Introducing Corpus
Annotation. In Richard Garside, Geoffrey Leech
and Tony McEnery, eds. Corpus Annotation.
Longmanm pp.1-18
Geoffrey Leech (2005). Adding Linguistic
Annotation. In Developing Linguistic Corpora A
Guide to good Practice, ed. M. Wynne. Oxford
Oxbow Books, pp. 17-29. Available online from
http//ahds.ac.uk./linguistic-corpora/
Gregory Grefenstette and Pasi Tapanainen (1994)
What is a word, what is a sentence? Problems of
tokenization. In Proceedings of the 3rd
Conference on Computational Lexicography and Text
Research.

36
References

Andrei Mikheev (2003) "Text segmentation". In
Ruslan Mitkov, editor, "The Oxford Handbook of
Computational Linguistics", pp. 376-394. Oxford
University Press.
Helmut Schmid (2007?) "Tokenizing". In Anke
Lüdeling and Merja Kytö, editors, "Corpus
Linguistics.
An International Handbook. Mouton de Gruyter,
Berlin.
Christopher D. Manning and Hinrich Schütze
(1999) Foundations of Statistical Natural
Language Processing, chapter 10. MIT Press.
Atro Voutilainen (2003) Part-of-speech
tagging". In Ruslan Mitkov, editor, "The Oxford
Handbook of Computational Linguistics", pp.
219-232. Oxford University Press.
John Carroll, Guido Minnen, and Ted Briscoe
(1999) Corpus annotation for parser
evaluation. In Proceedings of LINC. Bergen.
Ruslan Mitkov, Richard Evans, Constantin Orasan,
Catalina Barbu, Lisa Jones, and Violeta Sotirova
(2000) Coreference and anaphora developing
annotating tools, annotated resources and
annotation strategies. In Proceedings of the
Discourse, Anaphora and Reference Resolution
Conference, pp. 49-58.
Eva Hajicová, Jarmila Panevová, and Petr Sgall
(2000) "Coreference in annotating a large
corpus". In Proceedings of the 2nd International
Conference on Language Resources and Evaluation,
pp. 497-500.

37
Approximate Reasoning for the Semantic WebFrank
van Harmelen, Pascal Hitzler and Holger Wache

Contents
Semantic Web the Vision
Ontologies
XML
W3C Stack
Beyond RDF OWL
Why Approximate Reasoning
Reduction of use-cases to reasoning methods

38
Semantic Web the Vision

Semantic Web Web of Data
Set of open, stable W3C standards
Intelligent things we cant do today
Search engines concepts, not keywords
Personalization
Web Services need semantic characterizations to
find them, to combine them
Requirement Machine Accessible Meaning

39
Ontologies

Ontologies ARE shared models of the world
constructed to facilitate communication
Ontologies ARE NOT definitive descriptions of
what exists in the world (this is philosophy)
Whats inside an ontology?
Classes
Instances
Values
Inheritance
Restrictions
Relations
Properties
We need a machine representation

40
XML

What was XML again?
ltcountry nameGreecegt
ltcapital nameAthensgt
ltareacodegt210lt/areacodegt
lt/capitalgt
lt/countrygt

Why not use XML ??
No agreement on
Structure
Is country a
Object?
Class?
Attribute?
Relation?
What does nesting mean?
Vocabulary
Is country the same as nation ?

country
name
capital
Greece
name
areacode
Athens
210
41
W3C Stack

XML
Surface syntax, no semantics
XML Schema
Describes structure of XML documents
RDF
Datamodel for relations between things
RDF Schema
RDF Vocabulary Definition Language
OWL
A more expressive Vocabulary Definition Language

42
Beyond RDF OWL

OWL extends RDF Schema to a full-fledged ontology
representation language.
Domain / range
Cardinality
Quantifiers
Enumeration
Equality
Boolean Algebra
Union, complement
OWL is simply a Description Logic SHOIN(D) with
an RDF/XML syntax.
3 Flavors OWL Lite, OWL DL, OWL Full

43
Why Approximate Reasoning

Current inference is exact
yes or now
This was OK, because until now ontologies were
clean
Hand-crafted, well-designed, carefully populated,
well maintained,
BUT, ontologies will be sloppy
Made by machines
(e.g. almost subClassOf)
Mapping ontologies is almost always messy
(e.g. almost equal)

44
Reduction of use-cases to reasoning methods

Realization (member of)
Subsumption (subclass-relation)
Mapping (similar to)
Retrieval (has member)
Classification (locate in hierarchy)
GOAL
Find approximation methods for the reasoning
methods
Many reasoning methods can be reduced to
satisfiability
GOAL find approximation methods for
satisfiability

45
References

Cadoli and Schaerf, 1995 Marco Cadoli and Marco
Schaerf. Approximate inference in default
reasoning and circumscription. Fundamenta
Informaticae, 23123143, 1995.
Cadoli et al., 1994 Marco Cadoli, Francesco M.
Donini, and Marco Schaerf. Is intractability of
non-monotonic reasoning a real drawback? In
National Conference on Artificial Intelligence,
pages 946951, 1994.
Dalal, 1996a M. Dalal. Semantics of an anytime
family of reasoners. In W. Wahlster, editor,
Proceedings of ECAI-96, pages 360364, Budapest,
Hungary, August 1996. John Wiley Sons LTD.
Motik, 2006 B. Motik. Reasoning in Description
Logics using Resolution and Deductive Databases.
PhD thesis, Universität Karlsruhe (2006)
Schaerf and Cadoli, 1995 Marco Schaerf and
Marco Cadoli. Tractable reasoning via
approximation. Artificial Intelligence,
74249310, 1995.
Zilberstein, 1993 S. Zilberstein. Operational
rationality through compilation of anytime
algorithms. PhD thesis, Computer science
division, university of California at Berkley,
1993.
Zilberstein, 1996 S. Zilberstein. Using anytime
algorithms in intelligent systems. Artificial
Intelligence Magazine, fall7383, 1996.

46
Word Sense DisambiguationRada Mihalcea

Outline
Some Definitions
Basic Approaches Intro
Basic Approaches In more Detail
Some Examples

47
Word Sense Disambiguation

Word Sense Disambiguation is the problem of
selecting a sense for a word from a set of
predefined possibilities (Sense Inventory).
Sense Inventory usually comes from a dictionary
Word Sense Discrimination is the problem of
dividing the usages of a word into different
meanings, without regard to existing predefined
possibilities.

48
Word Sense Disambiguation

Knowledge-Based Disambiguation
- Machine Readable Dictionaries (e.g. WordNet)
- Raw Corpora (not manually annotated)
Supervised Disambiguation
- Manually Annotated Corpora
- Input of the learning system is
1. a training set of the feature-encoded inputs
2. their appropriate sense label
Unsupervised Disambiguation
- Unlabelled corpora
- Input of the learning system is
1. a training set of feature-encoded inputs
2. NOT their appropriate sense label

49
Word Sense Disambiguation

Knowledge-Based Disambiguation
Examples
- Algorithms based on Machine Readable
Dictionaries (e.g. Lesk alg)
- Semantic Similarity Metrics
- relies on semantic networks, like ontologies
e.g. Sim(a,b) -log(Path(a,b))/2D)
- may utilize on information content metric
e.g. Sim(a,b) IC(LCS(a,b)), IC(a)-log(P(a))
- Heuristic-based Methods
e.g. identify the most often used meaning and
use it
by default.

50
Word Sense Disambiguation

Knowledge-Based Disambiguation
Examples
disambiguate plant in plant with flower
1. plant, works, industrial plant
2. plant, flora, plant life
Sim(plant1, flower)1.0
Sim(plant2, flower)1.5 winner sense 2

51
Word Sense Disambiguation

Supervised Disambiguation
-Class of methods that induce a classifier from
manually sense-tagged text using machine learning
techniques (SVM, Na?ve Bayes, Neural Networks..)
- Resources
1. Sense tagged text
2. Dictionary (source of sense inventory)
3. Syntactic Analysis (POS tagger, Chunker)
Example of features of a training algorithm for
the target word bank bank/SHORE and
bank/FINANCE

52
Word Sense Disambiguation

Unsupervised Disambiguation
- Identifies patterns and divides data into
clusters,
where its member of a cluster has more in common
with the members of its own class, than any other
- Words with similar meanings tend to occur in
similar
contexts. So clustering is based on the context
- Only raw text is available, no external
resources nor
annotations
- Usual Approaches Agglomerative algorithm, LSA

53
Word Sense Disambiguation

Unsupervised Disambiguation
Examples
- Agglomerative Clustering(McQuitty's Similarity
Analysis)
First Order Representation of the target word
bank, in four sentences

Similarity Matrix and resulting clustering
54
An Empirical View on Semantic Roles Within and
Across LanguagesKatrin Erk and Sebastian Pado

Outline
- The problem
- Predicate-argument structure
- A solution
Proposition Bank (PropBank)
(http//www.cs.rochester.edu/gildea/PropBank/Sort
)

55
An Empirical View on Semantic Roles Within and
Across Languages

The problem
- Despite of the breakthroughs in NLP based on
statistical
methods and linguistic representations, accurate
information
extraction was out of reach
- A critical element was missing accurate
predicateargument
structure
- The most important factor for improved quality
in language
translation is accurate predicate-argument
structure
- Complete grammatical parse and vocabulary
coverage are
not enough.
- Knowledge of the proper constituents of verb
arguments is
not enough. Their proper position is very
important

56
An Empirical View on Semantic Roles Within and
Across Languages

Predicate-argument structure
- Example
Sentence John broke the window
Associated predicate-argument break(John,
window)
- The recognition of the structure is not a
trivial problem
- In natural language there are several lexical
items referring
to the same type of event and several syntactic
realizations
of the same predicate-argument relations
- Example
A will meet/visit/consult/debate (with) B
A and B met/visited/consulted/debated
There was a meeting/visit/consultation/debate
between A and B
A had a meeting/visit/consultation/debate with
B

57
An Empirical View on Semantic Roles Within and
Across Languages

A solution
- Create a body of publicly available training
data that explicitly annotates predicate-argument
positions with labels.
- Highest priority was given to
predicate-argument structure for verbs
- The result was the Proposition Bank (PropBank)

58
An Empirical View on Semantic Roles Within and
Across Languages