Title: Temporal Information Extraction
1Temporal Information Extraction
- Inderjeet Mani
- imani_at_mitre.org
2Outline
- Introduction
- Linguistic Theories
- AI Theories
- Annotation Schemes
- Rule-based and machine-learning methods.
- Challenges
- Links
3Motivation Question-Answering
- When is Ramadan this year?
- What was the largest U.S. military operation
since Vietnam? - Tell me the best time of the year to go
cherry-picking. - How often do you feed a pet gerbil?
- Is Gates currently CEO of Microsoft?
- Did the Enron merger with Dynegy take place?
- How long did the hostage situation in Beirut
last? - What is the current unemployment rate?
- How many Iraqi civilian casualties were there in
the first week of the U.S. invasion of Iraq? - Who was Secretary of Defense during the Gulf War?
4Motivation Coherent and Faithful Summaries
..worked in recent summers.. ..was the source of
the virus last week.. where Morris was a
computer science undergraduate until
June.. ..whose virus program three years ago
disrupted
- Single-document sentence extraction summarizers
are plagued by dangling references - especially temporal ones
- Multi-Document summarizers can be misled by the
weakness of vocabulary overlap methods - leads to inappropriate merging of distinct events
5An Example Story
- Feb. 18, 2004
- Yesterday Holly was running a marathon when she
twisted her ankle. David had pushed her.
1. When did the running occur? Yesterday. 2. When
did the twisting occur? Yesterday, during the
running. 3. Did the pushing occur before the
twisting? Yes. 4. Did Holly keep running after
twisting her ankle? 5. Probably not.
6Temporal Information Extraction Problem
- Feb. 18, 2004
- Yesterday Holly was running a marathon when she
twisted her ankle. David had pushed her.
- Input A natural language discourse
- Output representation of events and their
temporal relations
7IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
Idea for Temporal IE Make progress by focusing
on a particular top-down slice (i.e., time),
using its rich structure
8Theories
AI logic
Formal Linguistics
9Linguistic Theories
- Events
- Event Structure (event subclasses and parts)
- Tense (indicates location of event in time, via
verb inflections, modals, auxiliaries, etc.) - Grammatical Aspect (indicates whether event is
ongoing, finished, completed) - Time Adverbials
- Relations between events and/or times
- temporal relations
- we will also need discourse relations
10Tense
- All languages that have tense (in the semantic
sense of locating events in time) can express
location in time - Location can be expressed relative to a deictic
center that is the current moment of speech, or
speech time, or speech point - e.g., tomorrow, yesterday, etc.
- Languages can also express temporal locations
relative to a coordinate system - a calendar, e.g., 1991 (A.D.),
- a cyclically occurring event, e.g., morning,
spring, - an arbitrary event, e.g., the day after he
married her. - A language may have tense in the above semantic
sense, without expressing it using tense
morphemes - Instead, aspectual morphemes and/or modals and
auxiliaries may be used.
11Mandarin Chinese
- Has semantic tense
- Lacks tense morphemes
- Instead, it uses aspect markers to indicate
whether an event is ongoing (-zhai, -le),
completed (-wan), terminated (-le, -guo), or in a
result state (-zhe) - But aspect markers are often absent
? ? ?? wo kan dianshi I watch / will watch
/ watched TV Example from Congmin Min, MS
Thesis, Georgetown, 2005.
12Burmese
- No semantic tense, but all languages that lack
semantic tense all have a realis/irrealis
distinction. - Events that are ongoing or that were observed in
the past are expressed by sentence-final realis
particles te, -tha, -ta, and hta. - For unreal or hypothetical events (including
future and present and hypothetical past events),
the sentence-final irrealis particles me, -ma,
and hma are used.
Comrie, B. Tense. Cambridge, 1985.
13Tense as Anaphor Reichenbach
- A formal method for representing tense, based on
which one can locate events in time - Tensed utterances introduce references to 3 time
points - Speech Time S
- Event Time E
- Reference Time R
- SI had mailed the letterE when John came
told me the newsR - E lt R lt S
- Three temporal relations are defined on these
time points - at, before, after
- 13 different relations are possible
- N.B. the concept of time point is an
abstraction - it can map to an interval
14Reichenbachian Tense Analysis
- Tense is determined by relation between R and S
- RS, RltS, RgtS
- Aspect is determined by relation between E and R
- ER, E lt R, Egt R
- Relation of E relative to S not crucial
- Represent RltSE as EgtRltS
- Only 7 out of 13 relations are realized in
English - 6 different forms, simple future being ambiguous
- Progressive no different from simple tenses
- But I was eating a peach ?gt I ate a peach
EgtRltS
EltRgtS
15Priorean Tense Logic
- G? It is always going to be the case that ?.H?
It always has been the case that ?.F? It will
be at some point in the future be the case that
?. P? It was at some point in the past the case
that ?. F? G? P? H?
- System Kt(a) ? ? H F ? What is, has always
been going to be(b ? ? G P ? What is, will
always have been(c) H(? ??) ? (H? ? H?)
Whatever always follows from what always has
been, always has been(d) G(? ??) ? (G? ? G?)
Whatever always follows from what always will be,
always will be.
16Tense as Operator Prior
- Free iteration captures many more tenses,
- I would have slept PFP?
- But also expresses many non-NL tenses
- PPPP? It was the case4 John had slept
17Event Classes (Lexical Aspect)
- STATIVES know, sit, be clever, be happy,
killing, accident - can refer to state itself (ingressive) John knows
, or to entry into a state (inceptive) John
realizes - John is knowing Bill, Know the answer, What
John did was know the answer - ACTIVITIES walk, run, talk, march, paint
- if it occurs in period t, a part of it (the same
activity) must occur for most sub-periods of t - X is Ving entails that X has Ved
- John ran for an hour,John ran in an hour
- ACCOMPLISHMENTS build, cook, destroy
- culminate (telic)
- X is Ving does not entail that X has Ved.
- John booked a flight in an hour, John stopped
building a house - ACHIEVEMENTS notice, win, blink, find, reach
- instantaneous accomplishments
- John dies for an hour, John wins for an hour,
John stopped reaching New York
18Aspectual Composition
- Expressions of one class can be transformed into
one of another class by combining with another
expression. - e.g., an activity can be changed into an
accomplishment by adding an adverbial phrase
expressing temporal or spatial extent - I walked (activity)
- I walked to the station / a mile / home
(accomplishment) - I built my house (accomplishment).
- I built my house for an hour (activity).
- Moens Steedman (1988) implement aspectual
composition in a transition network
19Example Classifying Question Verbs
- Androutsopouloss (2002) NLITDB system allows
users to pose temporal questions in English to an
airport database that uses a temporal extension
of SQL - Verbs in single-clause questions with non-future
meanings are treated as states - Does any tank contain oil?
- Some verbs may be ambiguous between a (habitual)
state and an accomplishment - Which flight lands on runway 2?
- Does flight BA737 land on runway 2 this afternoon
- Activities are distinguished using the
imperfective paradox - Were any flights taxiing? implies that they
taxied - Were any flights taxiing to gate 2? does not
imply that they taxied. - So, taxi will be given
- an activity verb sense, one that doesnt expect a
destination argument, and - an accomplishment verb sense, one that expects a
destination argument.
20Grammatical Aspect
- Perfective focus on situation as a whole
- John built a house
- Imperfective focus on internal phases of
situation - John was building a house
English Verbal tense and aspect morphemes, e.g., for present and past perfect
French Tense (passé composé)
Mandarin morphemes le and guo
English progressive verbal inflection -ing
French Tense (imparfait)
Mandarin progressive morpheme zai and resultative morpheme zhe.
was building.a.h
21Inferring Temporal Relations
- Yesterday Holly was running a marathon when she
twisted her ankle. FINISHES David had pushed her.
BEFORE - I had mailed the letter when John came told me
the news AFTER - Simpson made the call at 3. Later, he was spotted
driving towards Westwood. AFTER - Max entered the room. Mary stood up/was seated on
the desk. AFTER/OVERLAP - Max stood up. John greeted him. AFTER
- Max fell. John pushed him. BEFORE
- Boutros-Ghali Sunday opened a meeting in Nairobi
of ....He arrived in Nairobi from South Africa
BEFORE - John bought Mary some flowers. He picked out
three red roses. DURING
22Linguistic Information Needed for Temporal IE
- Events
- Tense
- Aspect
- Time adverbials
- Explicit temporal signals (before, since, at,
etc.) - Discourse Modeling
- For disambiguation of time expressions based on
context - For tracking sequences of events (tense/aspect
shifts) - For computing Discourse Relations
- Commonsense Knowledge
- For inferring Discourse Relations
- For inferring event durations
23Narrative Ordering
- Temporal Discourse Interpretation Principle
(Dowty 1979) - Reference time for the current sentence is a time
consistent with its time adverbials if any, or
else it immediately follows reference time of the
previous sentence. - The overlap of statives is a pragmatic
inference,(hinting at a theory of defaults) - A man entered the White Hart. He was wearing a
black jacket. Bill served him a beer. - Discourse Representation Theory (Kamp and Reyle
1993) - In successive past tense sentences which lack
temporal adverbials, events advance the narrative
forward, while states do not. - Overlapping statives come out of semantic
inference rules - Neither theory explicitly represents discourse
relations, though they are needed (e.g., 6-8
above)
24Discourse Representation Theory (example)
- A man entered the White Hart. He was wearing a
black jacket. Bill served him a beer.
Rpt ? e1, t1, x, y enter(e1, x, y), man(x),
y theWhiteHart t1 lt n, e1 ? t1 Rpt ?
e1 -----------------------------------------------
----------- e2, t2, x1, y1 PROG(wear(e2, x1,
y1)), black-jacket(y1), x1x t2 lt n, e2 ? t2, e1
? e2 ---------------------------------------------
------------- e3, t3, x2, y2, z serve(e3, x2, y2,
z), beer(z), x2Bill, y2x t3 lt n, e3 ? t3 Rpt ?
e3 e1 lt e3
25Overriding Defaults
- Lascarides and Asher (1993) temporal ordering
is derived entirely from discourse relations
(that link together DRSs, based on SDRT
formalism). - Example
- Max switched off the light. The room was pitch
dark. - Default inference OVERLAP
- Use an inference rule that if the room is dark
and the light was just switched off, the
switching off caused the room to become dark. - Inference AFTER
- Problem requires large doses of world knowledge
LP 1993
26Outline
- Introduction
- Linguistic Theories
- AI Theories
- Annotation Schemes
- Rule-based and machine-learning methods.
- Challenges
- Links
27Time and Events in Logic
Events
Time
Time
Time
Events
Events
Instants
Intervals
Intervals
Intervals
Instants
Instants
28Instant Ontology
- Consider the event of Johns reading the book
- Decompose into an infinite set of infinitesimal
instants - Let T be a set of temporal instants.
- Let lt (BEFORE) be a temporal ordering relation
between instants - Properties irreflexive, antisymmetric,
transitive, and complete - Antisymmetric gt time has only one direction of
movement - Irreflexive and Transitive gt time is
non-cyclical - Complete gt lt is a total ordering
29Instants -- Problem Where Truth Values Change
- P The race is on
- T-R the time of running the race
- T-AR the time after running the race
- R and AR have to meet somewhere
- If we choose instants, there is some instant x
where T-R and AR meet - Either we have P and not P both true at x, or
there is a truth value gap at x - This is called the Divided Instant Problem
(D.I.P.)
30Ordering Relations on Intervals
- Unlike instants, where we have only lt, we can
have at least 3 ordering relations on intervals - Precedence lt I1 lt I2 iff ?t1 ? I1, ?t2 ? I2, t1
lt t2 (where lt is defined over instants) - Temporal Overlap O I1 O I2 iff I1? I2 ? ?
- Temporal Inclusion ? I1 ? I2 iff I1? I2
31Instants versus Intervals
- Instants
- We understand the idea of truth at an instant
- In cases of continuous change, e.g., a tossed
ball, we need a notion of a durationless event in
order to explain the trajectory of the ball just
before it falls - Intervals
- We often conceive of time as broken up in terms
of events which have a certain duration, rather
than as a (infinite) sequence of durationless
instants. - Many verbs do not describe instantaneous events.,
e.g., has read, ripened - Duration expressions like yesterday afternoon
arent construed as instants
32Allens Interval-Based Ontology
- Instants are banished
- So, avoids the divided instant problem
- Short duration intervals will be instant-like
- Uses 13 relations
- Relations are mutually exclusive
- All 13 relations can be expressed using meet
- ?X?Y Before (X, Y) ?
- ?Z meet(X, Z)
- meet(Z, Y)
James F. Allen, Towards a General Theory of
Action and Time, Artificial Intelligence 23
(1984) 12354.
33Allens 13 Temporal Relations
lt, gt
m, mi
o, oi
s, si
f, fi
d, di
34lt gt d di o oi m mi s si f fi
lt lt ? ltomds lt lt ltomds lt ltomds lt lt ltomds lt
gt ? gt gt oi mi d f gt gt oi mi d f gt gt oi mi d f gt gt oi mi d f gt gt gt
d lt gt d ? ltomds gt oi mi d f lt gt d gt oi mi d f d ltomds
di
o
oi
m
mi
s
si
f
fi
35Temporal Closure Sputlink in TANGO
Verhagen (2005)
36AI Reasoning about Events
- Situation Calculus
- Holds(Have(John, book), t1)
- Holds(Have(Mary, book), t2)
- Holds(Have(Z, Y), Result(give(X, Y, Z), t))
- t-i are states
- Concurrent actions cannot be represented
- No duration of actions or delayed effects
- Event Calculus
- HoldsAt(Have(J, B), t1)
- HoldsAt(Have(M, B), t2)
- Terminates(e1, Have(J, B))
- Initiates(e1, Have(M, B))
- Happens(e, t)
- t is a time point
- Involves non-monotonic reasoning
- Handles frame problem using circumscription
37Temporal Question-Answering using IE Event
Calculus
- Mueller (2004) Takes instantiated MUC terrorist
event templates and represents information in EC - Adds commonsense knowledge about terrorist domain
- e.g., if a bomb explodes, its no longer
activated - Commonsense knowledge includes frame axioms
- e.g., if an object starts falling, then its
height will be released from the commonsense law
of inertia - Example temporal questions
- Was the car dealership damaged before the
high-power bombs exploded? Ans No. - Requires reasoning that the damage did not occur
at all times t prior to the explosion - Problem requires large doses of world knowledge
Mueller, Erik T. (2004). Understanding
script-based stories using commonsense
reasoning. Cognitive Systems Research, 5(4),
307-340.
38Temporal Question Answering using IE Temporal
Databases
- In NLITDB, semantic relation between a question
event and the adverbial it combines with is
inferred by a variety of inference rules. - State point adverbial
- Which flight was queueing for runway 2 at 500
pm? - state coerced to an achievement, viewed as
holding at the time specified by the adverbial. - Activity point adverbial
- can mean that the activity holds at that time, or
that the activity starts at that time, e.g.,
Which flight queued for runway 2 at 500 pm? - An accomplishment may indicate inception or
termination - Which flight taxied to gate 4 at 500 pm? can
mean the taxiing starts or ends at 5 pm.
39Outline
- Introduction
- Linguistic Theories
- AI Theories
- Annotation Schemes
- Rule-based and machine-learning methods.
- Challenges
- Links
40IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
41Events in NLP
- Topic well-defined subject for searching
- document- or collection-level
- Template structure with slots for participant
named entities - document-level
- Mention linguistic expression that expresses an
underlying event - phrase-level (verb/noun)
42Event Characteristics
- Can have temporal a/o spatial locations
- Can have types
- assassinations, bombings, joint ventures, etc.
- Can have members
- Can have parts
- Can have people a/o other objects as participants
- Can be hypothetical
- Can have not happened
43MUC Event Templates
Wall Street Journal, 06/15/88 MAXICARE HEALTH
PLANS INC and UNIVERSAL HEALTH SERVICES INC have
dissolved a joint venture which provided health
services.
44ACE Event Templates
Type Subtype
Life Be-Born, Marry, Divorce, Injure, Die
Movement Transport
Transaction Transfer-Ownership, Transfer-Money
Business Start-Org, Merge-Org, Declare-Bankruptcy, End-Org
Conflict Attack, Demonstrate
Contact Meet, Phone-Write
Personnel Start-Position, End-Position, Nominate, Elect
Justice Arrest-Jail, Release-Parole, Trial-Hearing, Charge-Indict, Sue, Convict, Sentence, Fine, Execute, Extradite, Acquit, Appeal, Pardon
- Four additional attributes for each event mention
- Polarity (it did or did not occur)
- Tense (past, present, future)
- Modality (real vs. hypothetical)
- Genericity (specific vs. generic)
- Argument slots (4 -7) specific to each event
- E.g., Trial-Hearing event has slots for the
Defendant, Prosecutor, Adjudicator, Crime, Time,
and Place.
From Lisa Ferro _at_MITRE
45Mention-Level Events
- Event expressions
- tensed verbs has left, was captured, will
resign - stative adjectives sunken, stalled, on board
- event nominals merger, Military Operation, war
- Dependencies between events and times
- Anchoring John left on Monday.
- Orderings The party happened after midnight.
- Embedding John said Mary left.
46TIMEX2 (TIDES/ACE) Annotation Scheme
- Time Points ltTIMEX2 VAL"2000-W42"gtthe third week
of Octoberlt/TIMEX2gt - Durations ltTIMEX2 VALPT30Mgthalf an hour
longlt/TIMEX2gt - Indexicality ltTIMEX2 VAL2000-10-04gttomorrowlt/T
IMEX2gt -
- He wrapped up a ltTIMEX2 VAL"PT3H"
ANCHOR_DIR"WITHIN" ANCHOR_VAL"1999-07-15"gtthree-
hourlt/TIMEX2gt meeting with the Iraqi president in
Baghdad ltTIMEX2 VAL"1999-07-15"gttodaylt/TIMEX2gt. - Sets ltTIMEX2 VALXXXX-WXX-2" SET"YES
PERIODICITY"F1W" GRANULARITYG1Dgtevery
Tuesdaylt/TIMEX2gt - Fuzziness ltTIMEX2 VAL1990-SUgtSummer of 1990
lt/TIMEX2gt - ltTIMEX2 VAL1999-07-15TMOgtThis
morninglt/TIMEX2gt - ltTIMEX2 VAL2000-10-31TNI MODSTARTgtearly
last nightlt/TIMEX2gt
47TIMEX2 Inter-annotator Agreement
- Georgetown/MITRE (2001)
- 193 English docs, .79 F Extent, .86 F VAL
- 5 annotators
- Annotators deviate from guidelines, and produce
systematic errors (fatigue?) - several years ago PXY instead of PAST_REF
- all day P1D instead of YYYY-MM-DD
- LDC (2004)
- 49 English docs, .85 F Extent, .80F VAL
- 19 Chinese docs, .83 Extent
- 2 annotators
48Example of Annotator Difficulties (TERN 2004)
Time Expression Recognition and Normalization
Competition (timex2.mitre.org)
49TIMEX2 A Mature Standard
- Extensively debugged
- Detailed guidelines for English and Chinese
- Evaluated for English, Arabic, Chinese, Korean,
Spanish, French, Swedish, and Hindi - Applied to news, scheduling dialogues, other
types of data - Corpora available through ACE, MITRE
50Temporal Relations in ACE
- Restricted to verbal events (verbs of scheduling,
occurrence, aspect etc.) - The event and the timex must be in the same
sentence - Eight temporal relations
- Within
- The bombing occurred during the night.
- Holds
- They were meeting all night.
- Starting, Ending
- The talks ended (on) Monday.
- Before, After
- The initial briefs have to be filed by 4 p.m.
Tuesday - At-Beginning, At-End
- Sharon met with Bill at the start of the
three-day conference
From Lisa Ferro _at_MITRE
51Outline
- Introduction
- Linguistic Theories
- AI Theories
- Annotation Schemes
- Rule-based and machine-learning methods.
- Challenges
- Links
52TimeML Annotation Scheme
- A Proposed Metadata Standard for Markup of
events, their temporal anchoring, and how they
are related to each other - Marks up mention-level events, time expressions,
and links between events (and events and times) - Developer James Pustejovsky ( co.)
53An Example Story
- Feb. 18, 2004
- Yesterday Holly was running a marathon when she
twisted her ankle. David had pushed her.
1. When did the running occur? Yesterday. 2. When
did the twisting occur? Yesterday, during the
running. 3. Did the pushing occur before the
twisting? Yes. 4. Did Holly keep running after
twisting her ankle? 5. Probably not.
54An Attested Story
- AP-NR-08-15-90 1337EDT
- Iraq's Saddam Hussein, facing U.S. and Arab
troops at the Saudi - border, today sought peace on another front by
promising to - withdraw from Iranian territory and release
soldiers captured - during the Iran-Iraq war. Also today, King
Hussein of Jordan arrived in - Washington seeking to mediate the Persian Gulf
crisis. President Bush on - Tuesday said the United States may extend its
naval quarantine to Jordan's - Red Sea port of Aqaba to shut off Iraq's last
unhindered trade route. -
- Past lt Tuesday lt Today lt Indef Future
- __________________________________________________
_________ - war said sought withdraw
- captured release
- arrived extend
- quarantine
55TimeML Events
- AP-NR-08-15-90 1337EDT
- Iraq's Saddam Hussein, facing U.S. and Arab
troops at the Saudi - border, today sought peace on another front by
promising to withdraw from Iranian - territory and release soldiers captured
- during the Iran-Iraq war. Also today, King
Hussein of Jordan arrived in - Washington seeking to mediate the Persian Gulf
crisis. President Bush on - Tuesday said the United States may extend its
naval quarantine to Jordan's - Red Sea port of Aqaba to shut off Iraq's last
unhindered trade route. -
- In another mediation effort, the Soviet Union
said today it had - sent an envoy to the Middle East on a series of
stops to include - Baghdad. Soviet officials also said Soviet women,
children and - invalids would be allowed to leave Iraq.
56TimeML Event Classes
- Occurrence
- die, crash, build, merge, sell, take advantage
of, .. - State
- Be on board, kidnapped, recovering, love, ..
- Reporting
- Say, report, announce,
- I-Action
- Attempt, try,promise, offer
- I-State
- Believe, intend, want,
- Aspectual
- begin, start, finish, stop, continue.
- Perception
- See, hear, watch, feel.
57Temporal Anchoring Links
- AP-NR-08-15-90 1337EDT
- Iraq's Saddam Hussein, facing U.S. and Arab
troops at the Saudi - border, today sought peace on another front by
promising to - withdraw from Iranian territory and release
soldiers captured - during the Iran-Iraq war. Also today, King
Hussein of Jordan arrived in - Washington seeking to mediate the Persian Gulf
crisis. President Bush on - Tuesday said the United States may extend its
naval quarantine to Jordan's - Red Sea port of Aqaba to shut off Iraq's last
unhindered trade route. -
- In another mediation effort, the Soviet Union
said today it had - sent an envoy to the Middle East on a series of
stops to include - Baghdad. Soviet officials also said Soviet women,
children and - invalids would be allowed to leave Iraq.
58TLINK Types
- Simultaneous (happening at the same time)
- Identical (referring to the same event)
- John drove to Boston. During his drive he ate a
donut. - Before the other
- In six of the cases suspects have already been
arrested. - Immediately before the other
- All passengers died when the plane crashed into
the mountain. - Including the other
- John arrived in Boston last Thursday.
- Exhaustively during the duration of the other
- John taught for 20 minutes.
- Beginning of the other
- John was in the gym between 600 p.m. and 700
p.m. - Ending of the other
- John was in the gym between 600 p.m. and 700
p.m.
59TLINK Example
John taught 20 minutes every Monday. John
ltEVENT eid"e1" class"OCCURRENCE"gt taught
lt/EVENTgt ltMAKEINSTANCE eiid"ei1" eventID"e1"
pos"VERB" tense"PAST" aspect"NONE"
polarity"POS"/gt ltTIMEX3 tid"t1"
type"DURATION" value"P20TM"gt 20 minutes
lt/TIMEX3gt ltTIMEX3 tid"t2" type"SET"
value"xxxx-wxx-1" quant"EVERY"gt every Monday
lt/TIMEX3gt ltTLINK timeID"t1" relatedToTime"t2"
relType"IS_INCLUDED"/gt ltTLINK
eventInstanceID"ei1" relatedToTime"t1"
relType"DURING"/gt
60Subordinated Links
- AP-NR-08-15-90 1337EDT
- Iraq's Saddam Hussein, facing U.S. and Arab
troops at the Saudi - border, today sought peace on another front by
promising to withdraw from Iranian - territory and release soldiers captured
- during the Iran-Iraq war. Also today, King
Hussein of Jordan arrived in - Washington seeking to mediate the Persian Gulf
crisis. President Bush on - Tuesday said the United States may extend its
naval quarantine to Jordan's - Red Sea port of Aqaba to shut off Iraq's last
unhindered trade route. -
- In another mediation effort, the Soviet Union
said today it had - sent an envoy to the Middle East on a series of
stops to include - Baghdad. Soviet officials also said Soviet women,
children and - invalids would be allowed to leave Iraq.
61SLINK Types
SLINK or Subordination Link is used for contexts
introducing relations between two events, or an
event and a signal, of the following sort
Modal Relation introduced mostly by modal verbs
(should, could, would, etc.) and events that
introduce a reference to a possible world
--mainly I_STATEs John should have bought some
wine. Mary wanted John to buy some wine.
Factive Certain verbs introduce an entailment
(or presupposition) of the argument's veracity.
They include forget in the tensed complement,
regret, manage John forgot that he was in
Boston last year. Mary regrets that she didn't
marry John. Counterfactive The event
introduces a presupposition about the
non-veracity of its argument forget (to), unable
to (in past tense), prevent, cancel, avoid,
decline, etc. John forgot to buy some wine.
John prevented the divorce. Evidential
Evidential relations are introduced by REPORTING
or PERCEPTION John said he bought some wine.
Mary saw John carrying only beer. Negative
evidential Introduced by REPORTING (and
PERCEPTION?) events conveying negative polarity
John denied he bought only beer. Negative
Introduced only by negative particles (not, nor,
neither, etc.), which will be marked as SIGNALs,
with respect to the events they are modifying
John didn't forgot to buy some wine. John did
not wanted to marry Mary.
62Aspectual Links
-
- Th' U.S. military buildup in Saudi Arabia
corntinued at fevah pace, wif Syrian troops now
part of a multinashunal fo'ce camped out in th'
desert t'guard the Saudi kin'dom fum enny noo
threst by Iraq. - In a letter to President Hashemi Rafsanjani of
Iran, read by a broadcaster over Baghdad radio,
Saddam said he will begin withdrawing troops from
Iranian territory a week from tomorrow and
release Iranian prisoners of war.
63Towards TIMEX3
- Decompose more
- Smaller tag extents compared to TIMEX2
- ltTIMEX2 ID"t28" VAL"2000-10-02"gtjust days after
another court dismissed other corruption charges
against his fatherlt/TIMEX2gt. - N. B. extent marking a source of inter-annotator
disagreements in ACE TERN 2004 evaluation - Avoid tag Embedding
- ltTIMEX2 VAL"1999-08-03"gttwo weeks from ltTIMEX2
VAL"1999-07-20"gtnext Tuesdaylt/TIMEX2gtlt/TIMEX2gt - Include temporal functions for delayed evaluation
- Allow non-consuming tags
- Put relationships in Links
64TIMEX3 Annotation
- Time Points
- ltTIMEX3 tidt1 typeTIME valueT2400gtmidn
ightlt/TIMEX3gt - ltTIMEX3 tidt2 typeDATE value2005-02-15
temporalFunctionTRUE anchorTimeIDt0gttomorrow
lt/TIMEX3gt - Durations
- ltTIMEX3 tid"t6" type"DURATION" value"P2W"
beginPoint"t61" endPoint"t62"gttwo
weekslt/TIMEX3gt from ltTIMEX3 tid"t61" type"DATE"
value"2003-06-07"gtJune 7, 2003lt/TIMEX3gt - ltTIMEX3 tid"t62" type"DATE" value"2003-06-21"
temporalFunction"true" anchorTimeID"t6"/gt - Sets
- ltTIMEX3 tidt1 typeSET valueP1M
quantEVERY freqP3Dgt - three days every monthlt/TIMEX3gt
- ltTIMEX3 tidt1 typeSET valueP1M
freqP2Xgt - twice a monthlt/TIMEX3gt
65TimeML and DAML-Time Ontology
- We shipe1 2 dayst1 after the purchasee2
- TimeML
- ltTLINK eventInstanceIDe1 relatedToTimet1
relTypeBEGINS/gt - lt
- TLINK eventInstanceIDe1 relatedToEventInstancee2
relTypeAFTER/gt - DAML-OWL
- atTime(e1, t1) atTime(e2, t2) after(t1, t2)
timeBetween(T, t1, t2) duration(T, Days)2
Hobbs Pustejovsky, in I. Mani et al., eds.,
The Language of Time
66Outline
- Introduction
- Linguistic Theories
- AI Theories
- Annotation Schemes
- Rule-based and machine-learning methods.
- Challenges
- Links
67IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
68Callisto Annotation Tool
69Tabular Annotation of Links
70TANGO Graphical Annotator
71Outline
- Introduction
- Linguistic Theories
- AI Theories
- Annotation Schemes
- Rule-based and machine-learning methods.
- Challenges
- Links
72IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
73Timex2/3 Extraction
- Accuracy
- Best systems TIMEX2 95 F Extent, .8XF VAL
(TERN 2004 English) - GUTime .85F Extent, .82F VAL (TERN 2004 training
data English) - KTX .87F Extent, .86F VAL (100 Korean
documents) - Machine Learning
- Tagging Extent easily trained
- Normalizing Values harder to train
74TimeML Event Extraction
- Easier than MUC template events (those were .6F)
- Part-of-speech tagging to find verbs
- Lexical patterns to detect tense and lexical and
grammatical aspect - Syntactic rules to determine subordination
relations - Recognition and Disambiguation of event nominals,
e.g., war, building, construction, etc. - Evita (Brandeis)
- 0.8F on verbal events (overgenerates generic
events which werent marked in TimeBank) - 0.64F on event nominals (WordNet-derived,
disambiguated via SemCor training)
75TempEx in Qanda
76Extracting Temporal Relations based on Tense
Sequences
- Song Cohen 1991 Adopt a Reichenbachian tense
representation - Use rules for permissible tense sequences
- When the tense moves from simple present to
simple past, the event time moves backward, and
from simple present to simple future, it moves
forward. - When the tense of two successive sentences is the
same, they argue that the event time moves
forward, except for statives and unbounded
processes, which keep the same time. - Wont work in cases of discourse moves
- When the tense moves from present perfect to
simple past, or present prospective (John is
going to run) to simple future, the event time of
the second sentence is less than or equal to the
event time of the first sentence. - However, incorrectly rules out, among others,
present tense to past perfect transitions.
Song Cohen AAAI91
77Extracting Temporal Relations by Heuristic Rule
Weighting
- Approach assigns weights to different ordering
possibilities based on the knowledge sources
involved. - Temporal adverbials and discourse cues are first
tried if neither are present, then default rules
based on tense and aspect are used. - Given a sentence describing past tense activity
followed by one describing a past tense
accomplishment or achievement, the second event
can only occur just after the activity it cant
precede, overlap, or be identical to it. - If the ordering is still ambiguous at the end of
this, semantic rules are used based on modeling
the discourse in terms of threads. - Assumes there is one thread that the discourse
is currently following.
- a. John went into the florist shop.
- b. He had promised Mary some flowers.
- c. She said she wouldnt forgive him if he
forgot. - d. So he picked out three red roses.
- Each utterance is associated with exactly one of
two threads - (i) going into the florists shop and
- (ii) interacting with Mary.
- Prefer an utterance to continue a current thread
which has the same tense or is semantically
related to it - (i) would be continued by d. based on tense
Janet Hitzeman, Marc Moens, and Claire Grover,
Algorithms for Analysing the Temporal structure
of Discourse, EACL1995, 25360.
78Heuristic Rules (Georgetown GTag)
- Uses 187 hand-coded rules
- LHS tests based on TimeML-related features and
pos-tags - RHS TimeML TLINK classes ( 13 Allen)
- Ordered into Classes
- R12 event anchored w/o signal to time in same
clause - R3 (28) main clause event in 2 successive
sentences - R4 reporting verb and document time
- R5 (54) reporting verb and event in same
sentence - R6 (87) events in same sentence
- R7 timex linked to document time
- Rules can have confidence
- ruleNum6-6
- If sameSentenceYES
- sentenceTypeANY
- conjBetweenEventsYES
- arg1.classEVENT
- arg2.classEVENT
- arg1.tensePAST
- arg2.tensePAST
- arg1.aspectNONE
- arg2.aspectNONE
- arg1.posVB
- arg2.posVB
- arg1.firstVbEventANY
- arg2.firstVbEventANY
- then infer relationBEFORE
- Confidence 1.0
- Comment they traveled far and slept the night
in a rustic inn
79Using Web-Mined Rules
- Lexical relations (capturing causal and other
relations, etc.) - kill gt die (always)
- push gt fall (sometimes Max fell. John pushed
him.) - Idea leverage the distributions found in large
corpora - VerbOcean database from ISI that contains
lexical relations mined from Google searches - E.g., X happens before Y, where X and Y are
WordNet verbs highly associated in a corpus - Converted to GUTenLink Format
- Yields 4199 rules!
- ruleNum8-3991
- If arg1.classEVENT
- arg2.classEVENT
- arg1.wordlearn uses
morph normalization - arg2.wordforget
- then infer relationBEFORE
80Outline
- Introduction
- Linguistic Theories
- AI Theories
- Annotation Schemes
- Rule-based and machine-learning methods.
- Challenges
- Links
81IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
82Related Machine Learning Work
- (Li et al. ACL2004) obtained 78-88 accuracy on
ordering within-sentence temporal relations in
Chinese texts. - (Mani et al., HLT2003 short) obtained 80.2
F-measure training a decision tree on 2069
clauses in anchoring events to reference times
that were inferred for each clause. - (Lapata and Lascarides NAACL2004) used found
data to successfully learn which (possibly
ambiguous) temporal markers connect a main and
subordinate clause, without inferring underlying
temporal relations.
83Car Sim Text to Accident Simulation System
- Carries out TimeML annotation of Swedish accident
reports - Builds an event ordering graph using machine
learning, with separate decision trees for local
and global TLINKS - Generates, based on domain knowledge, a
simulation of the accident
Anders Berglund. Extracting Temporal Information
and Ordering Events for Swedish. MS Thesis. Lund
University. 2004.
84Prior Machine Learning from TimeBank
- Mani (p.c., 2004)
- TLINKs converted into feature vectors from
TimeBank 1.0 tags - TLINK relType converted to feature vector class
label, after collapsing - Accuracy of C5.0.1 decision rules .55 F
- majority class
- Boguraev Ando (IJCAI2005)
- Uses features based on local syntactic context
(chunks and clause-structure) - trained a classifier for within-sentence TLINKS
on Timebank 1.1 .53F - Bottom Line TimeBank corpus doesnt provide
enough data for training learners?
85Insight TLINK Annotation (Humans)
- Inter-annotator reliability is .55F
- But agreement on LINK labels 77
- So, the problem is largely which events to link
- Within sentence, adjacent sentences, across
document? - Guidelines arent that helpful
- Conclusion global TLINKing is too fatiguing
- 0.84 TLINKS/event in corpus
86Temporal Reasoning to the Rescue
- Earlier experiments with SputLINK in TANGO
(interactive, text-segmented closure) indicated
that without closure, annotators cover 4 of all
possible links. - With closure, an annotator could cover about 65
of all possible links in a document. - Of those links, 84 were derived by the algorithm
Initial Links 36 4
User Prompts 109 12
Derived Links 775 84
87IE Methodology
Raw Corpus
Initial Tagger
Annotation Editor
Annotation Guidelines
Machine Learning Program
Rule Apply
Learn-ed Rules
Raw Corpus
Annotated Corpus
Annotated Corpus
Axioms
88Temporal Closure as an Oversampling Method
Corpus 186 TimeBank 1.2.1 73 Opinion Corpus
- Closing the Corpus (with 745 axioms)
- Number of TLINKs goes up gt 11 times!
- BEFORE links go up from 3170 Event-Event and 1229
Event-Time TLINKs to 68,585 Event-Event and
186,65 Event-Time TLINKs - Before closure 0.84 TLINKs/event
- After closure 9.49 TLINKs/event
12750 Events, 2114 Times 12750 Events, 2114 Times 12750 Events, 2114 Times
Relation Event-Event Event-Time
IBEFORE 131 15
BEGINS 160 112
ENDS 208 159
SIMULTANEOUS 1528 77
INCLUDES 950 3001 (65.3)
BEFORE 3170 (51.6) 1229
TOTAL 6147 4593
89ML Results
- Features each TLINK is a feature vector
- For each event in the pair
- event-class occurrence, state, reporting,
i-action, i-state, aspectual, perception - aspect progressive, perfective,
progressive_perfective - modality nominal
- negation nominal
- string string
- tense present, past, future
- signal string
- shiftAspect boolean
- shiftTense boolean
- class SIMULTANEOUS, IBEFORE, BEFORE, BEGINS,
ENDS, INCLUDES
Link Labeling Accuracy
90TLINK Extraction Conclusion
- Annotated TimeML Corpus provides insufficient
examples for training machine learners - Significant Result
- number of examples expanded 11 times by Closure
- Training learners on the expanded corpus yields
excellent performance - Performance exceeds human intuitions, even when
augmented with lexical rules - Next steps
- Integrate GUTenLinkVerbOcean rules into machine
learning framework - Integrate with s2tlink and a2tlink
- Feature engineering
91Challenges Temporal Reasoning
- Temporal reasoning for IE has used qualitative
temporal relations - Trivial metric relations (distances in time) can
be extracted from anchored durations and sorted
time expressions - But commonsense metric constraints are missing
- Time(Haircut) ltlt Time(fly Boston2Sydney)
- First steps
- Hobbs et al. at ACL06
- Mani Wellner at ARTE06 workshop
92Challenges Integrating Reasoning and Learning
93Difficulties in Annotation
- In an interview with Barbara Walters to be shown
on ABCs Friday nights, Shapiro said he tried
on the gloves and realized they would never fit
Simpsons larger hands. - BEFORE or MEET?
- More coarse-grained annotation may suffice
94Discourse Relations
- Lexical Rules from VerbOcean are still very
sparse, even though they are less brittle - But need to match arguments when applying lexical
rules (e.g., subj/obj of push/fall) - A discourse model should in fact be used
95 Temporal Relations as Surrogates for Rhetorical
Relations
a. John went into the florist shop. b. He had
promised Mary some flowers. c. She said she
wouldnt forgive him if he forgot. d. So he
picked out three red roses.
- When E1 is left-sibling of E2 and E1 lt E2, then
typically, Narration(E1, E2) - When E1 is right-sibling of E2 and E1 lt E2, then
typically Explanation(E2, E1) - When E2 is a child node of E1, then typically
Elaboration(E1, E2)
Expl
Elab
Narr
constraints Eb lt Ec, Ec lt Ea, Ea lt Ed
96TLINKS as a measure of fluency in Second Language
Learning
- Analyzed English oral and written proficiency
samples elicited from 16 speakers of English - 8 native speakers and 8 students in Advanced
courses in an Intensive English Program. - Corpus includes 5888 words elicited from subjects
via a written narrative retelling task - Chaplins Hard Times
- On average, native speakers (NSs) use
significantly fewer wds to create TLinks
(8.2/TLink vs. 10.1 for NNSs). - Number of closed TLINKS for NS far exceeds the
number for NNS (12,330 vs. 4924). - This means NS have, on the average, longer chains
of TLINKS
Joint work with Jeff Connor-Linton at AAAL05.
97Outline
- Introduction
- Linguistic Theories
- AI Theories
- Annotation Schemes
- Rule-based and machine-learning methods.
- Challenges
- Links
98Corpora
- News (newswire and broadcast)
- TimeML TimeBank, AQUAINT Corpus (all English)
- TIMEX2 TIDES and TERN English Corpora, Korean
Corpus (200 docs), TERN Chinese and Arabic news
data (extents only) - Weblogs
- TIMEX2 TERN corpus (English, Chinese, Arabic
the latter with extents only) - Dialogues
- TIMEX2- 95 Spanish Enthusiast dialogs, and their
translations - Meetings
- TIMEX2 Spanish portions of UN Parallel corpus
(23,000 words) - Childrens Stories
- Reading Comprehension Exams from MITRE, Remedia
120 stories, 20K words, CBC 259 stories, 1/3
tagged, 50K
99Links
- TimeBank (April 17, 2006)
- http//www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?
catalogIdLDC2006T08 - TimeML
- www.timeml.org
- TIMEX2/TERN ACE data (English, Chinese, Arabic)
- timex2.mitre.org
- TIMEX2/3 Tagger
- http//complingone.georgetown.edu/linguist/GU_TIM
E_DOWNLOAD.HTML - Korean and Spanish data . imani_at_mitre.org
- Callisto callisto.mitre.org
100References
- Mani, I., Pustejovsky, J., and Gaizauskas, R.
(eds.). (2005) The Language of Time A Reader.
Oxford University Press. - Mani, I., and Schiffman, B. (2004). Temporally
Anchoring and Ordering Events in News. In
Pustejovsky, J. and Gaizauskas, R. (eds), Time
and Event Recognition in Natural Language. John
Benjamins, to appear. - Mani, I. (2004). Recent Developments in Temporal
Information Extraction. In Nicolov, N., and
Mitkov, R. Proceedings of RANLP'03, John
Benjamins, to appear. - Jang, S., Baldwin, J., and Mani, I. (2004).
Automatic TIMEX2 Tagging of Korean News. In Mani,
I., Pustejovsky, J., and Sundheim, B. (eds.), ACM
Transactions on Asian Language Processing
Special issue on Temporal Information Processing.
- Mani, I., Schiffman, B., and Zhang, J. (2003).
Inferring Temporal Ordering of Events in News.
Short Paper. In Proceedings of the Human Language
Technology Conference (HLT-NAACL'03). - Ferro, L., Mani, I., Sundheim, B. and Wilson G.
(2001). TIDES Temporal Annotation Guidelines
Draft - Version 1.02. MITRE Technical Report MTR
MTR 01W000004. McLean, Virginia The MITRE
Corporation. - Mani, I. and Wilson, G. (2000). Robust Temporal
Processing of News. In Proceedings of the 38th
Annual Meeting of the Association for
Computational Linguistics (ACL'2000), 69-76.