Title: Tree Kernel-based Semantic Relation Extraction using Unified Dynamic Relation Tree
1Tree Kernel-based Semantic Relation Extraction
using Unified Dynamic Relation Tree
- Reporter Longhua Qian
- School of Computer Science and Technology
- Soochow University, Suzhou, China
- 2008.07.23
- ALPIT2008, DaLian, China
2Outline
- 1. Introduction
- 2. Dynamic Relation Tree
- 3. Unified Dynamic Relation Tree
- 4. Experimental results
- 5. Conclusion and Future Work
31. Introduction
- Information extraction is an important research
topic in NLP. - It attempts to find relevant information from a
large amount of text documents available in
digital archives and the WWW. - Information extraction by NIST ACE
- Entity Detection and Tracking (EDT)
- Relation Detection and Characterization (RDC)
- Event Detection and Characterization (EDC)
4RDC
- Function
- RDC detects and classifies semantic relationships
(usually of predefined types) between pairs of
entities. Relation extraction is very useful for
a wide range of advanced NLP applications, such
as question answering and text summarization. - E.g.
- The sentence Microsoft Corp. is based in
Redmond, WA conveys the relation GPE-AFF.Based
between Microsoft Corp (ORG) and Redmond
(GPE).
5Two approaches
- Feature-based methods
- have dominated the research in relation
extraction over the past years. However, relevant
research shows that its difficult to extract new
effective features and further improve the
performance. - Kernel-based methods
- compute the similarity of two objects (e.g. parse
trees) directly. The key problem is how to
represent and capture structured information in
complex structures, such as the syntactic
information in the parse tree for relation
extraction?
6Kernel-based related work
- Zelenko et al. (2003), Culotta and Sorensen
(2004), Bunescu and Mooney (2005) described
several kernels between shallow parse trees or
dependency trees to extract semantic relations. - Zhang et al. (2006), Zhou et al. (2007) proposed
composite kernels consisting of an linear kernel
and a convolution parse tree kernel, and the
latter can effectively capture structured
syntactic information inherent in parse trees.
7Structured syntactic information
- A tree span for relation instance
- a part of a parse tree used to represent the
structured syntactic information for relation
extraction. - Two currently used tree spans
- PT(Path-enclosed Tree) the sub-tree enclosed by
the shortest path linking the two entities in the
parse tree - CSPT(Context-Sensitive Path-enclosed Tree)
Dynamically determined by further extending the
necessary predicate-linked path information
outside PT.
8Current problems
- Noisy information
- Both PT and CSPT may still contain noisy
information. In other words, more noise should be
pruned away from a tree span. - Useful information
- CSPT only captures part of context-sensitive
information only relating to predicate-linked
path. That is to say, more information outside
PT/CSPT may be recovered so as to discern their
relationships.
9Our solution
- Dynamic Relation Tree (DRT)
- Based on PT, we apply a variety of
linguistics-driven rules to dynamically prune out
noisy information from a syntactic parse tree and
include necessary contextual information. - Unified Dynamic Relation Tree (UDRT)
- Instead of constructing composite kernels,
various kinds of entity-related semantic
information, including entity types/sub-types/ment
ion levels etc., are unified into a Dynamic
Relation Tree.
102. Dynamic Relation Tree
- Generation of DRT
- Starting from PT, we further apply three kinds of
operations (i.e. Remove, Compress, and Expansion)
sequentially to reshaping PT, giving rise to a
Dynamic Relation Tree at last. - Remove operation
- DEL_ENT2_PRE Removing all the constituents
(except the headword) of the 2nd entity - DEL_PATH_ADVP/PP Removing adverb or preposition
phrases along the path
11DRT(cont)
- Compress operation
- CMP_NP_CC_NP Compressing noun phrase
coordination conjunction - CMP_VP_CC_VP Compressing verb phrase
coordination conjunction - CMP_SINGLE_INOUT Compressing single in-and-out
nodes - Expansion operation
- EXP_ENT2_POS Expanding the possessive structure
after the 2nd entity - EXP_ENT2_COREF Expanding entity coreferential
mention before the 2nd entity
12Some examples of DRT
133.Unified Dynamic Relation Tree
- T1 DRT
- T2 UDRT-Bottom
- T3 UDRT-Entity
- T4 UDRT-Top
14Four UDRT setups
- T1 DRT
- there is no entity-related information except
the entity order (i.e. E1 and E2). - T2 UDRT-Bottom
- the DRT with entity-related information attached
at the bottom of two entity nodes - T3 UDRT-Entity
- the DRT with entity-related information attached
in entity nodes - T4 UDRT-Top
- the DRT with entity-related feature attached at
the top node of the tree.
154. Experimental results
- Corpus Statistics
- The ACE RDC 2004 data contains 451 documents and
5702 relation instances. It defines 7 entity
major types, 7 major relation type and 23
relation subtypes. - Evaluation is done on 347 (nwire/bnews) documents
and 4307 relation instances using 5-fold
cross-validation. - Corpus processing
- parsed using Charniaks parser (Charniak, 2001)
- Relation instances are generated by iterating
over all pairs of entity mentions occurring in
the same sentence.
16Classifier
- Tools
- SVMLight (Joachims 1998)
- Tree Kernel Tooklits (Moschitti 2004)
- The training parameters C (SVM) and ? (tree
kernel) are also set to 2.4 and 0.4 respectively.
- One vs. others strategy
- which builds K basic binary classifiers so as to
separate one class from all the others.
17Contribution of various operation rules
- Each operation rule is incrementally applied on
the previously derived tree span. - The plus sign preceding a specific rule indicates
that this rule is useful and will be added
automatically in the next round. - Otherwise, the performance is unavailable.
Operation rules P R F
PT (baseline) 76.3 59.8 67.1
DEL_ENT2_PRE 76.3 62.1 68.5
DEL_PATH_PP - - -
DEL_PATH_ADVP - - -
CMP_SINGLE_INOUT 76.4 63.1 69.1
CMP_NP_CC_NP 76.1 63.3 69.1
CMP_VP_CC_VP - - -
EXP_ENT2_POS 76.6 63.8 69.6
EXP_ENT2_COREF 77.1 64.3 70.1
18Comparison of different UDRT setups
Tree Setups P R F
DRT 68.7 53.5 60.1
UDRT-Bottom 76.2 64.4 69.8
UDRT-Entity 77.1 64.3 70.1
UDRT-Top 76.4 65.2 70.4
- Compared with DRT, the Unified Dynamic Relation
Trees (UDRTs) with only entity type information
significantly improve the F-measure by average 10
units due to the increase both in precision and
recall. - Among the three UDRTs, UDRT-Top achieves slightly
better performance than the other two.
19Improvements of different tree setups over PT
Tree Setups P R F
CSPT over PT 1.5 1.1 1.3
DRT over PT 0.1 5.4 3.3
UDRT-Top over PT 3.9 9.4 7.2
- Dynamic Relation Tree (DRT) performs better that
CSPT/PT setups. - the Unified Dynamic Relation Tree with
entity-related semantic features attached at the
top node of the parse tree performs best.
20Comparison with best-reported systems
Systems P R F Systems P R F
Zhou et al. Composite kernel 82.2 70.2 75.8 Ours CTK with UDRT-Top 80.2 69.2 74.3
Zhang et al. Composite kernel 76.1 68.4 72.1 Zhou et al. CS-CTK with CSPT 81.1 66.7 73.2
Zhao and Grishman Composite kernel 69.2 70.5 70.4 Zhang et al. CTK with PT 74.1 62.4 67.7
- It shows that our UDRT-Top performs best among
tree setups using one single kernel, and even
better than the two previous composite kernels.
215. Conclusion
- Dynamic Relation Tree (DRT), which is generated
by applying various linguistics-driven rules, can
significantly improve the performance over
currently used tree spans for relation
extraction. - Integrating entity-related semantic information
into DRT can further improve the performance,
esp. when they are attached at the top node of
the tree.
22Future Work
- we will focus on semantic matching in computing
the similarity between two parse trees, where
semantic similarity between content words (such
as hire and employ) would be considered to
achieve better generalization.
23References
- Bunescu R. C. and Mooney R. J. 2005. A Shortest
Path Dependency Kernel for Relation Extraction.
EMNLP-2005 - Chianiak E. 2001. Intermediate-head Parsing for
Language Models. ACL-2001 - Collins M. and Duffy N. 2001. Convolution Kernels
for Natural Language. NIPS-2001 - Collins M. and Duffy, N. 2002. New Ranking
Algorithm for Parsing and Tagging Kernel over
Discrete Structure, and the Voted Perceptron.
ACL-02 - Culotta A. and Sorensen J. 2004. Dependency tree
kernels for relation extraction. ACL2004. - Joachims T. 1998. Text Categorization with
Support Vector Machine learning with many
relevant features. ECML-1998 - Moschitti A. 2004. A Study on Convolution Kernels
for Shallow Semantic Parsing. ACL-2004 - Zelenko D., Aone C. and Richardella A. 2003.
Kernel Methods for Relation Extraction. Journal
of MachineLearning Research. 2003(2) 1083-1106 - Zhang M., , Zhang J. Su J. and Zhou G.D. 2006. A
Composite Kernel to Extract Relations between
Entities with both Flat and Structured Features.
COLING-ACL2006. - Zhao S.B. and Grisman R. 2005. Extracting
relations with integrated information using
kernel methods. ACL2005. - Zhou G.D., Su J., Zhang J. and Zhang M. 2005.
Exploring various knowledge in relation
extraction. ACL2005.
24 End Thank You!