Title: INEX: Evaluating contentoriented XML retrieval
1INEX Evaluating content-oriented XML retrieval
- Mounia Lalmas
- Queen Mary University of London
- http//qmir.dcs.qmul.ac.uk
2Outline
- Content-oriented XML retrieval
- Evaluating XML retrieval INEX
3XML Retrieval
- Traditional IR is about finding relevant
documents to a users information need, e.g.
entire book. - XML retrieval allows users to retrieve document
components that are more focussed to their
information needs, e.g a chapter of a book
instead of an entire book. - The structure of documents is exploited to
identify which document components to retrieve.
4Structured Documents
- Linear order of words, sentences, paragraphs
- Hierarchy or logical structure of a books
chapters, sections - Links (hyperlink), cross-references, citations
- Temporal and spatial relationships in multimedia
documents
5Structured Documents
-
- Explicit structure formalised through
document representation standards (mark-up
languages) - Layout
- LaTeX (publishing), HTML (Web publishing)
- Structure
- SGML, XML (Web publishing, engineering), MPEG-7
(broadcasting) - Content/Semantic
- RDF, DAML OIL, OWL (semantic web)
ltbgtltfont size2gtSDRlt/fontgtlt/bgtltimg
src"qmir.jpg" border0gt
ltsectiongt ltsubsectiongt ltparagraphgt
lt/paragraphgt ltparagraphgt lt/paragraphgt
lt/subsectiongtlt/sectiongt
ltBook rdfaboutbookgt ltrdfauthor../gt
ltrdftitle/gtlt/Bookgt
6XML eXtensible Mark-up Language
- Meta-language (user-defined tags) currently being
adopted as the document format language by W3C - Used to describe content and structure (and not
layout) - Grammar described in DTD (? used for validation)
ltlecturegt lttitlegt Structured Document
Retrieval lt/titlegt ltauthorgt ltfnmgt Smith
lt/fnmgt ltsnmgt John lt/snmgt lt/authorgt ltchaptergt
lttitlegt Introduction into XML
retrieval lt/titlegt ltparagraphgt .
lt/paragraphgt lt/chaptergt
lt/lecturegt
lt!ELEMENT lecture (title, author,chapter)gt lt!ELE
MENT author (fnm,snm)gt lt!ELEMENT fnm PCDATAgt
7XML eXtensible Mark-up Language
- Use of XPath notation to refer to the XML
structure
chapter/title title is a direct sub-component of
chapter //title any title chapter//title title
is a direct or indirect sub-component of
chapter chapter/paragraph2 any direct second
paragraph of any chapter chapter/ all direct
sub-components of a chapter
ltlecturegt lttitlegt Structured Document
Retrieval lt/titlegt ltauthorgt ltfnmgt Smith
lt/fnmgt ltsnmgt John lt/snmgt lt/authorgt ltchaptergt
lttitlegt Introduction into SDR lt/titlegt
ltparagraphgt . lt/paragraphgt
lt/chaptergt lt/lecturegt
8Querying XML documents
- Content-only (CO) queries
- 'open standards for digital video in distance
learning' - Content-and-structure (CAS) queries
-
- //article about(., 'formal methods verify
correctness aviation systems') - /body//section
- about(.,'case study application
model checking theorem proving') - Structure-only (SA) queries
- /article//section/paragraph2
9Content-oriented XML retrieval
- Return document components of varying
granularity (e.g. a book, a chapter, a section, a
paragraph, a table, a figure, etc), relevant to
the users information need both with regards to
content and structure.
10Content-oriented XML retrieval
- Retrieve the best components according to
content and structure criteria - INEX most specific component that satisfies the
query, while being exhaustive to the query - Shakespeare study best entry points, which are
components from which many relevant components
can be reached through browsing - ???
11Challenges
0.2
- Article ?XML,?retrieval
-
?authoring -
- 0.9 XML 0.5
XML 0.2 XML - 0.4 retrieval
0.7
authoring
0.2
0.4
0.5
Section 2
Section 1
Title
0.6
0.4
0.4
- no fixed retrieval unit nested elements
element types - how to obtain document and collection statistics?
- which component is a good retrieval unit?
- which components contribute best to content of
Article? - how to estimate?
- how to aggregate?
12Approaches
vector space model
bayesian network
fusion
collection statistics
language model
cognitive model
smoothing
proximity search
tuning
belief model
boolean model
relevance feedback
phrase
parameter estimation
probabilistic model
logistic regression
component statistics
ontology
term statistics
natural language processing
extending DB model
13Vector space model
article index
abstract index
section index
merge
sub-section index
paragraph index
tf and idf as for fixed and non-nested retrieval
units
(IBM Haifa, INEX 2003)
14Language model
element language model collection language
model smoothing parameter ?
element score
high value of ? leads to increase in size of
retrieved elements
element size element score article score
rank element
query expansion with blind feedback ignore
elements with ? 20 terms
results with ? 0.9, 0.5 and 0.2 similar
(University of Amsterdam, INEX 2003)
15Evaluation of XML retrieval INEX
- Evaluating the effectiveness of content-oriented
XML retrieval approaches - Collaborative effort ? participants contribute to
the development of the collection - queries
- relevance assessments
- Similar methodology as for TREC, but adapted to
XML retrieval - 40 participants worldwide
- Workshop in Schloss Dagstuhl in December (20
institutions)
16INEX Test Collection
- Documents (500MB), which consist of 12,107
articles in XML format from the IEEE Computer
Society 8 millions elements - INEX 2002
- 30 CO and 30 CAS queries
- inex2002 metric
- INEX 2003
- 36 CO and 30 CAS queries
- CAS queries are defined according to enhanced
subset of XPath - inex2002 and inex2003 metrics
- INEX 2004 is just starting
17Tasks
- CO aim is to decrease user effort by pointing
the user to the most specific relevant portions
of documents. - SCAS retrieve relevant nodes that match the
structure specified in the query. - VCAS retrieve relevant nodes that may not be the
same as the target elements, but are structurally
similar.
18Relevance in XML
- A element is relevant if it has significant and
demonstrable bearing on the matter at hand - Common assumptions in IR
- Objectivity
- Topicality
- Binary nature
- Independence
article
section
1 2 3
paragraph
1 2
19Relevance in INEX
all sections relevant ? article very relevant all
sections relevant ? article better than
sections one section relevant ? article less
relevant one section relevant ? section better
than article
article
section
- Exhaustivity
- how exhaustively a document component discusses
the query 0, 1, 2, 3 - Specificity
- how focused the component is on the query 0, 1,
2, 3 - Relevance
- (3,3), (2,3), (1,1), (0,0),
20Relevance assessment task
- Completeness
- Element ? parent element, children element
- Consistency
- Parent of a relevant element must also be
relevant, although to a different extent - Exhaustivity increase going ?
- Specificity decrease going ?
- Use of an online interface
- Assessing a query takes a week!
- Average 2 topics per participants
article
section
1 2 3
paragraph
1 2
21Interface
Current assessments
Groups
Navigation
22Assessments
- With respect to the elemens to assess
- 26 assessments on elements in the pool (66
in INEX 2002). - 68 highly specific elements not in the pool
- 7 elements automatically assessed
- INEX 2002
- 23 inconsistent assessments per query for one
rule
23Metrics
- Need to consider
- Two dimensions of relevance
- Independency assumption does not hold
- No predefined retrieval unit
- Overlap
- Linear vs. clustered ranking
article
section
24INEX 2002 metric
- Quantization
- strict
- generalized
25INEX 2002 metric
- Precision as defined by Raghavan89 (based on
ESL) - where n is estimated
26Overlap problem
27INEX 2003 metric
- Ideal concept space (Wong Yao 95)
c
t
28INEX 2003 metric
- Quantization
- strict
- generalised
29INEX 2003 metric
30INEX 2003 metric
31INEX 2003 metric
- Penalises overlap by only scoring novel
information in overlapping results - Assume uniform distribution of relevant
information - Issue of stability
- Size considered directly in precision (is it
intuitive that large is good or not?) - Recall defined using exh only
- Precision defined using spec only
32Alternative metrics
- User-effort oriented measures
- Expected Relevant Ratio
- Tolerance to Irrelevance
- Discounted Cumulated Gain
33Lessons learnt
- Good definition of relevance
- Expressing CAS queries was not easy
- Relevance assessment process must be improved
- Further development on metrics needed
- User studies required
34Conclusion
- XML retrieval is not just about the effective
retrieval of XML documents, but also about how to
evaluate effectiveness - INEX 2004 tracks
- Relevance feedback
- Interactive
- Heterogeneous collection
- Natural language query
http//inex.is.informatik.uni-duisburg.de2004/
35INEX Evaluating content-oriented XML retrieval
- Mounia Lalmas
- Queen Mary University of London
- http//qmir.dcs.qmul.ac.uk