Title: Alvis status report: Index Data
1Alvis status report Index Data
Annual meeting, 23 January 2006
Check out the exciting things to come! 1.
Technical contribution 2. Status of tasks 3.
Status of milestones 4. Status of deliverables 5.
Contribution to other work packages
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
21. Technical contribution
Metadata formats Fat Peer description format
complete Enriched Document format complete
Architecture Many details of fat-peer
architecture resolved
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
31. Technical contribution
Indexing Engine (Zebra) Improved performance
bottlenecks Support 264 word and document
occurrences Improved indexer performance (approx.
1000 docs/s) Improved boolean 'and' search
performance Implemented approximate hit
counts Created XML/XSLT indexing input
filter Fixed truncation error found in WP8
testing Set up HP 64-bit dual AMD Opteron box for
load testing
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
42. Status of tasks
Task 3.1 - Network Node Metadata
Framework. Designed and documented (see D3.1),
not yet deployed Task 3.2 - Semantic Document
Metadata Framework. Designed, documented and
tested, except WP5 contribution Task 3.3 -
Database Engine Framework. Several releases
made Documentation and further development work
required
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
52. Status of tasks
Task 3.4 - Semantic Indexing Support. Prototype
facilities are complete Provided XML/XSLT-based
indexing specification Documentation and further
development work required Task 3.5 - Distributed
Network Integration. The indexing engine
integrated with processing pipeline Integration
between fat peers still to be done
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
63. Status of milestones
MS3.1. (M6) First version of network node
metadata framework. MS3.2. (M12) First version of
semantic document format. MS3.3. (M12) Database
engine framework initial release. MS3.4. (M18)
Semantic indexing support complete in DB
engine. MS3.5. (M18) Network node metadata
framework complete. MS3.6. (M20) Semantic
document metadata framework complete. MS3.7.
(M24) Database engine participating in ALVIS
network. All achieved except MS3.4 ...
complete is overstating it. MS3.8 (feature
lockdown) is still to come.
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
74. Status of deliverables
D3.1. (M24) Report on metadata frameworks,
including concrete representations, for network
nodes and semantic document analyses. Delivered
this morning ) D3.2. (M36) Database engine
framework extended to support external semantic
indexing modules and the ALVIS network architectur
e, fully documented and packaged. To follow
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
85. Contribution to other WPs
WP2 (Document Probability Model) Implemented
static ranking plugin API for Zebra Dynamic
relevance-scoring plugin API for Zebra
Experimental support for various TF-IDF
algorithms Fuzzy set ranking and hit-set
computation
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
95. Contribution to other WPs
WP4 (Distributed Search) Contribution to
architecture decisions Wrote use-case document
(Digital library system) CQL query-trickling
design Prototype P2P hit-set merge module
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
105. Contribution to other WPs
WP7 (Topic Specific Crawl) Debian packaging of
WP7 Crawler Integration of crawler into
pipeline. ZOOM-Perl API for feeding harvested
documents into indexer
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
115. Contribution to other WPs
WP8 (Integration and Evaluation) Pipeline
protocols and implementation AlvisPipeline Perl
module Pipeline-testing GUI client,
DC-TUNES. Built local testing database
containing 7GB Wikipedia Snippet generation
plugin API for Zebra
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
125. Contribution to other WPs
WP10 (Dissemination and Exploitation) Paper
presented to the IDDI session of DEXA
2005 Searching very large bodies of data using
a transparent peer-to-peer proxy. ZOOM-Perl and
AlvisPipeline modules on CPAN
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
135. Contribution to other WPs
WP11 (Demonstration) Preparation of software to
participate in demonstration Guidance of
demo-system integration efforts
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt
14The End ... finally
I can't believe we have to sit through eleven of
these presentations.
Alvis status report Index Data
Mike Taylor ltmike_at_indexdata.comgt