CSCI 586 Paper Presentation - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

CSCI 586 Paper Presentation

Description:

Number of Views:113

Avg rating:3.0/5.0

Slides: 20

Provided by: dennyz

Category:

Tags: csci | paper | presentation

Transcript and Presenter's Notes

Title: CSCI 586 Paper Presentation

1
CSCI 586 Paper Presentation

2
Jena

3
Jena (contd.)

4
Architecture
5
Storage Schema Storing Arbitrary RDF Statements

Jena 1- Two different database schemas one for
Relational databases and one for BerkeleyDB-
Relational database schema was normalized-
BerkeleyDB schema was denormalized- Graphs
stored using BerkeleyDB accessed faster

6
Storage Schema Storing Arbitrary RDF Statements
(contd.)
7
Storage Schema Storing Arbitrary RDF Statements
(contd.)

8
Storage Schema Storing Arbitrary RDF Statements
(contd.)
9
Storage Schema Storing Arbitrary RDF Statements
(contd.)

Advantage Possible to perform many find
operations without a join (less time required)
Disadvantage More database space
consumedAddressed by- Compressing common
prefixes (replacing by database references)-
Long values only stored once (configurable
threshold)- Not storing property URI

10
Storage Schema Optimizing for Common Statement
Patterns

Common Patterns arise from RDF specifications
(rdfpredicate, rdfobject ..) or user data
Revised RDF specification allows multiple reified
instances of any statement
Jena 2 Property table stores subject-value pairs
related by a particular property. It stores all
instances of the property in the graph

11
Jena 2 Persistence Architecture Specialized
Graph Interface

Graph interface at higher level supports usual
operations of add, delete and find
Each logical graph implemented by using a list of
specialized graphs (individually optimized)
Any operation on the entire logical graph is
processed by invoking individual operations on
each specialized graph in turn
Results are combined and returned as result for
entire graph

12
Jena 2 Persistence Architecture Specialized
Graph Interface
13
Jena 2 Persistence Architecture Specialized
Graph Interface

Optimization- Operation need not use all
graphs- Find operation can be done in a lazy
fashion, continuing only if more information is
needed- Overhead of running the operation over
the database is amortized across several graphs

14
Jena 2 Persistence Architecture Database Driver

Generic driver implementation for SQL databases
Engine specific drivers for other databases
Engine specific drivers override general methods
as necessary
Drivers responsible for tasks like database
initialization, table creation and deletion etc.
Drivers map Java objects to database encoding
Drivers use static and dynamically generated SQL
for data manipulation

15
Jena 2 Persistence Architecture Configuration
and Meta-graphs

Configuration parameters specified as RDF
statements (Jena 1 used a configuration file)
Analogous to storing metadata for relational
databases in tables
Default graphs provided with parameters
Meta-graph contains metadata about each logical
graph and can be queried but not modified
Meta-graph contains configuration parameters
other metadata such as driver version, list of
graphs stored ..

16
Jena Query Processing Find Processing

Pattern to be evaluated is passed to each
specific graph handler
Searching is done individually on graphs and a
completion flag is set at the end
Results are concatenated and returned to the
application
Each specialized graph issues a separate query
for the pattern (Single query might be unwieldy
for large databases)

17
Jena Query Processing RDQL Processing

Jena 1 converted RDQL queries into a pipeline of
find patterns connected by join variables which
was evaluated in a nested fashion
Jena 2 tries to push the join into the database
engine. The goal is to convert patterns to a
single query to be evaluated by the database
engine
In cases where a single query is not possible, a
combination method is used
Queries in Jena 2 may span graphs

18
Miscellaneous Topics (in work)

Jena 2 Performance Toolkit Utility programs
data generator, benchmark suite, RDF data
analysis tool
Jena Transaction Management Richer transaction
interface
Bulk Load Reduction in time to load persistent
graphs (denormalized schema, JDBC2, query
optimization)

19
Conclusion

Jena 2s denormalized schema is faster than
normalized schema for Jena 1
Increased database size (might be offset by
several techniques being implemented such as URI
prefix compression and property class tables for
reification)
Comprehensive study of RDQL still to be performed
Lots of future work expected in this field. Jena
2, though more efficient than Jena 1, still has a
long way to go