Title: Shahriyar Hossain, Munirul Islam, Jesmin, Hasan M Jamil
1PhyQL A Phylogenetic Visual Query Engine
- Shahriyar Hossain?, Munirul Islam?, Jesmin?,
Hasan M Jamil? - Integration Informatics Laboratory, Computer
Science, Wayne State University? - Department of Genetic Engineering and
Biotechnology, University of Dhaka, Bangladesh? - BIBM 2008
2What is a Phylogenetic Tree?
3(No Transcript)
4Queries
ltrootgt ltnodegtrayfinned fishlt/nodegt
ltinodegt ltnodegtlungfishlt/nodegt
ltinodegt ltinodegt
ltnodegtsalamanderslt/nodegt
ltnodegtfrogslt/nodegt lt/inodegt
. .
. lt/inodegt lt/inodegt lt/rootgt
for root in doc(tree.xml")//root return
ltspangt lth1gt root/node/text()
lt/h1gt lt/spangt
5Phylogenetic Query Language
- Select select a subset of trees that match a
given criteria - Join Join two trees based on a pair of nodes
- Subset Subset queries retrieve part of a given
tree
6Tree Join
Using Path Operators
SubTree Projection
7PhyQL
Visual Query Interface
SELECT
JOIN
User
SUBTREE
Translator
DB
XML /NEXUS From User / Interoperable Databases
Wrappers
XSB
8Why XSB?
- eliminates left recursion problem
- Path(X,Z) - Path(X,Y), Edge(Y,Z)
- Stores intermediate results (by tabling method)
- Model-based (order of writing rules doesnt
matter) - Path(X,Y) - edge(X,Y)
- Path(X,Y) - Path(X,Y), edge(Y,Z)
- its in-memory database queries are an order of
magnitude faster than methods such as tuProlog.
- - odbc_import(conn, 'tbl_treeinfo'(rootId',
author'), tree). - - odbc_import(conn, 'tbl_nodeinfo'('nodeId',
'nodename'), node). - - odbc_import(conn, 'tbl_edge'('parentId',
'childId'), edge).
9lttree author"stern"gt ltnode type"gt
ltnode type?"gt ltnodegt
Stanhopea_gibbosa lt/nodegt ltnodegt
Stanhopea_vasquezii lt/nodegt lt/nodegt
ltnodegt Stanhopea_shuttleworthii lt/nodegt
lt/nodegt lt/treegt
node(Y1, Stanhopea_shuttleworthii), node(Y2,
Stanhopea_gibbosa), node(Y3, Stanhopea_vasquezi
i), edge(Y4,Y2), edge(Y4,Y3), lca(Y0,Y4,Y1), edge(
Y0,Y1)
10(No Transcript)
11(No Transcript)
12Integration Informatics Research Group
13(No Transcript)
14Summary
- PhyQL offers a simple web-based visual query
interface - Logic based tree query operations
- Modifications to query tools only requires change
in logic rules - Proposed architecture can also applied to
protein-protein interaction networks, metabolic
pathways etc. - Future Work
- Database Interoperability allow retrieving
integrate phylogenetic data during query
submission - ReQuery query on the result set
- Tree Similarity Estimation
15Thank You!
me http//homopan.wayne.edu/PhD Students/Munirul
Islam/index.htm
16Uses of Phylogenetic Trees
- date events of divergence of species
- what is the most common ancestor of all living
species? - identify geographic origins of new disease
outbreaks
17- Crimson
- Uses nested subtrees to avoid long strings
- Zheng, Y. S. Fisher, S. Cohen, S. Guo, J. Kim,
and S. B. Davidson. 2006. Crimson A Data
Management System to Support Evaluating
Phylogenetic Tree Reconstruction Algorithms. 32nd
International Conference on Very Large Data
Bases, ACM, pp. 1231-1234.
18Dewey system
19Find clade for Z (ltCSDs)
Find common pattern starting from left
SELECT FROM nodes WHERE (path LIKE 0.2.1)
20Depth-first traversal scoring each node with a
left and right ID
21Minimum Spanning Clade of Node 5
SELECT FROM nodes INNER JOIN nodes AS
include ON (nodes.left_id BETWEEN include.left_id
AND include.right_id) WHERE include.node_id 5