Title: caBIG GoMiner Pilot: Progress Report and caBio Integration Demo
1caBIG GoMiner Pilot Progress Report and caBio
Integration Demo
2GoMiner Concept of Operations
- GoMiner is a tool that leverages the Gene
Ontology to analyze lists of genes - Typically from a gene expression microarrays
- GoMiner classifies the genes into biologically
coherent categories and assesses these
categories. - GoMiner has several major input sources
- From the GO Consortium
- Gene Ontology
- Gene Associations
- From the user
- List of Total Genes in the experiment
- List of Changed Genes in the experiment
- The insights gained through GoMiner can generate
hypotheses to guide additional research.
Ontology
Gene Mappings
TotalFile
ChangedFile
GoMiner
3GoMiner caBIG Pilot
- Bring GoMiner to Silver-level caBIG Compatibility
- Semantics
- UML
- Establish interoperability with other caBIG
resources - caBio
- caArray
- caDSR
- Web Services
4GoMiner API UML Model
- Developed as part of the semantic mapping
activity - Describes GoMiners interfaces
- We have not finalized all of our modeling
decisions - Include GO concepts?
- We just received our EVS report
- Semantic Connector
- Manual Curation
5caBIO BioCarta
- Leverage BioCarta to provide a biological context
for GoMiner results - For GO categories, provide a display that lists
all of the BioCarta pathway maps that contain at
least one changed gene from that category - Include summary data so users can examine the
most relevant pathway maps - Use as an example of interoperability with caBIO
- Leverage the caBIO web services API
-
Note Builds on simpler linkouts already in
GoMiner
6About the Plumbing
- GoMiner compiles a list of genes
- Opens a web services connection
- For each gene, GoMiner issues a web services
request to find all of the pathway maps
associated with that gene - The results are tabulated and presented to the
user - Remove duplicates
- Count changed genes on each pathway map
- A plain browser-based web page request is used to
display user-selected maps
7The Demo
8For the Propeller Heads Code Details
- Easiest way to get going is to
- Look at their code samples, you have everything
you need there. They have an excellent test
class to run all sorts of selects. - Base your code on the test cases.
- There is a webservices directory in the
distribution, and everything you need is in
there. No need to compile or generate anything. - When making queries to pathways, genes, taxons
etc remember to register the Gene type along
with the other types you need to get back
final QName qnGene new QName("urnimpl.domain.ca
bio.nci.nih.gov""GeneImpl") call.registerTypeMap
ping(GeneImpl.class, qnGene, new
BeanSerializerFactory(GeneImpl.class, qnGene),
new BeanDeserializerFactory(GeneImpl.class,
qnGene)) //This is added to the second argument
of the //Call object to search by
gene call.addParameter(arg2, qnGene,
ParameterMode.IN)
9More Code Details
- Results of the webservice call returns a Object
object, which contains the objects you were
selecting. In our case PathwayImpl. - Depending on what youre doing, the Object may
need to be rebuilt to an ArrayList. - In our case, to get things displayed properly in
a JTable, we needed to rebuild the array so we
could gain a Key Value pair for the TableModel.
10Observations
- The version of caBio in the 3.0 caCore release
has much stronger web services features than
earlier caBio releases - Faster responses
- Easier to get to the data of interest
- Described using WSDL
- This is not an ideal interoperability
demonstration - Our implementation uses client library that was
included in the caCore distribution.
11Open Issues
- All of the following are issues that we
encountered, but not yet resolved at this point - The pathway query API includes a collection to
list the genes on the pathway - Put for our queries, this collection always null
- We are only getting human pathways back
- We are unsure why we are not getting mouse
pathways which appear on the NCICB site - We have been unable generate stubs from WSDL.
- Workaround is to simply use the client library
included in the caCore distribution