Brain Data - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Brain Data

Description:

Center for Imaging Science, John Hopkins. Center for ... Lab1. Lab2. Lab3. Wrapper. Wrapper. Wrapper. XML Q/A. SRB/MCAT, DOM, X(ML)Query. structure ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 27
Provided by: amarnat
Learn more at: https://users.sdsc.edu
Category:
Tags: brain | data | lab1

less

Transcript and Presenter's Notes

Title: Brain Data


1
Brain Data Knowledge Grid
  • (or Towards Services for Knowledge-Based
    Mediation of Neuroscience Information Sources)

National Center for Microscopy and Imaging
Research (NCMIR) Mark Ellisman Maryann
Martone Steve Peltier Steve Lamont ...
Data-Intensive Computing Environments San Diego
Supercomputer Center (SDSC) Reagan Moore Chaitan
Baru Amarnath Gupta Bertram Ludäscher Richard
Marciano Arcot Rajasekar Ilya Zaslavsky ...
University of California, San Diego
2
Infrastructure for Sharing Neuroscience Data
  • SOURCES
  • NCMIR, U.C. San Diego
  • Caltech Neuroimaging
  • Center for Imaging Science, John Hopkins
  • Center for Computational Biology, Montana State
  • Laboratory of Neuro Imaging (LONI), UCLA
  • Computatuonal Neurobiology Laboratory, Salk
    Inst.
  • Van Essen Laboratory, Washington University
  • Data Management Infrastructure (DICE/NPACI)
  • MIX Mediation in XML
  • MCAT information discovery
  • SRB data handling
  • HPSS storage
  • ...

Knowledge-based GRID infrastructure
?
?
?
?
Data Management Infrastructure (Data
Grid) GTOMO, Telemicroscopy, Globus, SRB/MCAT,
HPSS
3
Sharing Resources on the Brain Data Grid
  • Scientific groups ...
  • create data products (e.g., text data, images,
    simulation data )
  • put them in collections
  • add metadata (who created it, what is the data
    about )
  • make it available for sharing (on the web, in
    data caches, in HPSS, )
  • Technical challenges ...
  • size packaging of data
  • heterogeneity data types, storage technologies,
    transport mechanisms, authentication, ...
  • access levels collection, object, fragment
    data-specific functions (data blades)
  • Data Grid technologies can help ...
  • distributed data management, e.g., Storage
    Request Broker/Metadata Catalog (SRB/MCAT),
    computing (Globus), ...
  • focus is on resource sharing (data, networks,
    cycles)

4
Integration Issue Semantic Integration/Mediation

??? SEMANTIC INTEGRATION ???
  • SYNTACTIC/STRUCTURAL Integration
  • Integrated Views (Src-XML gt Intgr-XML)
  • Schema Integration (DTD gtDTD)
  • Wrapping, Data Extraction (Text gt XML)

MIX Mediation of Information using XML
Distributed Query Processing
SRB/MCAT
storage, query capabilities protocols services
Globus JDBC DOM CORBA
SYSTEM INTEGRATION
TCP/IP grid-ftp HTTP
5
Standard Mediator/Wrapper Architecture
Client/User-Query
XML Q/A
INTEGRATED VIEW
domain semantics ???
GRID federation services ???
Integration logic

protocol translation
SRB/MCAT, DOM, X(ML)Query
structure
syntax
Wrapper
Wrapper
Wrapper
transport
storage
Files
Lab1
Lab2
Lab3
(Neuro)Science (Re)Sources
6
The Need for Semantic Integration
Cross-source queries
What is the cerebellar distribution of rat
proteins with more than 70 homology with human
NCS-1? Any structure specificity? How about other
rodents?
Cross-source relationships are modeled
Semantic (knowledge-based) mediation services
Data, relationships, constraints are modeled (CMs)
Wrapper
Wrapper
Wrapper
Wrapper
Web
protein localization
morphometry
neurotransmission
CaBP, Expasy
7
Hidden Semantics Protein Localization
  • ltprotein_localizationgt
  • ltneuron typepurkinje cell /gt
  • ltprotein channelredgt
  • ltnamegtRyRlt/gt
  • .
  • lt/proteingt
  • ltregion h_grid_pos1 v_grid_posAgt
  • ltdensitygt
  • ltstructure fraction0.8gt
  • ltnamegtspinelt/gt
  • ltamount nameRyRgt0lt/gt
  • lt/gt
  • ltstructure fraction0.2gt
  • ltnamegtbranchletlt/gt
  • ltamount nameRyRgt30lt/gt
  • lt/gt

8
Hidden Semantics Morphometry
  • ltneuron namepurkinje cellgt
  • ltbranch level10gt
  • ltshaftgt
  • lt/shaftgt
  • ltspine number1gt
  • ltattachment x5.3 y-3.2 z8.7 /gt
  • ltlengthgt12.348lt/gt
  • ltmin_sectiongt1.93lt/gt
  • ltmax_sectiongt4.47lt/gt
  • ltsurface_areagt9.884lt/gt
  • ltvolumegt7.930lt/gt
  • ltheadgt
  • ltwidthgt4.47lt/gt
  • ltlengthgt1.79lt/gt
  • lt/headgt
  • lt/spinegt

9
Knowledge-Based (Semantic) Mediation
  • Multiple Worlds Integration Problem
  • compatible terms not directly joinable
  • complex, indirect associations among attributes
  • unstated integrity constraints
  • Approach
  • a theory under which terms can be semantically
    joined
  • gt lift mediation to the level of conceptual
    models (CMs)
  • gt formalize domain knowledge, ICs become rules
    over CMs
  • gt Knowledge-Based/Model-Based (Semantic)
    Mediation

10
XML-Based vs. Model-Based Mediation
CM Descr.Logic, ER, UML, RDF/XML(-Schema),
CM-QL F-Logic, OIL, DAML,
XML Models
11
Knowledge-Based Mediator Prototype
USER/Client
CM (Integrated View)
Domain Map DM
Integrated View Definition IVD
CM Plug-In
CM Queries Results (exchanged in XML)
Logic API (capabilities)
12
Mediation Services Source Registration (System
Issues)
Source
Data Type
Query Capability
Result Delivery
Access Protocol
ARC
SQL
XML QL
DOOD
table
tree
file
SRB
HTTP
JDBC
Tuple-at-a-time
Stream
Set-at-a-time
SPJ
Selections
Binary for Viewer
13
Mediation Services Source Registration
(Semantics Issues)
  • Domain Map Registration
  • provide concept space/ontology
  • as a private object (myANATOM)
  • merge with others (give semantic bridges)
  • and check for conflicts
  • Conceptual Model Registration
  • schema classes, associations, attributes
  • domain constraints
  • put data into context (linking data to the
    domain map)

Next
14
ANATOM Domain Map
ANATOM
Back
15
Senselab (Yale) and NCMIR (UCSD) Semantic
Bridge
anatom_dom(X) - (ucsd_has_a(X,_)
ucsd_has_a(_,X) ucsd_isa(X,_)
ucsd_isa(_,X)). senselab_dom(X) - (sl_has_a(X,_)
sl_has_a(_,X) sl_isa(X,_) sl_isa(_,X)).
map Senselab anatom terms to equivalent UCSD
ANATOM sl2ucsd(X,X) - senselab_dom(X),
anatom_dom(X). sl2ucsd('A',axon). sl2ucsd('AH',axo
n). sl2ucsd('Dad',spiny_branchlet). should
map to a PATH not just the end of the
path sl2ucsd('Dam',main_branches). some of
the main_branches based on the branch
level sl2ucsd('Dap',main_branches). sl2ucsd('Dbd',
spiny_branchlet). sl2ucsd('Dbm',main_branches). sl
2ucsd('Dbp',main_branches). sl2ucsd('Ded',spiny_br
anchlet). sl2ucsd('Dem',main_branches). sl2ucsd('D
ep',main_branches). sl2ucsd('T',axon). keep
has_a edge if at least one node is known from
UCSD has_a(X,Y) - sl2ucsd(_,X),
ucsd_has_a(X,Y). has_a(X,Y) - sl2ucsd(_,Y),
ucsd_has_a(X,Y). keep all and only UCSD is_a
rels isa(X,Y) - ucsd_isa(X,Y). Back
16
Refinement of a Domain Map (Ontology) Putting
Data in Context via Registration of new Classes
Relationships
Neuron
MyNeuron
Neostriatum
Compartment
Spiny Neuron
ALLhas
Axon
Soma
Dendrite
Medium Spiny Neuron
Neurotransmitter
MyDendrite
exp

AND
GABA
Substance P
OR
exp
Dopamine R
Substantia Nigra Pc
Substantia Nigra Pr
Globus Pallidus Int.
Globus Pallidus Ext.
17
Mediation Services Integrated View Definition
  • DERIVE
  • protein_distribution(Protein, Organism,
    Brain_region, Feature_name, Anatom, Value)
  • FROM
  • Iprotein_label_image proteins -gtgt Protein
    organism -gt Organism anatomical_structures -gtgt
  • ASanatomical_structurename-gtAnatom ,
    from PROLAB
  • NAEneuro_anatomic_entityname-gtAnatom
    from ANATOM
  • located_in-gtgtBrain_region,
  • AS..segments..featuresname-gtFeature_name
    value-gtValue.
  • provided by the domain expert and mediation
    engineer
  • declarative language (here Frame-logic)

18
Example Query Evaluation (I)
  • Example protein_distribution
  • given organism, protein, brain_region
  • Use DOMAIN-KNOWLEDGE-BASE
  • recursively traverse the has_a_star paths under
    brain_region collect all anatomical_entities
  • Source PROLAB
  • join with anatomical structures and collect the
    value of attribute image.segments.features.featur
    e.protein_amount where image.segments.features.f
    eature.protein_name protein and
    study_db.study.animal.name organism
  • Mediator
  • aggregate over all parents up to brain_region
  • report distribution

19
Example Query Evaluation (II)
"How does the parallel fiber output
(Yale/SENSELAB) relate to the distribution of
Ryanodine Receptors (UCSD/NCMIR)?"
  • _at_SENSELAB X1 select output from parallel
    fiber
  • _at_MEDIATOR X2 hang off X1 from Domain Map
  • _at_MEDIATOR X3 subregion-closure(X2)
  • _at_NCMIR X4 select PROT-data(X3,
    Ryanodine Receptors)
  • _at_MEDIATOR X5 compute aggregate(X4)

20
Mediation Services Client Registration
Client
Update Client
Query Client
Thin Result Viewer
Fat Result Viewer
Navigate/ Ad-hoc
Query Capability
Query on Schema
Derive Before Insert
Check Data
Merge Before Insert
Client-side Processing
Client-side Buffer
Send Full Data
Context Sensitive
Server-side Buffer
Server-Push/ Client-Pull
21
Example Client Query Formulation and Result
Display
  • combination of ad hoc and navigational queries
  • client side visualization (left)
  • results are shown in semantic context (right)

22
Mediation Services Semantic Annotation Tools
line drawing annotationgt (spatial) database
for mediation
23
Mediator Architecture Blueprint
Mediation Services
Mediator Layer
  • Source model lifting
  • domain knowledge reconciliation
  • model transformation
  • Query formulation
  • user query
  • integrated view definition

Deductive Engine
Model Reasoner
  • Source registration
  • domain knowledge
  • model schema
  • query computation capabilities
  • Query processing
  • view unfolding
  • semantic optimization
  • capability-based rewriting

Optimizer
Wrapper Layer
  • Query interface (down API)
  • SDLIP, SOAP, ...
  • (subsets of) SQL, X(ML)-Query, CPL,...
  • DOM
  • SRB-based access
  • Result delivery interface (up API)
  • SDLIP, SOAP, ...
  • pull (tuple/set-at-a-time, DOM) vs. push
    (stream)
  • synchronous/asynchronous
  • direct data/data reference

XML Sources
RDB Sources
File Sources
HTML Sources
Digital Libraries (Collections)
Spatial Sources
Boston Univ.
NCMIR UCSD
Yale Univ.
Montana Univ.
SDLIP
ARC IMS
24
Coming up Knowledge-Based/Semantic Mediation
of Brain Data
PROTLOC
Result (XML/XSLT)
Result (VML/SVG)
ANATOM
25
Some Open Issues
  • Data/Knowledge Modeling
  • Extensibility how to handle a source with new
    data types and operations?
  • Temporal Data instrument readings, video
    microscopy
  • Spatial Data Integrating with spatial database
    systems
  • Image database systems
  • Conflict Management
  • Grades of certainty
  • Alternate Hypothesis
  • Integrating Services
  • Registration and warping of my image slice to a
    reference
  • Integrating into Larger Applications
  • M-Cell simulation
  • Telemicroscopy
  • Visualization

26
References
  • Model-Based Mediation with Domain Maps, Bertram
    Ludäscher, Amarnath Gupta, Maryann Martone, Intl.
    Conference on Data Engineering (ICDE),
    Heidelberg, 2001
  • Knowledge-Based Mediation of Heterogeneous
    Neuroscience Information Sources, Amarnath Gupta,
    Bertram Ludäscher, Maryann Martone, Intl.
    Conference on Scientific and Statistical
    Databases (SSDBM), Berlin, 2000.
  • Model-Based Information Integration in a
    Neuroscience Mediator System, Bertram Ludäscher,
    Amarnath Gupta, Maryann Martone, Intl. Conference
    on Very Large Data Bases (VLDB), Cairo, 2000.
Write a Comment
User Comments (0)
About PowerShow.com