caBIG Overview - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

caBIG Overview

Description:

caBIG Overview – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 22
Provided by: william228
Category:
Tags: cabig | overview | udal

less

Transcript and Presenter's Notes

Title: caBIG Overview


1
caGrid 0.5
caGrid Team Mission Define the caBIG system
architecture that satisfies the requirements of
the caBIG Community
August, 2005
2
caGrid
Grid-Enabled Client
Analytical Service
Tool 1
Tool 2
Research Center
NCICB
Grid Data Service
Tool 3
Tool 4
Grid Portal
Research Center
3
Architectural Considerations
  • Requirements
  • Support scientific requirements Use cases from
    cancer research community
  • Support functional requirements identifiers,
    workflow, query, etc
  • Support non-functional requirements security,
    reliability, performance, open source, etc
  • Principles
  • Driven by cancer research community requirements
  • Focus on solving a business problem, not a
    technology problem
  • Services-Oriented Architecture
  • Metadata driven and implements Virtualization
  • Expose objects, not backend databases (like
    RDBMS)
  • Standards, compatibility, and community
    acceptance
  • OGSA / OGSI

4
Architectural Considerations
  • Characteristics
  • caGrid presents an Object-Oriented view of data
  • Data types are well-defined and registered in a
    repository
  • Defined by XSD and ISO/IEC 11179
  • Described by UML, and semantic Ontologies
  • Formal harmonization and curation process
  • Standardized metadata facilitates discovery
  • Leverage existing technologies
  • caDSR, EVS, Mobius GME Common data elements,
    controlled vocabularies, schema management
  • Globus Toolkit (currently version 3.2.1)
  • Core grid services infrastructure
  • Service deployment, service registry, invocation,
    secure communication
  • OGSA-DAI (currently version 5.0)
  • Core support for data services

5
caGrid 0.5 Architecture
Functions
Quality of Service
Business Process
Semantic service
ID Resolution
GUMS
Analytical
UI
Security
Resource Management
caDSR
Service Registry
Service
GSI
OGSA-DAI
GT3
GME
Index
Service Description
caDSR
Grid Communication Protocol
GLOBUS Toolkit
CAMS
GT3
Transport
EVS
GT3
6
Grid View
caBIO

Other caBIG DataResource
caARRAY
rProteomics
Other caBIG Analysis tool
  • Data source exposed as objects
  • Well-defined objects using caDSR / EVS
  • Mobius GME for schemas
  • Metadata identifies services, objects exposed,
    relationships between objects, relationships
    between services
  • Standard Grid interfaces
  • Standard query language and interface
  • Advertisement and Discovery
  • Security
  • Invocation / Schedule
  • Execution / coordination

Resource API
caBIG Dataresource
GRAM
Security
Identifiers
OGSA-DAI
caBIG Analytical Service
caDSR EVS
Query
Invocation
Globus
Registry
Grid client API
GUI
Admin
7
caGrid
caGrid Toolkit/Infrastructure
8
caGrid Metadata and Data Description
  • Client and service APIs are object oriented, and
    operate over well-defined and curated data types
  • Objects are defined in UML and converted into
    Administered Components, which are in turn
    registered in the Cancer Data Standards
    Repository (caDSR)
  • Object definitions draw from vocabulary
    registered in the Enterprise Vocabulary Services
    (EVS), and their relationships are thus
    semantically described
  • XML serialization of objects adhere to XML
    schemas registered in the Global Model Exchange
    (GME)
  • All data in caGrid travel between services and
    between client and services as XML documents that
    conform to well-defined schemas stored in GME

9
caGrid 0.5 Services Metadata and Registry
  • Metadata and Registry Services
  • Support for Advertisement and Discovery processes
  • Metadata and registry services maintain metadata
    associated with data and analytical services
  • All services register information to an Index
    Service
  • Services can be discovered using semantics of
    their data types
  • Three types of Service Metadata
  • Common Metadata describes generic information
    about service providing Cancer Center
  • Data Service Metadata describes the data exposed
    using terminology and objects from caDSR/EVS
  • Analytical Service Metadata describes the
    supported operations and their inputs and outputs
    using terminology and objects from caDSR/EVS

10
caGrid 0.5 Services Data and Analytical Services
  • Data Services
  • Data services present an object view of data
    sources
  • Objects exposed as data services will comply with
    common data elements registered in the caDSR/EVS
  • Data Services leverage OGSA-DAI 5.0
  • Currently Query only (no update, insert, or
    delete)
  • Analytical Services
  • Analytical Services are base OGSI services
  • Required to be strongly-typed with respect to
    input and output
  • Analytical services input and output objects
    conforming to registered classes in caDSR
  • Graphical tool to automatically create source
    code, configuration files, and build process for
    new analytical services
  • Input and output parameters can be discovered
    from GME

11
caGrid 0.5 Services -- Query
  • Query services
  • Federated and semantic queries
  • Once the data sources are identified. The
    researcher can submit queries to data services
    using the web and windows based GUIs
  • The researcher specifies the query in a standard
    way regardless of the data source. The syntax of
    the query is represented in XML
  • Metadata extracted from caDSR provides
    information regarding objects exposed
  • Result sets can be transformed and redirected
    anywhere in the grid.Developers can use the API
    to implement applications
  • Currently using a custom query language
    implemented as an activity
  • Queries and Results are contained in OGSA-DAI
    Activities, Perform, and Response Documents

12
caGrid 0.5 Services -- Security
  • Secure Communication
  • Authentication - Parties involved can be assured
    of one another identity
  • Message Integrity Message sent by either party
    is guaranteed to same message when it is
    received.
  • Privacy Communication between the two parties
    can only be interpreted by the two parties
  • Single Sign On
  • Users and Grid Services should have one method of
    authenticating themselves to the grid, all
    services in the grid should accept this method
  • Access Control on caBIG Services
  • caBIG services determine which users or services
    may access them
  • User/Organizational Attribute Management
  • Services should have a method for determining the
    attributes of a requesting party. Such
    attributes may be needed to service the request,
    for example a username and password is needed to
    perform a query on a relational database on the
    partys behalf.
  • Attributes should be standardized such that they
    may be used across institutional and application
    boundaries
  • Delegation
  • caBIG services can interact with other caBIG
    services on a users behalf
  • User/Organization Management

13
caGrid 0.5 Services -- Security
  • Core Components
  • Globus Security Infrastructure (GSI)
  • Core security infrastructure
  • Grid User Management Service (GUMS)
  • Grid Service for the management and creation of
    grid users and grid user credentials.
  • Attribute Management Service (CAMS)
  • Grid Service for the management of user/virtual
    organization attributes
  • Authorization Manager
  • A general interface in which a caBIG service
    calls to determine if a user is authorized to
    perform operation X on resource Y
  • Can be used to integrate grid security with
    external authentication/authorization systems
  • External Components
  • Local Authentication/Authorization Systems
  • General Authorization Systems (e.g., PERMIS)
  • Grid Authorization Services

14
caGrid 0.5 Services
  • Portal, GUIs, and Client API
  • Web based UI
  • Graphical User Interfaces
  • Programmatic access to the grid

15
Deployment and Advertising
caGrid provides an easy way to expose data
services in the grid. When the api is generated
with the caCORE SDK, no code is required to
expose the new data service. The researcher
specifies the index service (virtual
organization) where the service will be
registered.
16
Discovery
This enables researchers to find service
providers in the grid. caGrid 0.5 provides web
and windows based discovery applications. The
same functionality can be performed using the API
17
Analytical Service Creation Tool
  • Developer defines the operations of the service
    and just has to focus on the implementation of
    them
  • Input and output parameters can be discovered
    from GME
  • Schema types can be automatically downloaded and
    configured as operation parameters
  • Specified types are used to create necessary Java
    Objects using Globus behind the scenes

18
Test bed Infrastructure
19
Acknowledgements caGrid Development team
  • SAIC
  • William Sanchez
  • Tara Akhavan
  • Manav Kher
  • Rouwei Wu
  • Jijin Yan
  • OSU
  • Scott Oster
  • Shannon Hastings
  • Steve Langella
  • Tahsin Kurc
  • Joel Saltz
  • Panther
  • Brian Gilman
  • Nick Encina
  • Oracle
  • Ram Chilukuri
  • TerpSys
  • Gavin Brennan
  • Troy Smith
  • Wei Lu
  • Doug Kanoza
  • BAH
  • Arumani Manisundaram
  • Mike Keller
  • Brian Davis
  • NCICB
  • Peter Covitz
  • Avinash Shanbhag
  • George Komatsoulis
  • Denise Warzel
  • Frank Hartel

20
Acknowledgements Reference Implementations
  • Georgetown PIR
  • Baris Suzek
  • Scott Shung
  • Georgetown - caArray
  • Colin Freas
  • Nick Marcou
  • Arnie Miles
  • DUKE - rProteomics
  • Patrick McConnell
  • UPMC - caTIES
  • Rebecca Crawley
  • Kevin Mitchell
  • SAIC
  • John Moy caArray
  • Sumeet Muju caArray
  • Juergen Lorenz caArray
  • Andrew Shinohara Test
  • Mike Connelly - caBIO
  • Jennifer Zeng caBIO
  • Nafis Zebarjadi - SDK

21
End of Talk
Write a Comment
User Comments (0)
About PowerShow.com