Introduction to Kepler - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Introduction to Kepler

Description:

Steve Mock NMI. Steve Neuendorffer Ptolemy II. Jing Tao SEEK. Mladen ... Bing Zhu SEEK. E-Science Link-up Project. Grid-enabled data queries. Grid-enabled data ... – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 27
Provided by: sro38
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Kepler


1
Introduction to Kepler
  • Deana Pennington
  • University of New Mexico
  • February 4, 2005

2
Scientific Workflows
  • Model the way scientists work with their data now
  • Mentally coordinate export and import of data
    among software systems
  • Workflows emphasize data flow
  • Metadata-driven data ingestion
  • Output generation includes creating appropriate
    metadata

Archive output to EcoGrid with workflow metadata
Query EcoGrid to find data
3
Scientific Workflows are
  • Not linear
  • Involve multiple data sets
  • Involve multiple analytical steps

4
Productivity Example
  • Mental Model

Biomass
Temp
Soil
Et al.
f (
C
Concept
5
Technology-enabled
Semantic Mediation System Kepler Workflow System
  • Mental Model

Ontologies
Executable Workflow
C
Concept
Workflow design Seamless execution
Automate TS
TS
DS
AS
AS
AS
AS
TS
TS
SS
DS
TS
Transformation Step
TS
SS
DS
TS
Semi-automatic data integration
SS
Sharing Step
6
Automated Workflows
  • Scripts Single platform
  • Visual modeling Single environment
  • Workflows
  • Cross-platform
  • Cross-environment
  • Distributed data analyses

7
Kepler today
  • Supports scientific workflows
  • Ecology, molecular bio, geology,
  • Variety of analytical components (including
    spatial data transformations)
  • Support for R scripts and Matlab scripts
  • Real-time data access via Antelope ORB
  • EcoGrid access to heterogeneous data
  • EML Data support
  • Experimental data, survey data, spatial raster
    and vector data, etc.
  • DarwinCore Data support
  • Museum collections
  • EcoGrid registry to discover data sources
  • Ontology-based browsing for analytical components
  • Exploit semantics to improve the user experience
  • Demonstration workflows
  • Ecological Niche Modeling
  • Promoter Identification Workflow
  • Geologic Map Information Integration
  • Real-time Revelle example of data access

8
Kepler next year
  • Usability engineering
  • Full evaluation and user-oriented customization
    of all UI components
  • Distributed computing/grid computing
  • Large jobs, lots of machines
  • Detached execution
  • Smart data and component discovery
  • Support annotating data sources
  • Component repository / downloadable components
  • Automated data and service integration and
    transformation using ontologies
  • Complete EcoGrid access
  • Full EML support
  • Support for large data and 3rd-party transfer
  • More data sources and types of data sources
    (e.g., JDBC, GEON data)

9
Starting point Ptolemy II
  • Electrical engineering community
  • Large mathematical library

Source Edward Lee et al. http//ptolemy.eecs.berk
eley.edu/ptolemyII/
10
KeplerContributors, Projects, Sponsors
  • Ilkay Altintas SDM
  • Chad Berkley SEEK
  • Shawn Bowers SEEK
  • Tobin Fricke ROADNet
  • Jeffrey Grethe BIRN
  • Christopher H. Brooks Ptolemy II
  • Zhengang Cheng SDM
  • Dan Higgins SEEK
  • Efrat Jaeger GEON
  • Matt Jones SEEK
  • Edward A. Lee Ptolemy II
  • Kai Lin GEON
  • Ashraf Memon GEON
  • Bertram Ludaescher BIRN, GEON, SDM, SEEK
  • Steve Mock NMI
  • Steve Neuendorffer Ptolemy II
  • Jing Tao SEEK
  • Mladen Vouk SDM
  • Xiaowen Xin SDM

E-Science Link-up Project
11
Grid-enabled data queries
Grid get
  • Grid-enabled data
  • Any registered node
  • Metadata driven
  • Ontology-based

12
EcoGrid Sources
13
EML Metadata in Kepler
14
Kepler Workflow System
  • Grid-enabled analyses
  • Any registered node
  • Any platform (Unix, Windows, Mac)
  • Any environment (C, SAS, GIS)
  • Local programs
  • Web application
  • Web service

15
Biodiversity Indices in Kepler
16
R in Kepler
Source Dan Higgins, Kepler/SEEK
17
(No Transcript)
18
Director/Actor Metaphor
Actor
Director
Actor
Actors know HOW to act..know their part Directors
know WHEN they should act
Actor
Examples Process Network procedural, single
point in time Synchronized Data Flow subset of
Process Net Continuous Time all points in time
  • Models of computation
  • Behavioral polymorphism

19
Actors
actor name
data
parameters
Input data
Output data
ports
20
Actors
actor name
data
parameters
Input data
Output data
ports
2 output ports
21
Right-click menu
22
Editing parameters
Double-click or right-click
0 to many
23
Configuring Ports
Right-click
String Int Double array
User-defined
24
Procedure
  • Open a new workflow
  • Add a director
  • Search for data (optional)
  • Add data source (optional)
  • Add an actor
  • Edit parameters
  • Add ports (if needed)
  • Configure ports
  • Add another actor
  • Hook up input/output ports

25
  • Kepler Exercises

26
Acknowledgements
This material is based upon work supported by the
National Science Foundation under awards 0225676
for SEEK and 0225673 (AWSFL008-DS3) for GEON and
by the Department of Energy under Contract No.
DE-FC02-01ER25486 for SciDAC/SDM and by DARPA
under Contract No. F33615-00-C-1703 for Ptolemy.
Any opinions, findings and conclusions or
recomendations expressed in this material are
those of the author(s) and do not necessarily
reflect the views of the National Science
Foundation (NSF). The National Center for
Ecological Analysis and Synthesis, a Center
funded by NSF (Grant Number 0072909), the
University of California, and the UC Santa
Barbara campus. The Andrew W. Mellon
Foundation. PBI Collaborators NCEAS, University
of New Mexico (Long Term Ecological Research
Network Office), San Diego Supercomputer Center,
University of Kansas (Center for Biodiversity
Research) Kepler contributors SEEK, Ptolemy II,
SDM/SciDAC, GEON
Write a Comment
User Comments (0)
About PowerShow.com