Managing SCEC Workflows - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Managing SCEC Workflows

Description:

Collaboration with Yolanda Gil & her team, ISI ... built in collaboration with Ian Foster, Mike Wilde (ANL) and Jens Voeckler (UofC) ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 15
Provided by: thomas570
Category:

less

Transcript and Presenter's Notes

Title: Managing SCEC Workflows


1
Managing SCEC Workflows
Ewa Deelman, USC Information Sciences
Institute November 21 2005
2
Acknowledgements
  • Pegasus Team Ewa Deelman, Carl Kesselman,
    Gaurang Mehta, Gurmeet Singh, Mei-Hui Su, Karan
    Vahi (ISI)
  • Collaboration with Yolanda Gil her team, ISI
  • VDS is built in collaboration with Ian Foster,
    Mike Wilde (ANL) and Jens Voeckler (UofC)

3
SCEC Cybershake
  • Calculate hazard curves by generating synthetic
    seismograms from estimated rupture forecast

Hazard Map
Strain Green Tensor
4
Cybershake on the SCEC VO
Provenance Catalog
Data Catalog
Workflow Scheduler Engine
VO Service Catalog
TeraGrid Storage
SCEC Storage
TeraGrid Compute
VO Scheduler
5
SCEC software infrastructure
6
SCEC workflows on the TG
Executable workflow
7
SCEC Workflows on the TG
Gaurang Mehta at ISI ran the experiments
8
Data Management in CyberShake
  • Data staged out to SCEC resources (Pegasus,
    GridFTP)
  • Data registered in the RLS (Pegasus, RLS)
  • Metadata registered in MCS (SCEC application)
  • Provenance information registered in the PTC (VDS
    kickstart and PTC)
  • Up to date Hazard Curves available through a web
    interface (custom code)

9
SCEC computations so far
  • Pasadena
  • 33 workflows
  • USC
  • 26 workflows
  • Each workflow
  • 11, 1000 jobs
  • 23 days total runtime
  • NCSA SDSC TG
  • Failed job recovery
  • Retries
  • Rescue DAG

10
So far 2 SCEC sites done (Pasadena and USC)
11
Distribution of seismogram jobs
70 hours
12
  • Observations from working with the Scientists
  • Two way street they give us feedback on our
    technologies, we show them how things run (break)
    at scale
  • We have seen great performance improvements in
    the codes

13
Future directions
  • Manage multiple workflows as part of a single
    scientific analysis
  • Automate the workflow provisioning and workflow
    mapping transition
  • Improve the failure recovery process
  • Interface with WINGS within in the CyberShake
    context
  • Evaluate the new Cybershake codes
  • Enlarge the workflow to encompass more of the
    end-to-end analysis
  • Transition the technology to the SCEC scientists

14
Summary
  • Ran 2 SCEC sites and provided results to
    scientists
  • Deployed a SCEC-oriented virtual grid and
    services
  • Leveraged existing software and hardware
  • Developed application-specific capabilities
  • Real time data
  • Managed the output data including registration
    and provenance tracking
  • Provided information that enabled the
    optimization of the CyberShake codes

15
PegasusPlanning for Execution in Grids
  • Maps from abstract to executable workflow
  • Automatically locates physical locations for both
    workflow components and data
  • Finds appropriate resources to execute the
    components
  • Reuses existing data products where applicable
  • Publishes newly derived data products
  • Provides provenance information

16
Information Components used by Pegasus
  • Globus Monitoring and Discovery Service (MDS) (or
    static file)
  • Locates available resources
  • Finds resource properties
  • Dynamic load, queue length
  • Static location of GridFTP server, RLS, etc
  • Globus Replica Location Service
  • Locates data that may be replicated
  • Registers new data products
  • Transformation Catalog
  • Locates installed executables
Write a Comment
User Comments (0)
About PowerShow.com