Grid%20Analysis%20Environment%20(GAE)%20(overview) - PowerPoint PPT Presentation

About This Presentation
Title:

Grid%20Analysis%20Environment%20(GAE)%20(overview)

Description:

ORCA/COBRA, IGUANA, PHYSH,.... Korea Workshop May 2005. 9. Single User View ... Installation of CMS (ORCA, COBRA, IGUANA,...) and LCG (POOL, SEAL,...) software ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 22
Provided by: Office20041647
Category:

less

Transcript and Presenter's Notes

Title: Grid%20Analysis%20Environment%20(GAE)%20(overview)


1
Grid Analysis Environment (GAE)(overview)
  • Frank van Lingen (fvlingen_at_caltech.edu)
  • (on behalf of the GAE group)

2
Outline
System View
Early Results
GAE
Frameworks
Projects
3
Goal
  • Provide a transparent environment for a physicist
    to perform his/her analysis (batch/interactive)
    in a distributed dynamic environment Identify
    your data (Catalogs), submit your (complex) job
    (Scheduling, Workflow,JDL), get fair access to
    resources (Priority, Accounting), monitor job
    progress (Monitor, Steering), get the results
    (Storage, Retrieval), repeat the process and
    refine results
  • Support data transfers ranging from the
    (predictable) movement of large scale (simulated)
    data, to the highly dynamic analysis tasks
    initiated by rapidly changing teams of scientist

4
Constraints
  • The Acid Test for Grids crucial for LHC
    experiments
  • Large, diverse, distributed community of users
  • Support for 100s to 1000s of analysis
    tasks,shared among dozens of sites
  • Widely varying task requirements and priorities
  • Need for priority schemes, robust authentication
    and security
  • Operates in a resource-limited and
    policy-constrained
  • global system
  • Dominated by collaboration policy and strategy
    (resource usage and priorities)
  • Requires real-time monitoring task and
    workflowtracking decisions often based on a
    global system view

5
System View
(Domain) Applications
Service Oriented Architecture (Frameworks)
(Domain) Portal
Monitoring
Interface Specifications!
(High Level Services) Global Services
Input
Local Services
Network
Compute
Storage
Design
Development
Resources
Software Lifecycle
Support/ Feedback
Multiple System Stages
Testing
Deployment
  • Not only research, but also
  • Software Engineering
  • Sociology

6
System View (Details)
  • Domains
  • Virtual Organization and Role management
  • Service Oriented Architecture
  • Authorized Access
  • Access Control Management(groups/individuals)
  • Discoverable
  • Protocols (XML-RPC, SOAP,.)
  • Service Version Management
  • Frameworks Clarens, MonALISA...
  • Monitoring
  • End-to-end monitoring,collecting and
    disseminating information
  • Provide Visualization of Monitor Data to Users

7
System View (Details)
  • Local Services (Local View)
  • Local Catalogs, Storage Systems, Task Tracking
    (Single User Tasks), Policies, Job Submission
  • Global Services (Global View)
  • Discovery Service, Global Catalogs, Job Tracking
    (Multiple User Tasks), Policies
  • High Level Services (Autonomous)
  • Acts on monitor data and has global view
  • Scheduling, Data Transfer, Network Optimization,
    Tasks Tracking (many users)

8
System View (Details)
  • (Domain) Portal
  • One Stop Shop for Applications/Users to access
    and Use Grid Resources
  • Task Tracking (Single User Tasks)
  • Graphical User Interface
  • User session logging (provide feedback when
    failures occur)
  • (Domain) Applications
  • ORCA/COBRA, IGUANA, PHYSH,.

9
Single User View
8
Client Application
1
2
Steering
Dataset service
7
3
Discovery
Catalogs
  • Catalogs to select datasets,
  • Resource Application Discovery
  • Schedulers guide jobs to resources
  • Policies enable fair access to resources
  • Robust (large size) data (set) transfer

4
9
Planner/ Scheduler
Job Submission
Execution
6
Storage Management
5
5
Monitor Information
Data Transfer
Policy
Thousands of user jobs (multi user environment!)
Storage Management
  • Feedback to users (e.g. status of their jobs)
  • Crash recovery of components (identify and
    restart)
  • Provide secure authorized access to resources and
    services.

10
Service Oriented View
Clients talk standard protocols to Grid Services
Web Server, Simple Web service API allows simple
or complex analysis clients Typical clients
ROOT, Web Browser, . Clarens portal hides
complexity Key features Global Scheduler,
Catalogs, Monitoring, Grid-wide Execution
service.
Analysis Client
Analysis Client
HTTP, SOAP, XML-RPC
  • - Discovery,
  • - ACL management,
  • Certificate based
  • access

Grid Services Web Server
Monitoring
Application Execution
Grid Wide Execution
11
Peer-2-Peer View
  • Peer-2-Peer configuration enhances
  • robustness
  • Scalability
  • Provide
  • Discovery
  • Services
  • Software
  • Publication
  • Subscription
  • No Single point of failure

discover
publish
subscribe
12
Framework (Clarens)
Authentication (X509) Access control on Web
Services. Remote file access (with access
control) Discovery of Web Services and
Software Shell service. Shell like access to
remote machines (managed by access control
lists) Proxy certificate functionality Group
management VO and role management Good
performance of the Web Service Framework Integrati
on with MonALISA
3rd party application
Service
Clarens
Web server
XML-RPC, SOAP. JavaRMI, JSON RPC, ..
http/ https
Clarens
Client
13
Framework (MonALISA)
Monitor Sensors (Web Services/Applications/)
  • Active filter agents
  • Process data
  • Application specific monitoring
  • Mobile agents
  • decision support
  • global optimisations

App
WS
App
(1) Publish
SS
SS
MonALISA Station Servers
MonALISA JINI Network
(2) Disseminate
AppS
(3) Subscribe (predicates)
AppS
(4) Steer/Retrieve
Web Services (WS), Applications (App)
WS
WS
App
  • Services are self describing
  • Code updates
  • Automatic secure
  • Dynamic configuration for services
  • Secure Admin Interface
  • MonALISA able to dynamically
  • register discover
  • Based on multi-threaded engine
  • Very scalable

Fully distributed, no single point of failure!
14
GAE Related Projects
  • DISUN (deployment)
  • Deployment and Support for Distributed Scientific
    Analysis
  • Ultralight (development)
  • Treating the network as resource
  • Vertically Integrated Monitor Information
  • Multi User, resource constraint view
  • MCPS (development)
  • Provide Clarens based Web Services for batch
    analysis (workflow)
  • SPHINX (development)
  • Policy based scheduling (global service) exposed
    a Clarens Web Service using MonALISA monitor
    information
  • SRM/Dcache (development)
  • Service based data transfer (local service)
  • Lambda Station (development)
  • Authorized programmability of routers using
    MonALISA Clarens
  • PHYSH
  • Clarens based services for command line user
    analysis
  • CRAB
  • Client to support user analysis using Clarens
    Framework

15
GAE and UltralightMake the Network an Integrated
Managed Resource
Application Interfaces
  • Unpredictable multi user analysis
  • Overall demand typically fills the capacity of
    the resources
  • Real time monitor systems for networks, storage,
    computing resources, E2E monitoring

Request Planning
Monitor
Network Planning
Network Resources
Support data transfers ranging from the
(predictable) movement of large scale (simulated
and real) data, to the highly dynamic analysis
tasks initiated by rapidly changing teams of
scientists
16
Combining Grid Projects into Grid Analysis
Environment
Clarens_Applications
MonALISA_Applications
PHEDEX
SPHINX
CRAB
Development
Grid Analysis Environment
Ultralight
SRM/dCache
Support/ Feedback
Testing
Frameworks
PHYSH
Lambda Station
Deployment
Policy
..
DISUN
MCPS
GAE focuses on integration
17
GAE Deployment
  • Installation of CMS (ORCA, COBRA, IGUANA,) and
    LCG (POOL, SEAL,) software on Caltech GAE
    testbed. Serves as environment to integrate
    applications as web services into the Clarens
    framework.
  • Demonstrated distributed multi user GAE prototype
    at SC03, SC04 (using BOSS and SPHINX)
  • Analysis Prototype in January 2005 currently
    upgraded to work with CRAB.
  • PHEDEX deployed at Caltech, UFL, UCSD and
    transferring data

18
GAE Deployment
  • Clarens has been deployed on 30 machines. Other
    sites Caltech, Florida, Fermilab, CERN,
    Pakistan, INFN (see Clarens Discovery Service)
  • Available in Java and Python
  • Core System Discovery, Proxy, Authentication,
    Remote File Access, Shell Service,.
  • Clarens Discovery Service part of OSG (uses
    MonALISA)
  • Talking with EGEE on common interface
  • Software Discovery Service being developed (SCRAM
    based prototype available.
  • GAE distributed test framework (Java, Python)
    available. Daily testing and verification.
  • Generic Catalog Service developed to expose
    pooldbs, refdb, phedex, and other DBs as
    entities with key/values

19
GAE Deployment
  • PHEDEX and Pubdb Web Services deployed in Florida
    and Caltech.
  • Part of the test framework
  • Working with the MCPS group to develop an Clarens
    based MCPS Web Service interface and browser
    based GUI.
  • Supporting Production and Analysis
  • Web Service interface to BOSS (CMS Job/Task
    book-keeping system)
  • SPHINX Distributed scheduler developed at UFL
  • Clarens/MonALISA Integration Facilitating
    user-level Job/Task Interaction
  • First version of a Steering Service
  • Java Webstart dashboard being developed for
    Clarens based services
  • Also Cross browser support for JavaScript
    browser GUI (Firefox, IE, Safari, .) based on
    JSON
  • CAVES Analysis code-sharing environment
    developed at UFL
  • Work with CERN to have the GAE components
    included in CMS software distribution.
  • GAE components being integrated in the DPE and
    VDT distribution used in US-CMS.

20
Lessons learned
  • Quality of (the) service(s)
  • Lot of exception handling needed for robust
    services (gracefully failure of services)
  • Time outs are important
  • Need very good performance for composite services
  • Discovery services
  • enables location independent service composition.
  • semantics of services are important (different
    name, name space, and/or WSDL)
  • Web service design Not every application is
    developed with a web service interface in mind
  • Interfaces of 3rd party applications change
    Rapid Application Development
  • Overlapping functionality of applications (but
    not same interfaces!)
  • Not one single solution for CMS
  • Not every problem has a technical solution,
    conventions also important

21
Future Directions
  • -Data movement using PHEDEX
  • -Integration of runjob into current deployment of
    services
  • Full chain of end to end analysis
  • -Develop/deploy accounting service
  • -Improved GUI interface (Webstart client)
  • -Improve exception handling
  • -Integrate/interoperability mass storage (e.g.
    SRM) applications into/with Clarens environment
  • -Steering service
  • -Autonomous replication
  • -Trend analysis using monitor data
  • -E2E error trapping and diagnosis cause and
    effect
  • -Strategic Workflow re-planning
  • -Adaptive steering and optimization algorithms
  • -Multi user distributed environment

22
More Information GAE http//ultralight.caltech.ed
u/gaeweb/portal Ultralight http//ultralight.calt
ech.edu WIKI http//ultralight.caltech.edu/gaeweb
/wiki/
Thank You.
Questions?
Write a Comment
User Comments (0)
About PowerShow.com