Status Report - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Status Report

Description:

Rigorous structure for interface specification and communication ... RESULT: A JAR file to use for deployment (GAR) OGSA/GT3 evaluation. 24. Using Grid Services ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 57
Provided by: lcgWe
Category:
Tags: ajar | report | status

less

Transcript and Presenter's Notes

Title: Status Report


1
OGSA/GT3 evaluation
  • Status Report
  • D. Foster et al.

2
Table of Content
  • Introduction
  • GT3 ToolKit Experience
  • GT3 Performance studies
  • Integration of Existing Codes/Services
  • Summary, Conclusions, and Outlook

3
OGSA/GT3 activityIntroduction
4
GT3 Activities
  • Motivation
  • The promise of the web services framework.
  • Rigorous structure for interface specification
    and communication semantics.
  • Basic framework becoming widely used.
  • Much activity in defining required interfaces for
    the Grid (OGSA)
  • First release of the new Globus toolkit in May
    2003.
  • OGSI framework and some grid services
  • GT3 out July the 1st
  • To provide input to the EGEE middleware activity.
  • Input to the strategic planning and architecture
    activity
  • Initial objectives
  • Three primary objectives
  • Understand the GT3 offering and its quality.
  • Test the available hosting environments.
  • Learn how to create new services in this
    framework.
  • Study how to leverage existing developments
    (AliEn) in an OGSA context.

5
OGSA Engineering Group
  • Proposed to the LCG referees (May 2003)
  • Started in June 2003
  • Massimo Lamanna Overall Coordination (CERN)
  • Ricardo Brito Da Rocha Test Service Development
    (EDG)
  • Alexander Kryukov Test Service Development
    (MSU)
  • Andrey Demichev Testbed Setup (MSU)
  • Volodia Kalyaev Test Service Development
    (MSU/CERN Summer Student)
  • Viktor Pose Performance and Testing (JINR,
    Dubna)
  • Claude Wang AliEn (Academica Sinica, Taipei)
  • Most people at CERN only for short periods.
  • What will be presented represents what we have
    been able to do in the short period since
    starting.

6
Overall Approach
  • Create a simple testbed with the GT3 toolkit.
  • First release was June 30th
  • Create some new simple services.
  • learn by doing
  • Demonstrate the results and measure performance.
  • Start to work with the AliEn components to
    understand them.
  • Can we envisage one framework and competitive
    services?
  • Report on the activity after a few months
    (mid-Sept was the target).
  • Plan the next 6 months.

7
OGSA/OGSI Overview
  • After the first generation of Grid toolkits and
    middleware, emphasis on
  • Agree on standards
  • Build an open system
  • Open Grid Service Architecture (OGSA)
  • Component model and high-level design
  • High-level services
  • Open Grid Service Infrastructure (OGSI)
  • Conventions
  • Detailed implementation issues
  • Actual implementation Globus Toolkit 3
  • Other implementation could coexist
  • The whole idea behind OGSA/OGSI does require
    heterogeneity
  • Standard components from the Web world
  • SOAP (Simple Object Access Protocol) to convey
    messages (XML payloads)
  • WSDL (Web Service Description Language) to
    describe interface
  • They are hosted in specific environments
  • Standalone container, TomCat, IBM Websphere
  • .NET

8
What does GT3 offer? (now)
  • The first OGSI implementation (July 2003)
  • The toolkit itself
  • Build new services and extend existing ones
  • Security Infrastructure
  • GSI
  • Services
  • GRAM (GT2 implementation wrapped up as a Grid
    service)
  • IS (new GT3 implementation)
  • RFT (Globus FTP)
  • RLS (GT2 implementation as a Grid service)

9
TestBeds
  • First hand experience on Globus Toolkit 3
  • This can be achieved only by using it!
  • Prototypes, with the following common features
  • Small
  • Working (with limited functionality)
  • No architectural ambition
  • Engineering approach
  • Mapping of functionality prototype functions

10
TestBeds
  • GT3 TestBed
  • 4 CERN machines 1 in Moscow
  • Focus on GT3 basic functionality and performances
  • AliEn TestBed
  • 3 CERN TestBed machines
  • Future ARDA TestBed
  • Focus on the complexity of future possible
    architectures

11
GT3 TestBed
12
GT3 TestBed
  • Simple system to distribute jobs and retrieve
    output
  • No security (for most services)
  • The user asks the Resource Broker (RB) to select
    the best Computing Element (CE)
  • The user submits the job to the CE
  • The Information and the Logging Bookkeeping
    services exchange information mainly with the RB
  • Why did we do it this way?
  • Simple scheme
  • As already mentioned no architectural ambitions
  • Learn by doing!
  • What did we learn out of it?
  • See next slides

13
GT3 TestBed
  • Resource broker and LB (Custom service)
  • Surprisingly fast to set-up
  • A few computing elements (GT3-GRAM, with
    modifications)
  • 2 PC boxes in the CERN Computing Centre
  • In a second phase, one PC located in Moscow was
    added
  • Some problems (solved) in data stage-in/stage-out
  • See GRAM comments in the performance part
  • Information service (GT3-IS)
  • Native GT3 service
  • In this TestBed talks only with other services

14
GT3 TestBed coverage
pull data access
Every service must implement this PortType
push data access
15
First summary
  • GT3 is the first OGSI 1.0 implementation
  • Main focus of all activity so far
  • GT3 (ToolKit doc) is in a status that allow a
    quick start
  • Not everything is perfect, but GT3 is more mature
    than expected
  • Development experience and quantitative
    measurements and in the next section of the
    presentation
  • GT3 provides a few OGSA services by now
  • GRAM and RLS (GT2)
  • IS (Information Service)
  • RFT (Reliable File Transfer GridFTP based)
  • GT3 encourages to create custom services
  • The OGSI system provides the building blocks to
    provide a variety of services

16
GT3 ToolKit Experience
17
Grid Service Development
  • Grid Services
  • Extended Web Services complying to the OGSI
    specification
  • Core Architecture

HOSTING ENVIRONMENT
GRID CONTAINER
GRID SERVICES
COMPLEMENTARY
OGSI IMPL.
WEB SERVICES ENGINE
18
Grid Service Development
  • What we get
  • From Web Services
  • Interoperability
  • standard for message creation and definition -
    XML
  • standard for protocol-independent message passing
    SOAP
  • standard for service definition WSDL
  • result choice on hosting environment is left to
    the service provider
  • Service Oriented Design approach
  • From OGSI
  • Stateful Services (Service Data)
  • Other common features on independent services
  • Different from GT2 where nothing is common
    between services apart from GSI
  • Straightforward development common framework for
    service usage and management

19
Grid Service Development
  • What we get
  • From the Globus Toolkit 3
  • Security Infrastructure
  • Authentication, authorization, delegation,
    message integrity and encryption
  • Higher-Level Services
  • Information Services Index Service
  • Data Management RLS and RFT
  • Master Managed Job Factory GT3 interface for
    GRAM
  • In summary
  • Interoperable and environment independent
    services

20
Grid Service Development
  • Current options
  • Hosting Environments
  • J2EE Application Server Jakarta Tomcat, GT3
    Standalone Container, Websphere,
  • Microsoft .NET Platform
  • OGSI implementations
  • J2EE Servers Globus Toolkit 3
  • Microsoft .NET OGSI.NET (Virginia Univ.)
    MS.NETGrid (EPCC)
  • Others are appearing
  • Any environment with an existing implementation
    of a Web Services engine is one single step away
    from providing Grid Services
  • Ex OGSILite (Perl), pyGridWare (Python)

21
Designing Grid Services
  • Important concepts when designing Grid Services
  • Factories and Instances

1
FACTORY
CLIENT
INSTANCE
2
  • Factories create instances and respond to
    instance creation requests by clients
  • Instances respond to clients service specific
    interaction requests
  • Advantages
  • Workload balancing between pools of instances
  • User dependent instances
  • Disadvantages
  • Instance creation overhead

22
Designing Grid Services
  • Approach
  • Service Data, Subscriptions and Notifications

GRID SERVICE A
GRID SERVICE B
1 - SUBSCRIPTION
SDE A1
SDE A2
SDE B1
2,.. - NOTIFICATIONS
  • Each Grid Service has its own Service Data Set -
    collection of Service Data Elements (SDEs)
  • Every SDE has a set of associated values
    concerning its validity in time goodFrom,
    goodUntil, availableUntil
  • A service or client may declare interest in a SDE
    by issuing a Subscription
  • Service Data flows by means of Notifications
    normally when a change occurs or the value
    lifetime has expired

23
Writing Grid Services in GT3
  • You need
  • A service interface GWSDL (WSDL extended)
  • manually written or generated from existing Java
    code
  • The service implementation
  • directly extending a basic Grid Service or using
    Operation Providers (delegation) in Java
  • A deployment descriptor
  • defined using WSDD (Web Service Deployment
    Descriptor)
  • A build file
  • For use by the Jakarta Ant build tool
  • RESULT A JAR file to use for deployment (GAR)

24
Using Grid Services
  • Grid Services in action

HOSTING ENVIRONMENT
GRID CONTAINER
SERVICE IMPLEMENTATION
2
STUBS
WSDL DESCRIPTION
1
2
CLIENT
2
STUBS
APPLICATION
25
Work Summary
  • GT3 Testbed
  • Performance Prototypes
  • Dummy Service
  • Dummy Secure Service
  • Dummy Service with Service Data
  • Dummy Service with Notifications
  • Dummy Service Index Service
  • Index Listener
  • Higher Level Prototyping
  • File Catalog Service
  • Metadata Catalog Service
  • Storage Element Service
  • Workload Management Service
  • Computing Element
  • Authentication and Authorization

26
Globus Toolkit 3 Overview
  • The Globus Toolkit 3 is a complete implementation
    of the OGSI specification
  • The development process is much easier when
    compared with previous versions of the toolkit
  • Some additional components to what is in OGSI
    proved essential to achieve this
  • Security Infrastructure
  • GSI3 is an easy to use security provider,
    abstracting the developer from the major issues
    it deals with
  • Deployment Tools
  • By using Ant and providing sample build files for
    service deployment, the developer can focus most
    of his time on the implementation of the service
    features
  • Backward compatibility
  • All GT2 components are shiped with the GT3 full
    bundle
  • Some services remain usable those where only an
    OGSI-compliant interface was provided (e.g. GRAM)
  • Others are completely independent implementations
    (eg. MDS2 and MDS3)
  • A large user community is being built

27
Globus Toolkit 3 Overview
  • Steep learning curve - it represents a new
    approach to service design and implementation
    (many small details that take time)
  • Incomplete documentation this is a real problem
    being faced by developers at this time
  • Several bugs found in these exercises
  • Core implementation related - due to framework
    short lifetime
  • From tools deployed with the framework hard to
    solve (e.g. Axis)
  • From the outside easy to solve (e.g. Tomcat)
  • Resource Management services still based on GRAM
    with an OGSI-compliant but complex architecture
    behind
  • Good resources for documentation and good
    interaction for problem solving
  • OGSI 1.0 Specification
  • GT3 Tutorial http//www.casa-sotomayor.net/gt3-tut
    orial/
  • Globus Discuss discuss_at_globus.org
  • Globus Bugzilla

28
GT3 Performance measurements
29
Overview
  • Goal
  • explore GT3 under heavy load/concurrency
  • maximal throughput/rate of GT3 services
  • see the limiting factors
  • GT3 grid services measured
  • GRAM
  • DummyService
  • IndexService

30
GT3 GRAM performance
  • Setup
  • GRAM in GT3 standalone container
  • managed-job-globusrun clients started
    simultaneously on up to 32 client nodes (lxplus)
    in non-batch mode used to submit jobs to GT3 GRAM
  • GRAM hardware 2 Intel Pentium III 600MHz
    processors, 256MB RAM
  • Note 1 managed-job-globusrun client is capable
    to submit 1 job

31
GT3 GRAM performance
  • Results service node
  • Saturation throughput for job submission on the
    service node 3.8 jobs/minute with an average CPU
    usersystem usage of 62
  • Comments
  • scalability issue for heavily used servers

32
GT3 GRAM performance
  • Results client node
  • using a 2 Intel Pentium III 600MHz processors,
    256MB RAM client node, a managed-job-globusrun
    client consumes at average 16 seconds CPU
    usersystem time (on both CPUs) for the of 1
    job
  • Comment
  • lightweight clients (e.g. written in "C") needed

33
DummyService performance
  • Setup (1)
  • each DummyService client executes the following
    steps
  • calls DummyServiceFactory to create a
    DummyService instance
  • executes 2 simple methods (echo and getTime) on
    the DummyService instance
  • calls DummyService instance to destroy itself
  • up to 1000 clients talking to the DummyService
    were run simultaneously on up to 45 client nodes
    (lxplus)
  • with and without authentication via GSI message
    level security used according to guides and
    tutorials at www.globus.org
  • grid service node hardware 2 Intel Pentium III
    600MHz processors, 256MB RAM
  • Setup (2)
  • same as Setup (1), but step 2. consists of 100
    cycles, each of them calling the 2 simple methods
    (echo and getTime) on the DummyService instance

34
DummyService performance
  • Preliminary Results
  • security overhead needs further investigation
  • cross check our implementation/setup with Globus
    team foreseen

35
DummyService performance
  • Conclusions
  • security overhead needs further investigation
  • more tests on more powerful machines
  • container comparison depending on the setup the
    Tomcat container may be a bit slower or up to
    50 faster, compared to the standalone container
  • Notes
  • in the results table above the top saturation
    rates are given
  • with varying number of clients throughput goes
    down by up to 30 and the average CPU us usage
    varies accordingly

36
(No Transcript)
37
  • the first time the client contacts the
    DummyServiceFactory, creates a DummyService
    instance, and calls the first method, it takes
    about 10s to accomplish it
  • the following times these actions take about 1s

38
IndexService performance
  • Setup (1) IndexService acting as a notification
    source (pushing data)
  • multiple notification sinks subscribe to the
    IndexService "Host" Service Data Element (SDE),
    and are notified about each update of "Host" SDE,
    happening at a fixed rate
  • no security
  • grid service node hardware 2 Intel Pentium III
    600MHz processors, 256MB RAM
  • Setup (2) IndexService responding to
    findServiceData requests (pulling data)
  • multiple ogsi-find-service-data clients are run
    sequentially and in parallel asking for
    IndexService "Host" Service Data Element
  • no security
  • grid service node hardware 2 Intel Pentium III
    600MHz processors, 256MB RAM

39
IndexService performance
  • Results

40
IndexService performance
  • Comments
  • saturation throughput with findServiceData is
    about 13-20 times higher than with the
    notification approach
  • Setup (1) measurement using Tomcat failed due to
    a bug concerning threads, is fixed, fix announced
    to appear in next (4.1.28) Tomcat version
  • preliminary measurement with a faster service
    node (2 Intel(R) Xeon(TM) 2.40GHz processors,
    1GB RAM)
  • saturation throughput for setup (1) was about 32
    notifications/s for 800 listeners compared to 10
    notifications/s with 400 listeners on the 2
    600MHz machines not quite 4 times faster

41
(No Transcript)
42
(No Transcript)
43
Reliable File Transfer Service
  • To complement their GT3 testbed activity, Andrey
    Demichev and Alexander Kryukov from SIMP MSU
    Moscow are doing RFT tests
  • reliability means that problems like e.g.
  • dropped connections,
  • machine reboots,
  • temporary network outages, etc
  • should be handled automatically (usually via
    retry) until they either resume or meet some
    "ultimate failure" condition
  • preliminary tests (up to 3 clients, each
    transferred 100 Mb data) show that the service
    works perfectly, but further and more
    comprehensive tests are needed
  • Note data on requests, transfers etc. are
    recorded in PostgreSQL DB

44
Current and next actions
  • performance measurements
  • further investigation of security overhead
  • IndexService acting as a notification sink
  • secure IndexService
  • continue RFT tests
  • redo the measurements on faster hardware (2
    Intel(R) Xeon(TM) 2.40GHz processors, 1GB RAM)
    and possibly nodes with more then 2 CPUs

45
Integration of Existing Codes/Services
46
Why Integration?
  • GRID mainly concerns about the interoperability
    among heterogeneous grid components
  • Heterogeneous Grid environments
  • AliEn (Alice Environment)
  • Web service oriented
  • Heterogeneous Grid technologies
  • Globus Toolkit 3
  • OGSI .NET, MS .NETGrid (.NET environment)
  • Unicore, others

47
AliEn (Alice Environment) Highlights
  • AliEn framework is a lightweight, but
    functionally equivalent, alternative to full GRID
    based on SOAP standard of Web Services.
  • Authentication module which supports various
    authentication methods (Globus/GSI)
  • Distributed file catalogue built on top of RDBMS
    with user interface that provides file system
    functionality
  • Secure file transport and replication Service
  • Task queue which holds commands to be executed in
    the system and Resource Broker
  • Computing and Storage elements
  • Metadata catalogue
  • Configuration and Information Service
  • Monitoring framework

48
  • Installation of AliEn testbed
  • tbed0132 (AliEn Core Service)
  • tbed0134 (CE, one WN, PBS, ClusterMonitor)
  • tbed0135 (SE, file///, FTD)
  • Small Tests
  • Job submission
  • File catalog
  • Continue to use the TestBed in future

49
.NET
  • The service layer constructed on .NET Web
    services platform
  • SOAP handling
  • Security
  • WSDL self service description
  • Application wrapping-up technologies
  • .NET The Common Language Runtime (CLR)
  • Java Java Native Interface (JNI)

50
(No Transcript)
51
System Design Investigation
  • Investigation to service structure of the system
  • Combining heterogeneous implementation of systems
  • Understand engineering design options
  • Eg. User Proxy Service

User Proxy Service
Authentication Service
Factory
Information Index
UI
Grid Service using 1 Technology
Grid Service using 2 Technology
1 Grid Environment
52
Conclusions and Outlook
53
Conclusions
  • GT3 is the natural partner for new middleware
    initiatives
  • Encouraging results so far
  • Requires experience of large scale deployment
  • OGSI seems to be already the lingua franca in
    this field
  • OGSA should provide attractive high level
    services
  • HEP-specific services will be missing
  • A convincing backbone of services should
    materialise (also with HEP contribution)
  • OGSA/OGSI concept validate when serious
    challengers will be deployed on large scale

54
Evaluation
  • Effort to try to distill this experience
  • Verify how much of the toolkits actually
    experienced
  • Effort to have this procedure reproducible
  • Other ToolKits in the near future?
  • Preliminary tests ? Test/Performance suites
  • Contacts with embryonic EGEE teams
  • F. Hemmer, B. Jones, E. Laure
  • Use GT2 experience (weak and strong points) from
    EDG, LCG, etc in the GT3 evaluation
  • Deployment experience (weak and strong points)
    from EDG testbeds and LCG

55
Outlook
  • Much progress has been made in a short time
    just a few months
  • Generally impressed with GT3 and the overall
    concept
  • Some major issues around the performance of the
    hosting environments and the factories
  • Continue closely working with Globus
  • Continue to validate this approach and prototype
    interfaces and services in a GT3 context
  • Continue closely working with EGEE
  • SC2 RTAG-11 (ARDA) has recently taken an approach
    to identify the infrastructure needs through grid
    service decomposition

56
Special thanks
  • TestBed support
  • B. Panzer, T. Smith, T. Kleiworth
  • Windows Servers support
  • A. Lossent
  • Globus community
  • Many many
  • Special thank to B. Sotomayor
  • And many others!
Write a Comment
User Comments (0)
About PowerShow.com