GRID Nation: building National Grid Infrastructure for Science - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

GRID Nation: building National Grid Infrastructure for Science

Description:

Grid3: an application grid laboratory. virtual data grid ... Grid3 at a Glance ... ACDC. Job Queues (Monalisa) Data IO (Monalisa) Metrics (MDViewer) 11 ... – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 25
Provided by: Mar5324
Category:

less

Transcript and Presenter's Notes

Title: GRID Nation: building National Grid Infrastructure for Science


1
GRID Nationbuilding National Grid
Infrastructure for Science
  • Rob Gardner
  • University of Chicago

2
Grid3 an application grid laboratory
CERN LHC US ATLAS testbeds data challenges
CERN LHC USCMS testbeds data challenges
end-to-end HENP applications
virtual data research
virtual data grid laboratory
3
Grid3 at a Glance
  • Grid environment built from core Globus and
    Condor middleware, as delivered through the
    Virtual Data Toolkit (VDT)
  • GRAM, GridFTP, MDS, RLS, VDS
  • equipped with VO and multi-VO security,
    monitoring, and operations services
  • allowing federation with other Grids where
    possible, eg. CERN LHC Computing Grid (LCG)
  • USATLAS GriPhyN VDS execution on LCG sites
  • USCMS storage element interoperability
    (SRM/dCache)
  • Delivering the US LHC Data Challenges

4
Grid3 Design
  • Simple approach
  • Sites consisting of
  • Computing element (CE)
  • Storage element (SE)
  • Information and monitoring services
  • VO level, and multi-VO
  • VO information services
  • Operations (iGOC)
  • Minimal use of grid-wide systems
  • No centralized workload manager, replica or data
    management catalogs, or command line interface
  • higher level services are provided by individual
    VOs

5
Site Services and Installation
  • Goal is to install and configure with minimal
    human intervention
  • Use Pacman and distributed software caches
  • Registers site with VO and Grid3 level services
  • Accounts, application install areas working
    directories

pacman get iVDGLGrid3
Grid3 Site
app

VDT VO service GIIS register Info providers Grid3
Schema Log management
tmp
Compute Element
Storage
6
Multi-VO Security Model
  • DOEGrids Certificate Authority
  • PPDG or iVDGL Registration Authority
  • Authorization service VOMS
  • Each Grid3 site generates a Globus gridmap file
    with an authenticated SOAP query to each VO
    service
  • Site-specific adjustments or mappings
  • Group accounts to associate VOs with jobs

VOMS
SDSS
US CMS

Grid3 grid-map
US ATLAS
Site
BTeV
LSC
iVDGL
7
iVDGL Operations Center (iGOC)
  • Co-located with Abilene NOC (Indianapolis)
  • Hosts/manages multi-VO services
  • top level Ganglia, GIIS collectors
  • MonALISA web server and archival service
  • VOMS servers for iVDGL, BTeV, SDSS
  • Site Catalog service, Pacman caches
  • Trouble ticket systems
  • phone (24 hr), web and email based collection and
    reporting system
  • Investigation and resolution of grid middleware
    problems at the level of 30 contacts per week
  • Weekly operations meetings for troubleshooting

8
Grid3 a snapshot of sites
  • Sep 04
  • 30 sites, multi-VO
  • shared resources
  • 3000 CPUs (shared)

9
Grid3 Monitoring Framework
c.f. M. Mambelli, B. Kim et al., 490
10
Monitors
Data IO (Monalisa)
Job Queues (Monalisa)
Metrics (MDViewer)
11
Use of Grid3 led by US LHC
  • 7 Scientific applications and 3 CS demonstrators
  • A third HEP and two biology experiments also
    participated
  • Over 100 users authorized to run on Grid3
  • Application execution performed by dedicated
    individuals
  • Typically few users ran the applications from a
    particular experiment

12
US CMS Data Challenge DC04
Opportunistic use of Grid3 non-CMS (blue)
Events produced vs. day
CMS dedicated (red)
c.f. A. Fanfani, 497
13
Ramp up ATLAS DC2
c.f. R. Gardner, et al., 503
14
Shared infrastructure, last 6 months
15
ATLAS DC2 production on Grid3 a joint activity
with LCG and NorduGrid
G. Poulard, 9/21/04
total
Validated Jobs
c.f. L. Goossens, 501 O. Smirnova 499
Day
16
Typical Job distribution on Grid3
G. Poulard, 9/21/04
17
Operations Experience
  • iGOC and US ATLAS Tier1 (BNL) developed
    operations response model in support of DC2
  • Tier1 center
  • core services, on-call person available always
  • response protocol developed
  • iGOC
  • Coordinates problem resolution for Tier1 off
    hours
  • Trouble handling for non-ATLAS Grid3 sites.
    Problems resolved at weekly iVDGL operations
    meetings
  • 600 trouble tickets (generic) 20 ATLAS DC2
    specific
  • Extensive use of email lists

18
Not major problems
  • bringing sites into single purpose grids
  • simple computational grids for highly portable
    applications
  • specific workflows as defined by todays JDL
    and/or DAG approaches
  • centralized, project-managed grids to a
    particular scale, yet to be seen

19
Major problems two perspectives
  • Site service providing perspective
  • maintaining multiple logical grids with a given
    resource maintaining robustness long term
    management dynamic reconfiguration platforms
  • complex resource sharing policies (department,
    university, projects, collaborative), user roles
  • Application developer perspective
  • challenge of building integrated distributed
    systems
  • end-to-end debugging of jobs, understanding
    faults
  • common workload and data management systems
    developed separately for each VO

20
Grid3 is evolving into OSG
  • Main features/enhancements
  • Storage Resource Management
  • Improve authorization service
  • Add data management capabilities
  • Improve monitoring and information services
  • Service challenges and interoperability with
    other Grids
  • Timeline
  • Current Grid3 remains stable through 2004
  • Service development continues
  • Grid3dev platform

21
Consortium Architecture
Campus, Labs
Technical Groups 0n (small)
Service Providers
Consortium Board (1)
Sites
Researchers
VO Org
Joint committees (0N small)
activity 1
Research Grid Projects
activity 1
activity 1
activity 0N (large)
Enterprise
OSG Process Framework
Participants provide resources, management,
project steering groups
22
OSG deployment landscape
VOs apps
TG MonInfo
TG Policy
Arch
MIS
Policy
OSG deployment
TG Storage
TG Security
TG Support Centers
Chairs
23
OSG Integration Activity
  • Integrate middleware services from technology
    providers targeted for the OSG
  • Develop processes for validation and
    certification of OSG deployments
  • Provide testbed for evaluation and testing of new
    services and applications
  • Test and exercise installation and distribution
    methods
  • Devise best practices for configuration management

24
OSG Integration, 2
  • Establish framework for releases
  • Provide feedback to service providers and VO
    application developers
  • Allow contributions and testing of new,
    interoperable technologies with established
    baseline services
  • Supply requirements and feedback for new tools,
    technology, and practices in all these areas

25
Validation and Certification
  • We will need to develop the model for new
    services and distributed applications into the
    OSG
  • Validation criteria will be specified by the
    service providers (experts)
  • Coherence with OSG deployment paramount
  • VO and experiments will specify what constitutes
    acceptable functionality. Validation space
  • Deployment and configuration process
  • Functionality
  • Scale

26
Conclusions
  • Grid3 taught us many lessons about how to deploy
    and run a production grid
  • Breakthrough in demonstrated use of
    opportunistic resources enabled by grid
    technologies
  • Grid3 will be a critical resource for continued
    data challenges through 2004, and environment to
    learn how to operate and upgrade large scale
    production grids
  • Grid3 is evolving to OSG with enhanced
    capabilities
Write a Comment
User Comments (0)
About PowerShow.com