UT Grid: Building a campus grid - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

UT Grid: Building a campus grid

Description:

Mary Wheeler, ICES. integrating cluster, leading-edge user ... Computational Fluid Dynamics (Dr. Carey) Reservoir modeling (Dr. Wheeler) ... – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 39
Provided by: JoshKn6
Category:
Tags: building | campus | carey | grid | mary

less

Transcript and Presenter's Notes

Title: UT Grid: Building a campus grid


1
UT Grid Building a campus grid
  • Ashok Adiga, Ph.D.
  • Distributed Grid Computing Group
  • Texas Advanced Computing Center
  • The University of Texas at Austin
  • adiga_at_tacc.utexas.edu
  • (512) 471-8196

2
TACC Grid Program
  • TACC involved in several Grid projects
  • Campus Grid (UT Grid, partially funded by IBM)
  • State Grid (TIGRE)
  • National Grid (ETF)
  • Grid Hardware Resources
  • Wide range of hardware resources available to
    research community at UT and partners
  • Grid Software Resources
  • Significantly leverage NMI GRIDS components
    (Globus Toolkit, GPT, MyProxy, Gridport, GridFTP,
    )
  • Other software where necessary
  • Resource managers (Condor, LSF, PBS, United
    Devices)
  • Schedulers (Condor, Community Scheduling
    Framework)

3
TeraGrid (National)
  • NSF Extensible Terascale Facility (ETF) project
  • build and deploy the world's largest, fastest,
    distributed computational infrastructure for
    general scientific research
  • 40 Gbps backbone with hubs in Los Angeles,
    Chicago Atlanta
  • UT (led by TACC) going online on Teragrid October
    1 2004
  • 10 Gbps network connection to ETF backbone
  • Provide access to high-end computers capable of
    6.2 teraflops, a new terascale visualization
    system, and a 2.8-petabyte mass storage system
  • Provide access to geoscience data collections
    used in environmental, geological climate and
    biological research
  • high-resolution digital terrain data
  • worldwide hydrological data
  • global gravity data
  • high-resolution X-ray computed tomography data
  • Current software stack includes Globus (GSI,
    GRAM, GridFTP), MPICH-G2, Condor-G, GPT,
    MyProxy, SRB

4
TIGRE (State-wide Grid)
  • Texas Internet Grid for Research and Education
  • computational grid to integrate computing
    storage systems, databases, visualization
    laboratories and displays, and instruments and
    sensors across Texas.
  • Funding announced by Gov. Rick Perry at Internet2
  • TIGRE members include several leading state
    institutes
  • Rice, Texas AM, Texas Tech, U of Houston, UT
    Austin, UT El Paso, others
  • Initial software stack will use NMI GRIDS

5
UT Grid Vision A Powerful, Flexible, and Simple
Virtual Environment for Research Education
  • The UT Grid vision is the creation of a
    cyberinfrastructure for research and education in
    which people can develop and test ideas,
    collaborate, teach, and learn through
    applications that seamlessly harness the diverse
    campus compute, visualization, storage, data, and
    instruments as needed from their personal systems
    (PCs) and interfaces (web browsers, GUIs, etc.).

6
UT Grid Develop and Provide a Unique,
Comprehensive Cyberinfrastructure
  • The strategy of the UT Grid project is to
    integrate
  • common security/authentication
  • scheduling and provisioning
  • aggregation and coordination
  • diverse campus resources
  • computational (PCs, servers, clusters)
  • storage (Local HDs, NASes, SANs, archives)
  • visualization (PCs, workstations, displays,
    projection rooms)
  • data collections (sci/eng, social sciences,
    communications, etc.)
  • instruments sensors (CT scanners, telescopes,
    etc.)
  • from personal scale to terascale
  • personal laptops and desktops
  • department servers and labs
  • institutional (and national) high-end facilities

7
That Provides Maximum Opportunity Capability
for Impact in Research, Education
  • into a campus cyberinfrastructure
  • evaluate existing grid computing technologies
  • develop new grid technologies
  • deploy and support appropriate technologies for
    production use
  • continue evaluation, RD on new technologies
  • share expertise, experiences, software
    techniques
  • that provides simple access to all resources
  • through web portals
  • from personal desktop/laptop PCs, via custom CLIs
    and GUIs
  • to the entire community for maximum impact on
  • computational research in applications domains
  • educational programs
  • grid computing RD

8
UT Grid Approach Leverage Strengths of Campus
Environment
  • Like any grid, campus grid must provide services
    to simplify use of distributed resources
  • But
  • Focus must be to support research and/or
    education mission of the university
  • Campus grid can leverage vast numbers of PCs and
    large numbers of clusters
  • Campus grid can integrate novel scientific data
    collections and research instruments

9
UT Grid Approach Leverage Strengths of Campus
Environment
  • Important differences from multi-institution
    grids
  • Staff in one location, can collaborate
    face-to-face
  • Controlled network environment
  • High-end computing center can lead deployment
  • Important differences from enterprise grids
  • Researchers generally more independent than in
    company
  • No central IT group governs researchers systems
  • Usage models driven by different priorities
  • Important differences from domain-specific grids
  • Might require integration of wider variety of
    resources
  • Must support wider variety of usage models

10
UT Has Massive Scale and Unique Deployment
Environment
  • ACES building is a model for a university grid
  • Massive bandwidth
  • Multidisciplinary users
  • Numerous PC, clusters, visualization systems,
    storage resources
  • UT main campus UT research campus can be model
    for multi-institution grid
  • Separated by true WAN, but UT controls paths
  • Massive bandwidth (10GigE) between campus
  • TACC controls resources on both campuses

11
UT Grid Project Team Has Participation From
Several Campus Departments
  • Additional UT Partners
  • Information Technology Services (ITS)
  • deploying Roundup clients, will include client
    s/w in BevoWare
  • College of Engineering IT Group
  • deploying Roundup clients
  • Center for Instructional Technology (CIT)
  • Helped with Web site, will create education
    content
  • Department of Computer Sciences
  • integrating Condor flock, partnering in RD
    proposals
  • Institute for Computational Engineering
    Sciences (ICES)
  • integrating clusters and Condor flock

12
..and Participation Will Grow Significantly as We
Enter Production
  • Additional Partners Expected in next 6 months
  • Mary Wheeler, ICES
  • integrating cluster, leading-edge user
  • Kamy Sepehrnoori, Dept of Petroleum Geophysical
    Eng.
  • integrating cluster, leading-edge user
  • College of Fine Arts
  • providing Roundup clients
  • College of Communications
  • interested in storage services
  • Additional outreach through UT Tech Deans
    Committee
  • Additional users through TACC User Services

13
UT Grid Components
  • Grid User Interfaces
  • Typical grid interface is via user portals
  • Grid User Nodes provide users with command line
    (shell) interfaces to the grid
  • Grid Resources
  • Compute, storage, visualization, instruments
  • Grid software must provide security, monitoring,
    remote access
  • Grid Services
  • Authentication (GSI, MyProxy)
  • Scheduling (Condor, CSF)
  • Data management (SRB, Avaki)

14
UT Grid Current Status
  • Providing compute services to users today
  • Heterogeneous set of cluster resources (LSF, PBS,
    LoadLeveler, Condor) and desktop resources
    (United Devices, Condor)
  • Single sign-on access via user portal
  • Allocation and support procedures
  • Resource monitoring
  • Serial and parallel job submission to clusters
    and desktop resources
  • Evaluation of scheduling technologies (Condor,
    CSF)
  • Evaluating workflow solutions (Pegasus)
  • Basic data services
  • Reliable File Transfer tool built using GridFTP,
    NWS, GPIR
  • Share data across resources using Avaki data grid
  • SRB
  • Visualization services coming soon
  • Remote interactive visualization, Batch
    rendering, Computational Steering

15
Challenges Include Scale, Heterogeneity, Purpose,
and Policies
  • Usage models
  • research vs education (vs. administrative)
  • ISV apps vs custom apps
  • Interactive vs batch
  • Serial codes vs parallel codes
  • Etc.
  • Most are locally managed
  • Local policies and procedures
  • Different priorities
  • Sense of ownership
  • Varying expertise levels of administrators
  • Varying levels of support

16
UT Grid Approach to building the grid
  • Challenge Getting scientists to use UT Grid
  • Gain confidence that they can meet their
    computing goals and benefit from using the grid
  • Share their resources by making them available to
    other grid users
  • Hub Spoke approach rather than peer resources
  • Leverage existing trust relationships between
    TACC and campus research users
  • As users become comfortable with grid software,
    convince them to share their resources

17
UT Grid Logical View
  • Integrate each set of resources(compute, vis,
    storage, data)within TACC first

TACC Compute, Vis, Storage, Data
(actually spread across two campuses)
18
UT Grid Logical View
  • Next add other UTresources usingsame tools
    andprocedures

TACC Compute, Vis, Storage, Data
ACES Cluster
ACES Data
ACES PCs
19
UT Grid Logical View
  • Next add other UTresources usingsame tools
    andprocedures

GEO Data
GEO Cluster
TACC Compute, Vis, Storage, Data
GEO Cluster
ACES Cluster
ACES Data
ACES PCs
20
UT Grid Logical View
BIO Data
BIO Instrument
  • Next add other UTresources usingsame tools
    andprocedures

PGE Cluster
GEO Data
PGE Data
GEO Cluster
TACC Compute, Vis, Storage, Data
PGE Instrument
GEO Cluster
ACES Cluster
ACES Data
ACES PCs
21
UT Grid Logical View
BIO Data
BIO Instrument
  • Finally negotiateconnectionsbetween spokesfor
    willing participantsto develop a P2P grid.

PGE Cluster
GEO Data
PGE Data
GEO Cluster
TACC Compute, Vis, Storage, Data
PGE Instrument
GEO Cluster
ACES Cluster
ACES Data
ACES PCs
22
Accessing UT Grid Portals vs CLIs
  • Choice of portals over command line interfaces is
    not universal
  • Some researchers prefer to use their current
    shell interface to access the grid
  • UT Grid supports Grid User Portals (GUPs) and
    Grid User Nodes (GUNs)

23
Why Are GUPs Important?
  • Lower the barrier of entry into grid computing
  • Easy access to multiple resources through a
    single interface
  • Simple GUI interface to complex grid computing
    capabilities
  • Present a Virtual Organization view of the Grid
    as a whole

24
UT GUP Infrastructure
  • Portal based on
  • Grid Portal Toolkit 3 (NMI component)
  • Jetspeed Portal infrastructure
  • Underlying Grid Middleware
  • Globus
  • Community Scheduling Framework
  • Network Weather Service
  • Soon Avaki, SRB

25
UT GUP Capabilities
  • Initial GUP capabilities include
  • View information on resources within UT Grid,
    including status, load, jobs, queues, etc.
  • View network bandwidth and latency between
    systems, aggregate capabilities for all systems.
  • Submit user jobs and run hosted applications
  • Manage files across systems, and move/copy
    multiple files between resources with transfer
    time estimates

26
UT Grid User Portal
27
Job Submission templates
28
Community Scheduling Framework (CSF)
  • Open-source metascheduler written by Platform
    Computing
  • Distributed under Globus Public License
  • Developed using GT3.0.2 and OGSI
  • Will be part of future Globus Toolkit
    distribution
  • Schedules jobs across heterogeneous resources
  • Advanced reservation support
  • Architecture allows pluggable scheduling policies
  • Resource Manager Adapters required to convert
    requests to local resource manager.
  • Dynamic performance information stored in Global
    Information Service

29
UT Grid CSF Configuration
PBS
CSF Server
Web Server
PBS
GT3.0
GT3.0
Queuing Service
User Portal
GridPort
Job Service
Reservation Service
Queues implement customizable scheduling policies
using plug-ins
LSF
LSF
LSF
30
Why Are GUNs Important?
  • Most campus users have PCs for their research
    education projects
  • They are used to their local systems
  • They also often need additional resources
  • They may want more flexibility than a portal
    provides
  • They need to be able to keep doing what they
    know, issuing same commands, but reaching
    additional resources
  • They would like access to those resources easily,
    even transparently
  • The Grid User Node concept is designed to provide
    these features and capabilities

31
Current Linux GUN Software
  • Users have the option of installing software
    stack on their desktops or using hosted GUN.
  • Linux Red Hat 9.0
  • Globus 3.2.1 NMI Release 5
  • Ant v1.6.2
  • Java J2SE SDK v1.4
  • Grid Package Tools v3.2.1 NMI 5
  • GridShell (pre-release version)
  • Condor
  • MPICH
  • United Devices SDK 4.1
  • Perl v 5.6

32
What is GridShell?
  • GridShell is an extension of TCSH and BASH shells
  • includes transparent distributed execution and
    data transfer features for intra and inter
    cluster execution of programs
  • Currently supports LSF, Condor and Globus
    environments
  • Goal is to extend services to match portal
    services

33
Grid User Node
34
UT Grid Application driven design
  • UT Grid design based on user requirements
  • Initial user set has been identified
  • Monthly meetings, mailing lists
  • Interviews to understand use cases
  • Initial set of application areas have compute,
    storage and visualization requirements
  • Computational Fluid Dynamics (Dr. Carey)
  • Reservoir modeling (Dr. Wheeler)
  • Flood prediction (Dr. Wells Dr. Maidment)

35
UT Grid Education
  • Training courses offered 3-4 times/year
  • Gridport (offered via Access Grid)
  • Running applications using United Devices
  • HPC training (MPI apps, tools)
  • Courses offered through CS department
  • High Performance Computing for scientists (this
    semester)
  • Grid Computing in science and engineering (summer
    05)
  • CIT planning to provide educational content about
    UT Grid research applications

36
NMI Experiences
  • TACC has benefited from using NMI
  • Easier to install configure components
  • Better documentation support
  • Software is more robust since it has gone through
    a level of integration testing
  • Exposure to new components (Gridsolve)
  • Working with other NMI Testbed members
  • Although NMI components are fairly reliable
  • They are still evolving, and occasionally cause
    backward compatibility issues (e.g. between
    Globus versions 3.0.2, 3.2, 3.2.1, and 4.0)
  • NMI not a complete grid solution
  • Components do not address scheduling, workflow,
    accounting, .

37
UT Grid Project Team
  • Jay Boisseau
  • Maytal Dahan
  • Edward Walker
  • Ashok Adiga
  • Ashesh Sahib
  • CJ Barker
  • Akhil Seth
  • David Walling
  • Eric Roberts
  • Jeff Mausolf (IBM)
  • Nina Wilner (IBM)

38
  • Texas Advanced Computing Center
  • www.tacc.utexas.edu
  • (512) 475-9411
Write a Comment
User Comments (0)
About PowerShow.com