FutureGrid Training, Education and Outreach - PowerPoint PPT Presentation

About This Presentation
Title:

FutureGrid Training, Education and Outreach

Description:

Title: Cloud Data mining and FutureGrid Author: Judy Qiu Last modified by: Geoffrey Fox Created Date: 11/17/2010 3:44:50 AM Document presentation format – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 33
Provided by: Judy120
Category:

less

Transcript and Presenter's Notes

Title: FutureGrid Training, Education and Outreach


1
FutureGridTraining, Education and Outreach
  • Bloomington Indiana
  • January 17 2010
  • Presented by Renato Figueiredo
  • renato_at_acis.ufl.edu
  • Associate Professor
  • University of Florida

2
Overview
  • Traditional ways of delivering hands-on training
    and education in parallel/distributed computing
    have non-trivial dependences on the environment
  • Difficult to replicate same environment on
    different resources (e.g. HPC clusters, desktops)
  • Difficult to cope with changes in the environment
    (e.g. software upgrades)
  • Virtualization technologies remove key software
    dependences through a layer of indirection

3
Overview
  • FutureGrid enables new approaches to education
    and training and opportunities to engage in
    outreach
  • Cloud, virtualization and dynamic provisioning
    environment can adapt to the user, rather than
    expect user to adapt to the environment
  • Focus of FutureGrid TEO is on leveraging the
    unique capabilities of the infrastructure and its
    software to
  • Reduce barriers to entry and engage new users
  • Use of encapsulated environments (appliances)
    as a primary delivery mechanism of
    education/training modules promoting reuse,
    replication, and sharing

4
Summary of activities (1)
  • Focus activities in the first year
  • Infrastructure supporting TEO activities
  • Documentation, integration of educational
    materials, input/recommendations for portal and
    computing infrastructure
  • Development of hands-on tutorials tailored to
    FutureGrid technologies and resources
  • Development, integration, testing of educational
    virtual appliances

5
Summary of activities (2)
  • Focus activities in the first year
  • Education activities
  • Working with early adopters in class environments
  • Understand requirements, opportunities,
    challenges
  • Outreach activities
  • Demonstrations and presentations highlighting
    FutureGrids unique capabilities in conferences,
    workshops
  • Engaging with minority serving institutions

6
TEO Infrastructure - guiding principles
  • Fidelity TEO activities should use full-fledged,
    executable software education/training modules
  • Learn using the proper tools
  • Reproducibility Creators of content should be
    able to install, configure, and test their
    modules once, and be assured of the same
    functional behavior regardless of where the
    module is deployed
  • Incentive to invest effort in developing, testing
    and documenting new modules

7
TEO Infrastructure - guiding principles
  • Deployability Students and users should be able
    to deploy modules in a simple manner, and in a
    variety of resources
  • Reduce barriers to entry avoid dependences upon
    a particular infrastructure
  • Community-oriented Modules should be simple to
    share, discover, reuse, and expand
  • Create conditions for viral growth

8
Towards this vision in FutureGrid
  • Executable modules virtual appliances
  • Deployable on FutureGrid resources
  • Deployable on other cloud platforms, as well as
    virtualized desktops
  • Community sharing Web 2.0 portal, appliance
    image repositories
  • An aggregation hub for executable modules and
    documentation

9
Educational appliances
  • A flexible, extensible platform for hands-on,
    lab-oriented education on FutureGrid
  • Need to support clustering of resources
  • Virtual machines social/virtual networking to
    create sandboxed modules
  • Virtual Grid appliances self-contained,
    pre-packaged execution environments
  • Group VPNs simple management of virtual clusters
    by students and educators

10
Virtual appliance example
  • Linux, Java, Hadoop, configuration scripts

Hadoop image
A Hadoop worker
Another Hadoop worker
instantiate
Virtualization Layer
copy
Repeat
11
Virtual Networking
  • A single appliance encapsulates software and
    configuration
  • Cluster/Grid/Cloud computing
  • Middleware expects a collection of machines,
    typically on a LAN (Local Area Network)
  • Appliances need to communicate and coordinate
    with each other
  • Each worker needs an IP address, uses TCP/IP
    sockets

12
Virtual cluster appliances
  • Virtual appliance virtual network

Virtual network
Hadoop Virtual Network
Another Hadoop worker
A Hadoop worker
instantiate
Virtual machine
copy
Repeat
13
Support for clustering
  • Network virtualization software on FutureGrid
    includes ViNe and GroupVPN
  • Nimbus has support for contextualization of
    one-click virtual clusters
  • Within a LAN, or coupled with ViNe
  • Grid appliances use peer-to-peer overlay for
    discovery and configuration of virtual addresses
    (DHCP) and cluster middleware

14
GroupVPN Overview
Bootstrapping private links through Web 2.0
interfaces and IP-over-P2P overlay
tunneling Private IP address spaces,
DHCP Appliances perceive virtual LAN
Virtual network
Alice
Carol
Bob
15
Deploying virtual clusters
  • Same image, different VPNs

Group VPN
Hadoop Virtual Network
Another Hadoop worker
A Hadoop worker
instantiate
Virtual machine
copy
GroupVPN Credentials
Repeat
(from Web site)
Virtual IP - DHCP 10.10.1.1
Virtual IP - DHCP 10.10.1.2
16
FutureGrid example
  • Deploying a Condor virtual appliance cluster on
    FutureGrid or desktop resources
  • Nimbus cloud-client.sh --run --name
    grid-appliance-amd64.tar.gz
  • Eucalyptus euca-run-instances ami-fd4aa494
    --instance-type m1.large -k keypair
  • Vmware player double-click Grid-appliance.vmx
  • Upload GroupVPN configuration file to appliances

17
FG appliances - Status
Nimbus, Eucalyptus
Appliance image
FutureGrid resources, Appliance images
(Condor, Hadoop), tutorials
GroupVPN portal, image downloads, bootstrap
routers
18
Use of FutureGrid in classes
  • First-year ramp-up of hardware and software
  • Training and education emphasis has been use in
    classes, tutorials with early adopters
  • Highlights
  • Cloud computing class at Indiana University
  • Distributed Scientific Computing class at
    Louisiana State University (LSU)
  • Big data summer school at IU
  • Nimbus tutorial at CloudCom conference

19
Big Data for Science
July 26-30, 2010 NCSA Summer School
Workshop http//salsahpc.indiana.edu/tutorial
300 Students (200 on sites from 10 institutes
100 online) IU MapReduce and UF Virtual
Appliance technologies are supported by
FutureGrid.
(Slide courtesy of Judy Qiu)
20
Cloud computing class at IU
  • Graduate-level Cloud computing for
    Data-Intensive Sciences (Judy Qiu, Fall 2010)
  • Virtualization technologies and tools
  • Infrastructure as a service
  • Parallel programming (MPI, Hadoop)
  • FutureGrid provided a set of software options
    that made it possible for students to work on
    different projects along the system stack.

21
Term Projects
Dryad/DryadLINQ 1 Matrix Multiplication
(Swapnil,Amit,Pradnay) 2 PhyloD
(Ratul,Adrija,Chengming)
Higher Level Languages
Iterative MapReduce 3 LDA (Changsi, Yang) 4
MemCache (Saliya, Yiming ,Jerome) 5 Avro (Yuduo,
Yuan, patanachai) 6 PageRank (Shuo-Huan,Parag)
Cloud Platform
Cloud Infrastructure 7 Nimbus, Eucalyptus
(Stephen, Sonali, Shakeela)
Cloud Infrastructure
Cloud Storage 8 Cloud Storage Survey (Xiaoming,
Nixiaogang)
Hypervisor/Virtualization
Virtualization 9 Hypervisor Performance Analysis
Project (James , Andrew)
(Slide courtesy of Judy Qiu)
22
Distributed Scientific Computing class at LSU
  • FutureGrid supported activities in a new
    semester-long class offered Fall 2010 at LSU
    (Gabrielle Allen, Shantenu Jha)
  • A practical and comprehensive graduate course
    preparing students for research involving
    scientific computing
  • Module E (Distributed Scientific Computing)
    taught by Shantenu Jha
  • Topics where FutureGrid was used
  • Introduction to the practice of distributed
    computing
  • Cloud computing and master-worker pattern
  • Distributed application case studies
  • Approximately half of a lecture provided an
    overview of FutureGrid and the process to get
    accounts and started
  • As part of the homework assignment associated
    with lecture E0, each student had to confirm
    access and successful login to FG-Sierra and
    FG-India

23
Distributed Scientific Computing class at LSU
  • FutureGrid (FG) was used by students to
  • (i) compile, deploy and execute basic SAGA
    commands
  • (ii) learn the basics of remote job submission
    and elementary Master-Worker based distributed
    applications (such as MapReduce and computing the
    Mandelbrot Set) using FG-India and FG-Sierra
    nodes
  • (iii) to get hands on training with IaaS Clouds,
    namely stand-up virtual machines using Eucalyptus
    and deploy software and/or applications from (i)
    and (ii)
  • Students also used Eucalyptus on FG-India and
    FG-Sierra to do their Module E projects, which
    ranged from
  • (a) Clouds as accelerators for Cactus-based
    applications,
  • (b) calculate PI using distributed tasks,
  • (c) extend the calculation of the Mandelbrot Set
    to new'' backends on FutureGrid (in addition to
    the default'' remote/ssh backends), and
  • (d) the execution of workers on bare-metal as
    well as Clouds concurrently (i.e., hybrid
    Grid-Cloud infrastructure) for master-worker
    applications.

24
Images
  • IMAGE emi-8D2A13F7 smaddi2-saga-bucket/saga153-ubu
    ntu.manifest.xml smaddi2 available public x86_64
    machine eri-5BB61255 eki-78EF12D2
  • IMAGE emi-DBD61078 ubuntu-0904-saga-1.5.2/image.ma
    nifest.xml luckow available public x86_64 machine
    eri-5BB61255 eki-78EF12D2
  • IMAGE emi-0E0E165E ajyounge/ubuntu-twister-memcach
    ed.img.manifest.xml ajyounge available public x86
    _64 machine eri-5BB61255 eki-78EF12D2

25
Nimbus tutorial at CloudCom
  • Half-day (3-hour) presentation hands-on
    activities
  • 30 attendees used their own computers to
    instantiate virtual machines on FutureGrid
    resources
  • Template for a self-learning tutorial for new
    users and prospective users

26
Nimbus tutorial at CloudCom
27
FutureGrid tutorials
  • Tutorial topic 1 Cloud Provisioning Platforms
  • Using Nimbus on FutureGrid
  • Nimbus One-click Cluster Guide
  • Using the Grid Appliances to run FutureGrid Cloud
    Clients
  • Using Eucalyptus on FutureGrid
  • Tutorial topic 2 Cloud Run-time Platforms
  • Introduction to Hadoop using the Grid Appliance
  • Running Hadoop on FG using Eucalyptus (.ppt)
  • Running Hadoop on Eualyptus
  • Tutorial topic 3 Educational Virtual Appliances
  • Introduction to the Grid Appliance
  • Creating Grid Appliance Clusters
  • Building an educational appliance from Ubuntu
    10.04
  • Deploying Grid Appliances using Nimbus
  • Deploying Grid Appliances using Eucalyptus
  • Customizing and registering Grid Appliance images
    using Eucalyptus
  • MPI Virtual Clusters with the Grid Appliances and
    MPICH2
  • Tutorial topic 4 High Performance Computing
  • Performance Analysis with Vampir

28
Year-1 Outreach activities
  • Demonstrations, presentations, booths at major
    events
  • SuperComputing, TeraGrid Conference, OGF (Open
    Grid Forum), CloudCom, CCGrid, Grid5000 meeting,
    Vampir workshop

1114 CPU cores (457 VMs) distributed over 3 sites
in FutureGrid and 3 sites in Grid5000 (P. Riteau
et al, OGF-29 demo, Chicago, IL, June 2010).
29
Outreach activities
  • At IU, working with dean for diversity and
    education to organize outreach and pursue REU
    funding to bring MSI students to IU for summer
    internships and to coordinate education and
    training workshops
  • Involvement of students from Historically Black
    Colleges and Universities (HBCUs)
  • REU supplement for FutureGrid this year funded 2
    HBCU students in summer 2010 will apply each year

30
Planned TEO activities
  • Plan to engage MSIs with which IU has already
    established formal collaborative agreements
  • MSI Cyberinfrastructure Empowerment Coalition
    (MSI-CIEC). Primary theme teach the teachers
    at MSIs so that they can incorporate
    cyberinfrastructure into their research and
    involve students and staff at their home
    institutions.
  • MSI-CIECs principal activity Cyberinfrastructure
    Days - daylong workshops feature prominent
    speakers who discuss the application of
    cyberinfrastructure to research and education

31
Planned TEO activities
  • With Elizabeth City State University
  • Planning summer school on cloud computing for
    ADMI (Association of Computer/Information
    Sciences and Engineering Departments at Minority
    Institutions) faculty and students
  • Leverage Indiana Universitys STEM Initiative
  • Provides travel, housing, and support for HBCU
    students to intern at Indiana University during
    the summer

32
Planned TEO activities
  • Coordinate Web tutorials and documentation
    emphasis to support short tutorials that can be
    given by partners at conferences, and self-guided
    learning by new or prospective users
  • Continuously provide recommendations and
    guidance, Web portal, user accounts
  • Engage with potential early adopters in computer
    science and engineering classes
  • Leverage existing MSI contacts, and use of
    FutureGrid in workshops, summer schools, and
    internships
Write a Comment
User Comments (0)
About PowerShow.com