Grid Computing: Harnessing Underutilized Resources - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Grid Computing: Harnessing Underutilized Resources

Description:

... focused researchers at partner institutions: NCSU, WCU, NCCU, ECU, and CFCC. ... Business Computing (UNCW and NCCU) Education and Training (UNCW, WCU, CFCC) ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 48
Provided by: Mart205
Category:

less

Transcript and Presenter's Notes

Title: Grid Computing: Harnessing Underutilized Resources


1
Grid Computing Harnessing Underutilized Resources
  • UNCW Department of Chemistry Biochemistry
    Seminar
  • September 24, 2004
  • Ned H. Martin

2
Outline
  • Definition of Grid computing
  • A brief history of computing
  • Growth of computing power
  • Rationale for Grid computing
  • How a Grid works
  • Examples of Grid projects
  • Grid computing in NC
  • Limitations of Grid computing
  • UNCW Grid initiative GridNexus
  • Whats next?

3
Definition of Grid Computing
  • Grid computing is a form of distributed computing
    that involves coordinating and controlled sharing
    of diverse computing, applications, data,
    storage, or network resources across dynamic and
    geographically dispersed multi-institutional
    virtual organizations.
  • A user of Grid computing does not need to have
    the data and the software on the same computer,
    and neither must be on the users home (login)
    computer.

4
Grid Computing
  • The term Grid computing suggests a computing
    paradigm similar to an electric power grid - a
    variety of resources contribute power into a
    shared "pool" for many consumers to access on an
    as-needed basis.

5
Background of Grid Computing
  • The idea of Grid computing resulted from the
    confluence of three developments
  • The proliferation of largely unused computing
    resources (especially desktop computers)
  • Their greatly increased cpu speed in recent years
  • The widespread availability of fast, universal
    network connections (the Internet).

6
Brief History of Computing
  • 1943 "I think there is a world market for maybe
    5 computers." Thomas Watson, chairman of IBM
  • 1947 Testudo The very first computer in the
    Netherlands the relay-based machine was 5 m
    long. Adding took 30 s and multiplication 45 s.

7
Brief History of Computing
  • 1949 "Computers in the future may weigh no more
    than 1.5 tons." -Popular Mechanics, forecasting
    the relentless march of science
  • 1957 "I have traveled the length and breadth of
    this country and talked with the best people, and
    I can assure you that data processing is a fad
    that won't last out the year." -The business book
    editor for Prentice Hall.

8
Brief History of Computing
  • 1977 "There is no reason anyone would want a
    computer in their home." -Ken Olson, president,
    chairman and founder of Digital Equipment Corp.
  • 1980 "DOS addresses only 1 Megabyte of RAM
    because we cannot imagine any applications
    needing more." -Microsoft on the development of
    DOS.
  • 1981 "640k ought to be enough for anybody."
    -Bill Gates

9
Brief History of Computing
  • 1979 Introduction of the 8086 chip by Intel
    used a 16 bit processor too expensive, so an 8
    bit version was developed (the 8088), which was
    chosen by IBM for the first IBM PC available
    clock frequencies up to 10 MHz. It had an
    instruction set of about 300 operations. At
    introduction the fastest processor was the 8 MHz
    version which achieved 0.8 MIPs (0.8 x 106
    instructions per second) and contained 29,000
    transistors.

10
Brief History of Computing
  • 1982 Intel 80286 released. It supported clock
    frequencies of up to 20 MHz. At introduction the
    fastest version ran at 12.5 MHz, achieved 2.7
    MIPs and contained 134,000 transistors.
  • 1985 Intel 80386 DX released. It supported clock
    frequencies of up to 33 MHz. At the date of
    release the fastest version ran at 20 MHz and
    achieved 6.0 MIPs. It contained 275,000
    transistors.

11
Brief History of Computing
  • 1989 Intel 80486 DX released by Intel. It
    contained the equivalent of about 1.2 million
    transistors. At the time of release the fastest
    version ran at 25 MHz and achieved up to 20 MIPs.
    Later versions had clock speeds up to 100 MHz.
  • 1993 Intel Pentium released. At that time it was
    only available in 60 66 MHz versions which
    achieved up to 100 MIPs, with over 3.1 million
    transistors.

12
Brief History of Computing
  • 1995 Pentium Pro released. At introduction it
    achieved a clock speed of up to 200 MHz. It
    achieved 440 MIPs and contained 5.5 million
    transistors - this was nearly 2400 times as many
    as the first microprocessor in 1971- and capable
    of 70,000 times as many instructions per second.
  • 2004 Pentium 4 chips available with clock speeds
    of up to 3.6 GHz providing 11,356 MIPS and
    containing 125,000,000 transistors.
  • 2005 500,000,000 transistors !!!

13
Growth of Computing Power
ts/104
2004
14
Rationale for Grid Computing
  • The proliferation of largely unused computing
    resources (especially desktop computers, of which
    152 million were sold in 2003).
  • Their greatly increased cpu speed in recent years
    (now gt3 GHz).
  • The widespread availability of fast, universal
    network connections (the Internet).

15
Rationale for Grid Computing
  • High performance computers (formerly called
    supercomputers) are very expensive to buy and
    maintain.
  • Much of the enhancement of computing power
    recently has come through the application of
    mulltiple cpus to a problem (e.g., NCSC had a 720
    processor IBM parallel computer).
  • Many computing tasks relegated to these
    (especially massively parallel) computers could
    be performed by a divide and conquer strategy
    using many more, although slower, processors as
    are available on a Grid.

16
How a Grid Works
  • The term "grid computing" suggests a computing
    paradigm similar to an electric power grid - a
    variety of resources contribute power into a
    shared "pool" for many consumers to access on an
    as-needed basis
  • Ideally the user does not know or care where the
    computing operation is being performed the
    process is invisible to the user.
  • Middleware handles security, authentication,
    authorization, resource selection and routing of
    input and output seamlessly.

17
Examples of Grid Projects
  • SETI_at_home
  • DNet (distributed.net)
  • GRID.ORG (anti-cancer ligand screening)
  • IBM Smallpox cure
  • Entropia.org
  • CERN

18
Grid Projects SETI_at_home
  • SETI_at_home
  • A large-scale search through data gathered by
    radiotelescopes in P.R. for evidence of
    extraterrestrial life
  • Involved more than 3 million computers averaging
    about 14 TeraFLOPS, or 14 trillion floating point
    operations per second,
  • Utilized over 500,000 years of processing time in
    the past year and a half.

19
Grid Projects DNet
  • DNet (distributed.net)
  • Began in 1997 as the first general-purpose
    distributed computing network on the Internet
  • Highly successful in bringing individuals
    together to complete cryptographic challenges via
    a distributed environment.
  • Equivalent to more than 160,000 PII 266Mhz
    computers working 24 hours a day, 7 days a week,
    365 days a year!
  • The core distributed.net development team joined
    United Devices in 2000.

20
Grid Projects GRID.ORG
  • The United Devices Cancer Research Project
    (GRID.ORG) will advance research to uncover new
    cancer drugs through the combination of
    chemistry, computers, and specialized software.
  • The research centers on proteins that have been
    determined to be a possible target for cancer
    therapy. Through a process called "virtual
    screening", LigandFit docking software by
    Accelrys identifies molecules that interact with
    these proteins, and determines which ones have a
    high likelihood of being developed into a drug.
  • In the first year and a half, over 3.5 million
    drug candidates were screened using over a
    million personal computers.

21
Grid Projects Smallpox Cure
  • Smallpox cure
  • To help find a cure for smallpox, IBM and a group
    of partners harnessed the processing power of 2
    million idle PCs. They then screened 35 million
    drug compounds and smallpox proteins to find the
    most effective cure.

22
Grid Projects Entropia
  • In 1997, Entropia applied idle computers
    worldwide to problems of scientific interest. In
    just two years, this network grew to encompass
    30,000 computers with an aggregate speed of over
    one teraflop per second. Among its several
    scientific achievements is the identification of
    the largest known prime number.

23
Grid Projects CERN
  • CERN
  • By 2005, detectors at the Large Hadron Collider
    at CERN, the European Laboratory for Particle
    Physics will produce several petabytes of data
    per year - a million times the storage capacity
    of a desktop computer
  • Just the basic data analysis requires 20 tflops/s
    of computing power (the fastest supercomputer
    produces 3 teraflops per second).
  • more sophisticated analyses will need orders of
    magnitude more computing power

24
Grid Computing in NC
  • NCBioGrid (www.ncbiogrid.org/), an outgrowth of
    the High Performance Computing and Data Storage
    Focus Group of the NCĀ Genomics and Bioinformatics
    Consortium
  • NC Computing Grid now includes 7 universities
    plus MCNC UNCW will be joining soon
  • UNCW Grid started as a grid for UNCW
    bioinformatics/genomics research, expanded now
    into chemistry and business applications.

25
Limitations of Grid Computing
  • Currently, although efforts are being made to
    standardize protocols (e.g., Globus toolkit and
    Avaki), interacting with Grid services remains a
    complex process.
  • Most of the existing applications that access
    Grid services require the user to type cumbersome
    commands, often using a command-line interface.
  • Creating new clients and services requires
    programming in a language such as C or Java and
    using a host of libraries for interacting with
    Open Grid Services Infrastructure, Grid Security
    Infrastructure, Web Services Description Language
    and other standards.

26
Limitations of Grid Computing
  • These tools and techniques are useful to a select
    group of computing specialists however the only
    way to make Grid resources accessible to a wide
    range of users is to provide a relatively simple
    graphical user interface (GUI).
  • The UNCW Grid project proposes to develop a
    Graphical Grid User Interface that is easy to use
    and can access a wide range of applications.
  • Our hope is to create an interface to Grid
    computing that accomplishes what Internet
    browsers (Netscape and Internet Explorer) did to
    open up the WWW .

27
UNCW Grid Initiative GridNexus
  • This initiative grew in part out of a need for
    HPC resources following the closure of the NCSC
    in June 2003, coupled with the availability of
    faculty with software programming expertise and
    others with computing applications that could
    benefit from use of a Grid.
  • The UNC-OP funded UNCWs proposal for 557,634
    over two years to develop Grid portals (GUI
    middleware to allow users to access software on
    computers on a Grid).

28
UNCW Grid Initiative GridNexus
  • The UNCW Grid Computing Project is a two-year
    collaborative project among a multi-discipline,
    multi-investigator core research team at UNCW and
    several discipline-focused researchers at partner
    institutions NCSU, WCU, NCCU, ECU, and CFCC. The
    research areas and institutional interests of
    this project are
  • Advanced Grid Software Development (UNCW)
  • Computational Chemistry (UNCW and ECU)
  • Bioinformatics (UNCW, NCSU, and NCCU)
  • Combinatorics (UNCW)
  • Business Computing (UNCW and NCCU)
  • Education and Training (UNCW, WCU, CFCC)
  • This project proposes to develop a Grid interface
    that is easy-to-use and may be used by a
    wide-range of applications and users. We have
    developed an innovative graphical user interface
    (GUI) for grid applications. In particular, we
    introduced a new scripting language (JXPL)
    designed for web-based services, a GUI for
    creating scripts, and have demonstrated the use
    of these tools with grid services.

29
UNCW Grid Initiative GridNexus
  • UNCWs initiative is unique in that it involves
    undergraduate students as the main players in the
    development of the Grid portal (GUI).
  • Undergraduate computer science students are
    partnered with faculty and students in
    application areas (chemistry, biology, business)
    to develop graphical front-ends to access
    services (programs) on computers on the Grid.
  • Grid portals are being developed for the two
    computational chemistry programs (Gaussian 03 and
    DMol ) most often used in research by our faculty
    and students.

30
Resources of UNCW Grid
  • Beowulf cluster 16 PIII processors in Computer
    Sciences Department
  • Fire and FireDev servers plus disc storage
    devices
  • PQS Quantum Cube 8 cpu cluster with PQS and
    Gaussian 03 computational chemistry software,
    plus TCP-Linda environment.
  • An 8 processor IBM blade cluster with 0.5 tB disk
    storage will be added soon.
  • Other computers may be added, including the
    possibility of using all computing lab computers,
    or possibly even all faculty/staff computers
    (when not in use).

31
Remote Computing before Grid
  • Now, to submit a quantum chemistry calculation
  • to a remote computer, e.g., at NCSU, one must
  • Telnet to remote computer, login (separate login
    and password for each user account and for each
    computer)
  • FTP input data file from local computer to remote
    machine (requires login, password)
  • Create and edit an input file for job (using vi
    or other text editor)
  • Create a .job file, edit it if necessary
  • Select queue based on cpus and time required
    submit .job file
  • Check progress of calculation by periodically
    telnet to remote machine look
    for file that indicates completion of job.
  • FTP output file to local computer
  • Open output file in text editor, examine
    numerical data
  • Open output file in a commercial program on local
    computer to visualize structure

32
Remote Computing on a Grid
  • In the future, using Grid middleware to submit a
    quantum chemistry calculation to a remote
    computer at NCSU
  • Login to Grid (single user login and password to
    access ANY Grid resource)
  • Select a data file and job parameters from
    pull-down menus click to submit (.input and .job
    file is created automatically by Grid middleware,
    job is submitted automatically to an appropriate
    available computer)
  • Upon completion of computation, output file is
    automatically sent to local computer to visualize
    structure (which can also be automated).

33
Development of a Grid Portal
  • The objective is to make accessing HPC resources
    (wherever they may be located) easy to scientists
    who are not computer savvy.
  • Most computation involves doing various
    mathematical operations on a dataset.
  • A GUI approach is employed, in which the user,
    after a single login that checks authentication
    and authorization, can create a workflow of
    functions/operations graphically by connecting
    boxes dragged from a series of lists of options,
    then applying that series of steps to a dataset.
  • Such a workflow can be saved for subsequent
    application to another dataset.

34
Development of a Grid Portal
  • Job submission Ideally in a grid, the grid
    middleware should select the best resource
    those computers that are available, capable, and
    have the software needed to handle the job.
  • The user need not select nor know where the
    computation is taking place. In fact, the job
    may even be passed from one computer to another
    for various aspects of the calculation.
  • The output is returned to the users workstation
    or account, rather than the user having to access
    and download the output file from a remote
    computer.

35
UNCWs Grid Portal GridNexus
  • 3 main application types genomics/
    bioinformatics, business and chemistry
  • Chemistry resources on UNCW Grid
  • PQS Quantum Cube 8 cpu cluster with PQS and
    Gaussian 03 computational chemistry software and
    TCP-Linda
  • Beowulf Cluster 16 cpu cluster with Gaussian 03
    computational chemistry software and TCP-Linda
  • Soon to be added IBM blade server with 8 or 16
    cpus Gaussian 03 will be installed on it.
  • Java script for file transformatione.g., to
    convert HyperChem file into a Gaussian 03 input
    file

36
Quantum Chemistry Portal
  • A GUI is under development to allow a user
    to select the following from pull-down menus
    within boxes that are linked into a workflow
  • Data input file
  • Transform to another file type if necessary
  • Level of calculation HF, DFT, MP2, etc.
  • Basis set 6-31G(d,p), 6-311G(2d,p), etc.
  • Number of processors needed
  • CPU time requested
  • Keywords opt, nmr, freq, popnpa, etc.
  • Charge and multiplicity

37
Design of UNCW Grid GUI
  • Select from pull-down menus in categories

Basis Set
Data sets (Windows Explorer-like file
browser)
Level of Theory
CPU Time
Processors
Chg. Multiplicity
Keywords
File Type Transformer
Submit
Visualize
38
Design of UNCW Grid GUI
  • Select from pull-down menus in categories

Basis Set
Data sets (Windows Explorer-like file
browser)
Level of Theory
HF MP2 DFT
CPU Time
Processors
Chg. Multiplicity
Keywords
File Type Transformer
Submit
Visualize
39
Design of UNCW Grid GUI
  • Functions can be grouped into sets called
    workflows for repetitive operations

Basis Set
Data sets (Windows Explorer-like file
browser)
Level of Theory
CPU Time
Processors
Chg. Multiplicity
Keywords
File Type Transformer
Submit
Visualize
40
Design of UNCW Grid GUI
  • Preferences among choices can be saved as part of
    the workflow

6-31G(d)
Data sets (Windows Explorer-like file
browser)
HF
4000
4
0,1
NMR
File Type Transformer
Submit
Visualize
41
Design of UNCW Grid GUI
  • The result is a much more simplified process for
    the user

Select data, Transform it
Calculate, Visualize
42
Design of UNCW Grid GUI
  • Multiple repeatedly used sets of commands
    (workflows) can be saved
  • A users preferences within a workflow (e.g.,
    level of theory, basis set, processors, cpu
    time requested, keywords, charge and
    multiplicity) could be saved also (future design
    feature).
  • In the future a user may need only to specify a
    data set (file) and link it to a pre-set
    workflow to initiate a calculation!

43
Chemistry Portal
  • Initially, the portal will operate under Linux
  • Next it will be ported to operate under Windows
  • Eventually, computations will be submitted online
    through web browsers
  • This could be accomplished from any devise (e.g.,
    pc, laptop, or even a cell phone) that can access
    the Internet.

44
JXPL Language
  • UNCW Mathematics Faculty Dr. Jeff Brown with help
    from Computer Science Faculty Dr. Clayton Ferner
    and recent graduate Mike Wood developed a new
    java-base programming language called JXPL.
  • JXPL is the language used in the GridNexus
    project, and is a language commonly used with web
    services and grid services
  • The advantages of JXPL include
  • It is readily extensible
  • Interfaces easily with (LISP-like) data
    structures in GUI
  • JXPL scripts are written in XML, a commonly used
    language

45
Whats Next?
  • More filters to transform data need to be
    developed and tested
  • Fancier graphics may be added to the GUIs
  • More computational nodes will be added to the
    Grid. The eventual goal is to include all NC
    institutions of higher learning.
  • Extend Grid to include more software applications
  • Extend Grid services to other disciplines
  • Include industry and businesses as users and
    developers.

46
References
  • http//people.uncw.edu/vetterr/grid/proposal/UNC-O
    P_Grid_Project20Overview.htm
  • http//www.ox.compsoc.net/swhite/history/
  • http//www.grid.org
  • http//www.gridcomputingplanet.com/
  • http//www.globus.org/research/papers/anatomy.pdf
  • http//www.ibm.com/grid
  • http//www.globus.org
  • http//www.usatlas.bnl.gov/computing/grid/

47
Acknowledgments
  • UNC-OP for funding the UNCW Grid Initiative
    Proposal
  • Fostering Undergraduate Research Partnerships
    through a Graphical User Environment for the
    North Carolina Computing Grid, Dr. Ron Vetter,
    PI
  • Co-PIs Dr. Rebecca S. Boston, NCSU Dr. Anthony
    Wilkinson, WCU Dr. Marilyn McClelland, NCCU Dr.
    Libero Bartolotti, ECU Ms. Judy Porter, CFCC.
  • UNCW Participants Computer Science Dr. Ron
    Vetter, Dr. Clayton Ferner, Dr. David Berman, and
    Dr. Tom Hudson. Information Technology Systems
    Dr. Bob Tyndall and Mr. Bobby Miller.
    Mathematics and Statistics Dr. Jeff Brown.
    Chemistry and Biochemistry Dr. Ned H. Martin.
    Biological Sciences Dr. Ann Stapleton
    Information Systems and Operations Management
    Dr. Tom Janicki.
  • UNCW Computer Science students working on the
    Chemistry portal Tristan
    Carland, Jerry Martin, Andrew Martin
Write a Comment
User Comments (0)
About PowerShow.com