Title: Grid Computing Technology An Review
1Grid Computing Technology --- An Review
- Jun Ni, Ph.D. M.E.
- Department of Computer Science
- The University of Iowa
2Outline
- History of Grid Computing
- Definition of Grid Computing
- A Grid Computing Model
- Grid Computing Protocols
- Types of Grids
3Introduction
- Grid Computing is the evolution and amalgamation
of numerous development efforts that have been
going on for many years. - Infrastructure of IT
- Innovative Distributed Computing Technology
4Grid Computing Protocols and Internet Protocol
Grid Protocol Architecture
Internet Protocol Architecture
Applications
Application
Collection
Resources
Transport
Connectivity
Internet
Fabric
Link
5History
- In the early-to-mid 1990s, there were numerous
research projects underway in the academic and
research community that were focused on
distributed computing. - One key area of research focused on developing
tools that would allow distributed high
performance computing systems to act like one
large computer.
6History
- At the IEEE/ACM 1995 Super Computing conference
in San Diego, 11 high speed networks were used to
connect 17 sites with high-end computing
resources for a demonstration to create one super
metacomputer. - This demonstration was called I-Way and was led
by Ian Foster of the United States Department of
Energys Argonne National Labs and University of
Chicago.1 - http//www-fp.mcs.anl.gov/foster/
7(No Transcript)
8History
- Sixty different applications, spanning various
faculties of science and engineering, were
developed and run over this demonstration
network. - Many of the early Grid Computing concepts were
explored in this demonstration as the team
created various software programs to make all
computing resources work together
9History
- The success of the I-Way demonstration led the
United States governments DARPA agency, in
October 1996, to fund a project to create
foundation tools for distributed computing. - The research project was led by Ian Foster of ANL
and Carl Kesselman of University of Southern
California.
10(No Transcript)
11History
- The project was named Globus
- A suite of tools that laid the foundation for
Grid Computing activities in the academic and
research communities. - (http//www.globus.org/)
- http//www.globus.org/alliance/publications/papers
.php - At the 1997 Super Computing Conference, 80 sites
worldwide running software based on the Globus
Toolkit were connected together.
12History
- This effort started to be referred to as Grid
Computing - coined to play on the analogy to the electrical
power grid. - Grid computing would make tremendous computing
power available to anybody, at anytime, and in a
truly transparent manner, just as, today, the
electric power grid makes power available to
billions of electrical outlets.
13History
- Grid Computing in the academic and research
communities remained focused on creating an
efficient framework to leverage distributed
high-performance computing systems. - But with the explosion of the Internet and the
increasing power of the desktop computer during
the same period, many efforts were launched to
create powerful distributed computing systems by
connecting together PCs on the network.
14History
- In 1997, Entropia was launched to harness the
idle computers worldwide to solve problems of
scientific interest. The Entropia network grew to
30,000 computers with aggregate speed of over one
teraflop per second. - A whole new field of philanthropic computing came
about in which ordinary users volunteered their
PCs to analyze research topics such as patients
response to chemotherapy, discovering drugs for
AIDS, and potential cures for anthrax.
15History
- Although none of the above projects could be
successfully monetized by companies, they did,
however, attract a lot more media attention than
any of the earlier projects in the academic and
research world. - Starting in late 2000, articles on Grid Computing
moved from the trade press to the popular press.
In rapid fire succession articles appeared in, -
16History
- For example, New York Times, Economist, Business
2.0, Red Herring, Washington Post, Financial
Times, Yomiuri Shimbum, The Herald of Glasgow,
Jakarta Post, Dawn of Karachi, etc. In fact, the
trend has only accelerated recently. - An analysis of Lexus-Nexis data shows that Grid
Computing references in popular U.S. media is up
dramatically to 500 citations from 100 in Q4 of
2001.
17History
- Grid Technology Partners Grid Looking Glass
Index which tracks Grid Computing related
searches on the Google search engine tripled in
the first year of its inception.
18(No Transcript)
19History
- Today, large corporations such as IBM, Sun
Microsystems, Intel, Hewlett Packard, - Some smaller companies such as Platform
Computing, Avaki, Entropia, DataSynapse, and
United Devices are putting their marketing
dollars to good use and creating the next
generation of thought-leadership around Grid
Computing that is focused on business
applications rather than academic and basic
research applications.
20High Performance Computing
- High-performance computing generally refers to
what has traditionally been called
supercomputing. - There are hundreds of supercomputers deployed
throughout the world. - Key parallel processing algorithms have already
been developed to support execution of programs
on different, but co-located processors.
21High Performance Computing
- High-performance computing system deployment,
contrary to popular belief, is not limited to
academic or research institutions. - In fact, more than half of supercomputers
deployed in the world today are in use at various
corporations.
22High Performance Computing
- The industries in which high performance systems
are deployed are numerous in nature. - shows the distribution of the top 500
supercomputers by their industries.
23(No Transcript)
24High Performance Computing
- It was the desire to share high- performance
computing resources amongst researchers that led
to the development of Grid Computing technology
and some of its fundamental infrastructure. - High-performance computing statistics are
extremely important because they tell us which
industries already have demand for tremendous
computing power. - Information on the worlds top 500 supercomputers
is compiled twice a year jointly by the
University of Mannheim and the University of
Tennessee. It can be found at www.top500.org
25Cluster Computing
- Cluster computing came about as a response to the
high prices of supercomputers, which made those
systems out of reach for many research projects. - Clusters are high-performance, massively parallel
computers built primarily out of commodity
hardware components, running a free-software
operating system such as Linux or FreeBSD, and
interconnected by a private high-speed network.
26Cluster Computing
- It consists of a cluster of PCs, or workstations,
dedicated to running high-performance computing
tasks. - The nodes in the cluster do not sit on users
desks, but are dedicated to running cluster jobs.
- A cluster is usually connected to the outside
world through only a single node.
27Cluster Computing
- Cluster computing has been around since 1994 when
the first Beowulf clusters were developed and
deployed. - Since then, numerous tools have been developed to
run and manage clusters.
28Cluster Computing
- Platform Computing, a firm that is now a leader
in Grid Computing, developed many of the early
load-balancing tools for clusters. - Additionally, tools have also been developed to
adapt applications run in the parallel cluster
environment.
Platform Computing, a firm that is now a leader
in Grid Computing, developed many of the early
load-balancing tools for clusters. Additionally,
tools have also been developed to adapt
applications run in the parallel cluster
environment.
Platform Computing, a firm that is now a leader
in Grid Computing, developed many of the early
load-balancing tools for clusters. Additionally,
tools have also been developed to adapt
applications run in the parallel cluster
environment.
29Cluster Computing
- One such tool, ForgeExplorer, can check if
particular applications are suitable for
parallelization and determine if they would be
suitable to run on a cluster.
30Cluster Computing
- The exponential growth in microprocessor speeds
over the last decade has now made it possible to
create truly impressive clusters. - AMD Athlonbased cluster at University of
Heidelberg in Germany was tested at 825 Gflops,
making it the 35th fastest high performance
computer in the world. - Clusters are widely deployed in industries such
as life sciences, digital entertainment, finance,
etc.
31Cluster Computing
- IEEE Computer Society Task Force on Cluster
Computing conducts a yearly conference that is
designed to bring together international cluster
and Grid Computing researchers, developers, and
users to present and exchange the latest
innovations and findings that drive future
research and products. - It is expected that the Grid Computing community
will benefit from the years of experience that
the cluster community has in building tools that
allow applications to share distributed computing
resources. Cluster computing and cluster-based
grids
32Cluster Computing
- Beowulf clusters were developed by Thomas
Sterling and Don Becker while working at the
Center of Excellence in Space and Data and
Information Sciences, a division of University
Space Research Association located at NASA
Goddard Space Flight Center. - The AMD-based cluster was tested based on Linpack
test performed by Top500.org.
33Peer-to-peer Computing
- Although the recent growth of user-friendly
file-sharing networks, such as Napster or Kazaa,
has only now brought Peer-to-Peer (P2P) networks
and file sharing into the public eye, methods for
transferring files and information between
computers have been, in fact, around almost as
long as computing itself. - Until recently, however, systems for sharing
files and information between computers were
exceedingly limited.
34Peer-to-peer Computing
- They were largely confined to Local Area Networks
(LANs) and the exchange of files with known
individuals over the Internet. - LAN transfers were executed mostly via a built-in
system or network software while Internet file
exchanges were mostly executed over an FTP (File
Transfer Protocol) connection.
35Peer-to-peer Computing
- The reach of this Peer-to-Peer sharing was
limited to the circle of computer users an
individual knew and agreed to share files with. - Users who wanted to communicate with new or
unknown users could transfer files using IRC
(Internet Relay Chat) or other similar bulletin
boards dedicated to specific subjects, but these
methods never gained mainstream popularity
because they were somewhat difficult to use.
36Peer-to-peer Computing
- Today, there are a number of advanced P2P file
sharing applications, and the reach and scope of
peer networks have increased dramatically. - The two main models that have evolved are the
centralized model, such as the one used by
Napster, and the decentralized model like the one
used by Gnutella.
37Peer-to-peer Computing
- In the centralized model of P2P, file sharing is
based around the use of a central server system
that directs traffic between individual
registered users. - The central servers maintain directories of the
shared files stored on the respective PCs of
registered users of the network. - These directories are updated every time a user
logs on or off the Napster server network.
38Peer-to-peer Computing
- Each time a user of a centralized P2P file
sharing system submits a request or searches for
a particular file, the central server creates a
list of files matching the search request by
cross-checking the request with the servers
database of files belonging to users who are
currently connected to the network. - The central server then displays that list to the
requesting user. The requesting user can then
select the desired file from the list and open a
direct HTTP link with the individual computer
that currently possesses that file. - The download of the actual file takes place
directly, from one network user to the other. The
actual file is never stored on the central server
or on any intermediate point on the network.
39Peer-to-peer Computing
- The decentralized model of P2P file sharing does
not use a central server to keep track of files. - Instead, it relies on each individual computer to
announce its existence to a peer, which in turn
announces it to all the users that it is
connected to, and so on.
40Peer-to-peer Computing
- The search for a file follows a similar path. If
one of the computers in the peer network has a
file that matches the request, it transmits the
file information (name, size, etc.) back through
all the computers in the pathway to the user that
requested the file. - A direct connection between the requester and the
owner of the file is directly established and the
file is transferred.
41Internet Computing
- The search for a file follows a similar path. If
one of the computers in the peer network has a
file that matches the request, it transmits the
file information (name, size, etc.) back through
all the computers in the pathway to the user that
requested the file. - A direct connection between the requester and the
owner of the file is directly established and the
file is transferred.
42Internet Computing
- The explosion of the Internet and the increasing
power of the home computer prompted computer
scientists and engineers to apply techniques
learned in high-performance and cluster-based
distributed computing to utilize the vast
processing cycles available at users desktops. - This has come to be known as Internet computing.
43Internet Computing
- Large compute intensive projects are coded so
that tasks can be broken down into smaller
subtasks and distributed over the Internet for
processing. - Volunteer users then download a lightweight
client onto their desktop, which periodically
communicates with the central server to receive
tasks.
44Internet Computing
- The client initiates the tasks only when the
desktop CPU is not in use. Upon completion of the
task, it communicates results back to the central
server. - The central server aggregates the information
received from all the different desktops and
compiles the results. - United Devices, Entropia, and others have
established large groups of users that volunteer
their desktops for large computing projects.
45(No Transcript)
46Internet Computing
- Internet computing projects, although not
profitable, have allowed companies to understand
large-scale distributed computation projects. - Many of these companies, which were originally
funded to harness the power of the consumers
desktops connected to the Internet, are retooling
their products for enterprise applications. - Some people classify some of the Internet
computing projects as an emerging area called
philanthropic computing.
47Grid Computing
- Grid computing tries to bring, under one
definitional umbrella all the work being done in
the high performance, cluster, peer-to-peer, and
Internet computing arenas. - Coming up with a definition for Grid Computing,
therefore is not as easy as one would have
expected. Vendors, academics, trade, as well as
the popular press have all tried to define Grid
Computing.
48Grid Computing
- Some of the definitions of Grid Computing that we
have uncovered include - The flexible, secure, coordinated resource
sharing among dynamic collections of individuals,
institutions, and resources. - Transparent, secure, and coordinated resource
sharing and collaboration across sites. - The ability to form virtual, collaborative
organizations that share applications and data in
an open heterogeneous server environment in order
to work on common problems. - The ability to aggregate large amounts of
computing resources which are geographically
dispersed to tackle large problems and workloads
as if all the servers and resources are located
in a single site. - A hardware and software infrastructure that
provides dependable, consistent, pervasive, and
inexpensive access to computational resources. - The Web provides us informationthe grid allows
us to process it.
49Grid Computing
- The following broader definition of Grid
Computing serves the purpose of this book more
fully and will be used to describe grid systems
50Grid Computing
- Grid Computing enables virtual organizations to
share geographically distributed resources as
they pursue common goals, assuming the absence of
central location, central control, omniscience,
and an existing trust relationship
51Grid Computing
- Virtual organizations can span from small
corporate departments that are in the same
physical location to large groups of people from
different organizations that are spread out
across the globe. - Virtual organizations can be large or small,
static or dynamic. Some may come together for a
particular event and then be disbanded once the
event expires.
52Grid Computing
- Some examples of a virtual organization are
- Boeings Blended Wing Body design team located in
numerous Boeing offices around the world. - Worldcoms Global VPN Product Management team
with members in 28 countries working on defining
product specifications. - An accounting department of a company.
- An emergency response team created to tackle an
oil spill in the Gulf of Mexico.
53Grid Computing
- A resource is an entity that is to be shared.
- It can be computational such as a personal
digital assistant, laptop, desktop, workstation,
server, cluster, and supercomputer or a storage
resource such as a hard drive in a desktop,
(Redundant Array of Inexpensive Disks), and
terabyte storage device. - Sensors are another type of resource.
- Bandwidth is yet another resource that is used in
the activities of the virtual organization.
54Grid Computing
- Absence of a central location and central control
implies that grid resources do not require a
particular central location for their management.
- The final key point is that in a grid environment
the resources do not have prior information about
each other nor do they have pre-defined security
relationships.
55Grid Computing
- We mentioned earlier that this is a broad and
all-encompassing definition of Grid Computing. - There will be degrees to which certain grid
deployments and grid products meet or do not meet
the above criteria.
56Relationship between Grid computing and others
- Peer-to-peer networks fall within our definition
of Grid Computing. - The resource in peer-to-peer networks is the
storage capacity of each (mostly desktops) node. - Desktops are globally distributed and there is no
central controlling authority. - The exchange of files between users also does not
predicate any pre-existing trust relationship. - It is not surprising, given how snugly P2P fits
in our definition of Grid Computing, that the
Peer to Peer Working Group has become part of the
grid standards body, the Global Grid Forum (GGF).
57Relationship between Grid computing and others
- From a Grid Computing perspective, a cluster is a
resource that is to be shared. A grid can be
considered a cluster of clusters.
58Relationship between Grid computing and others
- Internet computing examples presented earlier in
our opinion fit this broad definition of Grid
Computing. A virtual organization is assembled
for a particular project and disbanded once the
project is complete. The shared resource, in this
case, is the Internet connected desktop.