Title: Dong Lu, Peter A. Dinda
1GridG Synthesizing Realistic Computational Grids
Dong Lu, Peter A. Dinda Prescience
Laboratory Department of Computer
Science Northwestern University Evanston, IL
60201
2Outline
- Why GridG?
- What is GridG?
- Topology generation
- Hierarchical vs. degree based?
- What are the relationships among the power laws
of Internet topology? - Annotation
- What are the intra- and inter- correlations among
the hosts and within a host? - How to build the correlations into GridG?
- Conclusions and future work
3Why GridG?
- Synthetic Grids needed to evaluate Middleware
- Existing physical grids too small
- Cant control parameters
- Example Evaluation of our RGIS system
- Example Grid simulation projects
- GridSim and SimGrid
- Example overlay network simulations
- Application level multicast
4GridG A Synthetic Grid Generator
- Output Network topology annotated with the
hardware and software available on each node and
link. - Layer 3 network hosts, routers, links
- Hosts memory, architecture, number of CPUs,
disk, operating system, vendor, clock rate - Routers switching capacity
- Links bandwidth and Latency
5Example 1
Router (switching capacity)
Link (bw, latency)
Host (arch, numcpu, clock rate, osvendor, mem,
disk,)
6Requirements
- Realistic topologies
- Connected
- Hierarchical topology
- Power laws of Internet topology
- Realistic annotations
- Distributions of attributes
- Correlations of attributes
- Intra-host
- Inter-host
7GridG architecture
- A sequence of transformations on a text-based
representation of an annotated graph.
8Outline
- Why GridG?
- What is GridG?
- Topology generation
- Hierarchical vs. degree based?
- What are the relationships among the power laws
of Internet topology? - Annotation
- What are the intra- and inter- correlations among
the hosts and within a host? - How to build the correlations into GridG?
- Conclusions and future work
9Quick review of the Power laws of Internet
topology
Power Laws Expression
Rank exponent
Outdegree exponent
Eigen exponent
Hop-plot exponent
10Current Graph generators
- Random (Waxman)
- Hierarchical
- Tiers, Transit-Stub, etc. have clear network
hierarchy, but dont follow power laws - Degree based
- Inet, Brite, PLRG, etc. follow power
- laws, but dont have clear network
- hierarchy
11Topology Generation in GridG (1/2)
- Generate a basic graph without any redundant
links using Tiers - This is a hierarchical graph
- Assign each node an outdegree randomly using the
outdegree exponent power law as the distribution - This enforces all the power laws!
- Scale-free
- Determine the remaining outdegree of each node by
taking original hierarchical links into
consideration
12Topology Generation in GridG (2/2)
- Add redundant links between randomly chosen pairs
of nodes with sufficient remaining outdegree - Nodes at higher levels (e.g., WAN) are given
priority over nodes at lower levels (e.g., MAN) - Repeat 4 until there is no pair of nodes with
positive remaining outdegree
13Evaluation Topology Obeys Rank Exponent Law
14Evaluation Topology Obeys Outdegree Exponent Law
15Evaluation Topology Obeys Hop-plot Law
16Evaluation Topology Obeys Eigenvalue Exponent
Law
17Comparing To The Internet
Power Law Internet Routers GridG Tiers
Rank -0.49 -0.51
-0.18 R2
0.94 0.89 Outdegree -2.49
-2.63 -3.4 R2
0.97 0.55
Eigen -0.18 -0.24
-0.23 R2
0.97 0.97 Hop-plot 2.84
2.88 1.64 R2
0.99 0.99
Notice Close Match
18Relationship among power laws (0)
- An interesting phenomenon GridG and several
other graph generators generate graphs according
to the outdegree law only. But the generated
graphs follow all four power laws! - How is this possible?
- The power laws are closely related
- Can we deduce other power laws from the
outdegree power law?
19Relationship among power laws (1)
- Eigenvalue law follows from the outdegree law
Mihail and Papadimitriou - Hop-plot and Eigenvalue power laws are followed
by many topologies Medina, et al - Outdegree law follows from the rank law
- Rank law does not follow from outdegree law
- Alternative rank law follows from outdegree law
and fits data better
Our Results
20Relationship among power laws (2)
Rank law Outdegree law
This is a power law
21Relationship among power laws (3)
Log-log plot of the derived Outdegree law.
Perfect power law fit. So we can do Rank law
Outdegree law.
22Relationship among power laws (4)
Outdegree law Rank law
This is NOT a power law
23Relationship among power laws (5)
Log-log plot of the derived Rank law. Not power
law! So we can NOT do Outdegree law Rank
law.
Corresponds well to the Faloutsos Internet data
24Relationship among power laws (6)
- Log-log plot of derived Outdegree law using the
new Rank law. It is perfect power law.
25Relationship among power laws (7)
We propose the following as the relationships
among Internet topology power laws
New rank law
Outdegree power law
Eigenvalue law
26Outline
- Why GridG?
- What is GridG?
- Topology generation
- Hierarchical vs. degree based?
- What are the relationships among the power laws
of Internet topology? - Annotation
- What are the intra- and inter- correlations among
the hosts and within a host? - How to build the correlations into GridG?
- Conclusions and future work
27Annotation Generator
- Distributions for attributes
- Example Smith MDS trace for memory
- Intra-host correlation of attributes
- Example Memory and CPU
- Inter- host correlations of attributes
- Example cluster of identical machines
28Intra-host correlations
- The Memory size, Architecture, CPU clock rate,
Number of CPUs, Disk size, etc, all have certain
distributions. These distributions are not
independent, however - Example a host with 64 CPUs is likely to have
very big memory. Similarly, a host with a 3Ghz
processor is likely to have bigger memory than a
host with 1Ghz processor - Many Intra-host correlations are unknown
- GridG has heuristic rules and can be extended by
the user
29Heuristic Intra-host rules
- One processor will have memory between 64M and 4G
- More CPUs, more likely to have bigger memory and
disk - More memory, more likely to have bigger disk, and
vice versa - Windows machines wont have more than 4
processors - Machines with different architectures have
different distributions of CPU clock rate - Host load is not correlated to other attributes.
30Assumed Dependence Tree
31Inter-host correlations
- Hosts that are close to each other are likely to
share some attributes. - For example OS concentration
- Every IP subnet we probed had a dominant OS
- OS concentration rule built into GridG
- User can disable
32Annotation Algorithm Basic
- Based on the dependence tree, make grid conform
to correlations by applying conditional
probability - Choosing the distribution of an attribute based
on attribute picked before it. - For example first choose architecture according
to a distribution, then choose the number of CPUs
based on it, finally, choose the size of memory
based on the previous two choices.
33Annotation Algorithm user rules
- User can add rules to GridG for example, all
the hosts with N or above processors will have
memory bigger than N1024 MB, etc. - User rules appear as perl functions.
- User can also configure the distribution of host
attributes in the config file.
34Examples Silly hosts
Host NumCPU Clock rate Mem (MB) Disk (GB) Arch OS OS vendor
1 512 1200 256 40 IA32 DUX Sun
2 16 1000 512 800 PARISC NetBSD Microsoft
3 4 1600 512 160 SPARC32 DUX RedHat
4 1 1800 65536 400 IA32 Solaris Microsoft
Hosts generated without considering Intra-host
correlation, each attribute follows its own
distribution.
35Examples Sensible hosts
Host NumCPU Clock rate Mem (MB) Disk (GB) Arch OS OS vendor
1 512 1200 65536 10240 MIPS FreeBSD FreeBSD
2 16 1000 8192 800 PARISC NetBSD NetBSD
3 4 1600 1024 160 SPARC32 Solaris Sun
4 1 1800 512 80 IA32 Win2k Microsoft
Hosts generated with considering Intra-host
correlations.
36Open questions
- What are the real distributions of host
attributes? - What are the real intra- and inter-host
correlations?
Difficult to answer without measurement
data Difficult to acquire measurement data (see
paper) We would appreciate your help!
37Conclusions
- We have presented GridG, a tool kit for
generating synthetic computational grids. - The topology generation component can produce
structured network topologies that obey the power
laws of Internet topology. - The annotation generation component of GridG is
built upon Internet measurements and a set of
heuristic rules.
38Conclusions
- While developing GridGs topology generator, we
discovered an interesting relationship among the
power laws, and proposed a new one that better
fits the data. - While measuring the Internet, we found the OS
concentration phenomenon and built it into GridG
as an user option. -
39For MoreInformation
- GridG is released online at
- http//www.cs.northwestern.edu/urgis/GridG
- http//www.cs.northwestern.edu/urgis
- Related RGIS project papers
- Nondeterministic queries in a Relational Grid
Information Service, In proceedings of SC03. - Scoped and Approximate queries in a Relational
Grid Information Service, In proceedings of
Grid2003.