Title: Data Grid Technologies
1Data Grid Technologies
- Sathish Vadhiyar
- Sources/Credits Technical papers listed in
references
2Replica Strategies
3Problem Motivation
- Replication to deal with faults and provide
scheduling flexibility. - Given a file that is partitioned into blocks that
are replicated throughout a wide-area file
system, how can a client retrieve the file with
the best performance? - Various algorithms
4Basic Downloading Algorithm
- The client opens a thread to each server
containing the file - A block size is chosen
- Each thread selects a different block to download
and all threads start downloading - A thread then chooses a new block that is
currently not being downloaded by any other
thread - Adaptive Servers with higher bandwidths to
clients download more blocks - Selection of block size - tricky
5Aggressive Redundancy
- To provide fault tolerance and to improve
download time - A redundancy factor, R
- The client downloads a block simultaneously from
R servers - Only 1 is chosen whichever returns first
6Progress-Driven Redundancy
- Retry a download only when it is progressing
slowly - Progress number - P, redundancy factor R
- Each block assigned a download number initialized
to 0 - When a thread attempts to download a block, it
increments the blocks download number
7Progress-Driven Redundancy (Continued)
- For selecting a new block to download
- If there is a block B whose download number lt R,
and if there are P blocks after B whose downloads
have completed, then select B - Else select next block whose download number is
zero
8Fastest1
- Another approach
- For downloading a block, choose a server that has
minimum value of time(l1) - time predicted time to download a block when
there is no contention. Obtained from NWS numbers
before download is initiated. - l number of threads currently downloading from
the server
9Results
10Multiple clients
- Situation arises when parallel data for
computation on parallel clients have to be
selected from available replica server locations - More challenges download decision by a client
can impact download performance on other clients.
Need to predict this impact. - Periodic network monitoring have to be augmented
by measurements corresponding to current
downloads
11Collective Download algorithm
- Each algorithm connects to a server only once
even if some of the data belongs to other clients
download phase - The clients then redistribute data among
themselves redistribution phase - Widely followed in parallel-I/O
- Especially useful when clients and servers are on
either side of WAN multiple latencies can be
avoided at the cost of less expensive
redistribution phase
12Replica Placement Strategies
- Replica placement questions
- When should replicas be created?
- Which files should be replicated?
- Where should replicas be placed?
- The model assumes that data is produced in tier-1
(root) and there are storage spaces at various
tiers (levels of hierarchy) - Clients that request data form the leaves of the
hierarchy
13Placement strategies
- Best client
- Each storage node maintains history regarding
number of requests for the files it contains - If the number of requests for a file exceeds the
threshold, the node creates a replica of the file
in that client node that has generated most
requests for that file (best-client) - The request details for the file are cleared.
14Strategies
- Cascading replication
- Analogy to a 3-tiered function
- Once a threshold for a file is exceeded at the
root, a replica is created at the next level on
the path to the best client and so on - Geographical locality is exploited
- Plain caching done at the client
- Caching plus Cascading Replication
15Strategies
- Fast Spread
- A replica of the file is stored at each node
along its path to the client - Replica selection closest replica
- Replica replacement least popular file with
oldest age is replaced. Popularity logs are
cleared periodically
16Findings
- Best-client performs worst for random access
patterns and shows improvement for access
patterns with a bit of geographical locality - Fast spread works much better than cascading for
random data access - Bandwidth savings are more in fast spread than in
cascading - Fast spread has high storage requirements
17Computation and Data
18GriPhyN
- Focuses on virtual data grid technologies
- Allows exploitation of computation procedures and
results as community resources - Request to data can either retrieve data or
execute computation procedures that produce the
data
19Challenges
- Representing transformations in virtual data
catalog - Tracing derived data
- Mapping computations onto effective flow graphs
- Rebuilding dependent objects when code or data
changes - Automated generation and scheduling of
computations required to instantiate data products
20Chimera
- Virtual data system that supports capture and
reuse of data generated by computations - Consists of virtual data catalog and virtual data
language interpreter - VDC tracks how data is derived
- Transformation abstract definition of how a
program is to be invoked, what parameters and
input files it needs etc. - Derivation invocation of a transformation with
specific set of inputs and files - Execution of all transformations recorded in
Chimera database - VDL query functions allows to search VDC for
derivation or transformation. Queried by
application, transformation, input, output name
21Chimera architecture
22Transformation and Derivation Example
23Chimera-Pegasus Architecture
24Work flows
25Decoupling computation and data movement
26Architecture
- External Scheduler (ES)
- Decides which remote site to send the job to
- Local Scheduler (LS)
- Follows its own policies
- Data Scheduler (DS)
- Replicates popular data sets to remote sites
following some algorithm
27Algorithms
- 4 different ES algorithms
- JobRandom
- JobLeastLoaded
- JobDataPresent
- JobLocal
- 3 different DS algorithms
- DataDoNothing
- DataRandom
- DataLeastLoaded
28Simulation
- Discrete Event Simulator was used
- Resource capacities were modeled
- Dataset sizes uniform distribution between 500
MB to 2 GB - Initially only one replica per data set
- Users mapped evenly across sites
- Each job requires a single input file and
requires 300 D seconds, where D is the input size
in GB - Network contention modeled based on number of
simultaneous data transfers - Input file requests generated randomly according
to geometric distribution based on popularity of
files
29Popularity distribution
30Results
31Sources / References / Credits
- Algorithms for high Performance, Wide-area
distributed file downloads. J.S. Plank, S.
Atchley, Y.Ding and M. Beck, Parallel Processing
Letters, vol. 13, no. 2, pp 207-224, June 2003. - Downloading Replicated Wide-Area Files a
Framework and Empirical Evaluation. R.L. Collins
and J.S. Plank. NCA 2004. - Identifying Dynamic Replication Strategies for a
High-Performance Data Grid. K. Ranganathan and I.
Foster. Grid 2002.
32Sources / References / Credits
- Grid-Based Galaxy Morphology Analysis for the
National Virtual Observatory. Ewa Deelman,
Raymond Plante, Carl Kesselman, Gurmeet Singh,
Mei-Hui Su, Gretchen Greene, Robert Hanisch,
Niall Gaffney, Antonio Volpicelli, James Annis,
Vijay Sekhri, Fermi Tamas Budavari, Maria
Nieto-Santisteban, William O'Mullane, David
Bohlender, Tom McGlynn, Arnold Rots, Olga
Pevunova, Supercomputing 2003. - Applying Chimera virtual data concepts to cluster
finding in the Sloan Sky Survey. James Annis ,
Yong Zhao, Jens Voeckler, Michael Wilde, Steve
Kent, Ian Foster. SC 2002.
33Sources / References / Credits
- Kavitha Ranganathan and Ian Foster, Decoupling
Computation and Data Scheduling in Distributed
Data Intensive Applications, Proceedings of the
11th International Symposium for High Performance
Distributed Computing (HPDC-11), Edinburgh, July
2002.
34Replica Creation and Elimination Policy
- More replicas lead to load balance but puts
pressure on storage capacities - Replication creation
- On-demand
- By replica managers in the background
- Replica management decisions for on-demand
- Replica decision should a remote file be
replicated at a local site in response to the
file request? - Replica selection and
- Replica replacement
35Replica Optimization Strategies
- LRU
- Replication decision always replicate
- Replica selection based on closest location
- Replica replacement LRU
- Binomial
- Replica decision based on file value calculated
from binomial prediction of file popularity - Replica selection using auction and bidding
- Replica replacement replace local replica with
lowest file value - Zipf
- Same as binomial, but zipf distribution is used
36Scheduling optimizations
- Random assign job to a random host
- Shortest queue assign job to host whose queue
is smallest - Access cost assign job to host whose access
cost to files required by the job is smallest - Queue access cost assign job to host where the
sum of access cost for this job and all the jobs
in the queue is the smallest
37Results Impact of Network Performance