Title: Optimizing of data access using replication technique
1Optimizing of data access using replication
technique
- Renata Slota1, Darin Nikolow1,Lukasz Skital2,
Jacek Kitowski1,2 -
- 1 Institute of Computer Science AGH-UST, Cracow
- 2 ACC CYFRONET AGH, Cracow
2Agenda
- Motivation of the work
- Why does today grid computing need replication?
- Replication basics
- Clusterix Data Management System
- Architecture, optimization and replication
algorithms - Optimization Example
- Replication Example
- Summary, conclusions
3Site-level vs. Grid-levelreplication
- Site-level replication
- Replicas in one site
- Implementation examples
- RAID
- HSM
- Grid-level replication
- Data management systems
- Replicas spread on many sites
4Motivation of the workWhy does today grid
computing need replication?
- Data protection and availability
- Malfunction of one storage does not affect data
itself, only performance is affected - Performance
- Low level optimization and replication are not
sufficient (RAID, HSM) - Limited network bandwidth
- Limited storage performance
5Replication scenarios
- Static replication
- Decision made by system administrator or user
- Limited system support replica selection,
replica coherency, replica ordering - Dynamic replication
- Decision made by dedicated grid component based
on current data access pattern of users - Full system support
6Replication consequences
- Optimal replica selection algorithm
- Replica creation and removal algorithm
- Cost of replica creation, update and storage
- Replica coherency
7ClusterixNational Cluster of Linux Systems
- Project aim
- To develop set of tools and procedures allowing
to build productive Grid environment based on
local PC clusters spread in independent
supercomputing centers - Network Layer
- Pionier Polish optical networks
8Clusterix Data Management SystemArchitecture
9Optimization Algorithm
- Selects optimal storage element for
- data accessing
- replica creation
- Takes under consideration current state of the
System - Optimal storage element is one with the maximal
weight W(s,d) - W(s,d)min((1-NetLoad(s))?bandwidth(s,d),
(1-Sload(s))?Sbandwidth(s)) - s storage element
- d destination node
- NetLoad(s) s network interface load
- Bandwidth(s,d) available bandwidth between s
and d - Sload(s) storage system load
- Sbandwidth(s) storage system bandwidth
10Automatic replication algorithm
- Takes under consideration gain from replication
G(), cost of replica creation C(), cost of
replicas update U() and administrative factor
A(). - Replication profit
- P(d,R,S,f)G(d,R,S,f)C(d,R,f)U(d,R,S,f)A(d,f)
- d storage element, which profit is computed for
- R set of storage elements containing replicas
of f - S statistic data history of file usage
- f considered file
11Storage oriented problems Data intensive
applications for Clusterix
- Simulation of transonic flow past a wings tips
- Visualization of complex multidimensional
structures - Ecosystem modeling and simulation
12Optimization Example
- Node A needs file F stored on SE1, SE2 and SE3
F
NMS
Optimizer
F
SE1
NMS
CDMS
JIMS
NMS
JIMS
Node A
F
SE2
SE3
NMS
NMS
JIMS
F
13Optimization Example
- Node A sends request to CDMS
NMS
Optimizer
F
SE1
NMS
CDMS
JIMS
NMS
JIMS
Node A
F
SE2
SE3
NMS
NMS
JIMS
F
14Optimization Example
- CDMS uses Optimizer to choice optimal SE
NMS
Optimizer
F
SE1
NMS
CDMS
JIMS
NMS
JIMS
Node A
F
SE2
SE3
NMS
NMS
JIMS
F
15Optimization Example
W(s3,d)min((1-NetLoad(s3))?bandwidth(s3,d),
(1-Sload(s3))?Sbandwidth(s3))
W(s2,d)min((1-NetLoad(s2))?bandwidth(s2,d),
(1-Sload(s2))?Sbandwidth(s2))
W(s1,d)min((1-NetLoad(s1))?bandwidth(s1,d),
(1-Sload(s1))?Sbandwidth(s1))
NMS
Optimizer
F
SE1
NMS
CDMS
JIMS
NMS
JIMS
Node A
F
SE2
SE3
NMS
NMS
JIMS
F
16Initial replication example
JIMS
Optimizer
SE1
NMS
CDMS
NMS
Clusterix Entry point
NMS
User Workstation
JIMS
SE3
SE2
NMS
JIMS
NMS
17Dynamic replication in Clusterix
- Initial replication
- Every stored data file should be replicated
- Replication on demand
- Job driven replication
- Replication ordered by external process
- Replication based on statistic analysis
- Data access pattern driven replication
18Automatic replication exampleSituation
- 3 clusters
- 4 storage elements
- 2 contain replica of
- Set of applications running on these clusters and
accessing file
F
F
SE1
SE4
SE2
SE3
F
F
19Automatic replication example
Gain
Optimizer
F
F
Cost of rep.
Sleeping
Working
Replication Module
Cost of update
Adm. factor
CDMS
SE1
Statistic Module
SE2
SE3
SE4
20Automatic replication example
Optimizer
F
F
Working
Replication Module
Sleeping
CDMS
SE1
Statistic Module
SE2
SE3
F
F
F
SE4
F
F
F
F
21Automatic replication example
Optimizer
F
F
Sleeping
Replication Module
CDMS
SE1
Statistic Module
SE2
SE3
F
SE4
22Summary
- Architecture of CDMS with Optimization and
Replication modules has been designed - Replication and optimization algorithms has been
specified - Modules interfaces has been specified
- Future work
- Integration and tests
23Conclusions
- Simulation of replication vs. real system
implementation - Replication should be designed to meet specific
Clusterix applications profile - Data availability
- Replication drawbacks
24Publications
- Extended functionality of Virtual Storage System
for grid - Renata Slota, Darin Nikolow, Lukasz Skital, Jacek
Kitowski - Cracow Grid Workshop 2004, poster no. 13
- Application of data replication methods in
Clusterix project (in polish) - Renata Slota, Darin Nikolow, Lukasz Skital, Jacek
Kitowski - Pionier 2004, 19-20 May, Poznan, electronic
publication - Implementation of replication methods in the Grid
Environment - Renata Slota, Darin Nikolow, Lukasz Skital, Jacek
Kitowski - Submitted to European Grid Conference
25Thank You!
26Clusterix Data Management SystemArchitecture
- Replication module
- Responsible for
- Automatic replica creation/removal
- Implementation
- Java
- Apache SOAP
- Cooperate with
- Optimization module
- Statistic module
27Clusterix Data Management SystemArchitecture
- Optimization Module
- Responsible for
- storage element selection for newly created
replica, - optimal replica selection.
- Implementation
- C/C
- gSOAP
- Cooperates with
- Network Monitoring System (NMS)
- Information System
- JMX-based Infrastructure Monitoring System (JIMS)
28Clusterix Data Management SystemArchitecture
- Information System (JIMS)
- Department of Computer Science, AGH University of
Science Technology - Provides the following information for selected
node - Available storage capacity
- Total storage capacity
- Network interface load
- Network interface bandwidth
- Storage system load
- Average storage system load
- Maximal measured storage bandwidth
29Clusterix Data Management SystemArchitecture
- Network Monitoring System
- Poznan Supercomputing and Networking Center
- Provides the following information
- Maximum bandwidth between two network nodes
- Current load between two network nodes
- Nodes availability
30Clusterix Data Management SystemArchitecture
Statistic Module Bialystok Technical
University Responsible for gathering information
about past data usage