MINERVA: an automated resource provisioning tool for large-scale storage systems

About This Presentation
Title:

MINERVA: an automated resource provisioning tool for large-scale storage systems

Description:

Title: MINERVA: an automated resource provisioning tool for large-scale storage systems Author: Jack Lange Last modified by: Fabian E. Bustamante –

Number of Views:105
Avg rating:3.0/5.0
Slides: 36
Provided by: JackL167
Category:

less

Transcript and Presenter's Notes

Title: MINERVA: an automated resource provisioning tool for large-scale storage systems


1
MINERVA an automated resource provisioning tool
for large-scale storage systems
  • G. Alvarez, E. Borowsky, S. Go, T. Romer, R.
    Becker-Szendy, R. Golding, A. Merchant, M.
    Spasojevic, A. Veitch, J. Wilkes

2
Large Scale Storage Systems
  • Very Difficult to configure and design
  • 10 100s of host computers
  • 10 100s of storage devices
  • 10 1000s of Disks/Logical Volumes
  • Terabytes of capacity
  • Meet throughput demands
  • Maximize capacity utilization
  • Automation would be nice

3
MINERVA
  • Subdivide problem into three stages
  • Choose correct device set
  • Choose correct configuration parameters
  • Map user data onto devices
  • NP-hard
  • Architectural elements
  • Declarative descriptions of storage workload
    requirements
  • Constraint-based problem representation
  • Optimization strategies and heuristics
  • Analytic performance models

4
MINERVA Inputs
  • Workload Description
  • Data type descriptions and access patterns
  • Two types
  • Stores
  • Logically contiguous data (db table or
    filesystem)
  • Streams
  • Sequences of accesses on a store (pattern and
    throughput)
  • Device Descriptions
  • Disk information (number, size, and type)
  • Array information (number of LUNs)

5
MINERVA Objects
6
MINERVA Outputs
  • Assignment
  • Device Set taken from Device Descriptions
  • Mapping of stores to devices
  • 2nnm possible configurations
  • O((2m)m) complexity
  • Goal
  • Minimum cost that meets performance requirements
  • Effector tool
  • Takes assignment as input
  • Automated configuration of physical devices

7
Storage System Lifecycle
8
Architecture
  • Array Allocation
  • Tagger
  • Assigns a preferred RAID level
  • Allocator
  • Determines number of arrays
  • Array Configuration
  • Array Designer
  • Actually configures the arrays
  • Store Assignment
  • Solver
  • Assigns stores to LUNs
  • Optimizer
  • Prunes unused resources and balances load
  • Evaluator
  • Verifies design with analytic models

9
Architecture
10
MINERVA Process
11
Analytical Device Models
  • Determines feasibility
  • Predicted throughput error rate 20
  • Streams
  • Modeled as ON-OFF Markov-modulated Poisson
    process
  • Arrays
  • Array controller, bus connection, disks
  • Case Study
  • HP SureStore Model 30/FC High Availability disk
    array

12
Tagger
  • Choose storage class based on access pattern
  • RAID 1/0 or RAID 5
  • Rule Based
  • Determines capacity bound stores
  • Estimates average number of IO ops per sec.
  • IOPS

13
Capactiy Rules
  • Calculated per GB of storage
  • Capacity bound RAID 5

14
IOPS Estimation
  • RAID level least number of per-disk IOPS

15
Allocator
  • reasonable set of arrays
  • 3 steps
  • Consider type and number of arrays
  • Consider array configurations
  • Consider LUN divisions and RAID configurations

16
Allocator models
  • Can only use analytic device models
  • Ignores stream phasing
  • Rillifier handles large resource demands
  • Distribute workload among different LUNs
  • Stores become shards
  • Excessive capacity requirements
  • Streams become rills
  • Excessive throughput requirements

17
Allocator Search
  • Uses Branch-and-Bound strategy
  • Determines number of array types
  • Chooses lowest cost that supports workload
  • Searches array configurations
  • Starts with mixed arrays
  • Iteratively converts arrays to dedicated types
  • Branch and Bound-bias dedicated
  • Searches in reverse order starting with dedicated
    types
  • Calls array designer with configuration
  • If array designer fails, search continues

18
Array Designer
  • Determines LUN sizes and array parameters
  • Starts with simple cases of equal size LUNs
  • Also considers greedy configuration
  • Workload description determines LUN size
  • Relies on Optimizer to take care of unused
    capacity
  • Target disk assignment done with round robin
    across buses

19
Solver
  • Assigns stores to LUNs
  • Multidimensional constrained bin-packing
  • Uses analytic device models to evaluate objective
    function
  • Constraints
  • LUN capacity
  • LUN phased utilization
  • Array bus bandwidth
  • Array controller utilization

20
Solver Heuristics
  • Simple Random
  • 50 random cases using first fit
  • Toyoda
  • Best fit using gradient function
  • Objective function combined with economic
    utilization
  • (1/penalty lun_cost)
  • Favors LUNS already in use or low cost
  • LUNs filled in order of increasing cost
  • Minimizes resource contention

21
Solver Heuristics 2
  • ToyodaWeighted
  • Maps gradients against remaining available
    resources
  • Maps stores to LUNs such that utilization is
    balanced
  • Objective_function cos(a)
  • Objective_function max_lun_cost lun_cost
  • Minimizes cost

22
Toyoda and ToyodaWeighted
23
Optimizer
  • Reruns Solver against configuration
  • Reduces required arrays
  • Runs ToyodaWeighted with new objective function
  • Objective_value 1 lun_utilization
  • Assigns stores to underutilized LUNs
  • Variations
  • Simple Random
  • Randomized first fit, chooses lowest utilization
    variance
  • Simple Balanced
  • Round robin first fit, based on capacity and
    utilization constraints

24
Clusterer
  • Addresses performance scaling issues
  • With many stores runtime grew to days
  • Combines multiple stores into a cluster
  • Cluster is mapped instead of stores
  • Cluster rules based on observation
  • 10MB/s bandwidth
  • 2GB size
  • Increases cost 3

25
Evaluation
  • Analytic model performance predictions
  • Evaluate sensitivity to workload changes
  • Effect of design changes
  • Measure live system

26
Model Validation
  • Based on single FC-30
  • Ran performance tests on physical system
  • Compared results to model predictions
  • Results showed mean error rate of 5.4
  • Range of -11, 19

27
Safety and Sensitivity
  • Examined scaling of workload parameters
  • Start with baseline workload, then modify a
    single parameter
  • Wanted to have 3 effects
  • Mixing of appropriate RAID levels
  • Requiring non-trivial number of arrays (2)
  • Balanced store performance requirements

28
Scaling Store Size and Bandwidth
  • Store size scaling
  • System becomes capacity bound
  • Creates RAID 5 LUNs
  • System size scales linearly with store size
  • Bandwidth scaling
  • Ratio of RAID 1/0 to RAID 5 increases linearly

29
(No Transcript)
30
Scaling Number of Stores
  • Number of arrays scales linearly with stores

31
Running time
  • Quadratic increase with number of stores

32
Workload Variability
  • Workload attributes randomly taken from
    log-normal distribution
  • Baseline values mean distribution values
  • Capacity utilization drops with increased
    variability
  • RAID 5 LUNs increase
  • Segmentation increases

33
Workload variance
34
Whole System Validation
  • MINERVA vs. Human Expert
  • 3 aspects
  • Comparison of resultant system cost
  • Comparison of application performance
  • Low runtime and minimal human interaction
  • Based on TPC-D benchmark
  • Decision Support system based on DB queries
  • Human designers from HP system benchmarking team

35
Execution Times
Write a Comment
User Comments (0)
About PowerShow.com