An Adaptable Benchmark for MPFS Performance Testing - PowerPoint PPT Presentation

About This Presentation

Title:

An Adaptable Benchmark for MPFS Performance Testing

Description:

An architecture for distributed file systems based on shared storage. ... Overwrite/Append Ratio: pre-fetching and space allocation. Work Load Management ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 28

Provided by: yubin7

Learn more at: http://web.cs.wpi.edu

Category:

more less

Transcript and Presenter's Notes

Title: An Adaptable Benchmark for MPFS Performance Testing

1
An Adaptable Benchmark for MPFS Performance
Testing

A Master Thesis Presentation
Yubing Wang
Advisor Prof. Mark Claypool

2
Outline of the Presentation

Background
MPFS Benchmarking Approaches
Benchmarking Testbed
Performance Data
Conclusion
Future Work

3
SAN File System

Storage Area Networks (SAN)
NAS Fibre Channel Switch HBA (Host Based
Adapter).
Architecture
SAN File Systems
An architecture for distributed file systems
based on shared storage.
Fully exploits the special characteristics of
Filbre Channel-based LANs.
Key feature is that clients transfer data
directly from the device across the SAN
Advantages include availability, load-balancing
and scalability

4
MPFS

Drawbacks of Conventional Network File Sharing
Server is the bottleneck.
Store-and-forward model results in higher
response time.
MPFS Architecture
Server only involves in control data (metadata)
operations while file data operations are
performed directly between clients and disks
MPFS uses standard protocols such as NFS and CIFS
for control and metadata operations.
Potential advantages include better scalability
and higher availability.

5
File System Benchmarks

SPEC SFS
Only measure the server performance
Only generate RPC load
NFS protocol only
Unix only
NetBench
Windows only
CIFS protocol only

6
Ideal MPFS Benchmark

Help in understanding MPFS performance.
Be relevant to a wide range of applications.
Be scalable and target both large and small
files.
Provide workloads across various platforms.
Allow for fair comparisons across products.

7
Motivations

Current File System benchmarks are not suitable
for the MPFS performance measurement
They only measure the servers performance.
They only target some specific file access
protocols.
MPFS is a new file access protocol and demands
new file system benchmark
The split-data-metadata architecture will prevail
in the SAN industry.
Performance is critical to SAN file system.

8
Performance Metrics

Throughput
I/O rate, measured in operations/second
Data rate, measured in bytes/seconds
Response Time
Overall average response time for all mixed
operations.
Average response time for individual operation.
Measured in Msec/Op.
Scalability
Number of client hosts supported by the system
with acceptable performance
Sharing
System throughput and response time when multiple
clients access the same data

9
MPFS Benchmark Overview

Application groups target the critical
performance characteristics of MPFS.
Application mix percentages are derived from the
low-level NFS or CIFS operation mix percentages.
The file set is scalable and targets both big
files and small files.
The file access pattern is based on an earlier
file access trace study.
The load-generating processes in each
load-generator is Poisson distributed.
The embedded micro-benchmarks measure how MPFS
performs under intensive I/O traffics.
The huge file set and random file selection
avoids caching effect.

10
Application Groups

The application group is a mix of system calls
that mimic MPFS applications.
The applications are selected by profiling some
real-world MPFS applications.
The applications include both I/O operations and
metadata operations.
The operation groups for Windows NT follow the
file I/O calls used in the Disk Mix test of
NetBench.

11
Application Mix Tuning

The application mix percentage is derived from
the low-level NFS or CIFS operation mix
percentage.
The default NFS operation mix percentage we use
is the NFS version 3 mix published by SPEC SFS
2.0.
The default CIFS operation mix percentage is the
CIFS operation mix used in NetBench.
We allow user to specify the mix percentage for
their specific applications.

12
File Set Construction

We build three types of file set with different
file size distribution.
We have small, medium and large file sets.
The small file set comprises 88 of small files
(lt 16 KB).
The large file set comprises 18 of large files
(gt 128MB).
We build huge file set to avoid caching effect.
The number of files and amount of data in our
file set is scaled to the target load levels.

13
File Access Pattern

Based on an empirical file system workload study.
File Access Order sequential access or random
access
File access locality the same files tend to get
the same type of access repeatedly.
File access burst certain file access pattern
occurs in bursts.
Overwrite/Append Ratio pre-fetching and space
allocation

14
Work Load Management

Think time follows the exponential distribution.
Operation selection is based on the specified mix
percentage the operation context and file access
patterns.
Operation context is determined by profiling the
MPFS applications.

15
File Sharing

Mainly measure how the locking mechanism affects
the performance.
Include read and write sharing.
Multiple processes in a single client access the
same file simultaneously.
Multiple clients access the same file
simultaneously.

16
Embedded Micro-benchmarks

Measure the I/O performance of MPFS.
Include sequential read, sequential write, random
read, random write and random read/write
Report the throughput measured in
megabytes/second for each I/O test

17
Caching

A larger client cache or more effective client
caching may greatly affect the performance
measurement since our benchmark is in the
application level.
Huge file set and random file selection help to
avoid the caching effect.

18
Testbed Configuration
19
System Monitors

Network Monitor
- monitors the network states
- collects the network traffic statistics
I/O Monitor
- monitors the disk I/O activities
- collects the I/O statistics
CPU Monitor
- monitors the CPU usage
Protocol Statistic Monitor
- collects the MPFS/NFS/CIFS statistics

20
Web Interface
21
Throughput and Response Time
Generated Load Vs. Response Time for the MPFS
Benchmarking testbed with 8 Solaris Clients
22
Scalability
Measured Maximum Aggregate Throughput versus
Number of Solaris Clients
23
Change of the Mix Percentage
Generated Load Vs. Response Time for different
operation group mixes
24
Comparison between NFS and MPFS
Generated Load Vs. Response Time for three
different MPFS and NFS Solaris client
combinations
25
Conclusion (1)

Our benchmark achieves four major goals
Helps in understanding MPFS performance
Measure throughput and response time
Measure the scalability
Measure the performance for each individual
operation
Compare the performance of MPFS with that of NFS
or CIFS.
Generates realistic workload
Operations are selected by profiling the
real-life MPFS applications.
File access patterns are derived from an
empirical file system workload study.
File set construction mimics the real-world
environment.

26
Conclusion (2)

File set is scalable and target both large and
small files
The number of files and amount of data in our
file set is scaled to the target load levels.
The file sets are of different file size
distribution.
Provide workloads across various platforms.
Our benchmark supports both Unix and Windows NT
systems.

27
Future Work

Create more realistic workload
Build up a large set of MPFS trace archives
Develop a profiling model to characterize the
traces
Improve the scalability measurement
Our benchmark uses the number of clients (load
generators) to represent the scalability.
The mapping between the load generator and client
in a real-world application is subject to further
investigation.
Develop a more general workload model for SAN
file systems
Different SAN file systems may have different
implementation.
A general benchmark should be independent of the
implementation