An Adaptable Benchmark for MPFS Performance Testing - PowerPoint PPT Presentation

About This Presentation
Title:

An Adaptable Benchmark for MPFS Performance Testing

Description:

An architecture for distributed file systems based on shared storage. ... Overwrite/Append Ratio: pre-fetching and space allocation. Work Load Management ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 28
Provided by: yubin7
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: An Adaptable Benchmark for MPFS Performance Testing


1
An Adaptable Benchmark for MPFS Performance
Testing
  • A Master Thesis Presentation
  • Yubing Wang
  • Advisor Prof. Mark Claypool

2
Outline of the Presentation
  • Background
  • MPFS Benchmarking Approaches
  • Benchmarking Testbed
  • Performance Data
  • Conclusion
  • Future Work

3
SAN File System
  • Storage Area Networks (SAN)
  • NAS Fibre Channel Switch HBA (Host Based
    Adapter).
  • Architecture
  • SAN File Systems
  • An architecture for distributed file systems
    based on shared storage.
  • Fully exploits the special characteristics of
    Filbre Channel-based LANs.
  • Key feature is that clients transfer data
    directly from the device across the SAN
  • Advantages include availability, load-balancing
    and scalability

4
MPFS
  • Drawbacks of Conventional Network File Sharing
  • Server is the bottleneck.
  • Store-and-forward model results in higher
    response time.
  • MPFS Architecture
  • Server only involves in control data (metadata)
    operations while file data operations are
    performed directly between clients and disks
  • MPFS uses standard protocols such as NFS and CIFS
    for control and metadata operations.
  • Potential advantages include better scalability
    and higher availability.

5
File System Benchmarks
  • SPEC SFS
  • Only measure the server performance
  • Only generate RPC load
  • NFS protocol only
  • Unix only
  • NetBench
  • Windows only
  • CIFS protocol only

6
Ideal MPFS Benchmark
  • Help in understanding MPFS performance.
  • Be relevant to a wide range of applications.
  • Be scalable and target both large and small
    files.
  • Provide workloads across various platforms.
  • Allow for fair comparisons across products.

7
Motivations
  • Current File System benchmarks are not suitable
    for the MPFS performance measurement
  • They only measure the servers performance.
  • They only target some specific file access
    protocols.
  • MPFS is a new file access protocol and demands
    new file system benchmark
  • The split-data-metadata architecture will prevail
    in the SAN industry.
  • Performance is critical to SAN file system.

8
Performance Metrics
  • Throughput
  • I/O rate, measured in operations/second
  • Data rate, measured in bytes/seconds
  • Response Time
  • Overall average response time for all mixed
    operations.
  • Average response time for individual operation.
  • Measured in Msec/Op.
  • Scalability
  • Number of client hosts supported by the system
    with acceptable performance
  • Sharing
  • System throughput and response time when multiple
    clients access the same data

9
MPFS Benchmark Overview
  • Application groups target the critical
    performance characteristics of MPFS.
  • Application mix percentages are derived from the
    low-level NFS or CIFS operation mix percentages.
  • The file set is scalable and targets both big
    files and small files.
  • The file access pattern is based on an earlier
    file access trace study.
  • The load-generating processes in each
    load-generator is Poisson distributed.
  • The embedded micro-benchmarks measure how MPFS
    performs under intensive I/O traffics.
  • The huge file set and random file selection
    avoids caching effect.

10
Application Groups
  • The application group is a mix of system calls
    that mimic MPFS applications.
  • The applications are selected by profiling some
    real-world MPFS applications.
  • The applications include both I/O operations and
    metadata operations.
  • The operation groups for Windows NT follow the
    file I/O calls used in the Disk Mix test of
    NetBench.

11
Application Mix Tuning
  • The application mix percentage is derived from
    the low-level NFS or CIFS operation mix
    percentage.
  • The default NFS operation mix percentage we use
    is the NFS version 3 mix published by SPEC SFS
    2.0.
  • The default CIFS operation mix percentage is the
    CIFS operation mix used in NetBench.
  • We allow user to specify the mix percentage for
    their specific applications.

12
File Set Construction
  • We build three types of file set with different
    file size distribution.
  • We have small, medium and large file sets.
  • The small file set comprises 88 of small files
  • (lt 16 KB).
  • The large file set comprises 18 of large files
  • (gt 128MB).
  • We build huge file set to avoid caching effect.
  • The number of files and amount of data in our
    file set is scaled to the target load levels.

13
File Access Pattern
  • Based on an empirical file system workload study.
  • File Access Order sequential access or random
    access
  • File access locality the same files tend to get
    the same type of access repeatedly.
  • File access burst certain file access pattern
    occurs in bursts.
  • Overwrite/Append Ratio pre-fetching and space
    allocation

14
Work Load Management
  • Think time follows the exponential distribution.
  • Operation selection is based on the specified mix
    percentage the operation context and file access
    patterns.
  • Operation context is determined by profiling the
    MPFS applications.

15
File Sharing
  • Mainly measure how the locking mechanism affects
    the performance.
  • Include read and write sharing.
  • Multiple processes in a single client access the
    same file simultaneously.
  • Multiple clients access the same file
    simultaneously.

16
Embedded Micro-benchmarks
  • Measure the I/O performance of MPFS.
  • Include sequential read, sequential write, random
    read, random write and random read/write
  • Report the throughput measured in
    megabytes/second for each I/O test

17
Caching
  • A larger client cache or more effective client
    caching may greatly affect the performance
    measurement since our benchmark is in the
    application level.
  • Huge file set and random file selection help to
    avoid the caching effect.

18
Testbed Configuration
19
System Monitors
  • Network Monitor
  • - monitors the network states
  • - collects the network traffic statistics
  • I/O Monitor
  • - monitors the disk I/O activities
  • - collects the I/O statistics
  • CPU Monitor
  • - monitors the CPU usage
  • Protocol Statistic Monitor
  • - collects the MPFS/NFS/CIFS statistics

20
Web Interface
21
Throughput and Response Time
Generated Load Vs. Response Time for the MPFS
Benchmarking testbed with 8 Solaris Clients
22
Scalability
Measured Maximum Aggregate Throughput versus
Number of Solaris Clients
23
Change of the Mix Percentage
Generated Load Vs. Response Time for different
operation group mixes
24
Comparison between NFS and MPFS
Generated Load Vs. Response Time for three
different MPFS and NFS Solaris client
combinations
25
Conclusion (1)
  • Our benchmark achieves four major goals
  • Helps in understanding MPFS performance
  • Measure throughput and response time
  • Measure the scalability
  • Measure the performance for each individual
    operation
  • Compare the performance of MPFS with that of NFS
    or CIFS.
  • Generates realistic workload
  • Operations are selected by profiling the
    real-life MPFS applications.
  • File access patterns are derived from an
    empirical file system workload study.
  • File set construction mimics the real-world
    environment.

26
Conclusion (2)
  • File set is scalable and target both large and
    small files
  • The number of files and amount of data in our
    file set is scaled to the target load levels.
  • The file sets are of different file size
    distribution.
  • Provide workloads across various platforms.
  • Our benchmark supports both Unix and Windows NT
    systems.

27
Future Work
  • Create more realistic workload
  • Build up a large set of MPFS trace archives
  • Develop a profiling model to characterize the
    traces
  • Improve the scalability measurement
  • Our benchmark uses the number of clients (load
    generators) to represent the scalability.
  • The mapping between the load generator and client
    in a real-world application is subject to further
    investigation.
  • Develop a more general workload model for SAN
    file systems
  • Different SAN file systems may have different
    implementation.
  • A general benchmark should be independent of the
    implementation
Write a Comment
User Comments (0)
About PowerShow.com