Effective Strategies for SAN Performance Monitoring - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Effective Strategies for SAN Performance Monitoring

Description:

The Storage Networking Company. I N R A N G E T e c h n o l o g I e s C o r p o r ... Lowering management cost while increasing storage networking performance ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 41
Provided by: davids91
Category:

less

Transcript and Presenter's Notes

Title: Effective Strategies for SAN Performance Monitoring


1
Effective Strategies for SAN Performance
Monitoring
with PerformanceVSN
NTSMF Users Group - CMG
  • David Signori
  • Product Marketing Manager, Software Solutions
  • INRANGE Technologies Corporation
  • 12/9/02

2
Current Challenges in Storage Networking
Administration
  • Planning network requirements for Business
    Continuance applications.
  • Planning network requirements for the
    ever-increasing size and complexity of the
    storage environment.
  • Lowering management cost while increasing storage
    networking performance
  • Implementing a Service Provider model consisting
    of charge back, reporting, and service level
    agreements to end users.
  • Eliminating finger pointing with Server, Network,
    and Database administration groups.
  • Managing heterogeneous environments.
  • Decreasing or eliminating downtime.

Ultimately, how do I increase and guarantee
performance while lowering cost?
3
Storage Networking Performance Monitoring Solution
  • Requirements
  • Session Layer Traffic Flow Monitoring
  • External to the Storage Networking Equipment
  • Standards-based Management, Collection, and
    Reporting Interfaces
  • Simple Plug-and-Play Configuration and Operation
  • Persistence Permanent Records of Traffic
    Behavior
  • Flexible Reporting Capabilities
  • Policy Monitoring and Alerting
  • Enhance Storage Network Security
  • Scalable

A Comprehensive Storage Networking Performance
Monitoring Solution will increase performance and
lower cost.
4
What is PerformanceVSN?Product Overview
  • Definition
  • INRANGE Storage Networking Performance Monitoring
    Solution for Capacity Planning and Service Level
    Management.
  • Components
  • PerformanceVSN Server (Appliance)
  • PerformanceVSN Server Software
  • Optional PerformanceVSN Probe
  • Base Functionality
  • PerformanceVSN Server Server Software
  • Port-level statistics collection both real-time
    and historical
  • Statistics gathered from INRANGE Directors
    switches
  • Advanced Functionality
  • PerformanceVSN Server Server Software
    Probe(s)
  • Session-level statistics collection both
    real-time and historical
  • Statistics gathered from INRANGE Directors
    switches Probe(s)

PerformanceVSN Server
PerformanceVSN Probe
5
Performance Monitoring Requirements
Session Layer Traffic Flow Monitoring
LUNs 1..n
RAID_A
ISL
LUNs 1..n
Server_A
Server_B
Server_C
RAID_B
Server_D
Server_E
LUNs 1..n
Server_F
Server_G
Session statistics Total ISL
utilization 60 Server_A to
RAID_B util 35 Server_A
to RAID_B / Lun 3 util 10
Server_A to RAID_B / Lun 9 util 15
Server_A to RAID_B / Lun 5 util
10 Server_B to RAID_C util 25
Server_B to RAID_C / Lun 2
util 22 Server_B to
RAID_C / Lun 7 util 3
RAID_C
Port statistics ISL at 60 utilization
Port vs. Session Layer Statistics
6
Performance Monitoring Requirements
FICON Layer 2 Session Layer Traffic Flow
Monitoring
Server_A
FICON_Storage_A
Channel_A1
CU_A1
Channel_A2
CU_A2
Server_B
Channel_B1
FICON_Storage_B
CU_B1
Channel_B2
CU_B2
Channel_C1
Server_C
Channel_C2
Session statistics Total CU_B2
utilization 60 Channel_A1 to
CU_B2 util 35 Channel_B2 to
CU_B2 util 20 Channel_C1 to
CU_B2 util 5
Port statistics CU_B2 60 utilization
7
FICON Cascading High Integrity Fabric
Server_A
FICON_Storage_A
Channel_A1
CU_A1
Channel_A2
CU_A2
Server_B
Channel_B1
FICON_Storage_B
CU_B1
Channel_B2
CU_B2
Channel_C1
Server_C
FICON_Storage_C
Channel_C2
CU_C1
CU_C2
Channel_D1
Server_D
Channel_D2
Session statistics Total ISL
utilization 60 Channel_D1 to
CU_B2 util 35 Channel_A2 to
CU_C1 util 20 Channel_C1 to
CU_C2 util 5
Port statistics ISL 60 utilization
8
Performance Monitoring
FICON ULP Session Layer Traffic Flow Monitoring
Server_A
FICON_Storage_A
Channel_A1
CU_A1
Channel_A2
CU_A2
Server_B
Channel_B1
FICON_Storage_B
CU_B1
Device_B2A1
Channel_B2
Device_B2A2
CU_B2
Device_B2B1
Device_B2B2
Channel_C1
Server_C
Device_B2B3
Channel_C2
Session statistics Total CU_B2
utilization 60 Channel_A1 to
CU_B2 util 35
Channel_A1 to CUADD_B2B util 20
Channel_A1 to Device _B2B1 util
15 Channel_A1 to
Device_B2B3 util 5
Channel_A1 to CUADD_B2A util 15
Channel_A1 to Device_B2A1 util 10
Channel_A1 to Device_B2A2 util 5
Channel_B2 to CU_B2 util 20
Channel_C1 to CU_B2 util 5
Port statistics CU_B2 60 utilization
9
Session Layer ReportingExamples
  • Real-time Summary of the Selected LUNs in SCSI
    Read Mbytes/Sec being currently accessed by all
    hosts.
  • Note that this is a system wide report across all
    servers on the network.

10
Session Layer ReportingExamples
  • Real-time Summary of the Top 5 LUNs in Total
    Mbytes/Sec being currently accessed by Host
    Server_A.
  • Note LUNs 9, 5, 7, 6, and 8 on storage device
    RAID_A

11
Session Layer ReportingExamples
  • Real-time Summary of the Top 5 LUNs in Read
    Duration for Host Server_A.
  • Note that this is a measure of latency and is
    reporting on the 5 LUNs in which latency is a
    maximum the network.

12
Session Layer Reporting Examples
  • Trend of SCSI Exchanges/Sec between host
    Server_A and storage device RAID_A for the
    past 2 hours.

13
Performance Monitoring Requirements
External to Storage Networking Devices
  • Resources in network devices should be dedicated
    to the distribution and handling of incoming and
    outgoing data streams.
  • Many potential problems at the framing and upper
    layers are not reported.
  • Although external, probe should be non-intrusive

Servers
Servers
Metro Disk Mirroring
Remote Storage
Remote Storage
WAN Disk Mirroring
WAN
Storage
Performance Monitoring Probe
14
Performance Monitoring Requirements
Standards Based
Reporting
Management
SAN Management, Data Management, Virtualization, S
RM, Enterprise Management
Java GUI, Spreadsheets, SAS, Home grown
SNMP, CIM/XML
CSV, SQL, HTTP
Performance Monitoring Platform
TCP/IP
SNMP Fibre Alliance MIB
3rd Party Devices
Routers/ Channel Extension
Switches/Directors
Probes
Collection
15
Performance Monitoring Requirements
Standards Based
  • Should Support Heterogeneous Environments
  • Multi-Vendor Equipment
  • FICON, FCP, IP, and VI
  • Fibre Channel and WAN
  • Should Support Standalone Deployment or as a
    Plug-In to Chosen SAN Management Application
  • Adds value to chosen storage management
    applications
  • Should Function as a Plug-In to Chosen Enterprise
    Management System.
  • Should Leverage Performance Monitoring
    Capabilities in Existing Equipment Metrics and
    Access
  • Service Provider-Type Reporting

16
Performance Monitoring Requirements
Simple Plug-and-Play Configuration and Operation
  • Should Support Topology Rollup and Automatic
    Discovery of ports, devices, and LUNs.
  • Session and SCSI layer monitoring should be
    reported by human-readable logical port and
    device names
  • Permanent Statistics Logging should start
    automatically and have easily configurable
    sampling periods
  • Should Support a Dashboard for Quick Health
    Assessment
  • Should Support Open Systems Management for Remote
    and Desktop Access.

17
Performance Monitoring Requirements
Persistence Permanent Records of Traffic Behavior
  • Should support user-configurable historic
    sampling intervals
  • Should support user-configurable rollup periods
    and retention times for efficient database usage
  • Should support archival and export of database
    for long term capacity planning
  • Persistent statistical storage enables capacity
    planning and trouble-shooting of problems that
    occurred in the past
  • Should support historical trend reports for
    capacity planning and performance tuning.
  • Should support historical summaries for Service
    Provider-Type Reporting.
  • Should support bookmarks and pre-configured time
    durations for frequently viewed reports and
    Service Provider-Type Reporting

18
Performance Monitoring Requirements
Persistence Permanent Records of Traffic
Behavior Examples
  • Trend of Total Mbytes/Sec In and Out for a
    selected port over the past 2 hours
  • Note that report was requested at 1830 and
    displayed historical data. This is not a trace
    that began at 1630.

19
Performance Monitoring Requirements
Persistence Permanent Records of Traffic
Behavior Examples
  • Trend of Total Mbytes/Sec In and Out for a
    selected port over the past 8 hours
  • Note that in addition to customized time periods,
    pre-configured time periods like Today,
    Yesterday, Current Week, and Last Month should be
    possible.

20
Performance Monitoring Requirements
Persistence Permanent Records of Traffic
Behavior Examples
  • Trend of SCSI Exchanges/Sec between host
    Server_A and storage device RAID_A for the
    past 2 hours.

21
Performance Monitoring Requirements
Persistence Permanent Records of Traffic
Behavior Examples
  • Summary of the Top 5 LUNs in Total Mbytes/Sec
    being currently accessed by Host Server_A for
    Month of May, 2002
  • Note LUNs 9, 5, 7, 6, and 8 on storage device
    RAID_A

22
Performance Monitoring Requirements
Flexible Reporting Capabilities
  • Should Support Real-Time Monitoring
  • Should Support Collection of Hundreds of Metrics
    including Diagnostics
  • Should Include Value-Added Derived Reports like
    TopN, Rates, and Multiple Devices and Statistics
    in a Single Report
  • Should Support Configurable Sampling Intervals
  • Should Support Bookmarks to Easily Return to
    Frequently Viewed Reports.

23
Performance Monitoring Requirements
Flexible Reporting Capabilities Hundreds of
Metrics, Examples
  • Utilization
  • Frames (In/Out)
  • FC-2 MB/Sec (In, Out)
  • FC-4 MB/Sec (In, Out by ULP SCSI, IP, VI,
    FICON, and others)
  • Errors MB/Sec (In, Out)
  • SCSI IO/Sec (Read, Write, Other)
  • SCSI Read (avg, min, max, read percentage)
  • SCSI Write (avg, min, max, write percentage)
  • SCSI Other (other percentage)
  • SCSI Read/Write Payload Size Ranges (percentage)
  • Throughput Errors
  • Busy Frames
  • Rejected Frames
  • Link Failures
  • Aborts
  • Primitive Seq Protocol Errors
  • Invalid Tx Words
  • Delimiter Errors
  • Discarded Frames
  • BSYs and RJTs (Port, Fabric)
  • CRC Errors
  • Availability
  • Link Resets (In/Out)
  • OLS (In/Out)
  • LOGIs (Port, Fabric)
  • Available
  • Link Integrity
  • Sync Loss
  • Sig Loss
  • Capacity
  • capacity for all frames
  • FC-4 capacity (SCSI,IP,VI,FICON, other)
  • capacity link control
  • capacity link services
  • Latency
  • SCSI Read/Write Duration (ms)

24
Performance Monitoring Requirements
Flexible Reporting Capabilities Examples
  • Real-time Summary of Total Mbytes/Sec for 24
    selected ports.
  • Note that multiple ports across multiple switches
    can be added to single report.
  • Note Report is accessed using a Bookmark

25
Performance Monitoring Requirements
Flexible Reporting Capabilities Examples
  • Real-time Summary of percent read exchange size
    to storage device RAID_A from all hosts on the
    network.
  • Real-time sampling interval can be modified.
  • Report can be toggled to trend by simply
    selecting tool bar button.
  • Multiple metrics in a single report

26
Performance Monitoring Requirements
Policy Monitoring and Alerting
  • Should support proactive troubleshooting to
    eliminate or decrease downtime
  • Should support open real time alerting (i.e.
    SNMP, Email)
  • Should support multiple levels of thresholds
  • Should support pre-defined threshold definitions
    for quick and easy configuration
  • Thresholds should be supported on all metrics
    collected including errors, type of traffic, size
    of traffic, etc and all objects including
    ports, devices, and logical units
  • Ideal for Service Provider Model since
    administrator knows about potential problems
    before end-user.

27
Performance Monitoring Requirements
Enhanced Security Policies
  • Role-Based Security
  • Event Logging
  • Security Policy Monitoring Alerting on
    unauthorized Host to LUN access

28
Performance Monitoring Requirements
Scalability
  • Should Support a Combination of Software and
    Hardware to Suits your needs.
  • Should Support an Inexpensive Entry Point that is
    easily Expandable as your Network Grows.
  • Should Support a Roadmap around Future Storage
    Networking requirements (i.e. 10G, FC-IP, iSCSI,
    Infiniband)
  • Should be Data Center ready (i.e. multiple
    interfaces in a single enclosure, rack-mountable)

29
Performance Monitoring Life-Cycle
Putting it all together
  • Performance Profiling
  • Record and Monitor Current Network Performance
    Levels
  • Performance Thresholding
  • Set Thresholds based on profiles for real-time
    alerting to throughput and availability problems.
  • Performance Tuning
  • Adjust traffic flows based on profiles for better
    network performance without spending for more
    resources.
  • Capacity Planning
  • Know exactly when and how much more resources are
    needed without overspending.

30
Case Study and ROI
Large Financial Brokerage Metro Area Disk
Mirroring
Storage
Servers
Remote Storage
FICON
FICON
FICON
FCP
FCP
FCP
FCP
DWDM
31
Case Study and ROI
Performance Profiling
MAN extender usage across a selected week. Note
spikes in traffic.
32
Case Study and ROI
Performance Profiling
Drilling into MAN extender usage across for
specific day. Note spike in traffic between noon
and 1PM.
33
Case Study and ROI
Performance Tuning
Drilling into Storage port usage identifies
offending Storage Device
34
Case Study and ROI
Large Financial Brokerage Metro Area Disk
Mirroring
  • Given
  • DWDM Channel costs 16k/month.
  • Customer was considering going to 4 channels per
    fabric but justified that for time being, 3 per
    fabric was adequate.
  • Result
  • ROI was less than 2 months for this particular
    solution.
  • Additional Benefits
  • Capacity Planning
  • Visibility into utilization trends determine
    exactly when additional channels will be needed.
  • Performance Tuning
  • Visibility into offending storage device provide
    load balancing feedback to re-map devices to
    lower utilized links thus optimizing channels.
  • Standards-Based
  • Provides seamless visibility into the FICON
    portion of the fabric as well.
  • Real-Time Monitoring
  • Reports on errors for trouble-shooting and
    diagnostics.

35
Performance Monitoring Solutions to Current
Challenges
  • Planning network requirements for Business
    Continuance applications
  • Planning network requirements for the
    ever-increasing size and complexity of the
    storage environment
  • Answers question of how many MAN extender links
    you need.
  • Answers question of how much WAN extender
    bandwidth you need.
  • Traces spikes in MAN/WAN extender link back to
    the device and volume that caused it.
  • Enables you to know when you will need more
    bandwidth.
  • Reports on Latency
  • Answers question of how many ISLs you need.
  • Answers question of what is the optimum
    server-to-storage ratio.
  • Enables you to know when you will need more
    ports.
  • Traces spikes in ISL and storage port back to the
    device and volume (LUNs) that caused it.

36
Performance Monitoring Solutions to Current
Challenges
  • Lowering management cost while increasing storage
    networking performance
  • Implementing a Service Provider model consisting
    of charge back, reporting, and service level
    agreements to end users.
  • Eliminating finger pointing with Server, Network,
    and Database administration groups.
  • Reports, both real-time and historical, are only
    a mouse click away. No need for tedious
    spreadsheet crunching.
  • Command line launch and open APIs for seamless
    integration with 3rd party storage management
    application.
  • Since Session Layer Monitoring correlates usage
    and errors to the individual server, storage
    device, and volume (LUN), accountability can be
    maintained at the department level.
  • Session layer response time metrics allow you to
    distinguish between network, server, and storage
    device latency.

37
Performance Monitoring Solutions to Current
Challenges
  • Managing heterogeneous environments.
  • Decreasing or eliminating downtime with proactive
    policy-based monitoring.
  • Because solution is external to networking
    devices and uses standard collection interfaces,
    it is independent of fabric vendor, ULP, and can
    extend to the WAN.
  • Real-time and SNMP alerts on user-defined
    thresholds. You profile the network and define
    behavior. Solution provides real-time
    notification of policy violation.
  • Combines the best of both worlds
  • Level of visibility on par with expensive
    diagnostic tools
  • Ease of use and capacity planning of an
    Enterprise service level management application.

38
Advanced Performance Monitoring Solutions
  • Capacity Planning/Modeling Planning for network
    usage of resources yet added. For example, when
    adding a new department with 10 clients to access
    application X on Server A. Server A already has
    100 clients. Throughput from Server A to what
    disks will increase 10? ROI Potential If you
    under-use ISLs you are over-spending.
  • Service Duplication/Modeling Planning for WAN
    usage of application yet added. For example, WAN
    will support disk mirror. How much bandwidth is
    needed to adequately support write I/O to
    particular disks or volumes? ROI Potential If
    you under-use WAN links you are over-spending.
  • Performance Tuning An Application/Server
    consolidation example Applications needing
    access to much of the same data are candidates to
    run on the same server or in the same cluster.
    ROI Potential If you under-use servers you are
    over-spending.

39
Advanced Performance Monitoring Solutions
  • Performance Tuning Save cost by separating the
    types of transactions on the network. For
    instance, separating transaction (I/O) and data
    intensive operations will allow more transactions
    () and deeper data mining.
  • Add value to storage management applications
    Example performance monitoring application feeds
    data backup/replication application so that
    backup time period is automatically selected and
    optimized.
  • Performance Management Automate actions based on
    conditions detected. Example Feedback loop to
    switching devices for intelligent routing
    decisions.
  • Life-Cycle Data Monitoring Based on level of
    access over network, determine appropriate
    storage type for particular data or application.
    Provides feedback for HSM.

40
QuestionsorFor a Copy of the PresentationDavi
d.Signori_at_Inrange.com703-442-3284
Write a Comment
User Comments (0)
About PowerShow.com