Title: Metron
1Windows 2000 - The Rhetoric and the Reality for
Performance Managers
Des Atkinson Metron Technology Ltd desa_at_metron.co.
uk
2Contents
- NT 4.0 Performance Monitoring
- the shortcomings
- Windows 2000
- System Monitor
- New Metrics
- Event Trace
- Windows 2000 Scalability and other issues
3NT 4.0 PM Shortcomings
- All tools were snapshot-based
- Process/thread termination between sampling
intervals - no accounting function unlike UNIX or a mainframe
OS - threads come and go much more than processes, so
loss not too great - dependent on sampling granularity
4NT 4.0 PM Shortcomings
- Per Process I/O Counts
- Counters in NT 3.1 returned crazy values
- In NT 3.51 values were consistently zero
- In NT 4.0 they disappeared!
- Can attempt some inferences using PF Delta
- If server dedicated to a single app, e.g. Oracle,
then data available at that level - I/O reads per user session
- I/O writes by the DBWR
5NT 4.0 PM Shortcomings
- Per Process/Thread per Device I/O
- a dream for capacity planning/modelling
- not present even on most mainframe or mid-range
systems - Workarounds for the above
- work it out once then remember
- infer by file to device mappings, e.g. from
Oracle statistics
6NT 4.0 PM Shortcomings
- Transaction Boundaries
- Mainframes have TP Monitors such as CICS on MVS
or TPMS on VME - Block mode terminals with associated message
types make transaction boundaries clear, both
conceptually and in their metrics - Midrange systems such as OpenVMS or UNIX lack
this clarity, as does NT
7NT 4.0 PM Shortcomings
- Per Process/Thread Hard Page Fault Counts
- Which processes are the most memory intensive?
- The work of which processes actually result in
physical page faults as opposed to logical page
faults? - What is the size and rate of these hard page
faults?
8NT 4.0 Performance Monitor
- Good for real-time monitoring
- Data logging is not sophisticated
- NT Resource Kit has a data logging service
(MONITOR.EXE, DATALOG.EXE) - does not pick up new processes
- No built-in data manipulation, trending or
modelling
9Enhanced Disk Performance counters on this system
are currently set to start at boot. Note that
Logical Disk counters of striped disk sets may
not be correct. DISKPERF -YE -N
\\computername -YE Sets the system to
start disk performance counters when the
system is restarted. E Enables the disk
performance counters used for measuring
performance of the physical drives in striped
disk set when the system is restarted.
Specify -Y without the E to restore the
normal disk performance counters. -N
Sets the system disable disk performance
counters when the system is restarted.
\\computername Is the name of the computer
you want to see or set
disk performance counter use.
10New PM Features in W2K
- System Monitor
- Supersedes the NT 4.0 Performance Monitor
- An embeddable component
- May be programmed/configured
- Performance logs and alerts service
- Event Trace as additional data collection
technology - OS Kernel instrumented since Beta 1
- Active Directory, Exchange etc. in progress
11System Monitor Architecture
User defined VB or HTML application
Windows NT Performance Monitor
System Monitor graph control
Custom Performance Tool
Sysmon log and alert service
PDH.DLL
WMI
RegQueryValueEx()
Perflib
Performance Extension DLL
System Performance DLL
Performance Extension DLL
System Performance DLL
Sysmon Log Service Files
Hi-Perf Data Provider Object
System Performance DLL
Performance Extension DLL
12System Monitor Interfaces
- Methods
- e.g. typical user interface tasks such as adding
counters - Properties
- e.g. data source properties or those of counter
displays - Events
- e.g. where a control has been changed, such as
when a counter has been added
13(No Transcript)
14(No Transcript)
15(No Transcript)
16New W2K System Metrics
- New PnP DiskPerf
- Correct logical and physical counters for FT
devices at the same time! - Counters on a per disk or per volume basis
- Disk counters
- Idle time
- Split I/O count
- Count of I/Os that were sub-divided internally
- Provides an indication of disk fragmentation
17System Monitor Counter Fixes
- For disk counters, Microsoft say you no longer
have to run diskperf -y to switch these on - confirmed this to be true
- Microsoft claim to have fixed the I/O by process
counters in beta 3 - confirm these metrics do return values, but they
look rather strange (see earlier slide)
18What is Event Tracing?
- Event trace is a recorded and ordered set of
events - Provides information to supplement the standard
counters - Events traced may include disk I/O, TCP/IP
traffic, thread creation/deletion, file I/O
19Exploiting Trace Logs
- Running a trace creates an event log file (e.g.
test_000001.etl) in the PerfLogs directory (6MB
in 30 minutes!) - Microsoft talk of detailed analysis tools or
enterprise tools - BUT nothing in beta 3 will actually look at these
.etl files (Jee Pangs tracedmp.c files not on
the CDs!)
20Tracing Active Directory
- Microsoft claim that every request or transaction
may be traced for the following - LDAP (Lightweight Directory Access Protocol)
- Replication
- Kerberos authentication
- I have not yet been able to confirm this or
measure any overhead
21System Monitor vs Event Trace Data
- System Monitor Counters
- very low overhead
- best for continuous monitoring
- problems relating system to user
- Event Trace Data
- higher overhead
- best for detailed analysis or capacity planning
- good for relating system to user
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28Microsoft Help Extract
Note Trace logging of file I/O and page faults
can generate an extremely large amount of data.
It is recommended that you limit trace logging
using the file I/O and page fault options to a
maximum of two hours.
29What Overhead does Event Tracing Impose?
- Jee Pang claims that the overhead of the Kernel
tracing is very low - Decided to test this by running a set of
benchmarks and measuring elapsed times before and
after. - 3 benchmarks were run (Perl scripts)
- I/O intensive
- Memory intensive
- Compute intensive
30 I/O INTENSIVE BENCHMARK print "Enter
required number of iterations " Typical
number of iterations was 250, with avg elapsed
time of 53 secs chomp(input_var
ltSTDINgt) fname time start_t time for (x
1 x lt input_var x) for (y 1 y
lt input_var y) z x y Open
and close the file each time to increase the I/O
overhead open (OUTF, "gtgtfname") print
OUTF "z\n" close (OUTF) end_t
time tot_t end_t - start_t print "Start
time start_t\n" print "End time
end_t\n" print "Elapsed time tot_t secs for
input_var iterations\n" print "Output file
fname\n"
31(No Transcript)
32 MEMORY INTENSIVE BENCHMARK print "Enter
number of iterations " Typical number of
iterations was 3000000 with average elapsed time
of 19 secs chomp (iters ltSTDINgt) start_t
time for (num0 num lt iters num)
grocerylistnum num end_t
time tot_t end_t - start_t print "Start
time start_t\n" print "End time
end_t\n" print "Elapsed time tot_t secs for
iters iterations\n"
33(No Transcript)
34 COMPUTE INTENSIVE BENCHMARK print "Enter
required number of iterations " Typical
number of iterations was 5000 with average
elapsed time of 45 seconds chomp(input_var
ltSTDINgt) start_t time blob 0 for (x
1 x lt input_var x) for (y 1 y lt
input_var y) z x y end_t
time tot_t end_t - start_t print "\nStart
time start_t\n" print "End time
end_t\n" print "Elapsed time tot_t secs for
input_var iterations\n"
35(No Transcript)
36Event Trace and Applications
- Event Trace has an API that can be embedded in a
user-written application - Per process buffer pool etc.
- Can be used to measure transaction
throughputs/response times - Hey - what about ARM?
37Clustering Technologies on Windows
- MSCS
- 2-node failover cluster
- no scalability features
- Load Balancing Server
- a step-up from DNS or IP load balancing
- designed for the middle-tier on HTTP, FTP etc.
- 3rd party solutions such as Oracle Parallel Server
38Two-node failover database solution
Database
Database
39Oracle Parallel Server on Windows NT Architecture
40W2K and Clusters
- W2K Datacenter Server will be shipped 90 to 180
days after rest of W2K - Beta 3 issued early May 1999 so probable shipment
date of final release of W2K is October 1999 - Assume therefore that Datacenter Server will ship
around April 2000 - Microsoft still ambivalent about what extensions
to clustering will be in it
41Terminal Server Counters
- Terminal Services Object
- Session counts (active, inactive, total)
- Terminal Services Session
- 75 counters available
- CPU and memory usage
- Many relating to transmission of data
- Protocol Glyph Cache Hit Ratio!
42Windows 2000 - The Rhetoric and the Reality for
Performance Managers
Des Atkinson Metron Technology Ltd desa_at_metron.co.
uk