Title: NCCS User Forum
1NCCS User Forum
2Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
3Key Accomplishments
- Incorporation of SCU5 processors into general
queue pool - Capability to run large jobs (4000 cores) on
SCU5 - Analysis nodes placed in production
- Migrated DMF from Dirac (Irix) to Palm (Linux)
4New NCCS Staff Members
- Lynn Parnell, Ph.D. Engineering Mechanics, High
Performance Computing Lead - Matt Koop, Ph.D. Computer Science, User Services
- Tom Maxwell, Ph.D. Physics, Analysis System Lead
5Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
6Key Accomplishments
- Discover/Analysis Environment
- Added SCU5 (cluster totals 10,840 compute CPUs,
110 TF) - Placed analysis nodes (dali01-dali06) in
production status - Implemented storage area network (SAN)
- Implemented GPFS multicluster feature
- Upgraded GPFS
- Implemented RDMA
- Implemented InfiniBand token network
- Discover/Data Portal
- Implemented NFS mounts for select Discover data
on Data Portal - Data Portal
- Migrated all users/applications to HP
Bladeservers - Upgraded GPFS
- Implemented GPFS multicluster feature
- Implemented InfiniBand IP network
- Upgraded SLES10 operating system to SP2
- DMF
- Migrated DMF from Irix to Linux
- Other
7Discover 2009 Daily Utilization Percentage
8Discover Daily Utilization Percentage by
GroupMay August 2009
8/13/09 SCU5 (4,128 cores added)
9Discover Total CPU ConsumptionPast 12 Months
(CPU Hours)
9/4/08 SCU3 (2,064 cores added) 2/4/09 SCU4
(544 cores added) 2/19/09 SCU4 (240 cores
added) 2/27/09 SCU4 (1,280 cores added) 8/13/09
SCU5 (4,128 cores added)
10Discover Job Analysis August 2009
11Discover Job Analysis August 2009
12Discover Availability
Scheduled Maintenance Jun-Aug 10 Jun - 17 hrs 5
min GPFS (Token and Subnets, 3.2.1-12) 24 Jun -
12 hours GPFS (RDMA, Multicluster, SCU5
integration) 29 Jul - 12 hours GPFS 3.2.1-13,
OFED1.4 , DDN firmware 30 Jul - 2 hours 20
minutes DDN controller replacement 19 Aug - 4
hours NASA AUID transition
Unscheduled Outages Jun-Aug 16 Jun 3 hrs 35
min nodes out of memory 24 Jun 4 hrs 39 min
maintenance extension 6-7 Jul 4 hrs 18 min
internal switch error 13 Jul 2 hrs 59 min
GPFS error 14 Jul 26 min nodes out of
memory 20 Jul 2 hrs 2 min GPFS error 29 Jul
55 min Maintenance extension 19 Aug 2 hrs 45
min maintenance extension
13Current Issues on DiscoverLogin Node Hangs
- Symptom Login nodes become unresponsive.
- Impact Users cannot login.
- Status Developing/testing solution. Issue arose
during critical security patch installation.
14Current Issues on DMFPost-Migration Clean-Up
- Symptoms Various.
- Impact Various.
- Status Issues addressed as they are encountered
and reported.
15Future Enhancements
- Discover Cluster
- PBS V 10
- Additional storage
- SLES10 SP2
- Data Portal
- GDS OPeNDAP performance enhancements
- Use of GPFS-CNFS for improved NFS mount
availability
16Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
17I/O Study Team
- Dan Kokron
- Bill Putman
- Dan Duffy
- Bill Ward
- Tyler Simon
- Matt Koop
- Harper Pryor
- Building on work by SIVO and GMAO (Brent Swartz)
18Representative GEOS Output
- Dan Kokron has generated many runs containing
data in order to characterize the GEOS I/O - 720 core, quarter degree GEOS with YOTC-like
history - Number of processes that write 67
- Total amount of data 225 GB (written to
multiple files) - Average write size 1.7 MB
- Running in dnb33
- Using Nehalem cores (GPFS with RDMA)
- Average Bandwidth
- Timing the entire CFIO calls results in a
bandwidth of 3.8 MB/sec - Timing just the NetCDF ncvpt calls results in a
bandwidth of 44.4 MB/sec - Why is this so slow?
19Kernel Benchmarks
- Used open source I/O kernel benchmarks of xdd and
iozone - Achieved over 1 GB/sec to all the new nobackup
file systems - Wrote two representative one-node c-code
benchmarks - Using c writes and appending to files
- Using NetCDF writes with chunking and appending
to files - Ran these benchmarks writing out exactly the same
as process 0 in the GEOS run - C-writes Average bandwidth of around 900 MB/sec
(consistent with kernel benchmarks) - NetCDF writes Average bandwidth of around 600
MB/sec - Why is GEOS I/O running so slow?
C-writes Average Bandwidth 900MB/sec
NetCDF-writes Average Bandwidth 600MB/sec
20Effect of NetCDF Chunking
- How does changing the NetCDF chunk size affect
the overall performance? - The table shows runs varying the chunk size for
an average of 10 runs for each chunk size - Used the NetCDF kernel benchmark
- The smallest chunk size reproduces the GEOS
bandwidth - As best as we can tell, this is roughly
equivalent to the default chunk size - The best chunk size turned out to be about the
size of the array being written 3MB
Chunk size ( Floats) Chunk size (KB) AverageBandwidth (MB/sec)
1K 4 37
32K 128 262
128K 512 492
512K 2,048 537
1M 4,096 596
2M 8,192 497
3M 12,228 369
6M 24,576 477
10M 40,960 327
- References
- NetCDF-4 Performance Report, Lee, et. Al.,
June 2008. - NetCDF on-line tutorial
- http//www.unidata.ucar.edu/software/netcdf/docs_b
eta/netcdf-tutorial.html - Benchmarking I/O Performance with GEOSdas and
other modeling guru posts - https//modelingguru.nasa.gov/clearspace/message/5
6155615
21Setting Chunk Size in GEOS
- Dan K. ran several baseline runs to make sure we
were measuring things correctly - Turned on chunking and set the chunk size equal
to the write size (1080x721x1x1) - Dramatic improvement in ncvpt bandwidth
- Why was the last run so slow?
- Because we had a file system hang during that run
File Name Description Ncvpt Bandwidth (MB/sec)
Base Line 1 Base line run with time stamps at each wrote statement 44.47
Base Line 2 Printed out time stamps before and after the call to ncvpt 76.35
Base Line 3 Printing the time stamps moved after the call to ncvpt 64.69
Using NetCDF Chunking Initial run with NetCDF chunking turned on 409.87
Using NetCDF Chunking and Fortran Buffering (1) IO Buffering in the Intel IO library on top of NetCDF chunking 421.23
Using NetCDF Chunking and Fortran Buffering (2) Same as previous run with very different results 45.17
22What next?
- Further explore chunk sizes in NetCDF
- What is the best chunk size?
- Do you set the chunk sizes for write performance
or for read performance? - Once a file has been written with a set chunk
size, it cannot be changed without rewriting the
file. - Need to better understand the variability seen in
the file system performance - Not uncommon to see a 2x or greater difference in
performance from run to run - Turn the NetCDF kernel benchmark into a
multi-node benchmark - Use this benchmark for testing system changes and
potential new systems - Compare performance across NCCS and NAS systems
- Write up results
23Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
24Ticket Closure Percentiles1 March to 31 August
2009
25Issue Parallel Jobs gt 1500 CPUs
- Original problem Many jobs wouldnt run at gt
1500 CPUs - Status at last Forum Resolved using a different
version of the DAPL library - Current Status Now able to run at 4000 CPUs
using MVAPICH on SCU5
26Issue Getting Jobs into Execution
- Long wait for queued jobs before launching
- Reasons
- SCALITRUE is restrictive
- Per user per project limits on number of
eligible jobs (use qstat is) - Scheduling policy first-fit on job list ordered
by queue priority and queue time - User services will be contacting folks using
SCALITRUE to assist them in migration away from
this feature
27Future User Forums
- NCCS User Forum schedule
- 8 Dec 2009, 9 Mar 9 2010, 8 Jun 2010, 14 Sep
2010, and 7 Dec 2010 - All on Tuesday
- All 200-330 PM
- All in Building 33, Room H114
- Published
- On http//nccs.nasa.gov/
- On GSFC-CAL-NCCS-Users
28Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
29Sustained System Performance
- What is the overall system performance?
- Many different benchmarks or peak numbers are
available - Often unrealistic or not relevant
- SSP refers to a set of benchmarks that evaluates
performance as related to real workloads on the
system - SSP concepts originated from NERSC (LBNL)
30Performance Monitoring
- Not just for evaluating a new system
- Ever wonder if a system change has affected
performance? - Often changes can be subtle and not detected with
normal system validation tools - Silent corruption
- Slowness
- Find out immediately instead of after running the
application and getting an error
31Performance Monitoring (contd.)
- Run real workloads (SSP) to determine performance
changes over time - Quickly determine if something is broken or slow
- Perform data verification
- Run automatically on a regular basis as well as
after system changes - e.g. change to a compiler, MPI version, OS update
NERSC SSP Example Chart
32Meaningful Measurements
- How you can help
- We need your application and a representative
dataset for your application - Ideally should take 20-30 minutes to run at
various processor counts - Your benefits
- Changes to the system that affect your
application will be noticed immediately - Data will be placed on NCCS website to show
system performance over time
33Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
34Discover Job Monitor
- All data is presented as a current system
snapshot, in 5 min intervals. - Displays system load as a percentage
- Displays the number of running jobs and running
cores - Queued jobs and job wait times
- Displays current qstat -a output
- Interactive Historical Utilization Chart
- Message of the day
- Displays average number of cores per job
- Job Monitor
35Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
36Climate Data Analysis
- Climate models are generating ever-increasing
amounts of output data. - Larger datasets are making it increasingly
cumbersome for scientists to perform analyses on
their desktop computers. - Server-side analysis of climate model results is
quickly becoming a necessity.
37Parallelizing Application Scripts
- Many data processing shell scripts can be easily
parallelized - MatLab, IDL, etc.
- Use task parallelism to process multiple files in
parallel - Each file processed on a separate core within a
single dali node - Limit load on dali (16 cores per node )
- Max 10 compute intensive processes per node
Serial Version while ( ) process
another file run.grid.qd.s end
Parallel Version while ( ) process
another file run.grid.qd.s end
38ParaView
- Open-source, multi-platform visualization
application - Developed by Kitware, Inc. (authors of VTK)
- Designed to process large data sets
- Built on parallel VTK
- Client-server architecture
- Client Qt based desktop application
- Data Server MPI based parallel application on
dali. - Parallel streaming filters for data processing
- Large library of existing filters
- Highly extensible using plugins
- Plugin development required for HDF, NetCDF, OBS
data - No existing climate-specific tools or algorithms
- Data Server being integrated into ESG
39ParaView Client
- Qt desktop application that Controls data access,
processing, analysis, and visualization
40ParaView Client Features
41Analysis Workflow Configuration
- Configure a parallel streaming pipeline for data
analysis
42ParaView Applications
Polar Vortex Breakdown Simulation
Golevka Asteroid Explosion Simulation
3D Rayleigh-Benard problem
Cross Wind Fire Simulation
43Climate Data Analysis Toolkit
- Integrated environment for data processing, viz,
analysis - Integrates numerous software modules in python
shell - Open source with a large diverse set of
contributors - Analysis environment for ESG developed _at_ LLNL
44Data Manipulation
- Exploits NumPy Array and Masked Array
- Adds persistent climate metadata
- Exposes NumPy, SciPy, RPy mathematical
operations
Clustering FFT Image processing Linear
algebra Interpolation Max entropy Optimization Sig
nal processing Statistical functions Convolution S
parse matrices Regression Spatial algorithms
45Grid Support
- Spherical Coordinate Remapping and Interpolation
Package - remapping and interpolation between grids on a
sphere - Map between any pair of lat-long grids
- GridSpec
- Standard description of earth system model grids
- To be implemented in NetCDF CF convention
- Implemented in CMOR
- MoDAVE
- Grid visualization
46Climate Analysis
- Genutil Cdutil (PCMDI)
- General Utilities for climate data analysis
- Statistics, array color manipulation,
selection, etc. - Climate Utilities
- time extraction, averages, bounds, interpolation
- masking/regridding, region extraction
- PyClimate
- Toolset for analyzing climate variability
- Empirical Orthogonal Functions (EOF) analysis
- Analysis of coupled data sets
- Singular Vector Decomposition (SVD)
- Canonical Correlation Analysis (CCA)
- Linear digital filters
- Kernel based probability
- Density function estimation
47CDAT Climate Diagnostics
- Provides a common environment for climate
research - Uniform diagnostics for model evaluation and
comparison
Taylor Diagram Thermodynamic Plot Performance
Portrait Plot Wheeler-Kalidas Analysis
48Contributed Packages
- PyGrADS (potential)
- AsciiData
- BinaryIO
- ComparisonStatistics
- CssGrid
- DsGrid
- Egenix
- EOF
- EzTemplate
- HDF5Tools
- IOAPITools
- Ipython
- Lmoments
- MSU
- NatGrid
- ORT
- PyLoapi
- PynCl
- RegridPack
- ShGrid
- SP
- SpanLib
- SpherePack
- Trends
- Twisted
- ZonalMeans
- ZopeInterface
49Visualization
- Visualization and Control System (VCS)
- Standard CDAT 1D and 2D graphics package
- Integrated Contributed 2D Packages
- Xmgrace
- Matplotlib
- IaGraph
- Integrated Contributed 3D packages
- ViSUS
- VTK
- NcVTK
- MoDAVE
50Visual Climate Data Analysis Tools (VCDAT)
- CDAT GUI, facilitates
- Data access
- Data processing analysis
- Data visualization
- Accepts python input
- Commands and scripts
- Saves state
- Converts keystrokes to python
- Online help
51MoDAVE
- Visualization of Mosaic grids
- Parallelized using MPI
- Integration into CDAT in process
- Developed by Tech-X LLNL
Cubed sphere visualization
52ViSUS in CDAT
- Data streaming application
- Progressive processing visualization of large
scientific datasets - Future capabilities for petascale dataset
streaming - Simultaneous visualization of multiple ( 1D, 2D,
3D ) data representations
53VisTrails
- Scientific workflow and provenance management
system. - Interface for next version of CDAT
- history trees, data pipelines, visualization
spreadsheet, provenance capture
54Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
55Background
- Scientists generate large data files
- Processing the files consists of executing a
series of independent tasks - Ensemble runs of models
- All the tasks are run on one CPU
56PoDS
- Task parallelism tool taking advantage of
distributed architectures as well as multi-core
capabilities - For running serial independent tasks across nodes
- Does not make any assumption on the underlying
applications to be executed - Can be ported to other platforms
57PoDs Features
- Dynamic assessment of resource availability
- Each task is timed
- A summary report is provided
58Task Assignment
Node 1
Command 1 Command 2 Command 3 Command 4 Command
5 Command 6 Command 7 Command 8 Command 9
Node 2
Node 3
Execution File
59PoDS Usage
- pods.py -help execFile CpusPerNode
- execFile file listing all the independent tasks
to be executed - CpusPerNode number of CPUs per node. If not
provide, PoDS will automatically use the number
of CPUs available in each node.
60Simple Example
- Randomly generates an integer n between 0 and
109 - Loops over n to perform some basic operations
- Each time the application is called a different n
is obtained. We want to run the application 150
times.
61Timing Numbers
Nodes Cores/Node Time (s)
1 1 990
2 496
4 256
8 133
2 1 497
2 247
4 131
8 61
62More Information
- Users Guide on ModelingGuru
- https//modelingguru.nasa.gov/clearspace/docs/DOC-
1582 - Package available at
- /usr/local/other/pods
63Agenda
Welcome Introduction Phil Webster, CISTO Chief
SSP Test Matt Koop, User Services
Current System Status Fred Reitz, Operations Lead
Discover Job Monitor Tyler Simon, User Services
NCCS Compute Capabilities Dan Duffy, Lead
Architect
Analysis System Updates Tom Maxwell, Analysis
Lead
PoDS Jules Kouatchou, SIVO
User Services Updates Bill Ward, User Services
Lead
Questions and Comments Phil Webster, CISTO Chief
64Important Contacts
- NCCS Support
- support_at_nccs.nasa.gov (301) 286-9120
- Analysis Lead
- Thomas.Maxwell_at_nasa.gov (301) 286-7810
- I/O Improvements
- Daniel.Q.Duffy_at_nasa.gov (301) 286-8830
- PoDS Info
- Jules.Kouatchou-1_at_nasa.gov (301) 286-6059
- User Services Lead
- William.A.Ward_at_nasa.gov (301) 286-2954