Title: NERSC 3rdParty IPI3 Project
1NERSC 3rdParty IPI-3 Project
Michael K. Gleicher Gleicher Enterprises, L.L.C.
2Topics
- Project Phases
- HPSS (Server) mods
- HSI I/O Rewrite
- Results
- New Capabilities
- Issues
- Implementation
- Summary
3Project Phases
- Requirements phase
- Feasibility study
- Code/test
- Pre-production phase
- Stress test
- Production
4Background
- Requirements For High Speed Transfer Rates
to/from HPSS (10s - 100s of MB/s) - 3rd party IPI-3 Capability Existed In Unicos for
C90 (used with NSL UniTree) - HPSS supports 3rd Party IPI-3 transfers
- but
- No Unicos support for IPI-3 3rd party on
T3E/J90/T90
5Project Goals
- High Speed File Transfers between Crays and HPSS
- IPI-3 3rd Party Transfers Directly Between Max
Strat Disk Drives and Unicos system - ethernet/fddi control
- HIPPI data transfer)
- No Unicos Kernel Changes
6Project Goals (2)
- Develop Library to Provide Multiplexed User-Level
Access to HIPPI Device(s) - Minimal changes to HSI/PFTP to use new library
- Minimal changes to HPSS
- Integrate Code into HPSS Baseline
7Software Architecture
8Current Non-DCE HSI I/O Architecture - Overview
HSI Client
hpss_Read, hpss_Write
hpss_Read hpss_Write
Non-DCE Client Library
tcp/ip transfers
- serial process
- no parallel transfers
- no shared memory (AIX)
- no IPI-3 capability
- doesnt use hpss_netopt config file
9Current Non-DCE HSI I/O Data Flow Overview
Control ethernet
IPI-3
IPI-3
read"
tcp/ip
tcp/ip
Alternate tcp/ip path, e.g., fddi
HPSS
(tcp/ip)
10New and Improved HSI I/O Architecture - Overview
sockets, shared memory, IPI-3
HSI Client
hpss_Read, hpss_Write
hpss_ReadList, hpss_WriteList
hpss_ReadList, hpss_WriteList
hpss_Read, hpss_Write
piped files
socket
- multi-threaded I/O
- parallel transfers fully and automatically
- supported
- socket transfers
- shared memory (AIX)
- IPI-3 capability (AIX, Unicos)
11New Non-DCE HSI I/O Data Flow Overview
Control ethernet
IPI-3
Partial Blocks
read"
IPI-3
tcp/ip
HPSS
(tcp/ip)
12Current HSI I/O Architecture
- carried forward from UTI NSL Unitree
- simple read/check/write loop to copy file(s)
- fixed-size I/O buffer
- no partial copies
- no prepositioning
- not double-buffered
DATA
I/O BUFFER
DATA
13HSI New I/O Architecture Overview
I/O BUFFER shared memory)
I/O BUFFER IPI-3 memory)
Parallel I/O default
Serial I/O e.g., piped files
- multi-threaded (1 process vs. ftp fork/exec)
- automatic stripe width (user can override)
- auto buffer size (user can override)
- local memory, shared memory (AIX)
- IPI-3 memory (unicos)
- multiple memory types may be used per xfer
- standard HPSS mover protocol
- multi-threaded
- double-buffered
- automatically selected for piped files
- local memory (malloc) used for I/O buffer
Prepositioning and partial transfers coded, not
tested
14Unicos3rd Party I/O Components
Application (HSI, PFTP, ...)
Mover
All HPSS I/O
non-DCE client API library
HPSS message-passing interface
AIX IPI-3 library
IPI-3 protocol mesage encode/decode, I/O
coordination IPI-3 slave I/O interface
ULP Library
Upper Layer Protocol reservation
AIX HIPPI Driver (OS)
IPI-3 master/slave transport HIPPI FP interface
IPI-3 protocol message encode/decode, and I/O
coordination
unicos IPI-3 library
unicos master IPI-3 thread
Transfer Notification Handler, HIPPI I/O
15IPI-3 Slave I/O - Overview
Max Strat
64K data blocks
HSI
64K data block
MOVER
partial 1st and/or last block (lt 64k)
- Slave capability provides 3rd
- party transfers for non-Max Strat
- devices
SCSI disk/tape, SSA disk, etc.
16HIPPI SwitchSource Routing Issue as a result of
alternate FP header
logical addr 16
logical addr 26
port 4
port 5
Cray
port 7
port 3
port 1
port B
- slave IPI transfers not possible
- with this example configuration
- solution use logical routing
logical addr 36
port C
Mover/Max Strat must have the same switch path to
the client
mover-gtMax Strat 43B MaxStrat-gtmover
C17 MaxStrat-gtCray 5 Mover-gtCray 53B
17 Potential Max Strat problemon Complex Writes
if Alt. FP Header used
MOVER
D2 data burst(s)
1st burst (ignored)
cmd ref.
inbound HIPPI channel
Unicos Client (HSI)
Transfer Notification Response
Alt.FP Header
outbound HIPPI channel
Question Can a race condition occur that could
corrupt data?
18Some Test Results
19(No Transcript)
20(No Transcript)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28(No Transcript)
29Issues
- zeroed out ifield source address (Essential
switch problem? Unicos problem?) - swapout (?) causes IPI-3 transfers to fail
30Potential Next Steps
- Isolate, repair problems
- stress tests
- phase into production
31Summary of New Capabilities
- New multithreaded HSI I/O with direct mover
transfers. No store/forward for non-piped I/O - Administrative controls to easily and dynamically
enable/disable reads and/or writes - Other HSI enhancements, e.g. interactive tar
- Network options fully supported
- 3rd party IPI-3 for Max Strats, supported on
Unicos and AIX - 3rd party IPI-3 for HIPPI-attached non-MaxStrat
movers - HPSS baseline mods for non-AIX IPI-3
32HPSS Baseline vs. Local Mods
- Initial informal code review conducted