Title: CplantTM Release 0'5 Update
1CplantTM Release 0.5 Update
- Rolf Riesen, et al.
- April 4, 2000
2Highlights
- Bug fixes
- Meminfo() to get at /proc/meminfo
- New PBS policy (already on Alaska)
- Portals 3.0
- More (will send list when Release comes out)
3Portals 3.0
- Data movement layer from SUNMOS, Puma, and
Tflops - 3.0 is a complete redesign to make it suitable
for full-featured OSs (such as Linux) and COTS
hardware - API intended for library writers, not application
programmers - Tech report SAND99-2959
4Portals 3.0 Release 0.5
- Were behind, due date was March 17
- 3.0 is already better than 2.0, but some known
bugs remain - Want to release it to first few test users
without any known bugs - Fix whatever test users uncover, then release
- First release will be on Siberia 45?s, 100MB/s
under MPI - Fixes problems seen under 2.0 on Alaska
5Portals 3.0 Phase 1 (Release 0.5)
Application
Runtime environment yod, pct, etc.
User space
MPI Library
Std I/O Library
P3 Module
Old P2 Module
Kernel space
RTS/CTS Module
MCP on Myrinet card
6Portals 3.0 Phase 2
Application
Runtime environment yod, pct, etc.
User space
MPI Library
Std I/O Library
P3 Module
Kernel space
RTS/CTS Module
MCP on Myrinet card
7Portals 3.0 Phase 3
Application
Runtime environment yod, pct, etc.
User space
MPI Library
Std I/O Library
Cplant Module
Kernel space
P3 on Myrinet card
8Interface Concepts
- One-sided operations
- Put and Get
- Zero copy message passing
- Increased bandwidth
- OS Bypass
- Reduced latency
- Application Bypass
- No polling, no threads
- Reduced processor utilization
- Reduced software complexity
9MPP Network Paragon and Tflops
Network interface is on the memory bus
Network
Memory
Memory Bus
Processor
Processor
Message passing or computational co-processor
10Commodity Myrinet
Network is far from the memory
Processor
Memory
Memory Bus
Bridge
PCI Bus
OS Bypass
NIC
Network
11Must Requirements
- Common protocols (MPI, system protocols, I/O)
- Portability
- Scalability to 1000s of nodes
- High-performance
- Multiple process access
- Heterogeneous processes (binaries)
- Runtime independence
- Memory protection
- Reliable message delivery
- Pairwise message ordering
12Will Requirements
- Operational API
- Zero-copy MPI
- Myrinet
- Sockets implementation
- Unrestricted message size
- OS Bypass, Application Bypass
- Put/Get
13Will Requirements
- Packetized implementations
- Receive uses start and length
- Receiver managed
- Sender managed
- Gateways
- Asynchronous operations
- Threads
14Should requirements
- No message alignment restrictions
- Striping over multiple channels
- Socket API
- Implement on ST
- Implement on VIA
- No consistency/coherency
- Ease of use
- Topology information
15Portal Addressing
Operational Boundary
Portal Table
Event Queue
Match List
Memory Descriptors
Memory Region
Portal API Space
Application Space
16Portals 3.0 Status
- Currently testing Cplant Release 0.5
- Portals 3.0 kernel module using the RTS/CTS
module over Myrinet - Port of MPICH 1.2.0 over Portals 3.0
- TCP/IP reference implementation ready
- Design of port to LANai begun
17http//www.cs.sandia.gov/cplant