Title: The Madeleine Communication Library
1The MadeleineCommunicationLibrary
- Olivier Aumage
- LIP, ENS Lyon
Amsterdam March 2002
2Introduction
3Madeleine, PM2
Application
Interface
PM2
DSM-PM2
Iso-malloc
Marcel
Madeleine
Net-Toolbox
Toolbox
4Madeleine, PM2
Application
Interface
PM2
DSM-PM2
Iso-malloc
Marcel
Madeleine
Net-Toolbox
Toolbox
5Madeleine, PM2
Application
Interface
PM2
DSM-PM2
Iso-malloc
Marcel
Madeleine
Net-Toolbox
Toolbox
- Communication management
- Message-passing paradigm
- Session management
- Generic communication interface
6Madeleine, PM2
Application
Interface
PM2
DSM-PM2
Iso-malloc
Marcel
Madeleine
Net-Toolbox
Toolbox
- Thread management
- Thread migration
- SMP support
- Scheduler activations
- Synchronization
- Event polling
- Communication management
- Message-passing paradigm
- Session management
- Generic communication interface
7Madeleine, PM2
Application
Interface
PM2
DSM-PM2
Iso-malloc
Marcel
Madeleine
Net-Toolbox
Toolbox
- Lists
- Hash
- Dyn. arrays
- Arguments
- String
- Fast mem. alloc.
- Macros
- Thread management
- Thread migration
- SMP support
- Scheduler activations
- Synchronization
- Event polling
- Communication management
- Message-passing paradigm
- Session management
- Generic communication interface
- Auxiliary communication
- TCP support
8Madeleine, PM2
Application
Interface
PM2
DSM-PM2
Iso-malloc
- Iso-address memory allocator
Marcel
Madeleine
Net-Toolbox
Toolbox
- Lists
- Hash
- Dyn. arrays
- Arguments
- String
- Fast mem. alloc.
- Macros
- Thread management
- Thread migration
- SMP support
- Scheduler activations
- Synchronization
- Event polling
- Communication management
- Message-passing paradigm
- Session management
- Generic communication interface
- Auxiliary communication
- TCP support
9Madeleine, PM2
Application
Interface
PM2
DSM-PM2
Iso-malloc
- Iso-address memory allocator
Marcel
Madeleine
Net-Toolbox
Toolbox
- Lists
- Hash
- Dyn. arrays
- Arguments
- String
- Fast mem. alloc.
- Macros
- Thread management
- Thread migration
- SMP support
- Scheduler activations
- Synchronization
- Event polling
- Communication management
- Message-passing paradigm
- Session management
- Generic communication interface
- Auxiliary communication
- TCP support
10Madeleine
- Generic communication interface
- Network support
- Session management
- Efficiency
- Portability
- Functionalities
- Simplicity
11Objectives
12Objectives
- PM2 multithread environment
- High performance clusters
- Support for RPC-like communications
- Efficiency
- Reactivity
Node 1
Network
Node 2
13Objectives - networks
- Adaptivity
- Multi-paradigm network interfaces
- VIA message passing, remote DMA
- SCI shared memory, DMA
- Static-buffer based network interfaces
- SBP
- Multi-mode network interfaces
- BIP short/long messages
- Exhaustivity
- Multi-protocol support
- Multi-adapter support
14Objectives multi-clusters
- Multi-cluster exploitation
- Fast intra-cluster links
- Fast inter-cluster links
- Nework-level heterogeneity
15Madeleine
16Interface
17Packing - Unpacking
- Commands
- Mad_pack(cnx, buffer, len, pack_mode,
unpack_mode) - Mad_unpack(cnx, buffer, len, pack_mode,
unpack_mode) - Modes
18Send
Send_SAFER
Send_LATER
Send_CHEAPER
Pack
19Send
Send_SAFER
Send_LATER
Send_CHEAPER
Pack
Modification
20Send
Send_SAFER
Send_LATER
Send_CHEAPER
Pack
Modification
?
End_packing
Transmitted version
21Receive
Receive_EXPRESS
Receive_CHEAPER
Unpack
22Receive
Receive_EXPRESS
Receive_CHEAPER
Unpack
Après Unpack
Data is available
Data availability???
23Receive
Receive_EXPRESS
Receive_CHEAPER
Unpack
Après Unpack
Data is available
Data availability???
End_packing
Data is available
24Example
Send
Receive
int n
int n
char s NULL
char s "Hello, World !"
p_mad_connection_t cnx
p_mad_connection_t cnx
25Example
Send
Receive
int n
int n
char s NULL
char s "Hello, World !"
p_mad_connection_t cnx
p_mad_connection_t cnx
26Example
Send
Receive
int n
int n
char s NULL
char s "Hello, World !"
p_mad_connection_t cnx
p_mad_connection_t cnx
cnx mad_begin_unpacking(channel)
cnx mad_begin_packing(channel, dest)
27Example
Send
Receive
int n
int n
char s NULL
char s "Hello, World !"
p_mad_connection_t cnx
p_mad_connection_t cnx
cnx mad_begin_unpacking(channel)
cnx mad_begin_packing(channel, dest)
n strlen(s) 1
mad_unpack(cnx, n, sizeof(int),
mad_pack(cnx, n, sizeof(int),
send_CHEAPER,receive_EXPRESS)
send_CHEAPER, receive_EXPRESS)
28Example
Send
Receive
int n
int n
char s NULL
char s "Hello, World !"
p_mad_connection_t cnx
p_mad_connection_t cnx
cnx mad_begin_unpacking(channel)
cnx mad_begin_packing(channel, dest)
n strlen(s) 1
mad_unpack(cnx, n, sizeof(int),
mad_pack(cnx, n, sizeof(int),
send_CHEAPER,receive_EXPRESS)
send_CHEAPER, receive_EXPRESS)
s malloc(n)
29Example
Send
Receive
int n
int n
char s NULL
char s "Hello, World !"
p_mad_connection_t cnx
p_mad_connection_t cnx
cnx mad_begin_unpacking(channel)
cnx mad_begin_packing(channel, dest)
n strlen(s) 1
mad_unpack(cnx, n, sizeof(int),
mad_pack(cnx, n, sizeof(int),
send_CHEAPER,receive_EXPRESS)
send_CHEAPER, receive_EXPRESS)
s malloc(n)
mad_unpack(cnx, s, n,
mad_pack(cnx, s, n,
send_CHEAPER,receive_CHEAPER)
send_CHEAPER, receive_CHEAPER)
30Example
Send
Receive
int n
int n
char s NULL
char s "Hello, World !"
p_mad_connection_t cnx
p_mad_connection_t cnx
cnx mad_begin_unpacking(channel)
cnx mad_begin_packing(channel, dest)
n strlen(s) 1
mad_unpack(cnx, n, sizeof(int),
mad_pack(cnx, n, sizeof(int),
send_CHEAPER,receive_EXPRESS)
send_CHEAPER, receive_EXPRESS)
s malloc(n)
mad_unpack(cnx, s, n,
mad_pack(cnx, s, n,
send_CHEAPER,receive_CHEAPER)
send_CHEAPER, receive_CHEAPER)
mad_end_unpacking(cnx)
mad_end_packing(cnx)
31Madeleine
32Architecture
- Modular approach
- Buffer Management Modules (BMM)
- Transmission Modules (TM)
Interface
BMM
BMM
Buffermanagement
TM
TM
TM
Network management
Network
33Buffers
- Generic buffer management layer
- Virtual buffers
- Static
- Dynamic
- Buffer groups
- Aggregation
- Splitting
34Networks
- Network management layer
- Data transfers
- Send, receive
- Aggrated transfers
- Transmission mode selection
- Selection function
35Adaptivity
Interface
Pack
Buffer Management
?
Network Management
36Madeleine
37Multiplexing
- Channels
- Network
- Set of nodes
- Set of point to point connections
38Real Channels
- One-to-one mapping to physical networks
- Partially cover the configuration
39Virtual Channels
- Cover the whole configuration
- Built on top of real channels
Virtual
TCP
TCP
40Functioning
- Automatic multi-network forwarding support
- MTU negociation
- Static routes
- Multi-threaded handling
- Generic approach
41Bandwidth preservation
- A single copy
- Same buffer used for reception and forwarding
LANai
- Pipeline
- Simultaneous receive and send operations
42Integration
- Generic forwarding transmission module
- Limitation of code traversal on gateways
Interface
BMM
BMM
Buffermanagement
Generic TM
TM
TM
TM
Networkmanagement
Network
43Madeleine
44Polling support
- Interaction with the Marcel thread scheduler
- Request aggregation
- Channel level aggregation
- Low level request
- Support is not reentrant
- Polling frequency
- Coarse-grained contrôl
- Timer, yields, idle
- No fine-grained network-specific polling frequency
45Marcels polling support
Process
Network
Node
Process
LANai
Marcel
Thread
Process
46Polling support To Do list
- Request handling
- Multi-level polling
- Better forwarding support on gateways
- Polling vs. interruptions
- Automatic switch mechanism between blocking and
non-blocking network listening methods - Polling frequency
- Request priorities
- Higher polling priority for efficient networks
47Madeleine
48Session management
- Startup
- Modular approach
- Flexibility
- Scalability
- Two modules
- Madeleine
- Communication
- Léonie
- Session control
49Léonie
- Sessions
- Multi-cluster configurations
- Unified process spawning
- Grouped process spawning
- Support for network-specific launchers (e.g.
bipload) - Support for optimized process launchers
- Network
- Internal tables building
- Information directory
- Virtual channel routing tables
- Scheduling
- NIC initialization, channel setup
50Configuration Structure
Madeleine
Léonie
51Madeleine
52Myrinet
53Myrinet
54SCI
55SCI
56Multi-cluster
57Conclusion
- The Madeleine communication library
- Portability
- Linux, Solaris, Aix
- x86, Sparc, Alpha, PowerPC
- Heterogeneous cluster support
- Efficiency
- Multi-protocol support
- BIP, SISCI, VIA, SBP, MPI, TCP, UDP
- Dynamic transfer mode selection
- Multi-cluster support
- Automatic message forwarding on gateways
58On-going and future work
- Various improvements
- Dynamicity
- Fault tolerance
- The GRID
- Madeleine and the GRID
59Madeleine
60Madeleine Grid Component
- Multiprotocol communication device for
- Nexus
- MPICH
- Provide cluster-level communication support
- Generic
- Efficient
-
61Structure
Nexus
Nexus/Madeleine module
Message Passing module
TCP module
Other modules
TCP protocol
MPL protocol
INX protocol
MAD SCI protocol
MAD TCP protocol
Madeleine
MPL Library
INX Library
Sockets
SCI
TCP
62Latency
63Bandwidth
64MPICH/Madeleine
MPI API
Generic part (collective operations,
context/group management, ...)
ADI
Generic ADI code, datatype management, request
queues management
ProtocolInterface
CH_MAD device inter-node communication polling
loops eager protocol rendez-vous-protocol
SMP_PLUG device intra-node communication
CH_SELF device self communication
Madeleine heterogeneity management
TCP
SISCI
BIP
Fast-Ethernet
SCI
Myrinet
65Latency
66Bandwidth
67GRID-RMI Project
- Environment for code coupling applications
- French Research Department supported project
- Involves many French research teams
- Madeleine used as the communication layer basement
68Project Architecture
Simulation Code Coupling
C3D
Plants growing
ProActivePDC
OpenCCM
Do!
PaCO
GK
MPI
DSM Mome
Java VM
CORBA
PadicoTM
Madeleine
Marcel
69(No Transcript)
70(No Transcript)
71Scheduler Activations
- Support for blocking syscalls
- and interrupts
- Vincent Danjean LIP, ENS Lyon
72Interrupts et system
- Scheduler activation Anderson et al. 91
- Idea bidirectionnal cooperation between two
schedulers (user and kernel level) - User level scheduler syscalls
- Kernel level scheduler upcalls !
- Upcall
- Tell the application about kernel events
- Activations ( virtual processors)
- As many running activations as the number of
processors - Kernel controls creation/destruction of
activations
73Idea
- Blocking syscall / 2 processors
74Idea
- Blocking syscall / 2 processors
75Idea
- Blocking syscall / 2 processors
76Idea
- Blocking syscall / 2 processors
77Idea
- Blocking syscall / 2 processors
78Idea
- Blocking syscall / 2 processors
79Idea
- Blocking syscall / 2 processors
80Idea
- Blocking syscall / 2 processors
81Activation improvements
- Extension of Andersons model
- Kernel/Application independance
- Activation number not bounded
- Support for every blocking syscalls
- Better upcalls management
- new, block, preempt, unblock
- Optimization
- Pool of ready activations in the kernel
82(No Transcript)
83Aggregation
Flush
Flush
TM1
TM1
TM2
84Aggregation
Main
TM 1
TM 2
85Symmetry
Flush
Flush
Flush
Send
Receive
86Symmetry modes
Flush
Flush
Flush
Send
Receive
Flush
Flush
Flush
Flush
87Special cases
- Send_LATER / Receive_CHEAPER
Main
Pack
Pack
TM 1
TM 2
88Special cases
- Send_LATER / Receive_EXPRESS
- Actual send is delayed until the call of
End_packing
89(No Transcript)