Title: Service Differentiation and Grids
1Service Differentiation and Grids
- Pascale Vicat-Blanc Primet
- Benjamin Gaidioz, Pierre Billiau, François
Echantillac, Mathieu Goutelle, Fabien Chanussot - INRIA - Reso
- LIP Laboratory
- Ecole Normale Supérieure de Lyon
- France
- Pascale.primet_at_inria.fr
2Outline
- Requirements for E2E Service Differentiation in
Grids - The EDS approach (DataTAG project)
- The QoSinus approach (e-Toile/VTHD project)
- Conclusion
3Typology of Grid flows
- Applications flows
- Input Output data
- Inter process communication messages (MPI, DSM,
synchro) - Codes coupling
- Interactions
- Vizualizations
- Voice/Video in collaborative environments
- Control flows
- Grid environment deployment
- Applications deployment
- Control and Management of the Grid (middleware)
- Monitoring, scheduling, loading, reporting,
alarms - All these flows share the same  network
resource and the same bottlenecks
4Example e-toile Infrastructure
Europe US
Experimental Testbed
Production Testbed
ID-IMAG Grenoble
ID-IMAG Grenoble
ID-IMAG Grenoble
ID-IMAG Grenoble
ID-IMAG Grenoble
ID-IMAG Grenoble
ID-IMAG Grenoble
1 Gb/s
12 PC bipro
250 PC en cluster
VTHD 2.5 Ã 10 Gb/s
IRISA Rennes
IRISA Rennes
2 Gb/s
CERN
1 Gb/s
SUN Grenoble
SUN Grenoble
1 Gb/s
CEA Saclay
CEA Saclay
CEA Saclay
Serveur 8 processeurs
1 Gb/s
1 Gb/s
ENS Lyon
ENS Lyon
ENS Lyon
ENS Lyon
ENS Lyon
ENS Lyon
1 Gb/s
PRiSM Versailles
PRiSM Versailles
PRiSM Versailles
PRiSM Versailles
EDF Clamart
EDF Clamart
EDF Clamart
EDF Clamart
EDF Clamart
8 x 2 PC linked by SCI
Serveur bipro MP760
IBCP
1 cluster de PC
Machine SMP
16 power PC linked by Myrinet
Routeur actif
Routeur actif
1 cluster Myrinet de 10 PCs 1 cluster de 8 PCs
Service de dépôt de données IBP
Serveur 3 bipro
16 Sun Cobalt
5Grid Flows characteristics
- Mice, Elephant, Lièvres et Tortues,
- Throughput
- Rates more than 9 orders of magnitude
- Few bytes for interactive traffic or control
traffic - To petabytes for bulk data transfer.
- Delay
- Very heterogeneous needs
- Some applications are very sensitive to latency
(MPI visu) - Bulk Data Transfer delays have to be controlled
- Reliability
- Generally reliable (gt TCP) but some apps are
loss tolerant (Astro) - Communication models
- Point to point, point to multipoint, multipoint
to point, multipoint to point - Collectives operations, synchronisation
barriers...
6Medical Images processing Pipeline
tagged MRI sequences From 20MB to 2GB/frame
1. Tags and myocardium automatic extraction
2. Motion estimation
3. Quantification
7How to control the performances?
- Packet level (Network QoS)
- 1 Ã 100ms
- Mechanisms classifiers, marquers et
conditionners (routers) - Models IntServ, DiffServ, Corestateless,Proporti
onal, EDS - Round trip time level (E2E QoS)
- 1 Ã 100 ms
- Congestion control and flow control (TCP, TFRC)
- Session level
- s, mn, or hr
- Admission control, Resource reservation (RSVP),
routing - Load sharing, MPLS-TE, BoD
- Long term
- Days, months...
- Provisionning, planification, loD
8Explored Approaches (INRIA RESO)
- Grid really need End to end QoS (bulk to MPI
vizual.) - Packet differentiation is already there in IP
equipments - PQ, WFQ, CBQ, WRR, RED, WRED
- Lot of issues with IS Diffserv
- Service differentiation at transport level
- Two approaches have been explored at INRIA
- EE DataTAG (assumption bottleneck is in
accessLAN) - Relative IP packet differentiated forwarding
- Each connection manages its individual QoS
- End protocol has to be adapted (SlowStart or
AIMD) - Edge to Edge e-Toile (assumption bottleneck
is in WAN) - An Independant API defined and integrated in mw
to specify session QoS goals - QoSINUS as a Grid network Service
- Interact with the Grid Measurement Infrastructure
9EDS approach
10Equivalent Differentiated Service Model
- Goal Sharing the network resources (bottleneck)
and control the E2E performances according to the
application specific requirements - gt delay sensitive/ loss sensitive/rate
sensitive - Constraints new PHB at IP level
- Differentiated forwarding services without
pricing - No admission control required.
- PHB definition restricted to local parameters (no
layer violation) - The transport layer has to integrate some
adaptation mechanisms to contribute to end to end
performance control.
11Equivalent Differentiated Services
- Proportionality
- Asymmetry (cf ABE)
12Equivalent Differentiated Services
- The EDS model defines an arbitrary number N of
classes. - Differentiation on delay and loss rate for each
class. - A class i gets a delay coef di and a loss rate
coef li. - These coef are constants.
- let i and j be two classes, the router schedules
and drops their packets so that there is a ratio
di/dj between local queuing delays and li/lj
between local loss rates. - In order to avoid having privileged classes,
coefficients are set - if diltdj then ligtlj
- or
- if digtdj then liltlj
- for all I in 1,N and j in 1,N
13Adaptive Packet Marking simple algorithm
loss
delay
t
t
Selected class
Delay constraint
Loss constraint
14AIMD EDS packet marking principle
15Validation
- EDS layer3 has been implemented in NS and in the
Linux QoS kernel - EDS layer 4 has been implemented in SCTP via an
adaptation of the AIMD algorithm in NS and Linux
kernel and tested on a local emulated platform
(NistNet) and on DataTAG link
16Results for a mix of traffic (NS simulations)
EDS3/4
Interactive traffic Transfer delay lt60
Real-Time traffic Latency constraint respect 2x gt
Bulk transfer timeout
17QoSINUS approach
18e-Toile GRID project goals
- Develop a Grid testbed
- On the Very High Bandwidth experimental network
(VTHD) - Active Grid Technology (dynamicity of the grid)
- Develop a middleware prototype
- Programmable Network and communication Libraries
- NFSp GXFER, MPI madeleine, MOME (DSM),
- Active network services (QoS, Mcast)
- Perform tests with high end applications
- computing intensive, data intensive, network
intensive - validation of a high performance grid model
targeting large scale numerical simulations.
Management Monitoring Security IHM
Globus 2.2 Duroc, GRAM MDS, GRIS/GIIS GSI RSL
e-Toile Allocator, Loader SIC - SPAM GSI - authoriz. LDT - GUIDE
19Programmable network INRIA RESO/LIP)
- Active nodes TAMANOIR and IBP depot (Loci/UTK)
deployed at the edge of VTHD - Gigabit supported with a TAN cluster
(1.3Gbits/s) - TAN cluster a front-end with back-ends for
load balancing
Actif flow
Receiver
TAN CEA
Paris
Receiver
VTHD
TAN CERN
Active Flow
Genève
20QoSINUS E2E Performance controllability
- QoSINUS Quality of Service Negociate, Invoke,
Use - Goals
- E2E QoS an interface  application lt-gt
 network - Application QoS objective eg. E2E transfer delay
- Use Network QoS DiffServ (packet prioritization)
- A programmable service (adapt API algorithm)
- QoSinus principles
- Specification and negociation of a SLS for a
microflow by Grid scheduler or application - Programmable mapping of the QoS objective in a
packet DSCP in the first active node (use EF, AF,
BE, LBE). - Dynamic Adaptation of packet marking based on
measurement results (network and flow).
21QoS objectives programming
22VTHD plate-forme
FTRD Caen
IPv6 over MPLS
Rouen
FTRD Lannion
INRIA Nancy
ENST Br Brest
Paris AUB
IPv6inIPv4
CHU
Rennes
Nancy
Paris STL
Paris MSO
INT
ENST Br Rennes
FTRD Rennes
INRIA
INRIA Rennes
ENST
FTRD Issy
HEGP
CERN
CERN
EDF
INRIA Lyon
Lyon
CEA
Opentransit Connectivité IPv6
PRISM
INRIA Grenoble
FTRD Grenoble
Grenoble
IPv6/IPv4 2.5Gbps
Sun
IMAG
IPv6/IPv4 1 Gbps
Juniper M20/M40/T640
Nice
FTRD Sophia
IPv6/IPv4 STM1/4
Eurecom
Cisco GSR 12000
INRIA Sophia
IPv4 seulement
Routeurs de sites VTHD/eToile
TSR Avici
IPv6 sur tunnel
23The VTHD backbone
- Really Very High Bandwidth
- provides 1Gb/s to 2Gb/sdirect access links
- Up to 4 x 2.5Gb/s in the core
- experimental network
- great availability
- Advanced services (Multicast, DiffServ, IPv6,
MPLS, GMPLS/UNI) - connected to other research networks in EU
through the DataTAG link (CERN in Geneva). - The VTHD network is deployed by France Telecom
- RNRT project VTHD and VTHD
24DiffServ in VTHD
25Experimental results in e-Toile/ VTHD
26Conclusion
- Diffserv philosophie provides the mean to extend
the IP forwarding model with scalable and easy to
deploy service differentiation mechanisms. - Difficult to avoid it if we want to control
performances in GRIDS ! - Standard PHB are deployed (Premium, LBE) in EU
NRNs - EDS or propDS provide simple and autonomous
solutions to add differentiated services in an IP
network. - An incremental solution (for access links and
LANs) - Adaptive end to end transport protocols (packet
marking in AIMD...) - QoSINUS exploit and control DiffServ ingress
point transparently. - Provides a simple and extensible API to
application (XML) - Provides a multi-domain and transparent solution
27Future Work Grid5000
- Measure the gain obtained with challenging grid
applications and grid infrastructures. - Interaction with novel transport protocols for
bulk transfers - Explore deeply the multi-domain multi-service
problem - Explore the scalability of the EDS and QoSINUS
approaches. - GRID5000 project a large scale cluster
interconnection in France - With about 5000 processors aggregated
- With high performance DiffServ network links
(RENATER) - high performance latency emulation tools.
- http//www.grid5000.org
- Interconnected with GN2
28More info
- RESO project at INRIA
- http//www.ens-lyon.fr/LIP/RESO
- e-Toile http//www.urec.cnrs.fr/etoile
- VTHD http//www.vthd.org
- GRID5000 http//www.grid5000.org
- Pascale.Primet_at_inria.fr