Dynamic Topology Optimization for Supercomputer Interconnection Networks - PowerPoint PPT Presentation

1 / 4
About This Presentation
Title:

Dynamic Topology Optimization for Supercomputer Interconnection Networks

Description:

... hard circuits -- contributing virtually no latency to switch fabric (propagation ... ( getting 90% of the way to all-optical switching fabric! ... – PowerPoint PPT presentation

Number of Views:168
Avg rating:3.0/5.0
Slides: 5
Provided by: johns149
Category:

less

Transcript and Presenter's Notes

Title: Dynamic Topology Optimization for Supercomputer Interconnection Networks


1
Dynamic Topology Optimization for Supercomputer
Interconnection Networks
Definitions for Packet Switches
  • Layer-1 (L1) switch
  • Dumb switch, Electronic patch panel
  • Establishes hard links (real circuits) between
    endpoints
  • Does not read/respect/or otherwise understand
    packet boundaries or any content it transmits
  • Less expensive per port than Layer-2 switch due
    to lower complexity
  • No header parsing means minimal latency
    (end-to-end propagation delay)
  • Layer-2 (L2) switch
  • Packet switch that must parse packet headers to
    determine which output port to route input packet
    to. Requires line-rate packet decisions
  • Capable of multiplexing/demultiplexing messages
    encoded as streams of packets (Layer-1 cannot do
    this at line-rate packet granularity)
  • Complexity increases costs
  • Delays packets due to buffering, packet header
    parsing, routing decisions.

2
Hybrid Interconnect
L1 crossbar connects NICs on nodes (P1-Pn) to
Layer2 switch ports (SW1-SWm) L1 crossbar also
connects layer 2 switch ports (SW1-SWm) to each
other to form custom topological neighborhoods.
Dynamically provisions custom interconnect
topologies on a per-job basis.
Layer-1 crossbar (electronic patch panel)
Layer-2 switch blocks
Nodes
P1
SW1
P2
SW2
P3
SW3
P4
SW4
  • Lower Cost (L1 is less expensive lower L2
    costs with low port counts)
  • Lower Latency (L1 contributes little latency)
  • Improved Fault Tolerance / Fault Recovery
  • Optimal Scheduling (eliminate job fragmentation
    of SW topology)
  • Better Shelf Life (L1 switches are usable for
    several generations of L2 switch technology)

3
Hybrid Interconnect (details)
  • Lower Cost
  • L1 switches cost a fraction of L2 per port
  • Design allows bias towards cheaper,
    low-port-count L2 switches.
  • L2 switch infrastructure costs can scale linearly
    with system size (eg. provision optimal mesh
    topology for each application)
  • Lower Latency
  • L1 switches form hard circuits -- contributing
    virtually no latency to switch fabric
    (propagation delay due to speed of light) -- L2
    stage delays are larger due packet header
    parsing.
  • L1 light path for MEMs based optical switches
    requires no OEO conversion! (getting 90 of the
    way to all-optical switching fabric!)
  • Goal single L2 switch hop for any p2p message
  • Fault Tolerance
  • Lock out failing L2 switch blocks (in torus or
    mesh, failures induce significant runtime
    performance hit)
  • Optimal Scheduling (eliminate job fragmentation)
  • Prevents fragmentation of the network topology
    jobs are scheduled that do not fit into available
    dense slots in the network topology (as much a
    problem for fat-tree as it is for mesh/torus
    ORNL Altix example)
  • Optimal mesh topology can be provisioned for each
    job regardless of current system state
  • Better Shelf Life
  • Same L1 crossbar switch can be used for multiple
    generations of L2 switch implementations

4
Investigation Plan
  • Analyze communication topology requirements of
    existing DOE applications
  • Collaboration with Jeff Vetter (ORNL) to capture
    communication requirements
  • Use captured communication requirements to define
    proper balance between L1 and L2 switch layers
  • Use cost model for existing L1 and L2 switch
    components to predict cost benefits for hybrid
    infrastructure
  • Develop communication performance models for
    specific codes to predict benefits for
    lower-latency interconnects
  • Collaboration with UIC/StarLight facility to test
    using their Glimmerglass optical crossbar
    switches.
Write a Comment
User Comments (0)
About PowerShow.com