MESSAGE ROUTING SCHEMES IN A HYPERCUBE MACHINE - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

MESSAGE ROUTING SCHEMES IN A HYPERCUBE MACHINE

Description:

The torus and hypercube are symmetrical topologies in which the degree of a node ... mesh, all the nodes in tori and hypercubes are identical in connectivity. ... – PowerPoint PPT presentation

Number of Views:416
Avg rating:3.0/5.0
Slides: 49
Provided by: Syed82
Category:

less

Transcript and Presenter's Notes

Title: MESSAGE ROUTING SCHEMES IN A HYPERCUBE MACHINE


1
MESSAGE ROUTING SCHEMES IN A HYPERCUBE MACHINE
  • S. Raghupathy,
  • M. R. Leuze, and
  • S. R. Schach
  • Presented by Syed Md. Shakir

2
What are Interconnected Networks and why do we
need them?
  • One way for processors to communicate data is to
    use a shared memory and shared variables. However
    this is unrealistic for large numbers of
    processors. A more realistic assumption is that
    each processor has its own private memory and
    data communication takes place using message
    passing via an Interconnection Network.
  • The interconnection network plays a central role
    in determining the overall performance of a
    multicomputer system. If the network cannot
    provide adequate performance, for a particular
    application, nodes will frequently be forced to
    wait for data to arrive.

3
Parallel Computers
  • Large-scale parallel computers are potential
    candidates for providing very high computational
    power
  • These systems are usually organized as an
    ensemble of nodes, each with its own processor,
  • local memory, and other supporting devices.
  • The nodes are interconnected using a variety of
    topologies that can be classified into two broad
    categories
  • Direct
  • Indirect.

4
Direct Networks
  • In direct networks, each node has a
    point-to-point or direct connection to some of
    the other nodes, called neighboring nodes
  • examples of direct network topologies include
    hypercube, mesh, and tree.

5
Indirect Networks
  • In indirect networks, the nodes are connected to
    other nodes or a shared memory through one or
    more switching elements.
  • Examples of indirect networks include crossbar,
    bus, and multistage interconnection networks.

Multistage interconnected Network
6
Indirect Network

Cross Bar
7
Communication Latency
  • The communication latency of direct networks
    depends on several factors including switching,
    routing, flow control, and topology. Several
    switching techniques have been proposed for
    direct networks.
  • Wormhole switching has emerged as a popular
    technique and has been used in both commercial
    and experimental systems.
  • Wormhole switching can be employed in both direct
    and indirect networks. It is widely used in
    contemporary multicomputer because of its low
    latency and requirement of small buffers at the
    nodes.

8
cont...
  • The mesh is an asymmetrical topology in which the
    node degree depends on its location.
  • Interprocessor communication performance depends
    on the location of source and destination.
  • The torus and hypercube are symmetrical
    topologies in which the degree of a node is the
    same irrespective of its location in the network.
    Thus, unlike the mesh, all the nodes in tori and
    hypercubes are identical in connectivity.

9
Routing in Parallel Computers
  • Parallel computers are modeled by directed graphs
  • All interconnections between processors (nodes)
    occur in synchronous steps
  • Each link can carry at most one unit message
    (packet) in one step
  • During a step, a node can send at most one packet
    to each of its neighbors
  • Each node is uniquely identified by a number
    between 1 and N

10
Switching Techniques
  • In most multicomputer systems, a message enters
    the network from a source node and is switched or
    routed towards its destination through a series
    of intermediate nodes.
  • Four types of switching techniques are usually
    used for this purpose
  • circuit switching
  • packet switching
  • virtual cut-through switching
  • wormhole switching.

11
Circuit Switching
  • In circuit switching, a dedicated path is
    established between the source and the
    destination before data transfer initiates.
  • Once the data transfer is initiated the message
    is never blocked.
  • As the channels creating the path are reserved
    exclusively, buffering of data is not required.
  • On the other hand, establishing the path requires
    significant overhead during the
    data-transmission
  • phase, all channels are reserved for the entire
    duration of message transfer.
  • Circuit switching thus degrades performance and
    is no longer used in commercial multicomputer
    systems.

12
Packet Switching
  • In packet switching, a message is divided into
    packets that are independently routed towards its
    destination.
  • The destination address is encoded in the header
    of each packet. The entire packet is stored at
    every intermediate node and then forwarded to the
    next node in its path.
  • The main advantage of packet switching is that
    the channel resource is occupied only when a
    packet is actually transferred.

13
Packet Switching cont...
  • Each packet contains the routing information and
    alternative paths can be selected upon
    encountering network congestion or faulty nodes.
  • The major drawback of packet switching
  • Since the packet is stored entirely at each
    intermediate node, the time to transmit a packet
    from source to destination is directly
    proportional to the number of hops in the path.
  • At each intermediate node, we need buffer space
    to hold at least one packet.

14
Virtual Cut Through
  • In order to reduce the time to store the packets
    at each node, Kermani and Kleinrock introduced a
    technique called virtual cut-through
  • In this, while routing toward its destination, a
    message is stored at an intermediate node only if
    the next channel required is occupied by another
    packet.
  • Now, the distance between the source and
    destination has little effect on communication
    latency.

15
cont...
  • In an extreme case, when a message encounters
    blocking at all the intermediate nodes, the
    virtual cut-through technique reduces to packet
    switching.
  • The disadvantage of the virtual cut-through
    technique
  • Implementation cost each node must provide
    sufficient buffer space for all the messages
    passing through it, and because multiple messages
    may be blocked at any node, a very large buffer
    space is required at each node.
  • This implementation constraint limits the use of
    virtual cut-through technique.

16
Wormhole Switching
  • Wormhole switching is a variant of the virtual
    cut-through technique that avoids the need for
    large buffer spaces.
  • In wormhole switching, a packet is transmitted
    between the nodes in units of flits, the smallest
    units of a message on which flow control can be
    performed.
  • The header flit(s) of a message contains all the
    necessary routing information and all the other
    flits contain the data elements.
  • The flits of the message are transmitted through
    the network in a pipelined fashion.

17
cont...
  • Since only the header flit(s) has the routing
    information, all the trailing flits follow the
    header flit(s) contiguously.
  • Flits of two different messages cannot be
    interleaved at any intermediate node.
  • Successive flits in a packet are pipelined
    asynchronously in hardware using a handshaking
    protocol.
  • When the header flit is blocked, then all the
    trailing flits occupy the buffers at the
    intermediate nodes.

18

Wormhole Switching
Messages

D
H
Packets
Flits
D
D
D
D
D
D
D
D
D
D
D
D
D
D
H
D Data Flit H Header Flit (a)
(b)
Message format and routing in Wormhole Switching
19
Advantages of Wormhole Switching
  • The main advantage of wormhole switching derives
    from the pipelined message flow since
    transmission latency is insensitive to the
    distance between the source and destination.
  • Moreover, since the message moves flit by flit
    across the network, each node needs to store only
    one flit.
  • Some implementations, however, require storage of
    multiple flits at each node to improve routing
    performance. The reduction of buffer
    requirements at each node has a major effect on
    the cost and size of multicomputer systems.

20
Disadvantages of Wormhole Switching
  • The main disadvantage of wormhole switching comes
    from the fact that only the header flit has the
    routing information.
  • If the header flit cannot advance in the network
    due to resource contention, all the trailing
    flits are also blocked along the path and these
    blocked messages can block other messages.
  • This chained blocking can also lead to deadlock
    where messages wait for each other in a cycle and
    hence no message can advance any further.

21
cont...
  • Prevention of deadlock is one of the main issues
    in wormhole switching, and is usually
    accomplished by a suitable choice of routing
    function that selectively prohibits messages from
    taking all the available paths, thus preventing
    cycles in the network.
  • Selection of a routing algorithm is thus a major
    issue in wormhole-switched networks.

22
Hypercube Network
  • An n-dimensional hypercube network
  • Number of nodes N 2n
  • Degree n
  • The node i with address (i1, i2, , in) ? 0, 1n
    and the node j with address (j1, j2, , jn) ? 0,
    1n are connected if the hamming distance between
    (i1, i2, , in) and (j1, j2, , jn) is 1

23
Hypercube Topology
24
4d Hypercube
  • K dimensional hypercube is formed by combining
    two k-1 dimensional hypercubes and connecting
    corresponding nodes i.e. hypercubes are
    recursive, each node is connected to k other
    nodes i.e. each is of degree k.

25
Static routing in Hypercube
  • Given a source node Ns
  • Destination node Nd
  • The addresses of the 2n processors can be
    represented using n bits.
  • Then the next node on the route
  • from Ns to Nd is the node represented by bit
  • pattern (en-l, . . ., cl, CO) with bit i
    flipped, that
  • is to say, the message is routed in dimension
    i
  • The algorithm continues in this way until
  • the message arrives at node Nd.

26
Static routing
  • Algorithm
  • Given a destination address d(i) and an
    intermediate node ?(i)
  • Compare the bits of d(i) with ?(i) from left to
    right
  • Identify the first bit position at which these
    two addresses differ
  • Route this packet to its neighbor n(i) such that
    ?(i) and n(i) differ only in this bit position

27
Static Routing Algorithm
  • Example
  • Source (0, 0, 0, 0, 0, 0)
  • Destination (1, 0, 1, 0, 1, 1)
  • (0, 0, 0, 0, 0, 0) ? (1, 0, 0, 0, 0, 0) ?
  • (1, 0, 1, 0, 0, 0) ? (1, 0, 1, 0, 1, 0) ?
  • (1, 0, 1, 0, 1, 1)

28
Advantages and Disadvantages
  • Advantage
  • No overhead for calculating new routes.
  • Same CPU cycles can be used for other
    computational purpose.
  • Disadvantage
  • Blocking is a common consequence.

29

Dynamic routing
  • It allows every message to select the (locally)
    optimal route under the current circumstances.
  • In Dynamic routing, if link is blocked then
    attempt is made to pass the message through other
    link.
  • More utilization of the network
  • It uses local knowledge.

30
Dynamic routing
  • Allows the message to route from Ns,
  • to Nd ,depending on circumstances.
  • Allows optimal route under the current
    circumstances
  • Overhead of implementing dynamic routing.
  • At each node calculations have to be performed to
    determine the next node to which the message
    should be routed,
  • and links have to be tested to see which ones
    are free.

31
Advantages And Disadvantages
  • Advantages
  • Blocking is not a major problem
  • Disadvantages
  • overhead of implementing dynamic routing.
  • At each node calculations have to be performed to
    determine the next node to which the message
    should be routed,
  • links have to be tested to see which ones
  • are free.
  • The size of the overhead will vary
  • from hypercube to hypercube. In some
  • machines, the additional work can be done in
  • hardware in parallel with other operations in
  • other machines, it must be done in software,
  • using machine cycles that could otherwise be
    used for productive computing.

32
PRIORITIZATION
  • If a number of messages are waiting to use a
  • link, one method of choosing which message
  • to transmit is on the basis of
  • (FIFO), the method used in commercial
  • hypercubes.
  • In the paper alternative prioritization schemes,
    such as LIFO, giving priority to the message with
    the maximum number of remaining hops is also
    considered

33
Other Prioritization Schema
  • The processes form a DAG, each process can be
    assigned a sequence number such that every
    message is sent to a process with a higher
    sequence number than the sequence number of the
    process that generated the message.
  • The sequence number of the generating process can
    then be used to prioritize messages

34
Message Format
35
The Prioritization Schema
36
The Simulator
  • The simulator was constructed to investigate
    routing strategies.
  • The header contains information such as source
    and destination node, as well as information
    needed when the order of transmission of messages
    is done on the basis of
  • prioritization, such as sequence number, time
    generated, arrival time at the current node, and
    number of hops that still have to be traversed.

37
Execution Cycle Of The Simulator
  • The simulator has three phases
  • Message generation
  • Message ordering
  • Message routing.

38
Message Generation Phase
  • In this phase each active process is checked to
    see if it has received all the messages it
    requires.
  • If so, the messages it is to transmit are
    generated, and placed in the message buffer.
  • The process then terminates.
  • After all possible messages have been
  • generated, the simulator enters the message
    ordering phase.

39
Message Ordering Phase
  • After entering the message ordering phase the
    messages in each buffer are ordered according to
    the prioritization scheme currently being
    evaluated.
  • In the case of equal priorities, ties are broken
  • randomly.
  • Finally, the message routing cycle commences.

40
Message Routing Phase
  • After each message is fetched from the message
    buffer and an attempt is made to transmit it to a
    neighboring node.
  • If static routing is being used, and the
    predetermined link is in use, then that
    particular message is blocked.
  • When dynamic routing is used, an attempt is made
    to transmit the message over the first unused
    link that will move it closer to its destination.

41
Results
  • Dynamic routing performs better than static
    routing, but the improvement factor varies
    depending on the prioritization scheme. At best,
    the improvement is by a
  • factor of two.
  • Best results occur when priority is given to
    messages with the lowest sequence number.
  • Results almost as good are obtained when priority
    is given to messages with the fewer number of
    hops, either in the original message or remaining
    to be traversed.

42
Results Continued...
  • Messages of lowest sequence number are
    essentially those transmitted earliest in the
    computation sequence. Giving priority to such
    messages essentially speeds up the rate at which
    processes can begin transmitting, and hence
    speeds up the computation as a whole.
  • The traffic congestion in the hypercube is
    decreased by giving priority to messages with the
    fewest numbers of hops and therefore allowing the
    longer messages to proceed with less blocking
    than would otherwise be the case.
  • By giving priority to messages with fewer
    choices, the
  • overall amount of blocking is decreased

43
One Bidirectional Link Between Nodes
44
Two Unidirectional Links Between Nodes
45
Percentage Improvement When Two Unidirectional
Lines Are Used
46
Observations From The Graphs Above
  • Having two unidirectional links improve
    throughput over one bidirectional link
  • Note Improvement depends on the prioritization
    scheme.
  • The percentage improvement is rarely more than
    fifteen per cent, and is usually much smaller.
  • This effect may be caused by the fact that the
    problem graph is a DAG, thereby imposing a
    directionality on the flow of messages

47
Conclusions
  • Throughput of a certain class of problems on a
    hypercube can be increased by up an order of two
    through use of dynamic rather than static routing
    algorithms, and also by prioritizing the
    messages.
  • It is likely that different prioritization
    schemes would yield
  • improved throughput for other classes of
    problems.

48
Questions ?
Write a Comment
User Comments (0)
About PowerShow.com