MMachine and Grids Parallel Computer Architectures - PowerPoint PPT Presentation

1 / 29

About This Presentation

Title:

MMachine and Grids Parallel Computer Architectures

Description:

Exploiting fine-grain thread level parallelism on the MIT ... Horizontal Threads (H-Threads) Instruction level parallelism. Executes on a single MAP cluster ... – PowerPoint PPT presentation

Number of Views:49

Avg rating:3.0/5.0

Slides: 30

Provided by: cs911

Category:

Tags: architectures | computer | grids | mmachine | parallel | threads

Transcript and Presenter's Notes

Title: MMachine and Grids Parallel Computer Architectures

1
M-Machine and GridsParallel Computer
Architectures

Navendu Jain

2
Readings

The M-machine multicomputer Marco
et al., MICRO 1995
Exploiting fine-grain thread level parallelism on
the MIT multi-ALU processorKeckler et al., MICRO
1998
A design space evaluation of grid processor
architecturesNagarajan et al., MICRO 2001

3
Outline

The M-Machine Multicomputer
Thread Level Parallelism on M-Machine
Grid Processor Architectures
Review and Discussion

4
The M-Machine Multicomputer
5
Design Motivation

Achieve higher throughput of memory resources
Increase chip area devoted to processors
Arithmetic to bandwidth ratio of 12
operations/word
Minimize global communication (local sync.)
Faster execution of fixed size problems
Easier programmability of parallel computers
Incremental approach

6
Architecture

A bi-directional 3-D network mesh of
multi-threaded processing nodes
A chip comprises of a multi-ALU processor (MAP)
and 128KB on-chip sync. DRAM
A user-accessible message passing system (SEND)
Single global virtual address space
Target CLK 100 MHz (control logic 40MHz)

7
Multi-ALU processor (MAP)

A MAP chip comprises
Three 64-bit 3-issue clusters
2-way interleaved on-chip cache
A Memory Switch
A Cluster switch
External memory interface
On-chip network interfaces and routers

8
A MAP Cluster

64-bit three issue pipelined processor
2 Integer ALUs
1 Floating point ALU
Register Files
4KB Instruction cache
A MAP instruction has 1, 2 or 3 operations

9
Map Chip Die (18 mm side, 5M transistors)
10
Exploiting Parallelism on M-Machine
11
Threads

Exploit ILP both with-in and across the clusters
Horizontal Threads (H-Threads)
Instruction level parallelism
Executes on a single MAP cluster
3-wide instruction stream
Communication/synchronization through
messages/registers/memory
Max. 6 H-Threads can be interleaved dynamically
on a cycle-by-cycle basis

12
Threads (contd.)

Vertical Threads (V-Threads)
Thread level parallelism (a standard process)
contains up-to 4 H-Threads (one per cluster)
Flexibility of scheduling (compiler/run-time)
Communication/synchronization through registers
At-most 6 resident V-Threads
4 user slots, 1 event slot, 1 exception

13
(No Transcript)
14
Concurrency Model Three Levels of Parallelism

Instruction Level Parallelism ( 1 instruction)
VLIW, Superscalar processors
Issues Control Flow, Data dependency,
Scalability
Thread Level Parallelism ( 1000 instructions)
Chip Multiprocessors
Issues Limited coarse TLP, Inner cores
non-optimal
Fine grain Parallelism ( 50 1000 instructions)

15
Mapping
Program
Architecture
Granularity
16
Fine-grain overheads

Thread creation (11 cycles hfork)
Communication
Register-Register read/writes
Message passing/on-chip cache
Synchronization
Blocking on a register (full/empty bit)
Barrier Instruction (cbar instruction)
Memory (sync bit)

17
Grid Processor Architecture
18
Design Motivation

Continued scaling of the clock rate
Scalability of the processing core
Higher ILP - Instruction throughput (IPC)
Mitigate global wire and delay overheads
Closer coupling of Architecture and compiler

19
Architecture

An inter-connected 2-D network of ALU arrays
Each node has a IB and a execution unit
A single control thread maps instructions to
nodes
Block-Atomic Execution Model
Mapping blocks of statically scheduled
instructions
Dynamic execution in data-flow order
Forwarding temp. values to the consumer ALUs
Critical path scheduled along shortest physical
path

20
GPA Architecture
21
Example Block-Atomic Mapping
22
Implementation

Instruction fetch and map
predicated hyper-block, move instructions
Execution - control logic
Operand routing max 3 dest., split instructions
Hyper-block control
Predication (execute-all approach), cmove
instructions
Block-commit
Block-stitching

23
Review and Discussion
24
Key Ideas Convergence

Microprocessor no. of superscalar processors
comm./sync. via registers low overheads
Exploiting ILP TLP granularities
Dependency mapped to a grid of ALUs
Replication reduces design/verification effort
Point-to-point communication
Exposing architecture partitioning and flow of
operations to the compiler
Avoid wire, routing delays, memory wall problems

25
Ideas Divergence

M-Machine
On-chip cache Register based mech.
Delays
Broadcasting and Point-to-point communication
GPA
Register Set Grid Chaining
Scalability
Point-to-point communication
TERA
Fine-grain threads Memory comm/sync
(full/empty)
No support for single threaded code

26
Drawbacks (Unresolved Issues)

M-Machine
Scalability
Clock speeds
Memory Synchronization
(use hfork)

Grid Processor Arch.
Data Caches far from ALUs
Incur delays between dependent operations due to
network router and wires
Complex Frame-management and Block-stitching
Explicit compiler dependence

27
Challenges/Future Directions

Architectural support to extract TLP
Parallelizing compiler technology
How many cores/threads
No. of threads memory latency, wire delays
Flynn
Inter-thread communication
Height of Grid 8 (IPC 5-6) GPA, Peter
Optimization - f(comm., delays, memory costs)

28
Challenges (contd.)

On-fly data-dependence detection (RAW/WAR)
TLP/ILP Balance M Multi-Computer

29
Thanks

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

Parallel Processing Architectures PowerPoint PPT Presentation

Parallel Processing Architectures - Parallelism moved to instruction level. Microprocessor performance ... Process Level or Thread level parallelism; mainstream for general purpose computing? ... | PowerPoint PPT presentation | free to view

CS 213: Parallel Processing Architectures PowerPoint PPT Presentation

CS 213: Parallel Processing Architectures - Parallelism moved to instruction level. Microprocessor performance ... Process Level or Thread level parallelism; mainstream for general purpose computing? ... | PowerPoint PPT presentation | free to view

CS 258 Parallel Computer Architecture Lecture 3 Introduction to Scalable Interconnection Network Design PowerPoint PPT Presentation

CS 258 Parallel Computer Architecture Lecture 3 Introduction to Scalable Interconnection Network Design - Parallel Computer Architecture. Lecture 3 ... phit (physical unit) data transferred per cycle. flit - basic unit of flow-control ... | PowerPoint PPT presentation | free to view

Chapter 2: Computer Systems Organization PowerPoint PPT Presentation

Chapter 2: Computer Systems Organization - VAX computer system. Epitome of CISC. Very high-level ISA instruction set ... Allows parallel access to separate disks. Provides for data redundancy. 6 levels ... | PowerPoint PPT presentation | free to view

Structure of Computer Systems (Advanced Computer Architectures) PowerPoint PPT Presentation

Structure of Computer Systems (Advanced Computer Architectures) - Structure of Computer Systems (Advanced Computer Architectures) Course: Gheorghe Sebestyen Lab. works: Anca Hangan Madalin Neagu Ioana Dobos Objectives and content ... | PowerPoint PPT presentation | free to view

Advanced Hardware Parallel/Distributed Processing High Performance Computing Top 500 list Grid computing PowerPoint PPT Presentation

Advanced Hardware Parallel/Distributed Processing High Performance Computing Top 500 list Grid computing - CMPE 478, Parallel Processing Advanced Hardware Parallel/Distributed Processing High Performance Computing Top 500 list Grid computing picture of | PowerPoint PPT presentation | free to view

CSE 8383 - Advanced Computer Architecture PowerPoint PPT Presentation

CSE 8383 - Advanced Computer Architecture - Title: Introduction To Parallel Processors Author: rewini Last modified by: rewini Created Date: 3/5/2001 10:21:45 PM Document presentation format | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 6th Edition PowerPoint PPT Presentation

William Stallings Computer Organization and Architecture 6th Edition - William Stallings Computer Organization and Architecture 6th Edition Chapter 18 Parallel Processing Multiple Processor Organization Single instruction, single data ... | PowerPoint PPT presentation | free to view

Introduction to Parallel Programming PowerPoint PPT Presentation

Introduction to Parallel Programming - Title: The IC Wall Collaboration between Computer science + Physics Last modified by: bal Document presentation format: Custom Other titles: Times New Roman Arial ... | PowerPoint PPT presentation | free to view

New Directions in Computer Architecture PowerPoint PPT Presentation

New Directions in Computer Architecture - Outline Desktop/Server Microprocessor State of the Art Mobile Multimedia Computing as New ... Trends Affecting New ... edu/papers/direction/paper ... | PowerPoint PPT presentation | free to view

Parallel Job Deployment and Monitoring in a Hierarchy of Mobile Agents PowerPoint PPT Presentation

Parallel Job Deployment and Monitoring in a Hierarchy of Mobile Agents - Parallel Job Deployment and Monitoring in a Hierarchy of Mobile Agents Munehiro Fukuda Computing & Software Systems, University of Washington, Bothell | PowerPoint PPT presentation | free to view

From EARTH to HTMT: The Evolution of a Multithreaded Architecture Model PowerPoint PPT Presentation

From EARTH to HTMT: The Evolution of a Multithreaded Architecture Model - From EARTH to HTMT: The Evolution of a Multithreaded Architecture Model Guang R. Gao Computer Architecture & Parallel Systems Laboratory (CAPSL) University of Delaware | PowerPoint PPT presentation | free to view

Objective: End-to-end parallel programming solutions for high-performance interactive computing with provable performances. PowerPoint PPT Presentation

Objective: End-to-end parallel programming solutions for high-performance interactive computing with provable performances. - The MOAIS team-project Objective: End-to-end parallel programming solutions for high-performance interactive computing with provable performances. | PowerPoint PPT presentation | free to view

SMA2 Proposal LISA: Leaders in Information Systems and Architectures PowerPoint PPT Presentation

SMA2 Proposal LISA: Leaders in Information Systems and Architectures - SMA2 Proposal LISA: Leaders in Information Systems and Architectures Angela GOH, NTU Stuart MADNICK, MIT | PowerPoint PPT presentation | free to view

Exploiting Multithreaded Architectures to Improve Data Management Operations PowerPoint PPT Presentation

Exploiting Multithreaded Architectures to Improve Data Management Operations - Exploiting Multithreaded Architectures to Improve Data Management Operations Layali Rashid The Advanced Computer Architecture Group @ U of C (ACAG) | PowerPoint PPT presentation | free to view

EECS 252 Graduate Computer Architecture Lec 12 PowerPoint PPT Presentation

EECS 252 Graduate Computer Architecture Lec 12 - Lec 12 [Removed: Vector Wrap-up] Multiprocessor Introduction David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley | PowerPoint PPT presentation | free to view

Neural Network Implementations on Parallel Architectures PowerPoint PPT Presentation

Neural Network Implementations on Parallel Architectures - Neural Network Implementations on Parallel Architectures ... | PowerPoint PPT presentation | free to view

EECS 252 Graduate Computer Architecture Lec 12 PowerPoint PPT Presentation

EECS 252 Graduate Computer Architecture Lec 12 - Lec 12 Vector Wrap-up and Multiprocessor Introduction David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley | PowerPoint PPT presentation | free to view

Design of Component-based Parallel Image Processing Libraries in a GRID environment PowerPoint PPT Presentation

Design of Component-based Parallel Image Processing Libraries in a GRID environment - Design of Component-based Parallel Image Processing Libraries in a GRID environment Proposta di tesi di Dottorato XIX ciclo Candidata: Antonella Galizia | PowerPoint PPT presentation | free to view

EECS 252 Graduate Computer Architecture Lec 15 PowerPoint PPT Presentation

EECS 252 Graduate Computer Architecture Lec 15 - Lec 15 T1 ( Niagara ) and Papers Discussion David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley http://www.eecs ... | PowerPoint PPT presentation | free to view

IIT CS570 Graduate Advenced Computer Architecture PowerPoint PPT Presentation

IIT CS570 Graduate Advenced Computer Architecture - Title: IIT CS570 Graduate Advenced Computer Architecture Author: David Last modified by: sun Created Date: 2/8/2005 3:17:21 AM Document presentation format | PowerPoint PPT presentation | free to view

CSCE 432/832 High Performance Processor Architectures Introduction to Multiprocessors PowerPoint PPT Presentation

CSCE 432/832 High Performance Processor Architectures Introduction to Multiprocessors - Introduction to Multiprocessors Adopted from Professor David Patterson Electrical Engineering and Computer Sciences University of California, Berkeley | PowerPoint PPT presentation | free to view

Caches for Parallel Architectures (Coherence) PowerPoint PPT Presentation

Caches for Parallel Architectures (Coherence) - Caches for Parallel Architectures (Coherence) Figures, examples Parallel Computer Architecture: A Hardware/Software Approach, D. E. Culler, J. P. Singh, Morgan ... | PowerPoint PPT presentation | free to view

Parallel solution of the Helmholtz equation with high frequency PowerPoint PPT Presentation

Parallel solution of the Helmholtz equation with high frequency - Parallel solution of the Helmholtz equation with high frequency Dan Gordon Computer Science University of Haifa Rachel Gordon Aerospace Eng. Technion | PowerPoint PPT presentation | free to view

CSE 502 Graduate Computer Architecture Lec 16 17,19 20 PowerPoint PPT Presentation

CSE 502 Graduate Computer Architecture Lec 16 17,19 20 - Lec 16+17,19+20 Symmetric MultiProcessing Larry Wittie Computer Science, StonyBrook University http://www.cs.sunysb.edu/~cse502 and ~lw Slides adapted from David ... | PowerPoint PPT presentation | free to view

CSE 502 Graduate Computer Architecture Lec 16-18 PowerPoint PPT Presentation

CSE 502 Graduate Computer Architecture Lec 16-18 - Lec 16-18 Symmetric MultiProcessing Larry Wittie Computer Science, StonyBrook University http://www.cs.sunysb.edu/~cse502 and ~lw Slides adapted from David ... | PowerPoint PPT presentation | free to view

Introduction to Big Data HADOOP HDFS MapReduce - Department of Computer Engineering PowerPoint PPT Presentation

Introduction to Big Data HADOOP HDFS MapReduce - Department of Computer Engineering - This presentation is an Introduction to Big Data, HADOOP: HDFS, MapReduce and includes topics What is Big Data and its benefits, Big Data Technologies and their challenges, Hadoop framework comparison between SQL databases and Hadoop and more. It is presented by Prof. Deptii Chaudhari, from the department of Computer Engineering at International Institute of Information Technology, I²IT. | PowerPoint PPT presentation | free to view