CS184a: Computer Architecture (Structure and Organization) - PowerPoint PPT Presentation

About This Presentation

Title:

CS184a: Computer Architecture (Structure and Organization)

Description:

Lower Upper Bound: 22M functions realizable by M-LUT. Say Need n 4-LUTs to cover; compute n: ... Upper Bound: (M-k)/log2(k- log2(k)) 1. Caltech CS184 ... – PowerPoint PPT presentation

Number of Views:90

Avg rating:3.0/5.0

Slides: 48

Provided by: andre57

Learn more at: http://courses.cms.caltech.edu

Category:

more less

Transcript and Presenter's Notes

Title: CS184a: Computer Architecture (Structure and Organization)

1
CS184aComputer Architecture(Structure and
Organization)

Day 9 January 29, 2003
Compute 1 LUTs

2
Previously

Instruction Space Modeling
huge range of densities
huge range of efficiencies
large architecture space
modeling to understand design space
Empirical Comparisons
Ground cost of programmability

3
Today

Look at Programmable Compute Blocks
Specifically LUTs Today
Recurring theme
define parameterized space
identify costs and benefits
look at typical application requirements
compose results, try to find best point

4
Compute Function

What do we use for compute function
Any Universal
NANDx
ALU
LUT

5
Lookup Table

Load bits into table
2N bits to describe
? 22N different functions
Table translation
performs logic transform

6
Lookup Table
7
We could...

Just build a large memory large LUT
Put our function in there
Whats wrong with that?

8
FPGA Many small LUTs
Alternative to one big LUT
9
Toronto FPGA Model
10
Whats best to use?

Small LUTs
Large Memories
small LUTs or large LUTs
or, how big should our memory blocks used to
peform computation be?

11
Start to Sort Out Big vs. Small Luts

Establish equivalence
how many small LUTs equal one big LUT?

12
gates in 2-LUT ?
13
How Much Logic in a LUT?

Lower Bound?
Concrete 4-LUTs to implement M-LUT?
Not use all inputs?
0 maybe 1
Use all inputs?
(M-1)/3

(M-1)/k for K-lut
14
How much logic in a LUT?

Upper Upper Bound
M-LUT implemented w/ 4-LUTs
M-LUT ? 2M-4(2M-4-1) ? 2M-3 4-LUTs

15
How Much?

Lower Upper Bound
22M functions realizable by M-LUT
Say Need n 4-LUTs to cover compute n
strategy count functions realizable by each
(224)n ? 22M
nlog(224) ?log(22M)
n24log(2) ? 2Mlog(2)
n24 ? 2M
n ? 2M-4

16
How Much?

Combine
Lower Upper Bound
Upper Lower Bound
(number of 4-LUTs in M-LUT)
2M-4 ? n? 2M-3

17
Memories and 4-LUTs

For the most complex functions an M-LUT has 2M-4
4-LUTs
SRAM 32Kx8 l0.6mm
170Ml2 (21ns latency)
8211 16K 4-LUTs
XC3042 l0.6mm
180Ml2 (13ns delay per CLB)
288 4-LUTs
Memory is 50x denser than FPGA
and faster

18
Memory and 4-LUTs

For regular functions?
15-bit parity
entire 32Kx8 SRAM
5 4-LUTs
(2 of XC3042 3.2Ml21/50th Memory)
7b Add
entire 32Kx8 SRAM
14 4-LUTs
(5 of XC3042, 8.8Ml21/20th Memory)

19
LUT Interconnect

Interconnect allows us to exploit structure in
computation
Already know
LUT Area ltlt Interconnect Area
Area of an M-LUT on FPGA gtgt M-LUT Area
but most M-input functions
complexity ltlt 2M

20
Different Instance, Same Concept

Most general functions are huge
Applications exhibit structure
Exploit structure to optimize common case

21
LUT Count vs. base LUT size
22
LUT vs. K

DES MCNC Benchmark
moderately irregular

23
Toronto Experiments

Want to determine best K for LUTs
Bigger LUTs
handle complicated functions efficiently
less interconnect overhead
Smaller LUTs
handle regular functions efficiently
interconnect allows exploitation of compute
sturcture
Whats the typical complexity/structure?

24
Familiar Systematization

Define a design/optimization space
pick key parameters
e.g. K number of LUT inputs
Build a cost model
Map designs
Look at resource costs at each point
Compose
Logical Resources?Resource Cost
Look for best design points

25
Toronto LUT Size

Map to K-LUT
use Chortle
Route to determine wiring tracks
global route
different channel width W for each benchmark
Area Model for K and W
Alut exponential in K
Interconnect area based on switch count.

26
LUT Area vs. K

Routing Area roughly linear in K ?

27
Mapped LUT Area

Compose Mapped LUTs and Area Model

28
Mapped Area vs. LUT K
N.B. unusual case minimum area at K3
29
Toronto Result

Minimum LUT Area
at K4
Important to note minimum on previous slides
based on particular cost model
robust for different switch sizes
(wire widths)
see graphs in paper

30
Implications
31
Implications

Custom? / Gate Arrays?
More restricted logic functions?

32
Relate to Sequential?

How does this result relate to sequential
execution case?
Number of LUTs Number of Cycles
Interconnect Cost?
Naïve
structure in practice?
Instruction Cost?

33
Delay

Back to Spatial

34
Delay?

Circuit Depth in LUTs?
Simple Function ? M-input AND

1 table lookup in M-LUT logk(M) lookups in K-LUT
35
Delay?

M-input Complex function
1 table lookup for M-LUT
Lower bound ?logk(2(M-k))? 1
logk(2(M-k))(M-k)logk(2)

36
Some Math

Ylogk(2)
kY 2
Ylog2(k) 1
Y1/log2(k)
logk(2)1/log2(k)

(M-k)logk(2)
(M-k)/log2(k)

37
Delay?

M-input Complex function
Lower bound ?logk(2(M-k))? 1
logk(2(M-k))(M-k)logk(2)
Lower Bound ?(M-k)/log2(k)? 1

38
Delay?

M-input Complex function
Upper Bound
use each k-lut as a k- log2(k) input mux
Upper Bound ?(M-k)/log2(k- log2(k))?1

39
Delay?

M-input Complex function
1 table lookup for M-LUT
between ?(M-k)/log2(k)? 1
and ?(M-k)/log2(k- log2(k))?1

40
Delay

Simple log M
Complex linear in M
Both scale as 1/log(k)

41
Circuit Depth vs. K
42
LUT Delay vs. K

For small LUTs
tLUT?c0c1?K

Large LUTs
add length term
c2 ??2K
Plus Wire Delay
?area

43
Delay vs. K
Why not satisfied with this model?
Delay Depth ? (tLUT tInterconnect)
44
Observation

General interconnect is expensive
Larger logic blocks
less interconnect crossing
lower interconnect delay
get larger
get slower
Happens faster than modeled here due to area
less area efficient
dont match structure in computation

45
Big IdeasMSB Ideas

Memory most dense programmable structure for the
most complex functions
Memory inefficient (scales poorly) for structured
compute tasks
Most tasks have some structure
Programmable interconnect allows us to exploit
that structure

46
Big IdeasMSB-1 Ideas

Area
LUT count decrease w/ K, but slower than
exponential
LUT size increase w/ K
exponential LUT function
empirically linear routing area
Minimum area around K4

47
Big IdeasMSB-1 Ideas

Delay
LUT depth decreases with K
in practice closer to log(K)
Delay increases with K
small K linear large fixed term
minimum around 5-6

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user

CrystalGraphics Presentations

Introducing-PowerShowcom PowerPoint PPT Presentation

Introducing-PowerShowcom - Introducing-PowerShowcom (Without Music)

CrystalGraphics 3D Character Slides for PowerPoint PowerPoint PPT Presentation

CrystalGraphics 3D Character Slides for PowerPoint - CrystalGraphics 3D Character Slides for PowerPoint

Chart and Diagram Slides for PowerPoint PowerPoint PPT Presentation

Chart and Diagram Slides for PowerPoint - Beautifully designed chart and diagram s for PowerPoint with visually stunning graphics and animation effects. Our new CrystalGraphics Chart and Diagram Slides for PowerPoint is a collection of over 1000 impressively designed data-driven chart and editable diagram s guaranteed to impress any audience. They are all artistically enhanced with visually stunning color, shadow and lighting effects. Many of them are also animated. And they’re ready for you to use in your PowerPoint presentations the moment you need them. – PowerPoint PPT presentation

Related Presentations

CS184a: Computer Architecture (Structure and Organization) PowerPoint PPT Presentation

CS184a: Computer Architecture (Structure and Organization) - ... wired-OR Wired-or Connect series of inputs to wire Any of the inputs can drive the wire high Wired-or Implementation with ... of Technology Other titles: Times ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 7th Edition PowerPoint PPT Presentation

William Stallings Computer Organization and Architecture 7th Edition - William Stallings Computer Organization and Architecture 7th Edition Chapter 10 Instruction Sets: Characteristics and Functions | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) PowerPoint PPT Presentation

CS184a: Computer Architecture (Structures and Organization) - (2) Crossbar. Avoid bottleneck. Every output gets its own interconnect channel ... Can't afford full crossbar. Need to exploit locality. Can't have everything close ... | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) - minimum area (one study, see paper) K=10, N=12, M=3. A(PLA 10,12,3) ... Questions about homework. Caltech CS184a Fall2000 -- DeHon. 29. Big Ideas [MSB Ideas] ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 7th Edition - William Stallings Computer Organization and Architecture 7th Edition Chapter 5 Internal Memory ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 6th Edition PowerPoint PPT Presentation

William Stallings Computer Organization and Architecture 6th Edition - William Stallings Computer Organization and Architecture 6th Edition Chapter 10 Instruction Sets: Characteristics and Functions | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 6th Edition - William Stallings Computer Organization and Architecture 6th Edition Chapter 11 ... Organization and Architecture 6th Edition Addressing Modes Immediate ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 6th Edition - William Stallings Computer Organization and Architecture 6th Edition Chapter 11 Instruction Sets: Addressing Modes and Formats Addressing Modes Immediate Direct ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 8th Edition PowerPoint PPT Presentation

William Stallings Computer Organization and Architecture 8th Edition - William Stallings Computer Organization and Architecture 8th Edition Chapter 7 Input/Output | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) - Coming Attractions. Administrivia. Big Ideas. MSB. MSB-1. Caltech CS184a Fall2000 -- DeHon ... Coming Attractions: Three Talks by Tom Knight. Thursday 4pm (102 Steele) ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 6th Edition - William Stallings Computer Organization and Architecture 6th Edition Chapter 2 Computer Evolution and Performance A brief history of computer The first Generation ... | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) - Just starting to look at balancing interconnect and logic. Caltech CS184a Fall2000 -- DeHon ... Better results if 'reassociate' rather than keeping original subtrees. ... | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) - return end of class in basket. or later to Cynthia (256 JRG) Caltech CS184a ... `Science is the belief in the ignorance of experts.'' -- Richard Feynman ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 7th Edition - William Stallings Computer Organization and Architecture 7th Edition Chapter 1 Introduction | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) - CS184a: Computer Architecture (Structures and Organization) Day1: September 25, 2000 Introduction and Overview Today Matter Computes Architecture Matters This Course ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 8th Edition - William Stallings Computer Organization and Architecture 8th Edition Chapter 14 Instruction Level Parallelism and Superscalar Processors What is Superscalar? | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 8th Edition - William Stallings Computer Organization and Architecture 8th Edition Chapter 8 Operating System Support Objectives and Functions Convenience Making the computer ... | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) - CS184a: Computer Architecture (Structures and Organization) Day20: November 29, 2000 Review Today Review content and themes N.B. EOT Feedback Questionnaire return end ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 6th Edition - William Stallings Computer Organization and Architecture 6th Edition Chapter 1 Introduction Architecture & Organization 1 Architecture is those attributes of a system ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 8th Edition - William Stallings Computer Organization and Architecture 8th Edition Chapter 12 Processor Structure and Function CPU Structure CPU must: Fetch instructions Interpret ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 6th Edition - William Stallings Computer Organization and Architecture 6th Edition Chapter 18 Parallel Processing Multiple Processor Organization Single instruction, single data ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 5th Edition PowerPoint PPT Presentation

William Stallings Computer Organization and Architecture 5th Edition - William Stallings Computer Organization and Architecture 5th Edition Chapter 11 CPU Structure and Function CPU Topics Processor Organization ... | PowerPoint PPT presentation | free to view

William Stallings Computer Organization and Architecture 6th Edition - William Stallings Computer Organization and Architecture 6th Edition Chapter 1 Introduction Architecture & Organization 1 Architecture is those attributes visible to ... | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structures and Organization) - and why they don't work. Characterizing Interconnect ... Resuming... Caltech CS184a Fall2000 -- DeHon. 15. Rent's Rule. Typically consider. 0.5 P 0.75 ' ... | PowerPoint PPT presentation | free to view

Introduction to Computer Systems and Performance PowerPoint PPT Presentation

Introduction to Computer Systems and Performance - Chapter 1 Introduction to Computer Systems and Performance CS.216 Computer Architecture and Organization | PowerPoint PPT presentation | free to view

CS184a: Computer Architecture (Structure and Organization) - CS184a: Computer Architecture Structure and Organization | PowerPoint PPT presentation | free to view