Chapter 8 CPU and Memory: Design, Implementation, and Enhancement

About This Presentation

Title:

Chapter 8 CPU and Memory: Design, Implementation, and Enhancement

Description:

... performance but no space available for them Modern CISC and RISC architectures are becoming similar VLIW Architecture Transmeta Crusoe CPU 128-bit instruction ... –

Number of Views:143

Avg rating:3.0/5.0

Slides: 39

Provided by: anvariNe68

Category:

more less

Transcript and Presenter's Notes

Title: Chapter 8 CPU and Memory: Design, Implementation, and Enhancement

1
Chapter 8CPU and MemoryDesign, Implementation,
and Enhancement

The Architecture of Computer Hardware and Systems
Software An Information Technology Approach
3rd Edition, Irv Englander
John Wiley and Sons ?2003

2
CPU Architecture Overview

CISC Complex Instruction Set Computer
RISC Reduced Instruction Set Computer
CISC vs. RISC Comparisons
VLIW Very Long Instruction Word
EPIC Explicitly Parallel Instruction Computer

3
CISC Architecture

Examples
Intel x86, IBM Z-Series Mainframes, older CPU
architectures
Characteristics
Few general purpose registers
Many addressing modes
Large number of specialized, complex instructions
Instructions are of varying sizes

4
Limitations of CISC Architecture

Complex instructions are infrequently used by
programmers and compilers
Memory references, loads and stores, are slow and
account for a significant fraction of all
instructions
Procedure and function calls are a major
bottleneck
Passing arguments
Storing and retrieving values in registers

5
RISC Features

Examples
Power PC, Sun Sparc, Motorola 68000
Limited and simple instruction set
Fixed length, fixed format instruction words
Enable pipelining, parallel fetches and
executions
Limited addressing modes
Reduce complicated hardware
Register-oriented instruction set
Reduce memory accesses
Large bank of registers
Reduce memory accesses
Efficient procedure calls

6
CISC vs. RISC Processing
7
Circular Register Buffer
8
Circular Register Buffer- After Procedure Call
9
CISC vs. RISC Performance Comparison

RISC ? Simpler instructions
? more instructions
? more memory accesses
RISC ? more bus traffic and
increased cache memory misses
More registers would improve CISC performance but
no space available for them
Modern CISC and RISC architectures are becoming
similar

10
VLIW Architecture

Transmeta Crusoe CPU
128-bit instruction bundle molecule
4 32-bit atoms (atom instruction)
Parallel processing of 4 instructions
64 general purpose registers
Code morphing layer
Translates instructions written for other CPUs
into molecules
Instructions are not written directly for the
Crusoe CPU

11
EPIC Architecture

Intel Itanium CPU
128-bit instruction bundle
3 41-bit instructions
5 bits to identify type of instructions in bundle
128 64-bit general purpose registers
128 82-bit floating point registers
Intel X86 instruction set included
Programmers and compilers follow guidelines to
ensure parallel execution of instructions

12
Paging

Managed by the operating system
Built into the hardware
Independent of application

13
Logical vs. Physical Addresses

Logical addresses are relative locations of data,
instructions and branch target and are separate
from physical addresses
Logical addresses mapped to physical addresses
Physical addresses do not need to be consecutive

14
Logical vs. Physical Address
15
Page Address Layout
16
Page Translation Process
17
Memory Enhancements

Memory is slow compared to CPU processing speeds!
2Ghz CPU 1 cycle in ½ of a billionth of a
second
70ns DRAM 1 access in 70 millionth of a second
Methods to improvement memory accesses
Wide Path Memory Access
Retrieve multiple bytes instead of 1 byte at a
time
Memory Interleaving
Partition memory into subsections, each with its
own address register and data register
Cache Memory

18
Memory Interleaving
19
Why Cache?

Even the fastest hard disk has an access time of
about 10 milliseconds
2Ghz CPU waiting 10 millisecondswastes 20
million clock cycles!

20
Cache Memory

Blocks 8 or 16 bytes
Tags location in main memory
Cache controller
hardware that checks tags
Cache Line
Unit of transfer between storage and cache memory
Hit Ratio ratio of hits out of total requests
Synchronizing cache and memory
Write through
Write back

21
Step-by-Step Use of Cache
22
Step-by-Step Use of Cache
23
Performance Advantages

Hit ratios of 90 common
50 improved execution speed
Locality of reference is why caching works
Most memory references confined to small region
of memory at any given time
Well-written program in small loop, procedure or
function
Data likely in array
Variables stored together

24
Two-level Caches

Why do the sizes of the caches have to be
different?

25
Cache vs. Virtual Memory

Cache speeds up memory access
Virtual memory increases amount of perceived
storage
independence from the configuration and capacity
of the memory system
low cost per bit

26
Modern CPU Processing Methods

Timing Issues
Separate Fetch/Execute Units
Pipelining
Scalar Processing
Superscalar Processing

27
Timing Issues

Computer clock used for timing purposes
MHz million steps per second
GHz billion steps per second
Instructions can (and often) take more than one
step
Data word width can require multiple steps

28
Separate Fetch-Execute Units

Fetch Unit
Instruction fetch unit
Instruction decode unit
Determine opcode
Identify type of instruction and operands
Several instructions are fetched in parallel and
held in a buffer until decoded and executed
IP Instruction Pointer register
Execute Unit
Receives instructions from the decode unit
Appropriate execution unit services the
instruction

29
Alternative CPU Organization
30
Instruction Pipelining

Assembly-line technique to allow overlapping
between fetch-execute cycles of sequences of
instructions
Only one instruction is being executed to
completion at a time
Scalar processing
Average instruction execution is approximately
equal to the clock speed of the CPU
Problems from stalling
Instructions have different numbers of steps
Problems from branching

31
Branch Problem Solutions