Title: Memory Subsystem
1Memory Subsystem
- Lecture 3
- MSIT 123 Computer Architecture and Operating
Systems
2Digital Computer Subsystems
Systembus
The (three, four, five, or) six main units of a
digital computer. Usually, the link unit (a
simple bus or a more elaborate network) is not
explicitly included in such diagrams.
3Memory
- A computers main memory (or simply memory) holds
currently active programs and data - A program must be stored in memory before it can
be executed and, - Data must be stored in memory before the computer
can manipulate them.
4Memory
- Different types of electronic memory
- RAM
- ROM
- Cache
- Dynamic RAM
- Static RAM
- Flash memory
- Memory Sticks
- Virtual memory
- Video memory
- BIOS
- Electronic items that have some form of memory
- Computers
- Cell phones
- PDAs
- Game consoles
- Car radios
- VCRs
- TVs
5Memory
6Memory Hierarchy
Cache memory
Primary memory
Secondary memory
Tertiary memory
7Typical Levels in a Hierarchical Memory
registers
cache
primary
secondary
tertiary
Names and key characteristics of levels in a
memory hierarchy.
8Physical Memory Devices
- Computer memory holds binary digits or bits
- Any device that can assume either of the two
states (on, off) can serve as computer memory - Most computers use integrated circuits
- Basic operations supported reading and writing
- Some examples Random Access Memory (RAM), Read
Only Memory (ROM).
9Memory Technologies
- Cache Bipolar, Bipolar/CMOS (BiCMOS),
High-Speed CMOS (Static RAMs) - Primary CMOS, High-Speed CMOS (Dynamic or
Static RAMs, ROMs - Secondary Magnetic Media (Floppy Disk, Hard
Drive, Tape) Optical Media (CD ROM, WORM, R/W CD)
10Classifying Memory Technologies
- Ability to retain content volatile and
non-volatile - Manner the device is accessed Random Access
(read and write), Read Only (read not write) - Semiconductor technology used Bipolar (high
speed, low density, high power consumption), CMOS
(high density, low speed,low power), BiCMOS (high
density, higher speed, lower power) - Refreshing the content static (no need to
refresh), dynamic (need to be refreshed)
11Bytes and Words
- Bit (0 or 1)
- Byte (usually 8 bits)
- Word (8-, 16-, 32-, 64-bits)
12Memory Interfacing
- Memory has to be properly interfaced to the
processor so that it can be accessed efficiently. - The basic signals needed for interfacing to the
processor are - Address Bus
- Data Bus
- Control Signal
13Three-Bus Memory Interface
The Processor
K-bit Address Bus
- Addressable Space 2K
- Ex. K 20 -gt 1Mb
- Data Width or
- Bandwidth N
- N 8 (byte)
- N 16 (word)
- N 32 (double word)
MAR
Memory
N-bit Bidirectional Data Bus
MDR
- Control signals
- Read/write
- Chip select
- Clock
ControlUnit
- MAR Memory Address Register
- MDR Memory Data Register
14Intel Family of 80X86 Microprocessors Memory
Interface
15The Need for a Memory Hierarchy
The widening speed gap between CPU and main
memory Processor operations take of the order
of 1 ns Memory access requires 10s or even 100s
of ns Memory bandwidth limits the instruction
execution rate Each instruction executed
involves at least one memory access Hence,
a few to 100s of MIPS is the best that can be
achieved A fast buffer memory can help
bridge the CPU-memory gap The fastest memories
are expensive and thus not very large A
second (third?) intermediate cache level is thus
often used
16Memory Interleaving
Interleaved memory is more flexible than
wide-access memory in that it can handle multiple
independent accesses at once.
17Cache Memory
Main Memory
Cache Memory
The completeprogram
The Processor
Active data and instructions
- Many computers contain a block of high-speed
cache memory (I.e., cache is effectively a
staging area for the processor
18The Need for a Cache
One level of cache with hit rate h Ceff
hCfast (1 h)(Cslow Cfast) Cfast (1
h)Cslow
Cache memories act as intermediaries between the
superfast processor and the much slower main
memory.
19Cache Memory Design Parameters
Cache size (in bytes or words). A larger cache
can hold more of the programs useful data but is
more costly and likely to be slower. Block or
cache-line size (unit of data transfer between
cache and main). With a larger cache line, more
data is brought in cache with each miss. This can
improve the hit rate but also may bring
low-utility data in. Placement policy.
Determining where an incoming cache line is
stored. More flexible policies imply higher
hardware cost and may or may not have performance
benefits (due to more complex data location).
Replacement policy. Determining which of
several existing cache blocks (into which a new
cache line can be mapped) should be overwritten.
Typical policies choosing a random or the least
recently used block. Write policy. Determining
if updates to cache words are immediately
forwarded to main (write-through) or modified
blocks are copied back to main if and when they
must be replaced (write-back or copy-back).
20What Makes a Cache Work?
Temporal locality Spatial locality
Assuming no conflict in address mapping, the
cache will hold a small program loop in its
entirety, leading to fast execution.
21Desktop, Drawer, and File Cabinet Analogy
Once the working set is in the drawer, very few
trips to the file cabinet are needed.
Items on a desktop (register) or in a drawer
(cache) are more readily accessible than those in
a file cabinet (main memory).
22Compulsory, Capacity, and Conflict Misses
Compulsory misses With on-demand fetching, first
access to any item is a miss. Some compulsory
misses can be avoided by prefetching. Capacity
misses We have to oust some items to make room
for others. This leads to misses that are not
incurred with an infinitely large cache.
Conflict misses Occasionally, there is free
room, or space occupied by useless data, but the
mapping/placement scheme forces us to displace
useful items to bring in other items. This may
lead to misses in future.
Given a fixed-size cache, dictated, e.g., by cost
factors or availability of space on the processor
chip, compulsory and capacity misses are pretty
much fixed. Conflict misses, on the other hand,
are influenced by the data mapping scheme which
is under our control. We study two popular
mapping schemes direct and set-associative.
23Direct-Mapped Cache
Direct-mapped cache holding 32 words within eight
4-word lines. Each line is associated with a tag
and a valid bit.
24Set-Associative Cache
Two-way set-associative cache holding 32 words of
data within 4-word lines and 2-line sets.
25Cache and Main Memory
Split cache separate instruction and data caches
(L1) Unified cache holds instructions and data
(L1, L2, L3)
Harvard architecture separate instruction and
data memories von Neumann architecture one
memory for instructions and data
The writing problem Write-through slows down
the cache to allow main to catch up Write-back
or copy-back is less problematic, but still hurts
performance due to two main memory accesses in
some cases. Solution Provide write buffers for
the cache so that it does not have to wait for
main memory to catch up.
26Virtual Memory and Paging
- Managing data transfers between main mass is
cumbersome - Virtual memory automates this process
- Key to virtual memorys success is the same as
for cache
27The Need for Virtual Memory
Program segments in main memory and on disk.
28Page Tables and Address Translation
The role of page table in the virtual-to-physical
address translation process.
29Protection and Sharing in Virtual Memory
Virtual memory as a facilitator of sharing and
memory protection.
30Memory Hierarchy The Big Picture
Fig. 20.2 Data movement in a memory hierarchy.
31Summary of Memory Hierarchy
Cache memory provides illusion of very high speed
Virtual memory provides illusion of very large
size
Data movement in a memory hierarchy.
32Role of Memory The PC Process
- All of the components in your computer, such as
the CPU, the hard drive and the operating system,
work together as a team, and memory is one of the
most essential parts of this team. From the
moment you turn your computer on until the time
you shut it down, your CPU is constantly using
memory. Let's take a look at a typical scenario - You turn the computer on.
- The computer loads data from read-only memory
(ROM) and performs a power-on self-test (POST) to
make sure all the major components are
functioning properly. As part of this test, the
memory controller checks all of the memory
addresses with a quick read/write operation to
ensure that there are no errors in the memory
chips. Read/write means that data is written to a
bit and then read from that bit. - The computer loads the basic input/output system
(BIOS) from ROM. The BIOS provides the most basic
information about storage devices, boot sequence,
security, Plug and Play (auto device recognition)
capability and a few other items. - The computer loads the operating system (OS) from
the hard drive into the system's RAM. Generally,
the critical parts of the operating system are
maintained in RAM as long as the computer is on.
This allows the CPU to have immediate access to
the operating system, which enhances the
performance and functionality of the overall
system. - When you open an application, it is loaded into
RAM. To conserve RAM usage, many applications
load only the essential parts of the program
initially and then load other pieces as needed. - After an application is loaded, any files that
are opened for use in that application are loaded
into RAM. - When you save a file and close the application,
the file is written to the specified storage
device, and then it and the application are
purged from RAM. - Every time something is loaded or opened, it is
placed into RAM. This simply means that it has
been put in the computer's temporary storage area
so that the CPU can access that information more
easily. The CPU requests the data it needs from
RAM, processes it and writes new data back to RAM
in a continuous cycle. In most computers, this
shuffling of data between the CPU and RAM happens
millions of times every second. When an
application is closed, it and any accompanying
files are usually purged (deleted) from RAM to
make room for new data. If the changed files are
not saved to a permanent storage device before
being purged, they are lost.
33Appendix
34Primary Memory Technology Details
35Memory Structure and SRAM
Conceptual inner structure of a 2h ? g SRAM chip
and its shorthand representation.
36Multiple-Chip SRAM
Eight 128K ? 8 SRAM chips forming a 256K ? 32
memory unit.
37SRAM with Bidirectional Data Bus
When data input and output of an SRAM chip are
shared or connected to a bidirectional data bus,
output must be disabled during write operations.
38DRAM and Refresh Cycles
DRAM vs. SRAM Memory Cell Complexity
Single-transistor DRAM cell, which is
considerably simpler than SRAM cell, leads to
dense, high-capacity DRAM memory chips.
39DRAM Refresh Cycles and Refresh Rate
Variations in the voltage across a DRAM cell
capacitor after writing a 1 and subsequent
refresh operations.
40DRAM Packaging
24-pin dual in-line package (DIP)
Typical DRAM package housing a 16M ? 4 memory.
41Nonvolatile Memory
ROM PROM EPROM
Read-only memory organization, with the fixed
contents shown on the right.
42Flash Memory
EEPROM or Flash memory organization. Each memory
cell is built of a floating-gate MOS transistor.
43Secondary Memory Concepts
- Todays main memory is huge, but still
inadequate for all needs - Magnetic disks provide extended and back-up
storage - Optical disks disk arrays are other mass
storage options
44Disk Memory Basics
Disk memory elements and key terms.
45Disk Drives
Typically 2-8 cm
Comprehensive info about disk memory
http//www.storageview.com/guide/
46Access Time for a Disk
The three components of disk access time. Disks
that spin faster have a shorter average and
worst-case access time.
47Representative Magnetic Disks
Table 19.1 Key attributes of three
representative magnetic disks, from the highest
capacity to the smallest physical size (ca. early
2003). More detail (weight, dimensions,
recording density, etc.) in textbook.
48Organizing Data on Disk
Fig. 19.2 Magnetic recording along the tracks
and the read/write head.
Fig. 19.3 Logical numbering of sectors on
several adjacent tracks.
49Disk Performance
Seek time a b(c 1) b(c 1)1/2
Average rotational latency 30 / rpm s
30 000 / rpm ms
Reducing average seek time and rotational latency
by performing disk accesses out of order.
50Disk Caching
Same idea as processor cache bridge main-disk
speed gap Read/write an entire track with each
disk access Access one sector, get 100s
free, hit rate around 90 Disks listed in Table
19.1 have buffers from 1/8 to 16 MB Rotational
latency eliminated can start from any
sector Need back-up power so as not to lose
changes in disk cache (need it anyway for head
retraction upon power loss) Placement options
for disk cache In the disk controller Suffers
from bus and controller latencies even for a
cache hit Closer to the CPU Avoids latencies
and allows for better utilization of
space Intermediate or multilevel solutions
51Disk Arrays and RAID
RAID levels 0-6, with a simplified view of data
organization.
52Other Types of Mass Memory
Magnetic and optical disk memory units.
53Optical Disks
Spiral, rather than concentric, tracks
Simplified view of recording format and access
mechanism for data on a CD-ROM or DVD-ROM.