Mid3 Revision 2 - PowerPoint PPT Presentation

1 / 80

About This Presentation

Title:

Mid3 Revision 2

Description:

Mid3 Revision 2 Prof. Sin-Min Lee – PowerPoint PPT presentation

Number of Views:92

Avg rating:3.0/5.0

Slides: 81

Provided by: Lee149

Category:

more less

Transcript and Presenter's Notes

Title: Mid3 Revision 2

1
Mid3 Revision 2
CS147
Lecture 21

Prof. Sin-Min Lee

2
Parallel Architectures
MIMD Machines
3
From the beginning of time, computer scientists
have been challenging computers with larger and
larger problems. Eventually, computer processors
were combined together in parallel to work on the
same task together. This is parallel processing.
Types Of Parallel Processing
SISD Single Instruction stream, Single Data
stream MISD Multiple Instruction stream, Single
Data stream SIMD Single Instruction stream,
Multiple Data stream MIMD Multiple Instruction
stream, Multiple Data stream
4
SISD
One piece of data is sent to one processor.
Ex To multiply one hundred numbers by the number
three, each number would be sent and calculated
until all one hundred results were calculated.
5
MISD
One piece of data is broken up and sent to many
processor.
CPU
Data
CPU
Search
CPU
CPU
Ex A database is broken up into sections of
records and sent to several different processor,
each of which searches the section for a specific
key.
6
SIMD
Multiple processors execute the same instruction
of separate data.
Ex A SIMD machine with 100 processors could
multiply 100 numbers, each by the number three,
at the same time.
7
MIMD
Multiple processors execute different instruction
of separate data.
CPU
Data
Multiply
CPU
Data
Search
CPU
Data
Add
CPU
Data
Subtract
This is the most complex form of parallel
processing. It is used on complex simulations
like modeling the growth of cities.
8
The Granddaddy of Parallel Processing
MIMD
9
MIMD computers usually have a different program
running on every processor. This makes for a
very complex programming environment.
Whats doing what when?
What processor? Doing which task? At what time?
10
Memory latency
The time between issuing a memory fetch and
receiving the response.
Simply put, if execution proceeds before the
memory request responds, unexpected results will
occur. What values are being used? Not the
ones requested!
11
A similar problem can occur with instruction
executions themselves.
Synchronization The need to enforce the ordering
of instruction executions according to their data
dependencies.
Instruction b must occur before instruction a.
12
Despite potential problems, MIMD can prove larger
than life.
MIMD Successes
IBM Deep Blue Computer beats professional chess
player.
Some may not consider this to be a fair example,
because Deep Blue was built to beat Kasparov
alone. It knew his play style so it could
counter is projected moves. Still, Deep Blues
win marked a major victory for computing.
13
IBMs latest, a supercomputer that models nuclear
explosions.
IBM Poughkeepsie built the worlds fastest
supercomputer for the U. S. Department of Energy.
Its job was to model nuclear explosions.
14
MIMD its the most complex, fastest, flexible
parallel paradigm. Its beat a world class chess
player at his own game. It models things that
few people understand. It is parallel processing
at its finest.
15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
Midterm Gate Problem
Y
D Q¹ Q
I
T Q
Clock
21
Start
0
1
0
0
0
0
1
1
0
0
Start
I Q¹ Q Y
22
Clock Cycle 1
0
1
1
1
0
0
Note Q outputs are dependant on the state
of inputs present on the previous cycle.
1
0
1
1
I Q¹ Q Y
23
Clock Cycle 2
0
1
1
1
0
0
Note Q outputs are dependant on the state
of inputs present on the previous cycle.
1
0
1
1
I Q¹ Q Y
24
Clock Cycle 3
1
0
1
1
1
1
Note Q outputs are dependant on the state
of inputs present on the previous cycle.
0
0
1
1
I Q¹ Q Y
25
Clock Cycle 4
1
0
1
1
1
1
Note Q outputs are dependant on the state
of inputs present on the previous cycle.
0
0
1
1
I Q¹ Q Y
26
Clock Cycle 5
0
1
0
1
0
0
Note Q outputs are dependant on the state
of inputs present on the previous cycle.
1
1
0
0
I Q¹ Q Y
27
Clock Cycle 6
0
1
1
1
1
0
Note Q outputs are dependant on the state
of inputs present on the previous cycle.
1
0
0
0
I Q¹ Q Y
28
Clock Cycle 7
0
1
0
1
0
0
Note Q outputs are dependant on the state
of inputs present on the previous cycle.
1
1
0
0
I Q¹ Q Y
29
Clock Cycle 8
0
1
1
1
0
0
Note Q outputs are dependant on the state
of inputs present on the previous cycle.
1
0
1
1
I Q¹ Q Y
30
Some commonly used components

Decoders n inputs, 2n outputs.
the inputs are used to select which output is
turned on. At any time exactly one output is on.
Multiplexors 2n inputs, n selection bits, 1
output.
the selection bits determine which input will
become the output.
Adder 2n inputs, 2n outputs.
Computer Arithmetic.

31
Multiplexer

Selects binary information from one of many
input lines and directs it to a single output
line.
Also known as the selector circuit,
Selection is controlled by a particular set of
inputs lines whose depends on the of the data
input lines.
For a 2n-to-1 multiplexer, there are 2n data
input lines and n selection lines whose bit
combination determines which input is selected.

32
MUX
Enable
2n Data Inputs
Data Output
n
Input Select
33
Remember the 2 4 Decoder?
Sel(3)
S1
Sel(2)
Sel(1)
S0
Sel(0)
Mutually Exclusive (Only one O/P asserted at any
time
34
4 to 1 MUX
DataFlow
D3D0
Dout
4
Control
4
2 - 4 Decoder
Sel(30)
2
S1S0
35
4-to-1 MUX (Gate level)
Control Section
Three of these signal inputs will always be 0.
The other will depend on the data value selected
36
Multiplexer (cont.)

Until now, we have examined single-bit data
selected by a MUX. What if we want to select
m-bit data/words?? Combine MUX blocks in
parallel with common select and enable signals
Example Construct a logic circuit that selects
between 2 sets of 4-bit inputs (see next slide
for solution).

37
Example Quad 2-to-1 MUX

Uses four 4-to-1 MUXs with common select (S) and
enable (E).
Select line chooses between Ais and Bis. The
selected four-wire digital signal is sent to the
Yis
Enable line turns MUX on and off (E1 is on).

38
Implementing Boolean functions with Multiplexers

Any Boolean function of n variables can be
implemented using a 2n-1-to-1 multiplexer. A MUX
is basically a decoder with outputs ORed
together, hence this isnt surprising.
The SELECT signals generate the minterms of the
function.
The data inputs identify which minterms are to be
combined with an OR.

39
Example

F(X,Y,Z) XYZ XYZ XYZ XYZ
Sm(1,2,6,7)
There are n3 inputs, thus we need a 22-to-1 MUX
The first n-1 (2) inputs serve as the selection
lines

40
Efficient Method for implementing Boolean
functions

For an n-variable function (e.g., f(A,B,C,D))
Need a 2n-1 line MUX with n-1 select lines.
Enumerate function as a truth table with
consistent ordering of variables (e.g., A,B,C,D)
Attach the most significant n-1 variables to the
n-1 select lines (e.g., A,B,C)
Examine pairs of adjacent rows (only the least
significant variable differs, e.g., D0 and D1).
Determine whether the function output for the
(A,B,C,0) and (A,B,C,1) combination is (0,0),
(0,1), (1,0), or (1,1).
Attach 0, D, D, or 1 to the data input
corresponding to (A,B,C) respectively.

41
Another Example

Consider F(A,B,C) ?m(1,3,5,6). We can implement
this function using a 4-to-1 MUX as follows.
The index is ABC. Apply A and B to the S1 and S0
selection inputs of the MUX (A is most sig, S1 is
most sig.)
Enumerate function in a truth table.

42
MUX Example (cont.)
A B C F
0 0 0 0
0 0 1 1
0 1 0 0
0 1 1 1
1 0 0 0
1 0 1 1
1 1 0 1
1 1 1 0
When AB0, FC
When A0, B1, FC
When A1, B0, FC
When AB1, FC
43
MUX implementation of F(A,B,C) ?m(1,3,5,6)
A
B
C
C
F
C
C
44
1 input Decoder
Decoder
O0
I
O1
Treat I as a 1 bit integer i. The ith output will
be turned on (Oi1), the other one off.
45
1 input Decoder
O0
I
O1
46
2 input Decoder
Decoder
O0
I0
O1
O2
I1
O3
Treat I0I1 as a 2 bit integer i. The ith output
will be turned on (Oi1), all the others off.
47
2 input Decoder
I1
I0
O0 !I0 !I1
O1 !I0 I1
O2 I0 !I1
O3 I0 I1
48
3 Input Decoder
Decoder
O0
I0
O1
O2
I1
O3
O4
O5
I2
O6
O7
49
3-Decoder Partial Implementation
I2
I1
I0
O0
O1
. . .
50
2 Input Multiplexor
Inputs I0 and I1 Selector S Output O If S is
a 0 OI0 If S is a 1 OI1
Mux
I0
O
I1
S
51
2-Mux Logic Design
I1
I0
S
I0 !S
O
I1 S
52
4 Input Multiplexor
Inputs I0 I1 I2 I3 Selectors S0 S1 Output O
Mux
I0
I1
O
I2
S0 S1 O
0 0 I0
0 1 I1
1 0 I2
1 1 I3
I3
S0
S1
53
One Possible 4-Mux
2-Decoder
S0
I0
I1
S1
O
I2
I3
54
Adder

We want to build a box that can add two 32 bit
numbers.
Assume 2s complement representation
We can start by building a 1 bit adder.

55
Addition

We need to build a 1 bit adder
compute binary addition of 2 bits.
We already know that the result is 2 bits.

A B O0 O1
0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 0
This is addition!
A B O0 O1
56
One Implementation
A B
A
O0
B
!A
(!A B) (A !B)
B
O1
A
!B
57
Binary addition and our adder
1
1
Carry
01001 01101
10110

What we really want is something that can be used
to implement the binary addition algorithm.
O0 is the carry
O1 is the sum

58
What about the second column?
1
1
Carry
01001 01101
10110

We are adding 3 bits
new bit is the carry from the first column.
The output is still 2 bits, a sum and a carry

59
Truth Table for Addition
A B Carry In Carry Out Sum
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
60
Swapping
Disk
Monitor
User 1
User Partition
61
Swapping
Disk
Monitor
User 1
User Partition
User 1
62
Swapping
Disk
Monitor
User 1
User Partition
User 1
User 2
63
Swapping
Disk
Monitor
User 1
User Partition
User 2
User 2
64
Swapping
Disk
Monitor
User 1
User Partition
User 2
User 2
65
Swapping
Disk
Monitor
User 1
User Partition
User 1
User 2
66
Paging Request
67
Paging
68
Paging
69
Paging
70
Page Mapping Hardware
Virtual Memory
Virtual Address (P,D)
P
Page Table
D
P
P?F
Physical Memory
F
Physical Address (F,D)
D
71
Page Mapping Hardware
Virtual Memory
Virtual Address (004006)
Page Table
004
006
4
4?5
Physical Memory
005
Physical Address (F,D)
Page size 1000 Number of Possible Virtual Pages
1000 Number of Page Frames 8
006
72
Page Fault

Access a virtual page that is not mapped into any
physical page
A fault is triggered by hardware
Page fault handler (in OSs VM subsystem)
Find if there is any free physical page available
If no, evict some resident page to disk (swapping
space)
Allocate a free physical page
Load the faulted virtual page to the prepared
physical page
Modify the page table

73
Placement Policy

Determines where in real memory a process piece
is to reside
Important in a segmentation system
Paging or combined paging with segmentation
hardware performs address translation

74
Replacement Policy

Placement Policy
Which page is replaced?
Page removed should be the page least likely to
be referenced in the near future
Most policies predict the future behavior on the
basis of past behavior

75
Replacement Policy

Frame Locking
If frame is locked, it may not be replaced
Kernel of the operating system
Control structures
I/O buffers
Associate a lock bit with each frame

76
Basic Replacement Algorithms

Optimal policy
Selects for replacement that page for which the
time to the next reference is the longest
Impossible to have perfect knowledge of future
events

77
Basic Replacement Algorithms

Least Recently Used (LRU)
Replaces the page that has not been referenced
for the longest time
By the principle of locality, this should be the
page least likely to be referenced in the near
future
Each page could be tagged with the time of last
reference. This would require a great deal of
overhead.

78
Basic Replacement Algorithms

First-in, first-out (FIFO)
Treats page frames allocated to a process as a
circular buffer
Pages are removed in round-robin style
Simplest replacement policy to implement
Page that has been in memory the longest is
replaced
These pages may be needed again very soon

79
Basic Replacement Algorithms

Clock Policy
Additional bit called a use bit
When a page is first loaded in memory, the use
bit is set to 1
When the page is referenced, the use bit is set
to 1
When it is time to replace a page, the first
frame encountered with the use bit set to 0 is
replaced.
During the search for replacement, each use bit
set to 1 is changed to 0

80
(No Transcript)

Write a Comment

User Comments (0)