Title: Digital Logic Level Chapter 3
1Digital Logic LevelChapter 3
- Gates and Boolean Algebra
- Basic Digital Logic Circuits
- Memory
- CPU Chips and Buses
- Interfacing112/6/00
2Gates
- Digital Circuit has two logic values
- 0 for low voltage
- 1 for high voltage
- Other voltages are not permitted
- Gates, tiny electronic devices can compute
functions of these voltages - They form the basis of computers
3Gates A Bipolar Transistor
- Transistor (the circle)
- Has three connections
- Collector
- Base
- Emitter
- When Vin is below a critical value (0),
transistor turns off Vout is Vcc (high value 1) - Else transistor turns on and Vout is low (0)
4Inverter
- When input is 0 output is 1
- When input is 1 output is 0
- Inverts input
- Has only one input
5Inverter
- Input- output behavior given by table called
truth table - Symbols used in circuits to denote an inverter is
given - Bubble at the end is called the inversion bubble
- Forms a basic function in Boolean algebra and
circuits
6NAND Gates
- Two transistors cascaded in a series
- Two inputs V1 and V2
- If either input is low transistor will turn off
and the output is high - Vout is 0 (low) if and only if V1 and V2 is 1
(high). - NAND means NotAnd
7NAND GATES
- Symbol above is for NAND
- A, B input, X output
- Notice the inversion bubble just before the input
- Output is 0 iff both inputs are 1
- Output is 1 if at least one input 0
- Will see NAND is NotAND
8AND Gates
- Has corresponding electronic circuit
- A,B inputs, X output
- Output 1 iff both inputs 1
- Output 0 iff either input 1
- Notice diagram is NAND without the inversion
bubble - NAND NOTAND
9OR Gates
- Two inputs A,B,
- one output X
- X is 1 iff either of A, B is 1
- X is 0 if both A, B are 0
- Symbol on top
- Truth Table on bottom
- Note No inversion bubble
10NOR Gates
- Two inputs, one output
- If both V1, V2 are low (0) Vout is high
- Otherwise Vout is (1) high
11Nor Gates
- Symbol on top
- Truth table on bottom
- Notice inversion bubble in front of OR symbol
- Get output value by switching the value coming
out of OR gates
12Notes
- NAND and NOR Gates require two transistor each.
- Not requires one transistor
- Hence AND and OR requires three transistors
- So computers use NAND and NOR as basic
13Physical Characteristics
- Bipolar transistors come in two types
- TTL Transistor-Transistor Logic used often
- ECL Emitter-Coupled Logic used for high speed
- MOS (Metal Oxide Transistors) come in
- PMOS, NMOS
- CMOS used for CPUs, Memory
- MOS slow, but smaller and take less power, hence
can be packed tightly
14Boolean Algebra
- Boolean Algebra A function has only two outcomes
(1true), (o-false) - N variable boolean function can be described has
only 2N possible outputs. They can be written
as a table - If agree on order of listing argument
combinations such as base 2 , can write output as
2n bit binary number
15Example Majority Function
- Three inputs A, B, C
- One output M
- Output takes truth value of majority inputs. I.e.
- M is 1 iff two of A,B,C is 1
- M is 0 iff two of A, B, C is 0
- Notice writing large truth tables is cumbersome
16Alternative Representation
- Collect the combinations of variable that give 1
for output. - Write the function as a SUM of these terms
- In terms, write variable name for value 1, and a
bar over the name for 0. - EG M ABCABCABCABC
17Rationale for New Notation
- Consider ABC The product is for AND
- Consider ABCABC The sum is for OR
- So we are writing the function as a sum of
products - I.e. AND-ing OR-terms Called conjunctive normal
form. - Consider ABC This is 1 iff A0, B1 and C1
- A function of N variables can be given as sum of
2N n-variable products
18Creating Circuits for Boolean Functions
- MABCABCABCABC
- 1,2,3 are NOT gates feeding lines A,B,C
- 4,5,6,7 are AND gates corresponding to the four
product terms - 8 is an OR term corresponding to the sum
- A,B,C have been inserted to avoid clutter they
could be connected directly out of NOT gate
19Implementing Boolean Functions
- Write the truth table
- Provide inverters for complementing inputs
- Draw an AND gate for each term with 1in output
column - Wire the AND gates to appropriate inputs
- Feed the outputs of all AND gates into an OR gate
20Using A Single Gate Type
- It is desirable to use only one type of gate
generate the whole circuit. - Can use NAND or NOR gate.
- In order to do so, enough to show that
- NOT, AND, OR NAND can be generated by NOR gates
- NOT, AND, OR, NOR ca be generated by NAND gates.
- We say that NAND, NOR are complete for Boolean
circuits
21Completeness of NAND
22Completeness of NOR
23Circuit Equivalence
- Sometimes need to minimize number of elements on
a board - get minimum number of gates
- Two input gates instead of four input gates
- Need to find an equivalent circuit for the given
circuit - Equivalent having same input output behavior
computing same Boolean function - Use Boolean Algebra
24Example Using ABAC A(BC)
25Some Laws of Boolean Algebra
26Consequences of De Morgans Law
27Using De Morgans Laws to covert sum of products
to NAND
28Converting Voltages to Logical Values and Truth
tables
- Can map any two voltages A and B to 0, 1 in more
than one way, resulting in different tables - 0V -gt 0 and 5V -gt 1 in (a) gives gives (b) AND
Table, else get OR table
29Integrated Circuits
- Most gates are attached together to build
Integrated Circuits (ICs) - They are mounted on Plastic or Ceramic with some
combination of pins for input/output, power and
ground connections - Types base on number of gates
- (SSI) Small Scale Integrated Circuit 1 to 10
- (MSI) Medium Scale IC 10 to 100
- (LSI) Large Scale IC 100 to 100,000
- (VLSI) Very Large Scale IC Over 100,100
- Current State of the art 10 million
30Integrated Circuits
- Chips have finite delay (about 1-10 nsecs)
- Propagation delay
- Switching time
- Pins take space hence have to wire gates in
commonly used ways
31Combinatorial Circuits
- Circuits with
- multiple inputs and multiple outputs
- Inputs determine value of outputs
- Example
- Combinatorial Circuits implementing binary
functions - Non combinatorial Memory, where output depend on
inputstored values - CCs Multiplexers, Decoders, Comparators,
Programmable Logic Arrays
32Multiplexers
- 2n data inputs, n control input, one data
output - Data inputs selected by control are gated are
gated to output - Each AND gate gets 3 control and one data input,
selects input based on control - OR gate adds all selected inputs
33Majority Function using a Multiplxer
- Each input wired to 1 or 0
- If 0 in table ground Else connect to Vcc. Check
if it works!
34Other Users of Multiplexers
- Parallel to Serial Conversion
- Put 8 bit data in input lines
- Step through 000 to 111 in control lines to
select inputs serially - Used in serializing device inputs such as key
board inputs over telephone lines - Inverse operation Demultiplexing routes single
serial input into multiple outputs depending on
value of control lines
35Decoders
- Selects one of 2n inputs
- Each AND gate implements one Boolean expression
ABC etc.
36Comparators
- 4 address words, A, B compared.
- Output (A B)
- Users XOR gates 1 iff both inputs are same
37Programmable Logic Arrays (PLA)
- Used to form Sums of products
- Select inputs by burning out fuses
- Example has 12 inputs, 6 outputs, PWR and GND
38PGA Computing Majority function
- Can burn appropriate fuses to fabricate Majority
function from a PGA. Choose - 3 inputs, 4 AND gates and 1 OR gate
- Burn appropriate fuses
- Which one is best for Majority
- SSI with 4 gates
- 1 MSI multiplexer
- PLA more efficient
39Arithmetic Circuits
- Discussed general purpose MSI circuits
- Have MSI circuits for arithmetic operations
- Used in the ALU inside the CPU
- Basic Components
- Shifters shift input left or right
- Adders addition of binary numbers
401-Bit Left/Right Shifter
- Inputs D0, D7 Outputs S0, S7
- When C1 Right Shift, else left shift
- Go through and convince it works!
41Adders Half Adder
Problem Good for leftmost bit, Cannot handle
carry over
42Full Adder
43Building Adders
- To build a 16 bit adder
- replicate full adder 16 times
- Carry out bit is wired left neighbor
- Rightmost carry in is 0
- Leftmost carry out is overflow
- Adder in example has to wait until numbers ripple
from right to left Ripple Carry Adder - Solution Carry Select Adders
- Break 16 bit adder into 3 8 bit adders, two upper
half and one lower half. - Select correct upper half depending carry of
lower half
44Arithmetic Logic Unit (ALU)
- Most machine have a single circuit for AND, OR
sum - Has 4 units
- Decoder to select operation (control inputs)
select 00, 01, 10, 11 in figure, allows results
to pass to final OR gates - Logic units ENA, ENB use to control feeding A
and B, INVA is used to obtain A - Full adders
45(No Transcript)
46Bit Slices
- Circuits like previous figure allow building
adders of any width - Extra input INC is used to add 1
47Clocks
- To ensure order and timeliness, many digital
circuits use clocks - Clock is a circuit that emits a series of pulses
with precise pulse width precise interval between
pluses called cycle time - Pulse frequencies are in the 1 to 500 MHz range,
clock cycles in 1 to 1000 n. secs
48Finer Granularity than a Clock
- Divide clock cycle into sub-cycles
- Insert a circuit with known delay to get phase
shift - To get different time intervals AND combination
49Memory
- Used for storing datainstructions
- Latches
- SR, Locked SR and D Latches
- Flip-Flops
- Registers
- Memory Organization
- Memory Chips
- RAMs and ROMs
50SR-Latches
- Output NOT uniquely determined by input
- Two stable states
- 1. RS0 and Q0, 2. RS0 and Q1,
- S changing to 1 while Q0 sets from state 1 to
state 2, but setting R in state 1 has no effect - Similarly in state state 2, setting S has no
effect, but setting R does
51SR Latch
- Key property
- Setting S to 1 in state 2 (Q1) has no effect
- Setting S to 0 in state 2 (Q1) Sets Q0
- Setting S to 1 in state 1(Q0) Sets Q1
- Setting S to 1 in state 2 (Q1) has no effect
- Circuit remembers S -Memory
52Clocked SR Latches
- With clock 0 latch does not change state
- When clock ticks (I.e.1) latch activates
- RS1 circuit is unstable, circuit jumps back to
a stable state
53Locked D Latches
- Avoid ambiguity in SR, use one input D
- When clock ticks, value of is sampled and stored
in the circuit
54Flip-Flops
- Purpose Sample and store the value of a line at
a given moment - State transition occurs at clock-flip.
- I.e. transitions are edge triggered and not level
triggered - Feed pulse to an AND gate
- Need short pulse
- Can get short pulse of about 5 n. secs
55Flip-Flops
56D Flip-Flop
57Notation for Latches and Flip-Flops
- A,B Latches, C,D Flip-flops
- Have Set (force Q1) and Clear (force (Q0)
- A loaded with clock1, B loaded with clock 0
- C loaded with rising clock, a D with falling
58Registers
- Two independent flip-flops with clear and preset
59Registers
- Missing Q, preset, clocks ganged
- Inversion bubbles cancelled, so loaded with
rising - Can make 8- bit register with this
60Non-Inverting and Inverting Buffers
- 1 input, 1 output and 1 control signal
- Purpose Switch on/off within few nanoseconds
61Memory Organization
62Memory Chips
- Earlier Design 4x3 (4 words, 3bits each)
- Can use same data lines for input and output
- Notation Pins are set Asserted
- Some asserted HIGH set with high (1) current no
bar above pin name I.e. RD - Some asserted LOW set with low (0) current bar
above pin name I.e. RD - When pins are not asserted, they are Negated
63Memory Chips Example 1
- 512x8 512 words, each 8 bits wide,
- Requires 19 input and 8 output pins
64Memory Chips Example 2
- 4096kx1 Memory chip
- Internally arranged as a 2048x2048 I bit memory
- Row selected by 11 bit row address
- RAS asserted
- Col selected
- CAS asserted
- Chip respond by outputting data
65Memory Organization Addressing
- Large chips are NxN matrices
- Takes two cycles (or more) for earlier addressing
scheme - Addressed by setting row and then a sequence of
column addresses - Currently chip families have 1, 4, 8 and 16 bit
widths
66RAMs and ROMs
- SRAM Static RAM made of D flip-flops, Used in L2
caches - DRAM Made of 2-d arrays of cells, each made of a
transistor and a capacitor - Need to refill, more complex interfaces
- Larger capacity, slow speed 10 nsec.
- Tow kinds
- FPM-Fast Page Mode Organized like last eg,
rowcol addressing - EDO-Extended Data Output Second reference begins
before first is finished - SDRAM-Synchronous DRAM, address and data lines
driven by same clock - Usually SRAM caches and DRAM main memory is
available.
67ROMs
- ROMS Read Only memories
- Bits itched during manufacture
- Cheaper than RAM
- PROM
- Can be programmed once by burning fuse
- EPROM (Erasable PROM)
- Can be programmed and erased in the filed by
exposure to UV light in a special chamber - EEPROM Filed programmed by applying voltage
- 1/64 size of EPROM
- 10-100 times slower than DRAM or SRAM
- Flash Memory Byte Erasable, used in Cameras etc.
Can fail after some erasures. May replace disks,
100 nsec access
68RAMs and ROMs Summary
69CPU Operations
- Has pins connected to pins on memory, I/O via
buses to get and send - Address
- Data
- Control
- To fetch an instruction
- Puts memory address on address pins
- Asserts control lines to inform memory
- Memory informs by putting bits on data pins and
asserting control lines
70Key Parameters of CPU Performance
- of address Pins M address pins can address
upto 2m locations Common M16,20,32,64 - of data Pins N data pins can read N data bits
in a single operation Common N8,16,32,36,64 - Example A chip with 8 data pins take 4 cycles to
read a 64 bit word - Other Pins
- Bus Control Inform of reads, writes etc
- InterruptsFrom DMA to CPU to accept data
- Bus ArbitrationNegotiating Bus traffic
- Coprocessor SignalsFloating, video co-processor
- StatusOverflow, unusual conditions
- Miscellaneous Resetting, compatibility
71CPU Pins An Example
72Buses
- A Bunch of parallel wires carrying signals
between devices/units - Drawn as FAT arrows
- When all lines are for same type data diagonal
line across (with bit count) for same type of
data - For devices made of different people to talk to
each other must have (or agree to) - Same protocol
- Power,
- Timing
- Card cage
73Example Buses
- PCI Buses (in PCs)
- SCSI Buses (connects IO devices)
- SBUS (In Sun Machines)
- RS486 (In control devices)
74Buses Function
- Need Master Salve relationship for data transfer
- Bus Driver To amplify signals, avoid fading
- Bus receiver Chips connecting slaves
- Bus Transceiver Chips connecting Master/Slave
- Need Decoder chips to match pin assemblies
- Design Issues width, arbitration, clocking,
operation
75Bus Width
- Wider Buses
- Transfer larger addresses, can have more memory
- Cost more, Take more space
- Results in usually adding more lines later
76Solutions
- Decrease cycle time
- Results in bus skew Speed between different
lines mess up protocol - Backward Incompatibility
- Multiplexing Buses
- Address and data lines share same pins
- Results in complex protocol
- Slows down transfer speed
77Bus Clocking
- Synchronous Buses
- A clock line driven by a single oscillator
5-100MHz - All activities are multiples of this cycle time
- Asynchronous Buses
- No master clock
- Cycle time not the same between all devices
78Synchronous Bus Clocking Example
79Example Continued
- Memory read time 40 n-sec Signal change time 1
n-sec - Cycle time 25 n-sec
80Example Explanation
- CPU puts address on Add line
- After it settles down (6 n-sec) MREQ and RD
asserted - Memory does not have enough time (40 n-sec) to
read during second cycle (25 n-sec) asserts
WAIT resulting in a wasted cycle - During T3 memory negates WAIT, CPU strobes and
latches data - Tad lt 11 n-sec, guaranteed by CPU manufacturer
- Tds lt 5 n-sec bus requirement need to settle
down before strobing - Worse case Transfer time 62.5 11-5 46.5 n-sec
81Example Notes
- 50 n-sec memory would have needed one more cycle
inserting one more WAIT - Specification requires Tml gt 6 (I.e. MREQ assert
delay after adress is stable) - Tm lt 8 means, MREQ needs to be asserted within 8
n-sec of T1 falling gt 10 n-sec setup time chip
is not fast enough - Trl lt 8 means memory chip need to get data
within 2Cycles Tm Trl 2525-8-5 37 n-sec,
in addition to 40 ms read time.
82Further Assertions
- Tmh, and Trh says how long it takes MREQ and RD
to be negated after data strobed - Tdh gives gives time memory must hold data after
it has been read - Disadvantage of Synchronous bus
- Fractions of cycles wasted
- Cannot change to new memories once the Bus cycle
has been decided - Bus works for the slowest device on the list
83Asynchronous Bus
84Asynchronous Bus FunctionalityFull Hadshake
- Master does its work (assert address)
- Assert MREQ, RD
- Assert special signal MSYN (Master Sync)
- When slave sees it does its job
- Slave asserts SSYN (Slave sync)
- When master sees SSYN, it reads data and negate
MREQ and RD - Faster than synch bus
- Difficult to build
85Bus Arbitration
- Several devices may want to be the bus master at
the same time - There must be a procedure to determine who gets
it. It is built into the Bus Arbiter - Request line is asserted by all who wants bus
- Can be centralized or decentralized.
- Arbiters can be separate chips or reside in the
CPU - Centralized arbiters can be prioritized
- Within one priority level, grants are daisy
chained
86Centralized Bus Arbiters
87Decentralized Bus Arbitration
- 16 prioritized request lines
- Each in need send a request, every one sees
- Highest requester wins
- Pros and Cons
- Does not need arbiter
- Need more lines
- Same type of algorithms used in networks
88Another Decentralized Bus Arbitration
- 3 line
- wired-OR to request
- BUSY asserted by current bus master
- Arbitration line
- To acquire bus
- If bus idle, check IN signal
- If IN negated it loose, negate OUT
- If IN asserted, negate OUT
- Causes all downstream devices have IN negated,
hence they negate OUT - Only one has IN asserted, OUT negated. It is
master - Used in Networks!
89Example
90Bus Operation Block Transfers
91Block Transfers
- Cheaper to transfer blocks of memory at once
- When block read starts, master tells slave how
many by asserting BLOCK and putting count on data
line - Slave output one word in each cycle until count
is over
92Read-Modify-Write Cycle
- In multiprocessor systems to avoid two CPUs
modifying same data item simultaneously - Have special read-modify-write bus cycle to read
from memory, inspect and write back without
releasing the BUS
93Interrupt Handling
- More than one device wants to interrupt
- Need to prioritize and give bus to highest
priority interrupt - Controllers assert (its own) interrupt line to
interrupt controller. - It asserts Interrupt pin on CPU
- When CPU is able it send pulse back to interrupt
handler - Handler given interrupt vector to CPU through a
special bus cycle - CPU calls the appropriate interrupt handler
94Interrupt Handling
95Interfacing
- Universal Asynchronous Receiver Transmitter
(UART) - Data bus to serial interface 1 bit at a time
- Speeds 50 to 19,200 bps
- Character width 5 to 8 bits
- 1, 5 stop bits
- Even or odd parity bits
- Under program control
- Universal Synchronous Asynchronous Receiver
Transmitter (USART) - Handles synchronous transfers under program
control
96Parallel I/O Chips
- 24 I/O lines, can be used any way
- One way to operate
- CPU writes registers, and they appear in output
lines until register value change
97Address Decoding Example
- 16 bit Machine has
- CPU, PIO (need 3 bytes 1 byte for control
register) - 2Kx8 byte EPROM, 2Kx8 RAM
- Use Memory mapped I/O
98Address Decoding Example
- EPROM selected by 16 bit 00000xxxxxxxxxxx ad
- Can be wired to a 5 bit comparator
- Or use 5 OR gates use 8 NOR gates
- RAM selected by 16 bit 10000xxxxxxxxxxx add
- Need inverter
- PIO selected by 11111111111111xx address
- Use 2 8-input NAND gates to feed an OR gate
99Same Example
100Partial Address Decoding
- Only EPROM has highest order 0 bits
- RAM addresses have 10xx..addresses
- PIO addresses have 11xxx addresses
- Hence only decode first 2 bits
101Examples
- Example CPU Chips
- PicoJava II
- Pentium II
- Buses
- PCI Bus
102PicoJava II
- Designed by Sun, Lincesed to other companies
- Has the JVM Instruction set as native instruction
set - CPU is Sun MicroJava 701
103PicoJava II -Continued
- Single CPU with two bus interfaces
- For 32 bit interface
- For 64 bit interface
- Optional split level cache
- 16KB for instrcutions
- 16 KB for data
- No L2 cache
- Includes a flash PROM To store programs for
embedded applications
104PicoJava II
- Has 16 programmable I/O Lines to interface to
buttons etc - 316 Pins
- 59 connected to the PCI bus
- 123 for the memory bus (64 bidirectional data
pins) - Others
- Control 7
- Timers 3
- Interrupts 11
- Testing 10
- Programmable I/O 16
- Others for Power, ground etc.
105Pentium II
- 32 bit machine, Can address 64GB memory
- Can transfer data in 64 bit units
- Has two levels of cache
- 16KB for instructions and 16KB for data, 32 byte
cache lines - 512 KB unified L2 cache
106Pentium II - Continued
- Two synchronous external buses
- Memory bus to access DRAM
- PCI bus to access I/O devices
- Can have one or two CPU's sharing common memory
- Snooping provided for cache coherence
- Has 242 connectors, single edge cartridge
- Dessipated power 30-5- watts
- Has a heat sink
- To lower power consumption has sleep and deep
sleep states
107Pentium II - Continued
- 242 Pins ( asserted low)
- 170 Signals
- 27 Power Lines
- 35 Grounds
- 10 Spares for future use
- BPRI High priority bus request
- LOCK CPU Locks bus
108Pentium II - Continued
- A 33 address lines, 36 bit addresses, but low
3 bits 0, hence no pins - Transfers are 8 bytes, aligned on 8-byte
boundaries - Max addressible space 236 64GB
- ADS Asserted when address put onto bus
- REQ type of request - read block, write word
etc - Parity (3) - 2 protects A, One protects ADS and
REQ
109Pentium II - Continued
- 5 Error Lines used by slave to report parity and
other errors - Snoop used if other CPU using wanted word
- RS Status code for slave to report back to
master - Data
- D data pins
- DRDY data ready signal
- DBSY bus busy
- RESET Reset CPU in case of calamity
110Pentium II - Pipelining
- CPU much faster than DRAM, hence pipeline to
avoid starving CPU. Six stages of Memory request - Bus arbitration phase
- Determines who becomes master next
- Request phase
- Address passed, request made
- Error reporting phase
- Slave reports (parity) errors
- Snoop phase
- One CPU snoops on other CPU
- Response phase
- Master learns if request will be honored
- Data phase
- Data transferred
111Pentium II - Pipelining
- T3 introduces longer data phase (block transfer)
- T4 sees DBSY and waits
- T5 response phase takes multiple cycles
- Delays T6, T7 A delay bubble remains for a while!
112PCI Bus
- Synchronous
- Original PCI Bus (now)
- 33MHz 30 nsec cycle time - 66MHz now
- 32 bit transfers - (64 bit transfers now)
- 133 MB/Sec - (528 MB/Sec now)
- Not good enough for a memory bus
- Not compatible with old ISA cards
113Multi Bus Computers
114PCI Bus Arbitration
- Centralized Bus arbitration
- Each device has dedicated lines REQ and GNT to
arbiter - Device asserts REQ, waits for arbiter to assert
GNT
115PCI Signals
- If a master is making a very long transfer
arbiter can negate GNT, now the master MUST give
up bus
116PCI Signals
- Some signals are mandatory, other optional
- CLK - Drives the bus
- Master Asserts
- AD (32) - address and data
- Cycle 1 address asserted, cycle 3 data asserted
- CBE - In cycle 1 says operation (i.e read, blk
trasfer etc) - In cycle 2 bit map of valid bits
- Slave Asserts
- DEVSEL
- notifies slave detected address - if not notified
master times out and decides slave dead - TRDY Data ready on AD lines
- STOP slave says - disaster, abort current
transaction - PERR data parity error
- SERR Address or system error
117PCI Bus Transactions
118PCI Bus Timing Diagram
- Read Transaction - idle cycle - write transaction
- T1 Master puts address on AD and command on
CBE, asserts FRAME - T2 Master floats AD negates CBE
- T3 Slave asserts DEVSEL, puts data on AD and
asserts TRDY - T4 idle
- T5 Same master initiates a write