Hassan Al manasrah

About This Presentation

Title:

Hassan Al manasrah

Description:

... cover. Minimum cover that is prime. F = abc'd' a'b'cd a'bcd ab'cd. Sum ... abc'd', ab'cd, a'bcd, etc. Implicant: ... a'cd covers a'bcd and a'b'cd ... – PowerPoint PPT presentation

Number of Views:35

Avg rating:3.0/5.0

Slides: 51

Provided by: hass6

Category:

more less

Transcript and Presenter's Notes

Title: Hassan Al manasrah

1
Design Technology

By
Hassan Al manasrah
TamIr Al zubi

2
Outline

Introduction
Automation synthesis
Verification hardware/software co-simulation
Reuse intellectual property cores
Design process models

3
Introduction
System Design Goals
4
Introduction

What does Design means?
Task of defining system functionality and
converting that functionality into physical
implementation.
Convert functionality to physical implementation
while
Satisfying constrained metrics
Optimizing other design metrics
Designing embedded systems is hard because of
Complex functionality
Millions of possible environment scenarios. Ex
Elevator Controller.
So many Competing, tightly constrained metrics.
Productivity gap
As low as 10 lines of code or 100 transistors
produced per day

Many possible combinations of buttons being
pressed.
5
Improving productivity

Design technologies developed to improve
productivity, we focus on technologies advancing
hardware / software view
Automation Synthesis
Computer program to replace manual design.
Which made Hardware design look like Software
design.
Reuse
Process of using predesigned components.
Core in the Hardware domain.
Verification
Task of ensuring correctness/completeness of each
design step.
Hardware/Software co-simulation.

6
Automation synthesis

The parallel evolution of compilation and
synthesis
Synthesis levels
Logic synthesis
Two-level logic minimization
Multi-level logic minimization
FSM synthesis
Technology mapping
Register-transfer synthesis
Behavioral synthesis
System synthesis and hardware/software co-design

7
The parallel evolution of compilation and
synthesis

In the early design was mostly hardware, software
was fairly simple.
Software complexity increased with advent of
general-purpose processor.
Different techniques for software design and
hardware design
Caused division of the two fields
Hardware/software design fields rejoining
Both can start from behavioral description in
sequential program model

8
Cont.

Software design evolution
Machine instructions
Collection machine instructions called Program
(0s, 1s).
Assemblers
Convert assembly programs into machine
instructions, due to hard dealing with huge
number of 0s, 1s.
Compilers
translate sequential programs into assembly
Hardware design evolution
Interconnected logic gates
Logic synthesis
converts logic equations or FSMs into gates
Register-transfer (RT) synthesis
converts FSMDs into FSMs, logic equations,
predesigned RT components (registers, adders,
etc.)
Behavioral synthesis
converts sequential programs into FSMDs

Hardware design involves many more dimensions,
while compilers must generate assembly
instructions to implement itself. Hardware
Designer concerned about size, power, performance
and other metrics.
9
Synthesis Levels

Gajskis Y-chart
Each axis represents type of description
Behavioral
Defines outputs as function of inputs
Structural
Implements behavior by connecting components with
known behavior
Physical
Gives size/locations of components and wires on
chip/board
Synthesis converts behavior at given level to
structure at same level or lower
E.g.,
FSM ? gates, flip-flops (same level)
FSM ? transistors (lower level)
FSM X registers, FUs (higher level)
FSM X processors, memories (higher level)

Carry-ripple adder
Addition
10
Logic Synthesis

Converting logic-level behavior to structural
implementation
By converting Logic equations and/or FSM to
connected gates.
Combinational logic synthesis
Two-level minimization
Multilevel minimization
FSM synthesis
State minimization
State encoding

11
Two-level minimization

Represent logic function as sum of products (or
product of sums)
AND gate for each product
OR gate for each sum
This minimization gives best possible performance
when at most we have 2 gates delay
Goal minimize size
Minimum cover
Minimum cover that is prime

12
Minimum Cover

Minimum of AND gates (sum of products)
Literal variable or its complement
a or a, b or b, etc.
Minterm product of literals
Each literal appears exactly once
abcd, abcd, abcd, etc.
Implicant product of literals
Each literal appears no more than once
abcd, acd, etc.
Covers 1 or more minterms
acd covers abcd and abcd
Cover set of implicants that covers all minterms
of function
Minimum cover cover with minimum of implicants

13
Cont.

Minimum cover K-map approach
Karnaugh map (K-map)
1 represents minterm
Circle represents implicant
Minimum cover
Covering all 1s with min of circles
Example direct vs. min cover
Less gates
4 vs. 5
Less transistors
28 vs. 40

K-map sum of products
K-map minimum cover
Minimum cover
Fabc'd' a'cd ab'cd
Minimum cover implementation
2 4-input AND gate 1 3-input AND gates 1 4 input
OR gate ? 28 transistors
14
Minimum cover that is prime

Minimum of inputs to AND gates
Prime implicant
Implicant not covered by any other implicant
Max-sized circle in K-map
Minimum cover that is prime
Covering with min of prime implicants
Min of max-sized circles
Example prime cover vs. min cover
Same of gates
4 vs. 4
Less transistors
26 vs. 28

15
Minimum cover heuristics

K-maps give optimal solution every time
Functions with gt 6 inputs too complicated
Use computer-based tabular method
Finds all prime implicants
Finds min cover that is prime
Also optimal solution every time
Problem 2n minterms for n inputs
32 inputs 4 billion minterms
Exponential complexity
Heuristic
Solution technique where optimal solution not
guaranteed
Hopefully comes close

16
Heuristics iterative improvement

Start with initial solution
i.e., original logic equation
Repeatedly make modifications toward better
solution
Common modifications
Expand
Replace each nonprime implicant with a prime
implicant covering it
Delete all implicants covered by new prime
implicant
Reduce
Opposite of expand
Reshape
Expands one implicant while reducing another
Maintains total of implicants
Irredundant
Selects min of implicants that cover from
existing implicants
Synthesis tools differ in modifications used and
the order they are used

17
Multilevel logic minimization

Trade performance for size
Increase delay for lower of gates
Gray area represents all possible solutions
Circle with X represents ideal solution
Generally not possible
2-level gives best performance
max delay 2 gates
Solve for smallest size
Multilevel gives pareto-optimal solution
Minimum delay for a given size
Minimum size for a given delay

multi-level minim.
delay
2-level minim.
size
18
Example

Minimized 2-level logic function
F adef bdef cdef gh
Requires 5 gates with 18 total gate inputs
4 ANDS and 1 OR
After algebraic manipulation
F (a b c)def gh
Requires only 4 gates with 11 total gate inputs
2 ANDS and 2 ORs
Less inputs per gate
Assume gate inputs 2 transistors
Reduced by 14 transistors
36 (18 2) down to 22 (11 2)
Sacrifices performance for size
Inputs a, b, and c now have 3-gate delay
Iterative improvement heuristic commonly used

19
FSM synthesis

Converting FSM to gates
State minimization
Reduce of states
Identify and merge equivalent states
Outputs, next states same for all possible
inputs.
Tabular method gives exact solution.
Table of all possible state pairs.
If n states, n2 table entries.
heuristics used with large of states.
State encoding
Unique bit sequence for each state.
If n states, log2(n) bits to represent n unique
encodings.
n! possible encodings.
Thus, heuristics common.

Smaller states registers and fewer gates
20
Technology mapping

Library of gates available for implementation
Simple
only 2-input AND,OR gates
Complex
various-input AND,OR,NAND,NOR,etc. gates
Efficiently implemented meta-gates (i.e.,
AND-OR-INVERT,MUX)
Final structure consists of specified librarys
components only
If technology mapping integrated with logic
synthesis
More efficient circuit
More complex problem

21
Register-transfer synthesis

Converts FSMD to custom single-purpose processor
Datapath
Register units to store variables
Complex data types
Functional units
Arithmetic operations
Connection units
Buses, MUXs
FSM controller
Controls datapath
Key sub problems
Allocation
Instantiate storage, functional, connection units
Binding
Mapping FSMD operations to specific units

22
Behavioral synthesis

High-level synthesis
Converts single sequential program to
single-purpose processor
FSDM Does not require the program to schedule
states
Behavioral synthesis tool use advance techniques
to carry out task scheduling allocation.
Key sub problems
Allocation
Binding
Scheduling
Assign sequential programs operations to states
Optimizations important
Compiler
Constant propagation, dead-code elimination, loop
unrolling
Advanced techniques for allocation, binding,
scheduling

Implementing a sequential program needs
23
System synthesis
Collection of processors

At embedded systems its getting much complex
Multiple processes may provide better
performance/power
May be better described using concurrent
sequential programs
System synthesis means Convert 1 or more
processes into 1 or more processors
Tasks
Transformation
Can merge 2 exclusive processes into 1 process
Can break 1 large process into separate processes
Allocation
Essentially design of system architecture
Select processors to implement processes
Also select memories and busses

24
Cont.

Tasks (cont.)
Partitioning
Mapping 1 or more processes to 1 or more
processors
Variables among memories
Communications among buses
Scheduling
Determining when each of the multiple processes
on a single processor will have chance to execute
on the processor.
Memory accesses, bus communications must be
schedule.
Tasks performed in variety of orders
Iteration among tasks common

25
Cont.

Synthesis driven by constraints
E.g.,
Meet performance requirements at minimum cost
Allocate as much behavior as possible to
general-purpose processor
Low-cost/flexible implementation
Minimum of SPPs used to meet performance
System synthesis for GPP only (software)
Common for decades
Multiprocessing
Parallel processing
Real-time scheduling
Hardware/software codesign
Simultaneous consideration of GPPs/SPPs during
synthesis
Made possible by maturation of behavioral
synthesis in 1990s

Verification

27
Verification

It is the task of ensuring that a design is
correct and complete.
Correctness
Means that the design implements its
specification correctly.
Completeness
Means that the designs specification described
appropriate output responses to all relevant
input sequences.
There are two main verification approaches
Formal verification
Simulation

28
Formal Verification

It is an approach of analyzing a design to prove
or disprove certain properties.
This is done by verifying the correctness of a
particular design verifying the completeness of
a behavioral description.
Correctness verification
By verifying that a particular structural
description correctly implements a behavioral
description, by proving the equivalence of the
two descriptions.
Example
Prove ALU structural implementation equivalent to
behavioral description.
Derive Boolean equations for outputs.
Create truth table for equations.
Compare to truth table from original behavior
table.
completeness verification
Verifying completeness of a behavioral
verification is proving of that a certain
situations will never occur.
Example
Formally prove elevator door can never open while
elevator is moving
Derive conditions for door being open.
Show conditions conflict with conditions for
elevator moving.
Drawbacks
Formal Verification is very hard
limited to small designs or verifying only
certain key properties

29
Simulation

It is an approach in which we create a model of
the design that can be executed on computer
We entered the input values to the module and
check that the output values of the module match
the expected values.
Correctness verification
Example
Prove ALU structural implementation equivalent to
behavioral description.
Providing all possible input combinations to the
module
Checking the ALU outputs for correct results
completeness verification
Example
Formally prove Elevator door closed when moving
Provide all possible input sequences
Check door always closed when elevator moving
Simulation of all possible inputs is impossible,
like simulating of all possible inputs for
32-bits ALU ,which requires 232232 possible
input combinations which take a very long time to
simulate.
Designer can only simulate a tiny subset of
possible inputs, which includes typical values
,and boundary inputs.
Simulation increases confidence of
correctness/completeness of the design but Does
not prove anything.

30
Simulation advantages disadvantages

Simulation has several advantages over the
physical implementation with respect to test
debugging the system.
Controllability
The ability to control the execution of the
system, like the control of time and the data
inputs of the system.
Observability
the ability to examine system values, that the
user can stop the simulation and observe internal
system values.
Debugging
the user can stop the simulation at any time
,either small ,and change the input values or the
internal values or the environment values, then
restarting again.
Setting up time
Simulation takes a less setting up time than
physical implementation, and gives the ability to
test the system and check the output before
setting up the system in hardware.
Simulation has disadvantages
Set up simulation take much time for a complex
external environment.
The models of the environment likely is
incomplete ,so environment behavioral may be
not modeled correctly.
Simulation speed is slower than physical
implementation speed.

31
Cont

The most significant disadvantage is simulation
speed
e.g.. physical implementation of microprocessor
may executes 100 million instruction per second,
a simulation of gate level model may execute
only 10 instruction per secondbig gap!!!
Simulation is slow for many reasons
Sequentializing parallel design
Supposing that we are analyzing 1000000 logic
gates in a design ,all this gates operate in
parallel ,so we have inputs , outputs for each
gate, every gate is simulated per a time.
Several programs added between simulated system
and real hardware
The simulation has to understand the system
,takes the input , calculates ,then generates the
output ,all of this take a time, additionally the
simulation is running under OS which may make a
delay.
Overcome of slow simulation speed
Reducing the amount of real time simulation
Instead of using hours of simulation we might
use a milliseconds of simulation
Using faster simulator
There are two ways to make simulator faster
Building Using special hardware for simulation,
known as Emulators.
Using simulator which is less precise and
accurate, by reducing controllability and
observability.

32
Cont

Dont need gate-level analysis for all
simulations
E.g., cruise control
Dont care what happens at every input/output of
each logic gate
Simulating RT components 10x faster
Cycle-based simulation 100x faster
Accurate at clock boundaries only
No information on signal changes between
boundaries
Faster simulator often combined with reduction in
real time
If willing to simulate for 10 hours
Use instruction-set simulator
Real execution time simulated
10 hours 1 / 10,000
0.001 hour
3.6 seconds

33
Hardware/software co-simulation

It is a simulator that is designed to hide the
details of integration of an ISS and HDL
simulator.
There are many simulation approaches varying in
speed ,precision ,and accuracy.
You may find a very detailed simulation like
gate-level mode ,and very abstract simulation
like instruction level model.
Simulation tools evolved separately for
hardware/software ,so every one has separate
design evolution.
Software Global Purpose Processor(GPP)
Typically with instruction-set simulator (ISS)
Hardware Special Purpose Processor(SPP)
Typically with models in HDL environment
The integration of GPP SPP onto a single IC
increased the need of simulating these two
processors together, by merging the
Software/Hardware simulation tools.
There are two approaches to merge Software
Hardware simulation together
The Simple way is to create an HDL module for
the GPP which will run the software of the
system, and then integrating the HDL model of the
SPP, it has two disadvantages
Much slower than ISS
Less observable/controllable than ISS
Creating communication between GPP (ISS)
SPP(HDL) ,that every one run alone at its
simulation and transferred data between them by
shared communication when needed, this is known
as Hardware/Software Co-Simulation.

34
Cont

Modern Hardware/Software co-simulations
additionally to integrating two simulators, they
minimize the communication between two simulator.
E.g. the memory between GPP SPP every processor
has to access the memory
Where should memory go?
In ISS
HDL simulator must stall for memory access
In HDL?
ISS must stall when fetching each instruction
The solution is to model a independent memory for
every processor in ISS simulator and HDL
simulator with updating the shared data for both.
Huge speedups (100x or more) reported with this
technique.

35
Emulators

It is general physical device onto which a system
can be mapped relatively quickly, and can be
placed in the system real environment.
It is created to solve the problems of simulation
,expensive environment setup, incomplete
environment models, and slow simulation speed.
An emulator consists of microprocessor IC and
monitoring controlling circuits.
It may contain tens or hundreds of FPGAs ,and
Usually supports debugging tasks
Emulation has several advantages over simulation
Mapped relatively quickly
Hours, days
Can be placed in real environment
No environment setup time
No incomplete environment
Typically faster than simulation
Hardware implementation

36
Cont

Emulation has also disadvantages
Still not as fast as real implementations
E.g., emulated cruise-control may not respond
fast enough to keep control of car
Mapping still time consuming
E.g., mapping complex SOC to 10 FPGAs ,just
partitioning into 10 parts could take weeks
Can be very expensive
Top-of-the-line FPGA-based emulator 100,000 to
1mill
Leads to resource bottleneck, which a company may
afford one emulator, then caused a groups to
wait.

37
Reuse intellectual property cores

Designers always has Commercial Of-The-Shelf
components COTS, which is predesigned package
ICs, and it is reduced the time of design and
debug.
System-On-Chip SOC is implementing all components
of a system on single chip, this is achieved by
increasing ICs capacities.
Changing the way COTS components are sold ,it is
being sold as intellectual property (IP) rather
than actual IC.
They are sold as behavioral, structural, or
physical descriptions rather than actual ICs.
Designers can integrate these descriptions with
other to form one large SOC.
Processor-level components known as cores ,and it
is referred to GPP or SPP IP component.

38
Cont

Soft core
Synthesizable behavioral description
Typically written in HDL (VHDL/Verilog)
Firm core
Structural description
Typically provided in HDL
Hard core
Physical description
Provided in variety of physical layout file
formats

Gajskis Y-chart
39
Hard/Soft core advantages disadvantages

Hard cores
Ease of use
Developer already designed and tested hard core
Can use right away
Can expect to work correctly
Predictability
Size, power, performance predicted accurately
It is specific for exact IC process ,and not
easily mapped (retargeted) to different process
E.g., core available for vendor Xs 0.25
micrometer CMOS process
Cant use with vendor Xs 0.18 micrometer process
Cant use with vendor Y
Soft cores
Can be synthesized to nearly any technology
Can optimize for particular use
E.g., delete unused portion of core which gives
Lower power ,and smaller designs
Requires more design effort
May not work in technology not tested for
Not as optimized as hard core for the same
processor ,since hard cores have been given more
attention.

40
Firm core advantages disadvantages

Compromise between hard and soft cores
Some retargetability
Limited optimization
Better predictability/ease of use

41
New challenges to processor providers

Cores have dramatically changed business model of
vendors of GPP SPP.
These changes made for Pricing model IP
protection
Pricing models
In the past
Vendors sold product as IC to the designers
Designers must buy any additional copies, because
of impossible copying of ICs
Could not (economically) copy from original
Today
Vendors can sell as IP instead of ICs itself
Designers incorporate IPs into SOC
Designers can make as many copies as needed, and
vendors gain money
Vendor can use different pricing models
Royalty-based model
Similar to old IC model
Designer pays for each additional model created
Fixed price model
One price for IP and designers can make as many
copies as needed
Many other models used
IP protection

42
IP protection

IP protection has become a key concern of core
providers
In the past
Illegally copying of IC is very difficult
Reverse engineering required tremendous,
deliberate effort
Accidental copying is not possible
Today
Cores sold in electronic format
Deliberate/accidental unauthorized copying are
easier
Vendors consider Safeguards greatly when selling
their products
Contracts are created between vendors and
designers to ensure no copying and distributing
for the IP
Encryption techniques is used by vendors to limit
the actual exposure to IP
E.g. watermarking
determines if particular instance of processor
was copied
whether copy authorized

43
New challenges to processor users

There are a new challenges posed for a designers
to use GPP SPP
Licensing arrangements
Purchasing a cores is not as easy as purchasing
ICs
More contracts enforcing pricing model and IP
protection and possibly requiring legal
assistance .
Extra design effort
Especially for soft cores
Must still be synthesized and tested
Minor differences in synthesis tools can cause
problems
Verification requirements more difficult
Extensive testing for synthesized soft cores and
soft/firm cores mapped to particular technology
Ensure correct synthesis
Timing and power vary between implementations
There is no direct access to a core once it has
been integrated into a chip
Cores buried within IC
Cannot simply replace bad core like replacing bad
IC in the past

44
Design process model

It describes order that design steps are
processed, and each step has many sub steps.
Behavior description step
Behavior to structure conversion step
Mapping structure to physical implementation step
Waterfall model
Proceed to next step only after current step
completed
Spiral model
Proceed through 3 steps in order but with less
detail
Repeat 3 steps gradually increasing detail
Keep repeating until desired system obtained
Becoming extremely popular (hardware software
development)

45
Waterfall method

If the designer has 6 month to build a system
then he proceed with
The designer start with describing behavior of
the system completely, may take two months.
Once fully satisfied the correct of behavioral
,moving to the structural design, also take two
months.
Once fully satisfying the correct of structural,
then physical implementation is done.
Drawbacks
When we moved to the next step we cant come back
to the previous level
Not very realistic
Bugs often found in later steps that must be
fixed in earlier step
E.g., when testing the structure we notice that
we forgot to handle certain input condition at
the behavior level
Prototype often needed to know complete desired
behavior
E.g., customer adds features after product demo
System specifications commonly change
E.g., to remain competitive by reducing power,
size, certain features be dropped
Unexpected iterations back through 3 steps cause
missed deadlines
Lost revenues
May never make it to market

46
Spiral method

If the designer has 6 month to build a system
then he proceed with
The designer start with describing the basic
behavior of the system and it is not complete,
may take few weeks.
Proceeding to the structural design, also may
take few weeks.
then creating a physical prototype for the
system,and this prototype is used to test out the
basic functions.
Go back to the first step and continue
First iteration of 3 steps incomplete
Much faster, though
End up with prototype
Use to test basic functions
Get idea of functions to add/remove
Original iteration experience helps in following
iterations of 3 steps
Drawbacks
The designer must come up with ways to obtain
structure and physical implementations quickly
E.g., the designer uses FPGAs for prototype ,then
generating a new silicon for final product takes
a long time
May have to use more tools
The designer Could require Extra effort/cost when
using extra tools .
Could require more time than waterfall method due
to the overhead of creating physical prototyps.
If correct implementation first time with
waterfall

47
General-purpose processor design models

Previous slides focused on SPPs
Can apply equally to GPPs
Waterfall model
Structure developed by particular company
Acquired by embedded system designer
Designer develops software (behavior)
Designer maps application to architecture
Compilation
Manual design
Spiral-like model
Beginning to be applied by embedded system
designers

48
Spiral-like model

Designer develops or acquires architecture
Develops application(s)
Maps application to architecture
Analyzes design metrics
Now makes choice
Modify mapping
Modify application(s) to better suit architecture
Modify architecture to better suit application(s)
Not as difficult now
Maturation of synthesis/compilers
IPs can be tuned
Continue refining to lower abstraction level
until particular implementation chosen

49
Summary

Design technology seeks to reduce gap between IC
capacity growth and designer productivity growth
Synthesis has changed digital design
Increased IC capacity means sw/hw components
coexist on one chip
Design paradigm shift to core-based design
Simulation essential but hard
Spiral design process is popular