Title: Hassan Al manasrah
1Design Technology
- By
- Hassan Al manasrah
- TamIr Al zubi
2Outline
- Introduction
- Automation synthesis
- Verification hardware/software co-simulation
- Reuse intellectual property cores
- Design process models
3Introduction
System Design Goals
4Introduction
- What does Design means?
- Task of defining system functionality and
converting that functionality into physical
implementation. - Convert functionality to physical implementation
while - Satisfying constrained metrics
- Optimizing other design metrics
- Designing embedded systems is hard because of
- Complex functionality
- Millions of possible environment scenarios. Ex
Elevator Controller. - So many Competing, tightly constrained metrics.
- Productivity gap
- As low as 10 lines of code or 100 transistors
produced per day
Many possible combinations of buttons being
pressed.
5Improving productivity
- Design technologies developed to improve
productivity, we focus on technologies advancing
hardware / software view - Automation Synthesis
- Computer program to replace manual design.
- Which made Hardware design look like Software
design. - Reuse
- Process of using predesigned components.
- Core in the Hardware domain.
- Verification
- Task of ensuring correctness/completeness of each
design step. - Hardware/Software co-simulation.
6Automation synthesis
- The parallel evolution of compilation and
synthesis - Synthesis levels
- Logic synthesis
- Two-level logic minimization
- Multi-level logic minimization
- FSM synthesis
- Technology mapping
- Register-transfer synthesis
- Behavioral synthesis
- System synthesis and hardware/software co-design
7The parallel evolution of compilation and
synthesis
- In the early design was mostly hardware, software
was fairly simple. - Software complexity increased with advent of
general-purpose processor. - Different techniques for software design and
hardware design - Caused division of the two fields
- Hardware/software design fields rejoining
- Both can start from behavioral description in
sequential program model
8Cont.
- Software design evolution
- Machine instructions
- Collection machine instructions called Program
(0s, 1s). - Assemblers
- Convert assembly programs into machine
instructions, due to hard dealing with huge
number of 0s, 1s. - Compilers
- translate sequential programs into assembly
- Hardware design evolution
- Interconnected logic gates
- Logic synthesis
- converts logic equations or FSMs into gates
- Register-transfer (RT) synthesis
- converts FSMDs into FSMs, logic equations,
predesigned RT components (registers, adders,
etc.) - Behavioral synthesis
- converts sequential programs into FSMDs
Hardware design involves many more dimensions,
while compilers must generate assembly
instructions to implement itself. Hardware
Designer concerned about size, power, performance
and other metrics.
9Synthesis Levels
- Gajskis Y-chart
- Each axis represents type of description
- Behavioral
- Defines outputs as function of inputs
- Structural
- Implements behavior by connecting components with
known behavior - Physical
- Gives size/locations of components and wires on
chip/board - Synthesis converts behavior at given level to
structure at same level or lower - E.g.,
- FSM ? gates, flip-flops (same level)
- FSM ? transistors (lower level)
- FSM X registers, FUs (higher level)
- FSM X processors, memories (higher level)
Carry-ripple adder
Addition
10Logic Synthesis
- Converting logic-level behavior to structural
implementation - By converting Logic equations and/or FSM to
connected gates. - Combinational logic synthesis
- Two-level minimization
- Multilevel minimization
- FSM synthesis
- State minimization
- State encoding
11Two-level minimization
- Represent logic function as sum of products (or
product of sums) - AND gate for each product
- OR gate for each sum
- This minimization gives best possible performance
- when at most we have 2 gates delay
- Goal minimize size
- Minimum cover
- Minimum cover that is prime
12Minimum Cover
- Minimum of AND gates (sum of products)
- Literal variable or its complement
- a or a, b or b, etc.
- Minterm product of literals
- Each literal appears exactly once
- abcd, abcd, abcd, etc.
- Implicant product of literals
- Each literal appears no more than once
- abcd, acd, etc.
- Covers 1 or more minterms
- acd covers abcd and abcd
- Cover set of implicants that covers all minterms
of function - Minimum cover cover with minimum of implicants
13Cont.
- Minimum cover K-map approach
- Karnaugh map (K-map)
- 1 represents minterm
- Circle represents implicant
- Minimum cover
- Covering all 1s with min of circles
- Example direct vs. min cover
- Less gates
- 4 vs. 5
- Less transistors
- 28 vs. 40
K-map sum of products
K-map minimum cover
Minimum cover
Fabc'd' a'cd ab'cd
Minimum cover implementation
2 4-input AND gate 1 3-input AND gates 1 4 input
OR gate ? 28 transistors
14Minimum cover that is prime
- Minimum of inputs to AND gates
- Prime implicant
- Implicant not covered by any other implicant
- Max-sized circle in K-map
- Minimum cover that is prime
- Covering with min of prime implicants
- Min of max-sized circles
- Example prime cover vs. min cover
- Same of gates
- 4 vs. 4
- Less transistors
- 26 vs. 28
15Minimum cover heuristics
- K-maps give optimal solution every time
- Functions with gt 6 inputs too complicated
- Use computer-based tabular method
- Finds all prime implicants
- Finds min cover that is prime
- Also optimal solution every time
- Problem 2n minterms for n inputs
- 32 inputs 4 billion minterms
- Exponential complexity
- Heuristic
- Solution technique where optimal solution not
guaranteed - Hopefully comes close
16Heuristics iterative improvement
- Start with initial solution
- i.e., original logic equation
- Repeatedly make modifications toward better
solution - Common modifications
- Expand
- Replace each nonprime implicant with a prime
implicant covering it - Delete all implicants covered by new prime
implicant - Reduce
- Opposite of expand
- Reshape
- Expands one implicant while reducing another
- Maintains total of implicants
- Irredundant
- Selects min of implicants that cover from
existing implicants - Synthesis tools differ in modifications used and
the order they are used
17Multilevel logic minimization
- Trade performance for size
- Increase delay for lower of gates
- Gray area represents all possible solutions
- Circle with X represents ideal solution
- Generally not possible
- 2-level gives best performance
- max delay 2 gates
- Solve for smallest size
- Multilevel gives pareto-optimal solution
- Minimum delay for a given size
- Minimum size for a given delay
multi-level minim.
delay
2-level minim.
size
18Example
- Minimized 2-level logic function
- F adef bdef cdef gh
- Requires 5 gates with 18 total gate inputs
- 4 ANDS and 1 OR
- After algebraic manipulation
- F (a b c)def gh
- Requires only 4 gates with 11 total gate inputs
- 2 ANDS and 2 ORs
- Less inputs per gate
- Assume gate inputs 2 transistors
- Reduced by 14 transistors
- 36 (18 2) down to 22 (11 2)
- Sacrifices performance for size
- Inputs a, b, and c now have 3-gate delay
- Iterative improvement heuristic commonly used
19FSM synthesis
- Converting FSM to gates
- State minimization
- Reduce of states
- Identify and merge equivalent states
- Outputs, next states same for all possible
inputs. - Tabular method gives exact solution.
- Table of all possible state pairs.
- If n states, n2 table entries.
- heuristics used with large of states.
- State encoding
- Unique bit sequence for each state.
- If n states, log2(n) bits to represent n unique
encodings. - n! possible encodings.
- Thus, heuristics common.
Smaller states registers and fewer gates
20Technology mapping
- Library of gates available for implementation
- Simple
- only 2-input AND,OR gates
- Complex
- various-input AND,OR,NAND,NOR,etc. gates
- Efficiently implemented meta-gates (i.e.,
AND-OR-INVERT,MUX) - Final structure consists of specified librarys
components only - If technology mapping integrated with logic
synthesis - More efficient circuit
- More complex problem
21Register-transfer synthesis
- Converts FSMD to custom single-purpose processor
- Datapath
- Register units to store variables
- Complex data types
- Functional units
- Arithmetic operations
- Connection units
- Buses, MUXs
- FSM controller
- Controls datapath
- Key sub problems
- Allocation
- Instantiate storage, functional, connection units
- Binding
- Mapping FSMD operations to specific units
22Behavioral synthesis
- High-level synthesis
- Converts single sequential program to
single-purpose processor - FSDM Does not require the program to schedule
states - Behavioral synthesis tool use advance techniques
to carry out task scheduling allocation. - Key sub problems
- Allocation
- Binding
- Scheduling
- Assign sequential programs operations to states
- Optimizations important
- Compiler
- Constant propagation, dead-code elimination, loop
unrolling - Advanced techniques for allocation, binding,
scheduling
Implementing a sequential program needs
23System synthesis
Collection of processors
- At embedded systems its getting much complex
- Multiple processes may provide better
performance/power - May be better described using concurrent
sequential programs - System synthesis means Convert 1 or more
processes into 1 or more processors - Tasks
- Transformation
- Can merge 2 exclusive processes into 1 process
- Can break 1 large process into separate processes
- Allocation
- Essentially design of system architecture
- Select processors to implement processes
- Also select memories and busses
24Cont.
- Tasks (cont.)
- Partitioning
- Mapping 1 or more processes to 1 or more
processors - Variables among memories
- Communications among buses
- Scheduling
- Determining when each of the multiple processes
on a single processor will have chance to execute
on the processor. - Memory accesses, bus communications must be
schedule. - Tasks performed in variety of orders
- Iteration among tasks common
25Cont.
- Synthesis driven by constraints
- E.g.,
- Meet performance requirements at minimum cost
- Allocate as much behavior as possible to
general-purpose processor - Low-cost/flexible implementation
- Minimum of SPPs used to meet performance
- System synthesis for GPP only (software)
- Common for decades
- Multiprocessing
- Parallel processing
- Real-time scheduling
- Hardware/software codesign
- Simultaneous consideration of GPPs/SPPs during
synthesis - Made possible by maturation of behavioral
synthesis in 1990s
26 27Verification
- It is the task of ensuring that a design is
correct and complete. - Correctness
- Means that the design implements its
specification correctly. - Completeness
- Means that the designs specification described
appropriate output responses to all relevant
input sequences. - There are two main verification approaches
- Formal verification
- Simulation
28Formal Verification
- It is an approach of analyzing a design to prove
or disprove certain properties. - This is done by verifying the correctness of a
particular design verifying the completeness of
a behavioral description. - Correctness verification
- By verifying that a particular structural
description correctly implements a behavioral
description, by proving the equivalence of the
two descriptions. - Example
- Prove ALU structural implementation equivalent to
behavioral description. - Derive Boolean equations for outputs.
- Create truth table for equations.
- Compare to truth table from original behavior
table. - completeness verification
- Verifying completeness of a behavioral
verification is proving of that a certain
situations will never occur. - Example
- Formally prove elevator door can never open while
elevator is moving - Derive conditions for door being open.
- Show conditions conflict with conditions for
elevator moving. - Drawbacks
- Formal Verification is very hard
- limited to small designs or verifying only
certain key properties
29Simulation
- It is an approach in which we create a model of
the design that can be executed on computer - We entered the input values to the module and
check that the output values of the module match
the expected values. - Correctness verification
- Example
- Prove ALU structural implementation equivalent to
behavioral description. - Providing all possible input combinations to the
module - Checking the ALU outputs for correct results
- completeness verification
- Example
- Formally prove Elevator door closed when moving
- Provide all possible input sequences
- Check door always closed when elevator moving
- Simulation of all possible inputs is impossible,
like simulating of all possible inputs for
32-bits ALU ,which requires 232232 possible
input combinations which take a very long time to
simulate. - Designer can only simulate a tiny subset of
possible inputs, which includes typical values
,and boundary inputs. - Simulation increases confidence of
correctness/completeness of the design but Does
not prove anything.
30Simulation advantages disadvantages
- Simulation has several advantages over the
physical implementation with respect to test
debugging the system. - Controllability
- The ability to control the execution of the
system, like the control of time and the data
inputs of the system. - Observability
- the ability to examine system values, that the
user can stop the simulation and observe internal
system values. - Debugging
- the user can stop the simulation at any time
,either small ,and change the input values or the
internal values or the environment values, then
restarting again. - Setting up time
- Simulation takes a less setting up time than
physical implementation, and gives the ability to
test the system and check the output before
setting up the system in hardware. - Simulation has disadvantages
- Set up simulation take much time for a complex
external environment. - The models of the environment likely is
incomplete ,so environment behavioral may be
not modeled correctly. - Simulation speed is slower than physical
implementation speed.
31Cont
- The most significant disadvantage is simulation
speed - e.g.. physical implementation of microprocessor
may executes 100 million instruction per second,
a simulation of gate level model may execute
only 10 instruction per secondbig gap!!! - Simulation is slow for many reasons
- Sequentializing parallel design
- Supposing that we are analyzing 1000000 logic
gates in a design ,all this gates operate in
parallel ,so we have inputs , outputs for each
gate, every gate is simulated per a time. - Several programs added between simulated system
and real hardware - The simulation has to understand the system
,takes the input , calculates ,then generates the
output ,all of this take a time, additionally the
simulation is running under OS which may make a
delay. - Overcome of slow simulation speed
- Reducing the amount of real time simulation
- Instead of using hours of simulation we might
use a milliseconds of simulation - Using faster simulator
- There are two ways to make simulator faster
- Building Using special hardware for simulation,
known as Emulators. - Using simulator which is less precise and
accurate, by reducing controllability and
observability.
32Cont
- Dont need gate-level analysis for all
simulations - E.g., cruise control
- Dont care what happens at every input/output of
each logic gate - Simulating RT components 10x faster
- Cycle-based simulation 100x faster
- Accurate at clock boundaries only
- No information on signal changes between
boundaries - Faster simulator often combined with reduction in
real time - If willing to simulate for 10 hours
- Use instruction-set simulator
- Real execution time simulated
- 10 hours 1 / 10,000
- 0.001 hour
- 3.6 seconds
33Hardware/software co-simulation
- It is a simulator that is designed to hide the
details of integration of an ISS and HDL
simulator. - There are many simulation approaches varying in
speed ,precision ,and accuracy. - You may find a very detailed simulation like
gate-level mode ,and very abstract simulation
like instruction level model. - Simulation tools evolved separately for
hardware/software ,so every one has separate
design evolution. - Software Global Purpose Processor(GPP)
- Typically with instruction-set simulator (ISS)
- Hardware Special Purpose Processor(SPP)
- Typically with models in HDL environment
- The integration of GPP SPP onto a single IC
increased the need of simulating these two
processors together, by merging the
Software/Hardware simulation tools. - There are two approaches to merge Software
Hardware simulation together - The Simple way is to create an HDL module for
the GPP which will run the software of the
system, and then integrating the HDL model of the
SPP, it has two disadvantages - Much slower than ISS
- Less observable/controllable than ISS
- Creating communication between GPP (ISS)
SPP(HDL) ,that every one run alone at its
simulation and transferred data between them by
shared communication when needed, this is known
as Hardware/Software Co-Simulation.
34Cont
- Modern Hardware/Software co-simulations
additionally to integrating two simulators, they
minimize the communication between two simulator. - E.g. the memory between GPP SPP every processor
has to access the memory - Where should memory go?
- In ISS
- HDL simulator must stall for memory access
- In HDL?
- ISS must stall when fetching each instruction
- The solution is to model a independent memory for
every processor in ISS simulator and HDL
simulator with updating the shared data for both. - Huge speedups (100x or more) reported with this
technique.
35Emulators
- It is general physical device onto which a system
can be mapped relatively quickly, and can be
placed in the system real environment. - It is created to solve the problems of simulation
,expensive environment setup, incomplete
environment models, and slow simulation speed. - An emulator consists of microprocessor IC and
monitoring controlling circuits. - It may contain tens or hundreds of FPGAs ,and
Usually supports debugging tasks - Emulation has several advantages over simulation
- Mapped relatively quickly
- Hours, days
- Can be placed in real environment
- No environment setup time
- No incomplete environment
- Typically faster than simulation
- Hardware implementation
36Cont
- Emulation has also disadvantages
- Still not as fast as real implementations
- E.g., emulated cruise-control may not respond
fast enough to keep control of car - Mapping still time consuming
- E.g., mapping complex SOC to 10 FPGAs ,just
partitioning into 10 parts could take weeks - Can be very expensive
- Top-of-the-line FPGA-based emulator 100,000 to
1mill - Leads to resource bottleneck, which a company may
afford one emulator, then caused a groups to
wait.
37Reuse intellectual property cores
- Designers always has Commercial Of-The-Shelf
components COTS, which is predesigned package
ICs, and it is reduced the time of design and
debug. - System-On-Chip SOC is implementing all components
of a system on single chip, this is achieved by
increasing ICs capacities. - Changing the way COTS components are sold ,it is
being sold as intellectual property (IP) rather
than actual IC. - They are sold as behavioral, structural, or
physical descriptions rather than actual ICs. - Designers can integrate these descriptions with
other to form one large SOC. - Processor-level components known as cores ,and it
is referred to GPP or SPP IP component.
38Cont
- Soft core
- Synthesizable behavioral description
- Typically written in HDL (VHDL/Verilog)
- Firm core
- Structural description
- Typically provided in HDL
- Hard core
- Physical description
- Provided in variety of physical layout file
formats
Gajskis Y-chart
39Hard/Soft core advantages disadvantages
- Hard cores
- Ease of use
- Developer already designed and tested hard core
- Can use right away
- Can expect to work correctly
- Predictability
- Size, power, performance predicted accurately
- It is specific for exact IC process ,and not
easily mapped (retargeted) to different process - E.g., core available for vendor Xs 0.25
micrometer CMOS process - Cant use with vendor Xs 0.18 micrometer process
- Cant use with vendor Y
- Soft cores
- Can be synthesized to nearly any technology
- Can optimize for particular use
- E.g., delete unused portion of core which gives
Lower power ,and smaller designs - Requires more design effort
- May not work in technology not tested for
- Not as optimized as hard core for the same
processor ,since hard cores have been given more
attention.
40Firm core advantages disadvantages
- Compromise between hard and soft cores
- Some retargetability
- Limited optimization
- Better predictability/ease of use
41New challenges to processor providers
- Cores have dramatically changed business model of
vendors of GPP SPP. - These changes made for Pricing model IP
protection - Pricing models
- In the past
- Vendors sold product as IC to the designers
- Designers must buy any additional copies, because
of impossible copying of ICs - Could not (economically) copy from original
- Today
- Vendors can sell as IP instead of ICs itself
- Designers incorporate IPs into SOC
- Designers can make as many copies as needed, and
vendors gain money - Vendor can use different pricing models
- Royalty-based model
- Similar to old IC model
- Designer pays for each additional model created
- Fixed price model
- One price for IP and designers can make as many
copies as needed - Many other models used
- IP protection
42IP protection
- IP protection has become a key concern of core
providers - In the past
- Illegally copying of IC is very difficult
- Reverse engineering required tremendous,
deliberate effort - Accidental copying is not possible
- Today
- Cores sold in electronic format
- Deliberate/accidental unauthorized copying are
easier - Vendors consider Safeguards greatly when selling
their products - Contracts are created between vendors and
designers to ensure no copying and distributing
for the IP - Encryption techniques is used by vendors to limit
the actual exposure to IP - E.g. watermarking
- determines if particular instance of processor
was copied - whether copy authorized
43New challenges to processor users
- There are a new challenges posed for a designers
to use GPP SPP - Licensing arrangements
- Purchasing a cores is not as easy as purchasing
ICs - More contracts enforcing pricing model and IP
protection and possibly requiring legal
assistance . - Extra design effort
- Especially for soft cores
- Must still be synthesized and tested
- Minor differences in synthesis tools can cause
problems - Verification requirements more difficult
- Extensive testing for synthesized soft cores and
soft/firm cores mapped to particular technology - Ensure correct synthesis
- Timing and power vary between implementations
- There is no direct access to a core once it has
been integrated into a chip - Cores buried within IC
- Cannot simply replace bad core like replacing bad
IC in the past
44Design process model
- It describes order that design steps are
processed, and each step has many sub steps. - Behavior description step
- Behavior to structure conversion step
- Mapping structure to physical implementation step
- Waterfall model
- Proceed to next step only after current step
completed - Spiral model
- Proceed through 3 steps in order but with less
detail - Repeat 3 steps gradually increasing detail
- Keep repeating until desired system obtained
- Becoming extremely popular (hardware software
development)
45Waterfall method
- If the designer has 6 month to build a system
then he proceed with - The designer start with describing behavior of
the system completely, may take two months. - Once fully satisfied the correct of behavioral
,moving to the structural design, also take two
months. - Once fully satisfying the correct of structural,
then physical implementation is done. - Drawbacks
- When we moved to the next step we cant come back
to the previous level - Not very realistic
- Bugs often found in later steps that must be
fixed in earlier step - E.g., when testing the structure we notice that
we forgot to handle certain input condition at
the behavior level - Prototype often needed to know complete desired
behavior - E.g., customer adds features after product demo
- System specifications commonly change
- E.g., to remain competitive by reducing power,
size, certain features be dropped - Unexpected iterations back through 3 steps cause
missed deadlines - Lost revenues
- May never make it to market
46Spiral method
- If the designer has 6 month to build a system
then he proceed with - The designer start with describing the basic
behavior of the system and it is not complete,
may take few weeks. - Proceeding to the structural design, also may
take few weeks. - then creating a physical prototype for the
system,and this prototype is used to test out the
basic functions. - Go back to the first step and continue
- First iteration of 3 steps incomplete
- Much faster, though
- End up with prototype
- Use to test basic functions
- Get idea of functions to add/remove
- Original iteration experience helps in following
iterations of 3 steps - Drawbacks
- The designer must come up with ways to obtain
structure and physical implementations quickly - E.g., the designer uses FPGAs for prototype ,then
generating a new silicon for final product takes
a long time - May have to use more tools
- The designer Could require Extra effort/cost when
using extra tools . - Could require more time than waterfall method due
to the overhead of creating physical prototyps. - If correct implementation first time with
waterfall
47General-purpose processor design models
- Previous slides focused on SPPs
- Can apply equally to GPPs
- Waterfall model
- Structure developed by particular company
- Acquired by embedded system designer
- Designer develops software (behavior)
- Designer maps application to architecture
- Compilation
- Manual design
- Spiral-like model
- Beginning to be applied by embedded system
designers
48Spiral-like model
- Designer develops or acquires architecture
- Develops application(s)
- Maps application to architecture
- Analyzes design metrics
- Now makes choice
- Modify mapping
- Modify application(s) to better suit architecture
- Modify architecture to better suit application(s)
- Not as difficult now
- Maturation of synthesis/compilers
- IPs can be tuned
- Continue refining to lower abstraction level
until particular implementation chosen
49Summary
- Design technology seeks to reduce gap between IC
capacity growth and designer productivity growth - Synthesis has changed digital design
- Increased IC capacity means sw/hw components
coexist on one chip - Design paradigm shift to core-based design
- Simulation essential but hard
- Spiral design process is popular
50References
- Embedded System Design A Unified
Hardware/Software Introduction - Frank Vahid and Tony Givargis
-