Title: Formal%20Verification%20of%20Pipelined%20Processors
1Formal Verification Using Infinite-State Models
Randal E. Bryant
Carnegie Mellon University
http//www.cs.cmu.edu/bryant
Contributions by graduate students Miroslav
Velev, Sanjit Seshia, Shuvendu Lahiri
2Outline
- Task
- Formally verify hardware and software systems
- Build on success in verifying finite models
- Infinite-State Models
- How do they arise
- Need logic that is suitably expressive, yet
remains reasonably tractable. - Verification Techniques
- Range of methods with varying capabilities and
limitations - Solve problems by mapping into propositional
logic - Proof engines can use powerful Boolean methods
3Example HP/Compaq Alpha 21264
- Pipeline State
- Multiple caches
- Instruction queues
- Dynamically-allocated registers
- Memory queue
- Many buffers between stages
- Verification Tasks
- Does it implement the Alpha ISA?
- Do specific units satisfy desired properties?
Microprocessor Report, Oct. 28, 1996
4Temporal Logic Model Checking
- Verify Reactive Systems
- Construct state machine representation of
reactive system - Nondeterminism expresses range of possible
behaviors - Product of component state machines
- Express desired behavior as formula in temporal
logic - Determine whether or not property holds
Traffic Light Controller Design
Model Checker
True
False Counterexample
It is never possible to have a green light for
both N-S and E-W.
5Finite System Modeling Example
Distributed, Shared Memory System
- Simplifying Abstractions
- Single word cache
- Single bit/word
- Abstract other clusters
- Imprecise timing
Arbitrary reads writes
6Symbolic FSM Analysis Example
- K. McMillan, E. Clarke (CMU) J. Schwalbe
(Encore Computer) - Encore Gigamax Cache System
- Distributed memory multiprocessor
- Cache system to improve access time
- Complex hardware and synchronization protocol.
- Verification
- Create simplified finite state model of system
(109 states!) - Verify properties about set of reachable states
- Bug Detected
- Sequence of 13 bus events leading to deadlock
- With random simulations, would require ?2 years
to generate failing case. - In real system, would yield MTBF lt 1 day.
7Boolean Manipulation with OBDDs
- Ordered Binary Decision Diagrams
- Data structure for representing Boolean functions
- Key to success in hardware verification
- Example
- (x1 ? x2) ? x3
- Nodes represent variable tests
- Branches represent variable values
- Dashed for value 0
- Solid for value 1
- Canonical representation
- when reduction rules applied
- Makes equivalence trivial
8Representing Circuit Functions
- Functions
- All outputs of 4-bit adder
- Functions of data inputs
- Shared Representation
- Graph with multiple roots
- 31 nodes for 4-bit adder
- 571 nodes for 64-bit adder
- Linear growth
9Simplified Processor Example
Bdat
- Simplified RISC pipeline
- Register-Register and Register-Immediate
operations - Data hazards handled by register forwarding
- Each step of operation defined by function dpipe
10ISA Reference Model
PC
- Only programmer-visible state
- Much simpler control logic
- Assume verified against instruction set
definition - Each step of operation defined by function dspec
11Abstracting Data from Bits to Integers
x0
x1
x2
xn-1
- View Data as Symbolic Terms
- Arbitrary integers
- Verification proves correctness of design for all
possible word sizes - Can store in memories registers
- Can select with multiplexors
- ITE If-Then-Else operation
12Abstraction Via Uninterpreted Functions
f
- For any Block that Transforms or Evaluates Data
- Replace with generic, unspecified function
- Only assumed property is functional consistency
- a x ? b y ? f (a, b) f (x, y)
13Abstraction Via Uninterpreted Functions
F3
F2
F1
- For any Block that Transforms or Evaluates Data
- Replace with generic, unspecified function
- Also view instruction memory as function
14Abstracting Reference Model
PC
- Abstract with identical functions as in pipeline
model
15EUF Equality with Uninterp. Functs
- Decidable fragment of first order logic
- Formulas (F ) Boolean Expressions
- ?F, F1 ? F2, F1 ? F2 Boolean connectives
- T1 T2 Equation
- P (T1, , Tk) Predicate application
- Terms (T ) Integer Expressions
- ITE(F, T1, T2) If-then-else
- Fun (T1, , Tk) Function application
- Functions (Fun) Integer ? Integer
- f Uninterpreted function symbol
- Read, Write Memory operations
- Predicates (P) Integer ? Boolean
- p Uninterpreted predicate symbol
16Correctness of Pipeline
Qspec
Q?spec
Qpipe
Q?pipe
- Abstraction Function Abs
- Relates state of pipeline to program state
- Result of completing partially-executed
instructions - Requirement
- Pipeline step dpipe matches k instruction
executions dkspec - For our pipeline k 1
- When pipeline stalls have k 0
- Superscalar pipelines can have k gt 1
17Correspondence Checking
- Burch Dill, Computer-Aided Verification 94
- Exploit State Structure
- State held in memories and pipeline latches
- Memories match those of instruction set model
- Latches hold additional pipeline state
- Pipeline State can be flushed
- Control logic to support external interrupts
- Complete in-flight instructions
- Without fetching any new ones
18Computing Abstraction Function
- Method
- Start with arbitrary pipeline state Qpipe
- Symbolically simulate processor with stall
asserted - Project out all but programmer-visible state
- Effect
- Processor computes its own abstraction function!
Qspec
19Computational Task Single-Issue Processor
Qpipe
?
- Compare results of two symbolic simulations
- Starting from same initial state
- Number of simulation steps pipeline depth
- Check that resulting user-visible states
identical - Disjunctive acceptance condition
- Extra clock cycle causes either 0 or 1 new
instructions to complete
20Computational Task Dual-Issue Processor
- Extra clock cycle causes 0, 1, or 2 new
instructions to complete
21Term-Level Symbolic Simulation
- Simulator Operation
- Register states are term-level expressions
- Denoted by pointers to nodes in Directed Acyclic
Graph (DAG) - Simulate each cycle of circuit by adding new
nodes to DAG - Based on circuit operations
- Construct DAG denoting correctness condition
22Decision Problem
- Logic of Equality with Uninterpreted Functions
(EUF) - Truth Values
- Dashed Lines
- Model Control
- Logical connectives
- Equations
- Integer Values
- Solid lines
- Model Data
- Uninterpreted functions
- If-Then-Else operation
- Task
- Determine whether formula is universally valid
- True for all interpretations of variables and
function symbols
23Finite Model Property for EUF
- Observation
- Any formula has limited number of distinct
expressions - Only property that matters is whether or not
different terms are equal
24Boolean Encoding of Integer Values
Expression Possible Values Bit Encoding Bit Encoding
x0 0 0 0
d0 0,1 0 b10
f (x0) 0,1,2 b21 b20
f (d0) 0,1,2,3 b31 b30
- For Each Expression
- Either equal to or distinct from each preceding
expression - Boolean Encoding
- Use Boolean values to encode integers over small
range - EUF formula can be translated into propositional
logic - Tautology iff original formula valid
25Benchmark Circuits
- Single Issue Pipeline 1xDLX
- Analogous to DLX model in Hennessy Patterson
- Verified in 94 by Burch Dill
- Dual Issue Pipeline 2xDLX-CC
- Superscalar operation with two complete pipelines
- Full-Featured Pipeline 2xDLX-
- Multi-cycle function units, exception handling
branch prediction
26Evaluation
- Using BDD Evaluation to Prove Tautology
- Circuit BDD Vars. BDD Nodes CPU Secs.
- 1xDLX 63 2,127 0.2
- 2xDLX-CC 173 51,826 20
- 2xDLX- 418 986,740 2,635
- Using SAT Checkers to Prove Tautology
- Chaff (Malik, Princeton)
- Major advances in last few years
- Circuit CNF Vars. Clauses CPU Secs.
- 2xDLX- 4,583 41,704 22
27An Out-of-order Processor (OOO)
valid tag val
D E C O D E
incr
dispatch
Program memory
valid value src1valid src1val src1tag src2valid sr
c2val src2tag dest op
result
PC
Register Rename Unit
1st Operand
result bus
retire
2nd Operand
ALU
Reorder Buffer
execute
head
tail
Reorder Buffer Fields
- Data Dependencies Resolved by Register Renaming
- Mapping from register ID to instruction in
reorder buffer that will generate register value - Inorder Retirement Managed by Retirement Buffer
- FIFO buffer keeping pending instructions in
program order
28Access Modes for Reorder Buffer
- FIFO
- Insert when dispatch
- Remove when retire
- Content Addressable
- Broadcast result to all entries with matching
source tag
- Global
- Flush all queue entries when instruction at head
causes exception
29Required Logic
- Increased Expressive Power
- Model queue pointers
- Increment decrement operations
- Relative ordering
- Ability to construct complex memory structures
- Not just set of fixed memory types
- Dont Go Too Far
- Want practical decision procedures
- Efficient reduction to propositional logic
30EUF ? CLU
- Terms (T )
- ITE(F, T1, T2) If-then-else
- Fun (T1, , Tk) Function application
- Formulas (F )
- ?F, F1 ? F2, F1 ? F2 Boolean connectives
- T1 T2 Equation
- P(T1, , Tk) Predicate application
31EUF ? CLU (Cont.)
- Functions (Fun)
- f Uninterpreted function symbol
- Read, Write Memory operations
- Predicates (P)
- p Uninterpreted predicate symbol
32Modeling Memories with ?s
- Memory M Modeled as Function
- M(a) Value at location a
- Initially
- Arbitrary state
- Modeled by uninterpreted function m0
- Writing Transforms Memory
- M? Write(M, wa, wd)
- ? a . ITE(a wa, wd, M(a))
- Future reads of address wa will get wd
33Modeling Unbounded FIFO Buffer
Already Popped
- Queue is Subrange of Infinite Sequence
- Q.head h
- Index of oldest element
- Q.tail t
- Index of insertion location
- Q.val q
- Function mapping indices to values
- q(i) valid only when h ? i lt t
- Initial State Arbitrary Queue
- Q.head h0, Q.tail t0
- Impose constraint that h0 ? t0
- Q.val q0
- Uninterpreted function
q(h2)
q(h1)
q(h)
head
q(h1)
increasing indices
q(t2)
q(t1)
tail
q(t)
q(t1)
Not Yet Inserted
34Modeling FIFO Buffer (cont.)
35Decision Procedure
CLU Formula
Lambda Expansion
?-free Formula
Function Predicate Elimination
- Operation
- Series of transformations leading to
propositional formula - Propositional formula checked with BDD or SAT
tools - Bryant, Lahiri, Seshia CAV02
Function-free Formula
Convert to Boolean Formula
Boolean Formula
Boolean Satisfiability
36Finite Model Property for CLU
x ? y ? succ(x) gt pred(y)
- Observation
- Need to encode all possible relative orderings of
expressions - Each symbolic value has maximum range of
increments decrements - Can use Boolean encodings of small integer ranges
37Verification Techniques in UCLID
- Bounded Property Checking
- Start in reset state
- Symbolically simulate for fixed number of steps
- Verify a safety property for all states reachable
within the fixed number of steps from the start
state - Correspondence Checking
- Run 2 different simulations starting in most
general state - Prove that final states equivalent
- e.g. Burch-Dill Technique
- Invariant Checking
- Start in general state s
- Prove Inv(s) ? Inv(nexts)
- Limited support for automatic quantifier
instantiation
38Verification of OOO Automation vs. Guarantee
Method Resources Verification ( of steps) Auxiliary variables Invariants
Bounded Property Checking Unbounded Bounded None None
Burch-Dill Technique Fixed Unbounded None Very few
Inductive Invariant Checking Unbounded Unbounded Significant Significant, including those for auxiliary variables
- Presence of decision procedure
- Efficiency Allows improved bounded property
checking and Burch-Dill method - Automation Reduces manual guidance in proving
invariants - Automatic Instantiation of quantifiers
39Technique 1 Bounded Property Checking
- Debugging OOO using Bounded Property Checking
- All the errors were discovered during this phase
- Counterexample trace of great help
- Debugging Motorola ELF
- Superscalar out-of-order processor
- Reorder Buffer, memory unit, load-store queues
etc. - Applied during early design exploration phase
40Bounded Property Checking Results
Model steps terms Term formula size Prop Formula Size UCLID time (s) SVC time (s)
OOO unit 10 59 2566 15290 10.8 233.18
14 87 7480 62504 76.55 gt 5 hrs
20 129 19921 263413 1679.12 gt 1 day
Elf 6 33 218 942 1.2 10.9
8 70 1085 4481 8.4 1851.6
10 104 2467 16453 30.6 gt 1 day
12 149 4553 54288 111.0 gt 1 day
- SVC (Stanford) Another decision procedure to
solve CLU formulas - Can decide more expressive class
- CVC (Successor of SVC) runs out of memory on
larger cases
41Burch-Dill Technique for OOO
- Exponential blowup with the number of ROB entries
- Limited to r 8 entries currently
- r 8 finished after case-splitting in 2.5hrs
Of ROB Entries of terms Term formula size Prop Formula Size UCLID time (s)
2 63 398 5325 6.83
3 83 618 10248 30.23
4 103 886 18175 157.41
6 143 1534 41208 3051.79
8 183 2342 82915 gt31hrs
42Technique 3 Invariant Checking
- Deriving the inductive invariants
- Require additional (auxiliary) variables to
express invariants - Auxiliary variables do not affect system
operation - Proving that the invariants are inductive
- Automate proof of invariants in UCLID
- Eliminates need for large (often fragile) proof
script
43Restricted Invariants and Proofs
- Restricted classes of invariants
- ?x1?x2?xk ?(x1xk)
- ?(x1xk) is a CLU formula without quantifiers
- x1xk are integer variables free in ?(x1xk)
- Proving these invariants requires quantifiers
- ?x1?x2?xk ?(x1xk) ? ?y1?y2?ym ?(y1ym)
- ?x1 ?x2?xk ?y1?y2?ym ??(x1xk) ? ?(y1ym)
- Automatic instantiation of x1xk with concrete
terms - Sound but incomplete method
- Reduce the quantified formula to a CLU formula
- Can use the decision procedure for CLU
44Proving Invariants
- Proved automatically
- Quantifier instantiation was sufficient in these
cases - Relieves the user of writing proof scripts to
discharge the proofs - Time spent 54s on 1.4GHz m/c
- Total effort 2 person days
45Extending the Design
- base
- Executes ALU instructions only
- exc
- Handles arithmetic exceptions
- Must flush reorder buffer
- exc/br
- Handles branches
- Predicts branch speculatively executes along
path - exc/br/mem-simp
- Adds load store instructions
- Store commits as instruction retires
- exc/br/mem
- Stores held in buffer
- Can commit later
- Loads must scan buffer for matching addresses
46Comparative Verification Effort
base exc exc / br exc / br / mem-simp exc / br / mem
Total Invariants 13 34 39 67 71
Manually instantiate 0 0 0 4 8
UCLID time 54 s 236 s 403 s 1594 s 2200 s
Person time 2 days 5 days 2 days 15 days 10 days
47Beyond Processor Verification
- Systems of Identical Processes
- E.g., synchronization protocols
- Arbitrary number of processes, each having same
operation - Software
- Create finite model by predicate abstraction
48Systems of Identical Processes
- Each Process has k State Variables
- Each state variable represented as array
- Indexed by process Id
sv1
sv2
svk
State of Process i
49Modeling System of Identical Processes
- On Each Step
- Select arbitrary process index p
- As if chosen by nondeterministic scheduler
- Update state for selected process
nextstate lambda(i) case i p state(i)
IDLE TRYING i p state(i) TRYING inuse
TRYING i p state(i) TRYING !inuse
CRITICAL default state(i) esac
50Model Checking Software
- Program is Hard to Model as Finite-State Machine
- Large number of large data words means lots of
bits - Although finite, bound is very large
- Recursion requires stack
- Conceptually unbounded
- Creating Finite State Abstraction
- Microsoft SLAM verifier
- Focus on device drivers
- Start with very abstract model of program
- Every conditional can arbitrarily be
taken/not-taken - Check properties
- E.g., always close files
- Refine when find counterexample
- More careful analysis of conditionals
51Code Verification Example
- Adapted by Tom Ball from PCI device driver code
- Initial verification run based on simple model of
control flow
- Properties to Check
- Cannot unlock v unless locked
- Cannot lock v unless unlocked
- Must exit code with v unlocked
do lock(v) old new if
(test()) unlock(v) new
while (new ! old) unlock(v)
52Model as Boolean Program
- All conditionals abstracted as Boolean variables
- Allows arbitrary branching
- Finite-state approximation of program
do lock() if (a)
unlock() while
(b) unlock()
53Refining Abstraction
- Add more detail to model to prove that errors do
not occur - Use lightweight theorem prover to check
Double locking
!a ? !b
do lock() old new if
(test()) unlock() new
while (new ! old) unlock()
do lock() if (a)
unlock() while
(b) unlock()
old new
54Refining Abstraction (cont.)
- Continue using counterexamples to generate more
constraints on allowed state transitions
Double unlocking
a ? b
do lock() old new if
(test()) unlock() new
while (new ! old) unlock()
do lock() old new if (a)
unlock() new while
(b) unlock()
55Software Verification Status
- Shows Promise
- Reason about real-life code
- Fully automatic
- No user-supplied assertions or induction
hypotheses - Still in Early Stages
- Can only deal with limited class of programs
- Memory referencing aliasing possibilities
difficult to decipher - Look for particular classes of errors
- Property checking rather than comprehensive
verification