ChingTsun Chou March 2003 Slide 0 - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

ChingTsun Chou March 2003 Slide 0

Description:

The views expressed in this talk are the presenter's alone and ... Overview of Intel's Scalability Port (SP) ... SP was implemented in Intel's E8870 chipset ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 16
Provided by: chingts
Category:
Tags: chingtsun | chou | intels | march

less

Transcript and Presenter's Notes

Title: ChingTsun Chou March 2003 Slide 0


1
Applying Formal Methods to Protocol
Specifications andSystem Architecture
Ching-Tsun Chou Multi-Processor
Architecture Enterprise Platforms Group Intel
Corporation
2
Disclaimer
  • The views expressed in this talk are the
    presenters alone and not necessarily those of
    Intel Corporation

3
Why formal methods?
  • Architectural specifications contain complex
    distributed protocols whose correctness is
    nontrivial to establish
  • Examples Directory-based cache coherence
    protocols, forward-progress mechanisms,
    variations of sliding window protocols,
  • Goal Get protocol specifications correct before
    implementations commence
  • The earlier a bug is found, the easier it is to
    fix it, and the more flexibility there is in
    possible fixes
  • Formal verification (FV) is a body of powerful
    techniques for achieving this goal
  • Formal modeling promotes clear thinking and
    minimizes misunderstanding and misinterpretation
    of specifications
  • In the early stages of protocol design, more bugs
    are found during formal modeling than by model
    checking
  • Protocol design and formal modeling should go
    hand in hand
  • Formal modeling produces unambiguous golden
    models of at least some aspects of complex
    protocols
  • Executable reference models can be generated from
    formal models
  • Experience shows that formal methods work
  • Already a standard industry practice Intel, Sun,
    IBM, Compaq, SGI, ...
  • As this talk hopes to demonstrate

4
Formal vs simulation-based verification
  • Simulation-based verification
  • Check a small fraction of all possible behaviors
    of a large model
  • Very large and relatively complete model
  • Model need not be simplified or abstracted
  • Only a very small part of the state space is
    explored
  • Need to generate tests and collect coverage
    feedback
  • gt Results only as good as your tests and checkers
  • Formal verification
  • Check all possible behaviors of a small model
  • All states are exhaustively explored
  • No tests are needed and coverage is 100
  • Only very small models can be handled
  • Often need drastic simplification and
    abstraction
  • gt Results only as good as your models and
    properties

Moral There is no free lunch!
5
Overview of Intels Scalability Port (SP)
architecture
  • Designed for mid-range shared memory
    multiprocessors
  • Employ high-speed point-to-point interconnect
    that provides good scalability for mid-range to
    high-end systems
  • Shared buses are neither cost-effective nor
    scalable beyond limited number of processors due
    to signaling, thermal, mechanical, and other
    challenges
  • Support flexible system architecture
  • Enable cost-optimal small systems to scalable
    high-end systems
  • Enable system vendors with proprietary system
    interconnects and components to use Intel
    building blocks
  • An instance of SP was implemented in Intels
    E8870 chipset

6
Overview of SP cache coherence protocol
  • Make no assumption whatsoever about the relative
    timing of events or the ordering of messages
  • Completely asynchronous, event-driven
    specification
  • Directory-based, though the directory
  • is optional (no directory null directory)
  • may or may not be physically distributed
  • A generalization of the invalidation-based MESI
    protocol, but different caching agents may do
    MESI-state transitions at different times
  • Employ mechanisms to resolve conflicts
    collisions of requests from different requesters
    to the same cache line in a distributed manner
  • Table-based specification with 1,000 rows in all
    tables
  • gt20 transaction types, each of which has a
    different behavior by itself and can interact
    with every other transaction type

7
SP cache coherence protocol validation flow
Protocol specification
Boolean rules
Non-table-based code
Extracted p-tables
Generated p-tables
Formal verification model
C reference model
?

Model checking
Simulation
Find easy bugs in protocol spec
Find hard bugs in protocol spec
Find bugs in implementations
8
Properties verified
  • Data consistency
  • If a caches state is valid (i.e., S, E, or M),
    then its data is up to date
  • Cache and directory state consistency
  • If any cache is in state E or M, the other caches
    must be in I
  • If a presence bit in directory is 0, the
    corresponding cache must be in I
  • If the directory state is I, all presence bits
    are 0 (and hence all caches are I)
  • If the directory state is S, the caches whose
    presence bits are 1 are in I or S
  • If the directory state is E, there is exactly one
    presence bit being 1
  • Weak liveness properties AG EF (cs CS), for
    each control state cs and each possible value
    CS of cs
  • Excellent guard against missing rows in protocol
    tables and other unexpected cases
  • Detect both global and local deadlocks, but not
    livelock or starvation
  • Do not rely on fairness assumptions

9
Results of SP cache coherence protocol FV
  • An SP cache coherence protocol has gt1033
    reachable states for a configuration containing 1
    cache-line address, 1 home node, 2 caching nodes,
    and all gt20 transaction types
  • Each property takes (on the average) 45 hours to
    model-check on a 700 MHz Pentium III Xeon machine
    with 4 GB of physical memory
  • Many interesting bugs were found in successive
    versions of SP cache coherence protocol by both
    formal modeling and model checking
  • In fact, more bugs were found by the former than
    by the latter in the early phase of SP protocol
    design
  • Not surprisingly, most problems were found when
    SP was first designed and during major revisions
    (e.g., when new transaction types were added)
  • But even minor revisions could introduce problems
  • Moral As far as cache coherence protocols are
    concerned, unaided human reasoning should not be
    trusted

10
Rule-based table checking flow
Specification document in word processor (e.g,.
FrameMaker)
Specification document in HTML
Convert
Extract flatten
Rules
Pre-processed table
Post-process
Generate
?

Post-processed table
Generated table
11
Why rule-based table checking works
  • Tables and rules take two fundamentally different
    but complementary views
  • Tables are row-centric and enumeration of cases
    (row case)
  • Rules are column-centric and expression of
    relationships between columns
  • By comparing the two views against each other,
    the chance of a bug escaping is minimized
  • Ideally, tables and rules should be constructed
    by two different persons
  • Expression of complex relationships between
    visible columns is simplified by means of hidden
    columns
  • Cause-and-effect metaphor Hidden columns are
    the ultimate but invisible causes of visible
    columns
  • Hidden columns are hidden by existentially
    quantifying them away
  • Hidden columns are used to increase further the
    difference of the two views

12
Results of rule-based table checking
  • Coded boolean rules for SP protocol tables and
    checked them against each other
  • Typically dozens of errors were found before
    tables and rules agree
  • Most errors were trivial (e.g., typos), but some
    were more serious (e.g., missing cases or
    systematic misunderstanding)
  • Maintained the agreement between tables and rules
    over 2 years and tens of major and minor protocol
    revisions
  • Changing rules to keep up with tables almost
    never required more efforts than changing tables
    themselves
  • Rule-based table checking is our first line of
    defense, flushes out virtually all easy bugs,
    and has very low computational overhead
  • It takes lt 5 minutes to extract and verify by
    rules all SP protocol tables
  • We are not advocating that code review of
    tables be eliminated
  • Code review is still a must at the beginning
  • We do advocate that insights from code review
    be captured and codified by rules and re-used
    later when tables are changed

13
Novel applications of binary decision diagrams
  • Rule-based table generation and checking
  • Boils down to enumerating satisfying truth
    assignments of boolean expressions over
    enumerated types
  • Search for minimal deadlock-free wormhole routing
    scheme
  • A wormhole routing scheme is deadlock-free ? Its
    channel dependency graph is acyclic ? The
    transitive closure of the graph contains no
    self-loop
  • Hence reducible to BDD fixpoint computation
  • Details in our FMSD paper
  • Search for fault-tolerant link initialization
    sequences
  • Details in our FMSD paper
  • Observations
  • Formal methods thinking leads to new ways of
    looking at old problems
  • A little BDD goes a long way
  • BDD is an efficient representation of boolean
    rules (1. above)
  • BDD supports exhaustive search of a solution
    space (2. 3. above)

14
Lessons learned
  • Formal modeling steered us toward more precise
    and concrete protocol specifications than we
    would have written without it
  • Even an abstract formal model requires one to
    spell out what exactly one means by each protocol
    structure and action
  • Formal modeling also turned out to be an
    excellent way to help architects articulate their
    ideas
  • Formal verification gave us much higher
    confidence in the correctness of our protocol
    specifications than we would have without it
  • Certain distributed protocols (e.g.,
    directory-based cache coherence protocols) are
    too complex for unaided human reasoning alone to
    get correct
  • Formal verification makes it less risky to modify
    protocol specifications
  • Architecture definition affords a rich and
    fruitful area for the application of formal
    methods
  • Avoid state explosion with a high level of
    abstraction
  • Get to bugs at the earliest possible stage
  • Encourage architects to choose more
    validation-friendly schemes
  • Applying formal methods to a specification
    enables the exploration of design spaces that are
    beyond the scope of any particular implementation
  • Especially important for Intel, which defines
    architectures that will be implemented by
    multiple vendors over multiple product generations

15
Acknowledgements
  • Mani Azimi, Jay Jayasimha, Akhilesh Kumar, Victor
    W. Lee, Phanindra K. Mannava, Seungjoon Park, and
    Aniruddha Vaidya all contributed to the work
    described above.
Write a Comment
User Comments (0)
About PowerShow.com