Validating the Intel Pentium 4 Microprocessor - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Validating the Intel Pentium 4 Microprocessor

Description:

June 20, 2001. Pentium 4 Development Timeline. Structural RTL coding ... Moved bug detection upstream earlier detection is less costly and less disruptive ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 16
Provided by: bobb155
Category:

less

Transcript and Presenter's Notes

Title: Validating the Intel Pentium 4 Microprocessor


1
Validating the Intel Pentium 4Microprocessor
  • Bob Bentley
  • Intel Corporation
  • June 20, 2001

2
Pentium 4 Development Timeline
  • Structural RTL coding start 2H96
  • First cluster models released late 96
  • First full-chip model released 1Q97
  • Structural RTL coding complete Q298
  • All bugs coded for the first time!
  • Structural RTL under full ECO control Q299
  • Structural RTL frozen Q399
  • A-0 tapeout December 99
  • First packaged parts available January 2000
  • First samples shipped to customers Q100
  • Production ship qualification granted October
    2000

3
Pentium 4 Validation Staffing
4
Pre-silicon validation environment
  • SRTL validation is MUCH slower than real silicon
  • Typical full-chip simulation with checkers ran at
    3-5 Hz on a Pentium III machine
  • We used a compute farm containing thousands of
    machines running 24/7 to get 6 billion
    cycles/week
  • ALL the SRTL simulation cycles we recorded
    amounted to less than 2 minutes on a single 1 GHz
    system!
  • But pre-silicon validation has some advantages
  • Fine-grained (cycle-by-cycle) checking
  • Visibility of internal state (e.g. caches)
  • APIs to allow event injection
  • No amount of dynamic validation is enough to
    exhaustively test a complex microprocessor
  • A single dyadic extended-precision FP instruction
    has O(1050) combinations

5
Pentium 4 Formal Verification
  • First large-scale effort (60 person years) at
    Intel to apply formal verification techniques to
    CPU design
  • Applying FV to a moving target is a big
    challenge!
  • Mostly model checking, with some recent work
    using theorem proving to connect FP proofs to
    IEEE 754
  • More than 10,000 proofs in key areas
  • FP Execution units
  • Instruction decode
  • Out-of-order control mechanisms
  • Found 20 high quality bugs that would have
    been hard to detect by dynamic testing
  • No silicon bugs found to date in areas proved by
    FV

6
Cluster Test Environments
  • Cluster test environments were developed for each
    of the Pentium 4 hardware clusters plus microcode
  • A BIG win overall, even though it took a lot of
    work to develop and maintain them
  • Almost 60 of all the bugs found by dynamic
    testing were caught at CTE level
  • Moved bug detection upstream earlier detection
    is less costly and less disruptive
  • Decoupled validation of different parts of the
    chip bugs in one area didnt block progress
    elsewhere
  • Provided much better controllability than the
    full-chip model, especially for downstream
    microarchitecture pipeline stages
  • CTEs helped to maintain a healthy full-chip model

7
Power Reduction Validation
  • Power consumption was a big concern for Pentium 4
  • Need to stay within the cost-effective thermal
    envelope for desktop systems at 1.5 GHz
  • Extensive clock gating in every part of the
    design
  • Mounted a focused effort to validate that
  • Committed features were implemented as per plan
  • Functional correctness was maintained in the face
    of clock gating
  • Changes to the design did not impact power
    savings
  • 12 person years of effort, 5 heads at peak
  • Fully functional on A-step silicon, measured
    savings of 20W achieved for typical workloads

8
Coverage
  • Testing without coverage feedback is like driving
    a car with a blindfold on
  • You may think you know where you are going, but
    where you end up is not where you meant to go
  • We made extensive use of directed random test
    generators, with coverage feedback to tell us
    what we were, and were not, hitting
  • Intuition is a poor guide, especially for a
    complex microarchitecture like the Pentium 4
    microprocessor
  • The purpose of coverage is not necessarily to hit
    100 of the identified conditions
  • Rather, it is a way of directing future testing
    to the places it is most needed
  • It helps to avoid the trap of spinning your
    wheels and testing the same areas over and over
    again

9
SRTL Model Release Process
  • Integrating an SRTL model for a design as complex
    as the Pentium 4 is a real challenge
  • 1.5 million lines of SRTL code
  • Massively parallel development
  • We put together a code release process based on
    our experience from previous projects
  • Build and test a graft cluster model
  • Turn in changes for inclusion in the next cluster
    model release, along with other designers
    changes
  • Build and test a graft full-chip model
  • Turn in changes for inclusion in the next
    full-chip model release, along with changes for
    other clusters
  • Although this may seem overly bureaucratic, it
    kept the full-chip model healthy even in the face
    of major waves of new functionality

10
Feature Pioneering
  • Sometimes, we had to make exceptions to the
    general integration methodology
  • Some features required simultaneous turnins from
    multiple sources (almost always including
    microcode)
  • In these cases, we adopted a scheme called
    feature pioneering
  • A feature owner created a prototype model
    containing all the first-cut changes
  • A feature validator put together a set of
    broad-brush tests designed to exercise the basic
    functionality
  • The two of them sat together and rapidly iterated
    through test and fix cycles until the feature was
    healthy enough to be released to a wider audience

11
How do you know youre done?
  • Short answer you are never done
  • There are always more tests that can be run, more
    coverage that can be obtained,
  • More useful answer you are done when you have
    exhausted the usefulness of the SRTL model as a
    vehicle for finding bugs
  • A good first-order approximation is to track
  • New bug rate
  • Cycles run
  • Coverage
  • If you are trying really hard to find bugs, and
    not finding anything significant, and youve
    covered the whole target space, then it may be
    time to tape out

12
Pre-silicon Bug Rate
13
Pre-silicon validation cycles
Full-chip
14
Unit-Level Coverage
15
Pre-silicon Bug Causes
Write a Comment
User Comments (0)
About PowerShow.com