SEU Tolerant Device, Circuit and Processor Design - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

SEU Tolerant Device, Circuit and Processor Design

Description:

Classes of Assertions. Inverse - Uses Output Results to ... QR Factorization. Singular Value Decomposition. Fault Tolerant Systems. Recovery. System Recovery ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 53
Provided by: carl290
Category:

less

Transcript and Presenter's Notes

Title: SEU Tolerant Device, Circuit and Processor Design


1
SEU Tolerant Device, Circuit and Processor Design
  • William Heidergott
  • General Dynamics C4 Systems
  • Scottsdale, Arizona, USA

2
Outline
  • Energetic Particle Environments
  • Space and Terrestrial Particles
  • Single Particle Interaction With Devices
  • Charge Generation and Collection
  • Single Event Effects (SEE)
  • Fault Tolerant Systems
  • Fault Avoidance, Fault Masking and Detection,
    Containment and Recovery Techniques
  • Validation
  • Summary

3
Energetic Particle Environments
  • Interplanetary Space Particles and Spectra
  • Galactic and Solar Cosmic Rays
  • Highly Ionized
  • Influenced by Magnetic Fields
  • Very Energetic
  • Relativistic (GeV) Energy

4
Energetic Particle Environments
  • Interplanetary Space Particles Interact with
    Earths Atmosphere
  • Nucleonic Component of Importance to Terrestrial
    Single Event Effects
  • Flux of Atmospheric Neutrons Varies With Altitude
  • Production VS Removal Processes
  • Maximum Flux at 60K Ft

5
Energetic Particle Environments
  • Long-Term Variation in Terrestrial Neutron Flux
  • Variation Over Solar Cycle (30 Variation)

6
Energetic Particle Environments
  • Short-Term Variation in Interplanetary Flux
  • Transient Variation Due to Solar Particle Event

7
Energetic Particle Environments
  • Short-Term Variation in the Environment
  • Transient Variation Due to Solar Particle Event

8
Energetic Particle Environments
  • Terrestrial Energetic Particle Environment
  • High Energy
    Neutrons
  • Thermal
    Neutrons
  • Alpha Particles

9
Single Particle Interaction
  • High Energy Neutron Interaction

10
Single Particle Interaction
  • Low Energy (Thermal) Neutron Interaction

11
Single Particle Interaction
  • Alpha Particle Interaction

12
Single Particle Interaction
  • Ionization Track

13
Single Event Effects
  • Single Event Effects (SEE)
  • Destructive Single Event Effects
  • Dielectric Rupture (SEDR) Thin Gate Oxides
  • Gate Rupture (SEGR) and Burnout (SEB) Power
    MOSFETS
  • Potentially Destructive
  • Single Event Latchup (SEL) Bulk CMOS ICs
  • Snapback (SES) SOI CMOS ICs
  • Soft Errors
  • Single Event Transient (SET)
  • Single Event Upset (SEU)
  • Single Event Functional Interrupt (SEFI)

14
Fault Tolerant Systems
  • Process of of Single Event Upset / Soft Error
    Generation and Effect
  • External Energetic Particle Environment
  • Transport of Energetic Particle
    Environment
  • to Semiconductor Sensitive Volume
  • Charge Generation and Collection
  • Single Event Transient (SET)
  • Generation and Propagation
  • Single Event Upset (SEU)
  • Undesired State of
    System
  • Inability to
    Provide
  • Service
    (Failure)

15
Fault Tolerant Systems
  • Fault Avoidance, Fault Masking and Detection,
    Containment and Recovery Techniques
  • External Energetic Particle Environment
  • Transport of Energetic Particle
    Environment
  • to Semiconductor Sensitive Volume
  • Charge Generation and Collection
  • Single Event Transient (SET)
  • Generation and Propagation
  • Single Event Upset
  • Undesired State of
    System
  • Inability to
    Provide
  • Service
    (Failure)

16
Fault Tolerant Systems
  • Fault Avoidance
  • Prevent Critical System Operations During Severe
    Environmental Conditions
  • Avionics Applications

17
Fault Tolerant Systems
  • Fault Avoidance
  • Reduce Severity of the Energetic Particle
    Environment (? Shielding)
  • C4 Solder Alpha Emission Keep-Out Areas for SEU
    Susceptible Elements

18
Fault Tolerant Systems
  • Fault Avoidance
  • Attenuate the Transport of Energetic Particle
    Environment to Semiconductor Sensitive Volume
    (Shielding)
  • Polyimide Layers Between Alpha Emitting Packaging
    Materials and Sensitive Device Structures
  • Alpha Particle Range in Materials
  • Si 23.6 ?m Polyimide 28.0 ?m
  • Pb 11.5 ?m Au 6.6 ?m
  • Al 19.5 ?m Resist 24.0 ?m
  • Cu 7.0 ?m Air 4.7 cm

19
Fault Tolerant Systems
  • Fault Avoidance
  • Attenuate the Transport of Energetic Particle
    Environment to Semiconductor Sensitive Volume
    (Shielding)
  • Terrestrial and Avionics Systems
  • Thermal Neutron Shielding Work to be Published by
    Full Circle Research, NASA, and Hybrid Plastics
    Inc. at IEEE NSREC in July 05
  • Metallized Polyhedral Oligomeric Silsesquioxanes
    (POSS) Board Coating Material
  • Naturally Occurring Gadolinium
  • Thermal Neutron Capture Cross Section of
    48,890 Barnes

20
Fault Tolerant Systems
  • Fault Avoidance
  • Reduce Charge Generation and Collection Processes
  • Silicon-On-Insulator (SOI)
  • Removes Reverse Biased Source / Drain Node
    Junction From Device Cross Section (Potentially
    Reduced Cross Section)
  • Epi, Retrograde and Double/Triple Well Structures
  • Reduces Carrier Lifetime in Region Below Device
    Structure
  • Non-Ionizing Energy Deposition and Low
    Temperature Buffer Layer (LT GaAs)
  • Reduces Carrier Lifetime in Region Below Device
    Structure

21
Fault Tolerant Systems
  • Fault Avoidance
  • Attenuate Single Event Transient (SET) Pulse
    Generation and Propagation

22
Fault Tolerant Systems
  • Fault Avoidance
  • Attenuate Single Event Transient (SET) Pulse
    Generation and Propagation
  • Increase Memory Cell Node Capacitance (Increase
    Critical Charge)
  • SRAM Metal-in-Metal (MIM) Capacitor
  • DRAM Capacitor on Top of Memory Cell
  • Trench DRAM Cell

23
Fault Tolerant Systems
  • Fault Avoidance
  • Block Single Event Transient (SET) Pulse From
    Producing a Single Event Upset

24
Fault Tolerant Systems
  • Fault Avoidance
  • Block Single Event Transient (SET) Pulse From
    Producing a Single Event Upset

25
Fault Tolerant Systems
  • Fault Avoidance
  • Block Single Event Transient (SET) Pulse From
    Producing a Single Event Upset

26
Fault Tolerant Systems
  • Fault Masking Techniques
  • Prevent Single Event Upsets From Producing and
    Undesired State of the System
  • Redundancy
  • Informational Redundancy
  • Error Detection and Correction Coding
  • Significant Use of EDAC in Systems
  • Byte Correction to Mitigate SEFI in SDRAM
  • Arithmetic Codes
  • No Efficient Techniques Identified
  • Spatial Redundancy
  • n-Modular Redundant (nMR) Structures
  • Significant Use in Systems
  • Temporal Redundancy

27
Fault Tolerant Systems
  • Redundancy
  • Informational Redundancy
  • Error Detection and Correction Coding

28
Fault Tolerant Systems
  • Redundancy
  • Error Detection and Correction Coding

29
Fault Tolerant Systems
  • Redundancy
  • Error Detection and Correction Coding

30
Fault Tolerant Systems
  • Redundancy
  • Spatial Redundancy
  • n-Modular Redundant (nMR) Structures
  • Triple Modular Redundancy (TMR)

31
Fault Tolerant Systems
  • Redundancy
  • Spatial Redundancy
  • n-Modular Redundant (nMR) Structures
  • Triple Modular Redundancy (TMR)

32
Fault Tolerant Systems
  • Redundancy Techniques
  • Spatial Redundancy
  • n-Modular Redundant (nMR) Structures

33
Fault Tolerant Systems
  • Recent Onset of Combinatorial Logic Single Event
    Transient Susceptibility

34
Fault Tolerant Systems
  • Recent Onset of Combinatorial Logic Single Event
    Transient Susceptibility
  • Current Technology Provides Bandwidth for
    Response
  • Capability to Propagate Short Pulses
  • Clock Speed Increasing Probability of Clock
    Occurrence With SET Within Set-Up/Hold Window

35
Fault Tolerant Systems
  • Redundancy
  • Temporal Redundancy

36
Fault Tolerant Systems
  • Redundancy
  • Temporal and Spatial Redundancy

37
Fault Tolerant Systems
  • Detection, Containment and Recovery
  • Prevent an Undesired State of System From
    Resulting in Failure
  • Detection
  • Detection is the Difficult Aspect of This
    Approach
  • Application-Oriented Fault Tolerance
  • Acceptance Testing
  • Algorithm Based Fault Tolerance (ABFT)
  • Containment
  • Hierarchical Error Containment Boundaries
  • Confine Errors to Module or Subsystem
  • Subsystems Validate Inputs and Check Results
  • Recovery
  • Recovery Blocks

38
Fault Tolerant Systems
  • Application-Oriented Fault Tolerance
  • Constraint Predicates
  • Identifies Specific Properties of Problems Which
    Enable or Constrain Application Oriented Fault
    Tolerance
  • Progress - Decompose Process Into Operations
    Blocks, Providing Testability at Intermediate
    Points
  • Surfaces Notion That the Number of Process Steps
    is Known a-priori
  • Feasibility - Constraints Which are Apparent From
    the Nature of the Problem.
  • Boundary Conditions
  • Results Must Be Within the Solution Space of the
    Problem
  • Consistency - Ability to Infer Validity of
    Intermediate or Final Results
  • Input Variables
  • Previous Intermediate or Final Results

39
Fault Tolerant Systems
  • Application-Oriented Fault Tolerance
  • Software Components - Executable Assertions
  • if not ASSERTION then ERROR
  • Detection Capability is Determined by the
    Perceptiveness of the ASSERTION
  • Containment and Recovery is Determined by the
    Response Embedded in ERROR
  • N-Version Programming
  • Parallel or Sequential Execution of Programs and
    Comparing the Results
  • Design Diversity vs Redundancy
  • Developed to Protect Against Design Defects
  • Redundant Execution Against Transient Faults

40
Fault Tolerant Systems
  • Acceptance Testing
  • Functionality and Data
  • Assesses Reasonableness of Computation Results
  • Allowable Range
  • Consistency With Input Variables
  • Consistency With Previous Results
  • Mostly Ad-Hoc Developed Techniques
  • Control Flow
  • Validates Execution Flow Within Blocks and Paths
    Between Blocks
  • Within Blocks - Set Block Tag to Key Value on
    Entry, Test for Validity on Completion
  • Between Blocks - Set Path Tag to Key Value on
    Branch Decision, Verify Proper Path Tag on
    Destination Block Entry

41
Fault Tolerant Systems
  • Acceptance Testing
  • Watchdog Coprocessor
  • Extends Notion of Watchdog Timer to Include
    Checking of On-Line Processor Operation and
    Results
  • Classes of Assertions
  • Inverse - Uses Output Results to Infer Required
    Input Variables
  • Transformation - Converts Problem to a Simpler
    One and Compares Approximated Results
  • Range - Pre-Established Limits on Results
  • State - Coprocessor Execution of Self-Checking
    Software

42
Fault Tolerant Systems
  • Algorithm Based Fault Tolerance (ABFT)
  • Most ABFT Techniques Address Computational
    Problems Which Exhibit Structure and Regularity
  • Matrix Computation
  • Fourier Transform
  • Least Squares Minimization
  • Sorting
  • QR Factorization
  • Singular Value Decomposition

43
Fault Tolerant Systems
  • Recovery
  • System Recovery
  • Check-Pointing
  • Backward Error Recovery
  • Micro-Rollback
  • Rollback
  • Forward Operational Recovery
  • Safe Point to Resume With Loss of Previous
    Results
  • Redundant Modules to Take Over for Failed
    Subsystem Until It Can Be Reinitialized
  • Hot, Warm, or Cold Spare

44
Fault Tolerant Systems
  • Validation and Verification
  • Analytical Modeling
  • Experimental Techniques
  • Hardware Pin Faults
  • Memory Corruption
  • Ion Irradiation
  • Simulation Modeling
  • Op-Code Level Simulation
  • Gate Level Simulation
  • Register-Transfer Level
  • Fault Emulation
  • Memory
  • Register Transfer Level
  • Bus

45
Summary
  • Space Systems Applications
  • Most Severe Environment and Significant
    Consequence of Failure
  • Heritage of Most Single Event Effects
    Understanding and Mitigation Techniques
  • Fault Tolerance Provisions
  • Fault Avoidance
  • Fault Masking Techniques
  • Detection, Containment and Recovery Strategies
  • Validation and Verification
  • Access to Background and Current Information

46
Information Sources
  • Single Event Effects (December Issue of TNS)

Short Course Data Workshop
47
Information Sources
  • Single Event Effects (IRPS Conference Proc.)

48
Information Sources
  • Single Event Effects

49
Information Sources
  • Single Event Effects

50
Information Sources
  • Fault Tolerant Systems

51
Information Sources
  • Fault Tolerant Systems

52
Information Sources
  • Heightened SEE Awareness
Write a Comment
User Comments (0)
About PowerShow.com