SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA

Description:

SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA Mandy M. Wang JPL R&TD Mobility Avionics – PowerPoint PPT presentation

Number of Views:95
Avg rating:3.0/5.0
Slides: 22
Provided by: klabsOrg
Learn more at: http://klabs.org
Category:

less

Transcript and Presenter's Notes

Title: SEU Mitigation Techniques for Xilinx Virtex-II Pro FPGA


1
SEU Mitigation Techniques for Xilinx Virtex-II
Pro FPGA
  • Mandy M. Wang
  • JPL RTD Mobility Avionics

2
Agenda
  • Project Background
  • SEU Sensitive Areas and Mitigation Approaches
  • Design Details
  • Conclusion

3
Project Objective
  • Mobility Avionics project aims to develop an
    embedded platform for space flight instruments
    and systems that is scalable, configurable, and
    capable of withstanding low to medium radiation
    environments.

4
Multi-Tiered Strategy
Science Data Processor
Orbiter Command Data Handler
Not Time Critical
Image Processor
Micro-Mobility Controller
Motor Control
Simple Strategy
Robust Strategy
Science Data Processor
Time Critical
EDL Controller
Always Available Strategy
Ground Support Equipment
Mission Critical
Not Mission Critical
Low to Medium Radiation Tolerance is Assumed
5
Strategies
  • Simple Strategy A quick-and-dirty approach. It
    uses less than desirable techniques such as
    device reset and reconfiguration as a means of
    error correction. It may require an external
    computer for configuration check.
  • Robust Strategy A refinement of the simple
    strategy. It uses a SEU immune FPGA as a
    monitoring device for the system board base on
    Xilinx FPGA device. As a result, no external
    computer is needed.

6
SEU Sensitive Areas
  • Xilinx Virtex-II Pro SEU sensitive areas include
  • PPC405 Core registers
  • Configuration Memory
  • (LUT equation and Routing)
  • Data path Registers
  • User Memory
  • (Block or Distributed RAMs)

(XC2VP20)
Normalized Data based on predicted upset rates
7
Mitigation Approaches
8
System Design - Overview

Serial Port Decoder (Injects fault Signals)
FI
PPC405 1
PPC405 2
EXT MEM (128MB)
EDC
FI
OCM BRAM (8K)
PLB2OPB Bridge
UARTs
C
DDR SDRAM Cntl
FI
FI
FI
FI
EDC
PLB ARB
OPB ARB
EDC Controller
FI
Status BRAMs (4K)
PLB BRAMs (Firmware) (32K)
Crit. INTC
Non-Crit INTC
(External Devices)
9
Dual-processor Comparator
PPC 405 Block 1
PPC 405 Block 2
Off Chip Area
Cache Units
MMU
CPU
Timers and Debug
Cache Units
MMU
CPU
Timers and Debug
External SDRAM
PLB IPIF
PLB IPIF
FI
FI
FI
FI
FI
FI
C
FI
FI
DDR SDRAM Controller
PC
Arbiter
PLB Bus
Note Yellow lines PLB master read / write
signals for D-Cache Green Lines PLB
master read signals for I-Cache
FI
Fault insertion point
PC
Parity Check
10
Dual-Processor Voting Simulation
11
EDAC OCM BRAMs (Read/Write)
  • Hamming Code 32,39
  • Read-modified-write to support byte enable
    feature
  • Error information is stored in a separate
    memory space
  • Single-bit error triggers a CPU interrupt
  • Double-bit error triggers a CPU reset

Data Out (discard parity bits)
32
PPC405 1
FORCE ERROR
PARITY_OUT
Glue Logic
ENCIN
Parity Encoder
32
7
ADDR
32
BRAMS (8KB)
EN
ENOUT
DECIN
W_EN30
Error Detection Correction
32
32
7
DECOUT
CLK
PARITY_IN
PPC405 2
ERROR
Xilinx XAPP645
12
EDAC PLB BRAMs (Read Only)
  • Hamming Code 64,72
  • Read-modified-write to support byte enable
    feature
  • Single-bit error is stored in a separate memory
    space
  • Single-bit error triggers a CPU interrupt
  • Double-bit error triggers a device
    reconfiguration

Data Out (discard parity bits)
64
FORCE ERROR
2
PLB Interface
PARITY_OUT
ENCIN
Parity Encoder
64
Glue Logic
8
ADDR
64
Processor Local Bus
BRAMS (32KB 8 KB)
EN
ENOUT
W_EN
DECIN
PLB BRAM Controller
Error Detection Correction
64
64
DECOUT
8
CLK
PARITY_IN
2
ERROR
Xilinx XAPP645
13
EDAC DDR SDRAM
  • Hamming Code 64,72
  • Read-modified-write to support byte enable and
    burst of 2-words features
  • Single error is stored in a separate memory
    space
  • Single error triggers a CPU interrupt
  • Double error triggers device reconfiguration

Data Out (discard parity bits)
64
32
Mux
FORCE ERROR
2
PARITY_OUT
ENCIN
8
Parity Encoder
Glue Logic
64
4
Mux
DDR SDRAM (128MB 32MB)
PLB interface modules
32
64
ADDR
Processor Local Bus
ENOUT
DECIN
32
DDR SDRAM Controller
Error Detection Correction
64
64
CLK
Demux
CLKn
4
8
DECOUT
PARITY_IN
ERROR
2
Xilinx XAPP645
14
Self Configuration Checker
Digital Design
ICAP Controller
top.bit
ICAP
Implementation
top.ll (contains frame address used for the
design)
Frame Address Memory (BRAMS)
Read Back Commands ( 44 Bytes)
C script
4 Bytes
(BRAMS)
Frame address data formatted for BRAMS
CRC Checker
Virtex-II Pro
This portion can be ported to a
radiation-hardened FPGA in the case of robust
strategy
15
Self Configuration CheckerDesign Highlights
  • No External I/Os access required
  • Frame-by-frame read back required
  • 32-bit CRC algorithm implemented.
  • (A CRC signature is generated after device
  • power up)
  • No SRL16 and Distributed SelectRAMs
  • used in design

16
Labview Fault Injection Panel
Screenshot of fault injection emulator that
interfaces with the prototype board.
Process Bus Fault Injection Buttons
Program counter resets to zero when a CPU reset
occurs.
ASCII Command Input window
Fault Injection Error Counters
Processors Mismatch LED Indicator
Fault location map
17
XC2VP20 Device Utilization (without TMR)
Number of External IOBs 57 out
of 564 10  Number of PPC405s
2 out of 2 100 Number of RAMB16s
30 out of 88 34 Number of
SLICEs 4334 out of 9280
46   Number of BUFGMUXs 6 out
of 16 37 Number of DCMs
2 out of 8 25 Number of ICAPs
1 out of 1 100
Number of JTAGPPCs 1 out of 1
100
18
Slice Utilization (without TMR)
Note The shaded modules can be replaced by other
approach.
19
Mitigation State Machine
CPU Interrupt
1) OCM BRAM single-bit error 2) PLB BRAM
single-bit error 3) DDR SDRAM single-bit error
CPU Reset
1) CPU mismatch 2) CPU watchdog timer 3) OCM
EDC double-bit error
CPU reset counter full
Mitigation Severity
Normal
System Reset
1) OPB Bus error 2) PLB Bus error
System reset counter full
FPGA Reconfiguration
1) Configuration check fail 2) PLB EDC
double-bit error 3) DDR SDRAM double-bit error
20
Conclusion
  • Identified and categorized error prone
  • regions on the Virtex-II Pro into four
  • types
  • Developed mitigation strategies for each
  • region.
  • Radiation test on the overall system is in
  • progress.

21
Acronyms
  • SEU Single Event Upset
  • FPGA Field Programmable Gate Array
  • LUT Look Up Table
  • PLB Processor Local Bus
  • OPB On-Chip Peripheral Bus
  • OCM On-Chip Memory
  • EDAC Error Detect-And-Correct
  • ICAP Internal Configuration Access Point
Write a Comment
User Comments (0)
About PowerShow.com