Single Event Upset Mitigation Techniques for SRAMbased FPGAs - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Single Event Upset Mitigation Techniques for SRAMbased FPGAs

Description:

Single Event Upset Mitigation Techniques for SRAM-based FPGAs ... Instituto de Informatica PPGC DELET. Porto Alegre RS BRAZIL ... – PowerPoint PPT presentation

Number of Views:151
Avg rating:3.0/5.0
Slides: 36
Provided by: fernandagu
Category:

less

Transcript and Presenter's Notes

Title: Single Event Upset Mitigation Techniques for SRAMbased FPGAs


1
Single Event Upset Mitigation Techniques for
SRAM-based FPGAs
  • Fernanda de Lima1,2 Luigi Carro1
    Ricardo Reis1

1Federal University of Rio Grande do Sul
(UFRGS) Instituto de Informatica PPGC
DELET Porto Alegre RS BRAZIL 2State
University of Rio Grande do Sul (UERGS) Digital
Systems Engineering Department Guaíba RS
BRAZIL
2
Introduction FPGA for Space
Why SRAM-based FPGAs are attractive for space
applications?
Dedicated routing
  • High density
  • High performance
  • High flexibility
  • Low Non-Recurring Engineering cost
  • Re-programmability
  • valuable for remote missions
  • reduce the mission cost
  • enable correcting errors or improving system
    performance after launch.

Soft IP core
Hard IP core
Ultra fast I/O
Customizable Logic Blocks
Embedded memory
(CLBs)
New generation of FPGAs
3
Radiation Effects
  • Charged particles emitted by Sun activity can
    provoke
  • Single Event Upset (SEU)
  • Combinational logic (transient current pulse)
  • Sequential logic (bit flip)

E1 E2 E3
error
0
1
clk
E1 E2
error
E1 E3
0
1
0
no error
clk
E2 E3
4
SEU in SRAM-based FPGAs
E1 E2 E3
CLB Flip-flops 0.5 of the FPGA sensitive area
clk
E1 E2
E1 E3
  • Bit flip
  • Transient effect
  • Corrected at the next load

clk
E2 E3
BlockRAM
ff
LUT
F1
M
M
M
M
F2
M
M
F3
F4
M
SEU (Bit flip)
Virtex (Xilinx)
Configuration Memory Cell
5
SEU in SRAM-based FPGAs
E1 E2 E3
CLB LUTs 8 of the FPGA sensitive area
clk
  • Bit flip
  • Permanent effect
  • Corrected by reconfiguration

E1 E2
E1 E3
clk
E2 E3
BlockRAM
ff
LUT
F1
M
M
M
M
F2
M
M
F3
F4
M
SEU (Bit flip)
Virtex (Xilinx)
Configuration Memory Cell
6
SEU in SRAM-based FPGAs
E1 E2 E3
Routing and CLB customization 91.5 of the
FPGA sensitive area
clk
E1 E2
E1 E3
clk
  • Short or open circuit
  • Corrected by reconfiguration

E2 E3
BlockRAM
ff
LUT
F1
M
M
M
M
F2
M
M
F3
F4
M
SEU (Bit flip)
Virtex (Xilinx)
Configuration Memory Cell
7
SEU Mitigation Techniques for SRAM-based FPGA
  • Triple Modular Redundancy (TMR) - Full hardware
    redundancy
  • Continuous reconfiguration (scrubbing)

Sequential logic (flip-flops triplicated
majority voters)
Corrected by voter
3x input pads
check0
tr0
MAJ
tr0
clk0

Corrected by scrubbing
check1
tr1
MAJ
tr1
clk1

tr2
check2
MAJ
tr2
clk2
3x output pads with minority voters
Combinational logic triplicated
8
Goal
  • The goal is to develop a high-level technique to
    reduce TMR overheads
  • input and output pin count
  • area
  • power
  • No changes in the process technology
  • No changes at mask level
  • Coping with permanent effect of a SEU in circuits
    mapped to SRAM-based FPGAs

9
SEU mitigations for ASICs
Sample ffs
Combinational logic
clk
MAJ
clkd
clk2d
Full time redundancy
Full hardware redundancy
10
Tradeoffs
Full time redundancy
Full hardware redundancy
Able to cope with permanent faults
Able to cope with permanent faults
Not able to cope with permanent faults
11
Reducing Overheads from TMR
MAJ
tr0
clk0
MAJ
tr1
clk1
MAJ
tr2
clk2
12
Reducing Overheads from TMR
MAJ
tr0
clk0
MAJ
tr1
clk1
MAJ
tr0 or tr1 ?
clk2
  • How to detect a SEU?
  • How to recognize the faulty module?
  • How to vote?

13
How to detect a SEU?
  • Duplication with comparison (DWC)

combinational logic
dr0
dr1

Hc
dr0 dr1
TMR register
tr0
tr1
tr2
clkd
MAJ
MAJ
MAJ
14
How to recognize the faulty module?
  • To detect permanent faults
  • Stuck at 1
  • Stuck at 0
  • Recomputing with encoding operands

Time tt0
x
Ft0
Combinational logic
clk
Permanent fault
15
Recomputing with encoding operands
  • Which encode and decode blocks to use?
  • Properties of the combinational block
  • For example
  • - Arithmetic blocks recomputing with shifted
    operands (RESO)

A B C SHL(A) SHL(B) SHL(C)
C SHR(SHL(C))
16
Recomputing with shifted operands
combinational logic
dr0
dr1
clk

Tc1
Hc
dr0 dr1
TMR register
tr0
tr1
tr2
clkd
MAJ
MAJ
MAJ
17
Recomputing with shifted operands
B
A
A
B
A shifted
B shifted
B shifted
A shifted
ST0
Shift encode
0
1
1
0
ST0
ST1
ST1
0
1
1
0
combinational logic
dr0
dr1
dr1
dr1 shifted
dr0
dr0 shifted
clk
clk
Sample latches Shift decode Comparators



Tc1
Hc
dr1
Tc0
dr0
dr0 dr1
TMR register
tr0
tr1
tr2
clkd
MAJ
MAJ
MAJ
18
How to vote?
  • Concurrent Error Detection block (CED)
  • Inputs
  • Hardware Comparison (HC)
  • Time Comparison dr0 (TC0)
  • Time Comparison dr1 (TC1)
  • Small state machine
  • Initialized dr0 is the fault-free module

dr1
dr0
dr1 shifted
dr0 shifted




Tc1
Hc
Tc0
CED
Fault-free module
ST1
ST0
19
The proposed method
  • Time t0 (normal operation)
  • modules dr0 and dr1 are constantly compared to
    detect a fault
  • the operands are used directly in the
    combinational block
  • result is stored for further comparison
  • If dr0 and dr1 outputs mismatch (a fault
    detected)
  • an extra clock cycle is used to recognize the
    faulty module
  • the permanent fault detection in the module is
    done by recomputation with shifted operands
    (RESO)
  • Time t0d (next clock cycle)
  • CED block votes the fault-free module

20
The proposed method (DWC-CED)
B
A
A
B
A shifted
B shifted
B shifted
A shifted
ST0
Shift encode
0
1
1
0
ST0
ST1
ST1
0
1
1
0
combinational logic
dr0
dr1
dr1 shifted
dr0 shifted
clk
clk
Sample latches Shift decode Comparators



Hc Tc0 Tc1
Tc1
Hc
dr1
Tc0
dr0
dr0 dr1
CED
TMR register
ST1
ST0
tr0
tr1
tr2
clkd
MAJ
MAJ
MAJ
21
The proposed method (DWC-CED)
B
A
A
B
A shifted
B shifted
B shifted
A shifted
ST0
0
1
1
0
ST0
ST1
ST1
0
1
1
0
Normal Operation
combinational logic
dr0
dr1
dr1 shifted
dr0 shifted
clk
clk



Hc Tc0 Tc1
Tc1
Hc
dr1
Tc0
dr0
dr0 dr1
CED
ST1
ST0
tr0
tr1
tr2
clkd
MAJ
MAJ
MAJ
22
The proposed method (DWC-CED)
B
A
A
B
A shifted
B shifted
B shifted
A shifted
ST0
0
1
1
0
ST0
ST1
ST1
0
1
1
0
Faulty Module Recognition
combinational logic
dr0
dr1
dr1 shifted
dr0 shifted
clk
clk



Hc Tc0 Tc1
Tc1
Hc
dr1
Tc0
dr0
dr0 dr1
CED
ST1
ST0
tr0
tr1
tr2
clkd
MAJ
MAJ
MAJ
23
Experimental Setup
  • VHDL Fault injection
  • Experiment emulated in a AFX board
  • parts XCV300-240 and XCV600E-240
  • CASE STUDY Arithmetic-based circuits
  • 8x8 bit multiplier
  • Digital Filter

A
B
A
B
When?
Fault injection Control Block (VHDL)
  • Where?
  • module dr0 or dr1
  • combinational nodes

dr0
dr1
Which input vectors?
CED
Type of SEU ? - stuck at 0 or stuck at 1
FPGA board
24
Experimental Results I
CASE STUDY 8x8 bit Multiplier
  • Fault coverage
  • 384 topological FPGA nodes
  • stuck at 0 faults
  • stuck at 1 faults
  • 216 input vectors
  • Experiment has shown that 100 of the faults were
    detected and voted out when using a DWC with a
    9-bit multiplier.

25
Waveform
clock
1
scrubbing
input A
0A
11
45
12
63
02
B2
input B
21
38
4C
15
39
E4
34
fault effect
0388
01D8
out_reg_dr0
014A
14FC
01FA
1688
1239
13EC
0388
01C8
014A
147C
017A
1608
1239
out_reg_dr1
st_perm_dr0
5
fault-free
st_perm_dr1
ctrl_mux(tr2)
tr2 receives dr1
4
014A
0388
147C
017A
1608
01C8
1239
voted to tr2
extra clock
26
Experimental Results I
CASE STUDY 8x8 bit Multiplier
Area and Pin count comparison Part
XCV300-240pins
I/O pins were out of range for the TMR
approach, the part XCV300-BG432 was used.
Registered output
Non registered output
27
Experimental Results I
CASE STUDY 8x8 bit Multiplier
Area and Pin count comparison Part
XCV300-240pins
I/O pins were out of range for the TMR
approach, the part XCV300-BG432 was used.
Registered output
Non registered output
Freq. 33.8 MHz - 29.5 ns TMR Freq. 26.7 MHz -
37.4 ns proposed method (DMR-CED) (-21)
Performance
28
Experimental Results II
CASE STUDY DIGITAL FILTER
  • An 8-bit FIR canonical filter of 9 taps was
    synthesized in a XCV600E FPGA to evaluate area
    and pin count.
  • The multiplier coefficients are 2, 6, 17, 32
    and 38.
  • Registers
  • TMR
  • Combinational logic
  • DWC-CED

29
Experimental Results II
CASE STUDY DIGITAL FILTER
30
Experimental Results II
CASE STUDY DIGITAL FILTER
  • Fault coverage
  • 4208 topological FPGA nodes
  • stuck at 0 faults
  • stuck at 1 faults
  • 28 input vectors
  • Experiment has shown that 100 of the faults were
    detected and voted.

31
Experimental Results II
CASE STUDY DIGITAL FILTER
Area and Pin count comparison Part
XCV600E-240pins
Freq. 40 MHz 25.0 ns TMR Freq. 22 MHz
45.4 ns proposed method (DMR-CED) (-45)
Performance
32
Discussion
  • Constraints for both technologies (TMR and the
    proposed method)
  • only one upset per design
  • dr0, dr1, encoding, decoding or CED.
  • scrubbing rate should be fast enough
  • to avoid accumulation of upsets in two different
    redundant blocks.
  • dedicated floorplanning
  • can increase the possibility of robustness in the
    presence of more than one upset.

33
Conclusions
  • New technique for permanent fault detection and
    voting in combinational circuits mapped to
    SRAM-based FPGAs based on
  • duplication with comparison (hardware redundancy)
  • recomputation with coding and decoding (time
    redundancy)
  • The proposed approach
  • reduces the number of input and output pins
  • reduces area when large combinational blocks are
    used
  • Fault Coverage 100
  • 8x8 bits multiplier
  • Digital Filter

34
Future Work
  • Optimizing the number of extra flip-flops.
  • Improving performance
  • Studying the applicability of this method for
    other cores such as microprocessors.
  • Performing fault injection test directly on the
    component bitstream by partial reconfiguration.
  • Analysing the impact of the dedicated
    floorplanning in the method robustness.

35
  • Thank you!

Contact fglima_at_inf.ufrgs.br carro_at_eletro.ufrgs.b
r reis_at_inf.ufrgs.br
Write a Comment
User Comments (0)
About PowerShow.com