Title: Tinoosh Mohsenin and Bevan M' Baas
1Split-Row A Reduced Complexity, High Throughput
Low Density Parity Check (LDPC) Decoder
Architecture
- Tinoosh Mohsenin and Bevan M. Baas
- VLSI Computation Lab, ECE Department
- University of California, Davis
2Outline
- Introduction to LDPC Codes
- Split-Row Decoder Algorithm
- Error Performance Comparison
- Decoder Implementation Results
- Conclusion
3Error Correction in Communication Systems
Error correction is widely used in most
communication systems.
4 LDPC Codes Applications
- Standards
- 10 Gigabit Ethernet (10GBASE-T) 2006
- Digital Video Broadcasting (DVB-S2)2005
- Next generation of WiFi and WiMAX
-
- Problems with current LDPC decoders
- Lack of enough memory bandwidth
- High interconnect complexity
www.ieee802.org/3/an/
5LDPC Coding
Transmitter
Noisy Channel
Encoded Image
Receiver
Decoded Image
Received Image
Iteration 1
Iteration 14
Modified images from Maccay 2001
6LDPC Decoding Message Passing Algorithm
- Performs row and column operations iteratively.
7Serial Decoders
- One or a few row and column processing units.
- Features
- Simple
- Small area
- Small number of memories
- Disadvantages
- Low memory bandwidth
- Low throughput 100 Kbps-10Mbps
8Full Parallel Decoders
- Row and column processors are directly mapped
according to the parity check matrix - High throughput
- Disadvantages
- Large circuit area
- High interconnect complexity
- Example 2048-bit, 10GBASE-T
- Row weight32, Col weight6, quantization bit5
- 139 mm2 in 0.18 µm CMOS
- 122,000 long inter-processor wires
- 1.3 Gbps
9Outline
- Introduction to LDPC Codes
- Split-Row Decoder Algorithm
- Error Rate Comparison
- Decoder Implementation Results
- Conclusion
10Key Features of Split-Row Decoder
- Row processing (dominates decoder complexity)
- Increased parallelism
- Reduced number of memory accesses
- Reduced processor complexity
- Results
- Smaller decoder area and higher utilization
- Lower interconnect complexity
- Higher throughput
- Simpler hardware implementation
11Standard vs. Split-Row Decoder
Standard Decoder
Split-Row Decoder
12Split-Row Algorithm-Mathematical View
- The magnitude part of the row processor output a,
is larger for the Split-Row decoder
- By normalizing the a values with a scale factor
Slt1 the error performance of Split-Row decoder is
improved
13Outline
- Introduction to LDPC Codes
- Split-Row Decoder Algorithm
- Error Performance Comparison
- Decoder Implementation Results
- Conclusion
14Bit Error Rate Performance Comparison
- Code length 1536 bits
- Message length 1155 bits
- Row weight 16
- Column weight4
- No. of iterations15
- MS MinSum
- MS Split-Row MinSum-
- Split Row
- S Scale factor
0.6dB
15Bit Error Rate Performance Comparison
- Code length 2048 bits
- Message length 1723 bits
- Row weight 32
- Column weight6
- No. of iterations15
- MS MinSum
- MS Split-Row MinSum-
- Split Row
- S Scale factor
0.3dB
16Outline
- Introduction to LDPC Codes
- Split-Row Decoder Algorithm
- Error Rate Comparison
- Decoder Implementation Results
- Conclusion
17A Full-Parallel Decoder Implementation
- LDPC code example
- Code length1536 bits
- Message length770 bits
- Row weight6
- Col weight3
- In Split-Row decoder
- Total no. of wires between each half is 3 of
total wires. - Row processors in each half are 2.7 times
smaller - Each row processor in each half is connected to
only 3 column processors
18Full Parallel Decoder Architecture
0.18 µm CMOS Technology, 6M layer
- Split-Row, each half includes
- 768 row processors
- 768 column processors
Standard MinSum
19Split-Row vs. Standard Decoder
(mm2)
(MHz)
(Gbps)
(mm)
- 1536-bit (3,6) Quasi-cyclic LDPC code
- No. of quantization bits is set to 5 bits per
message. - For throughput computation no. of decoding
iterations is set to 15. - Reported numbers are based on chip implementation
results in 0.18 µm
20Conclusion
- Split-Row decoder method provides a significant
reduction in circuit area - Results in
- Reduced wire interconnect complexity
- Increased circuit area utilization
- Increased speed
- Simpler implementation
- A good tradeoff between hardware complexity and
error performance
21Acknowledgments
- Intel Corporation
- UC Micro
- NSF Grant No. 0430090
- UCD Faculty Research Grant
22Message Passing (Row processing )
23Message Passing (Column processing )
?j is the received information.
24?1
25(No Transcript)
26LDPC Codes
- An LDPC code is defined by a binary matrix called
parity check matrix H. - Rows define parity check equations (constrains)
between encoded symbols in a code word and
columns define the length of the code. - V is a valid code word if H?Vt0
- Decoder in the receiver checks if the condition
H?Vt0 is valid. - Example Parity check matrix for (9, 5) LDPC
code, row weight4, column weight 2
27Row and Column Processor Architecture
28RowCol Procs. Right
RowCol Procs. left
29(No Transcript)
30- ThroughputClkCode length/Imax
- Pcfv2
31- What is the critical path and how you make sure
that sign is computed correctly? - Answer the critical path is the sign
computation, which depends on the other side. The
statistical timing analysis in place and route
reports the slowest path delay, so it will make
sure that the circuit works correctly. - Why the decoder chip becomes smaller even when
you make it into half? - Answer first the size and total no of col
processors doesnt change. The main benefit comes
from the row processor which gets smaller than
twice. The reason is that inside row processor
there are different stages of comparators and
they decrease more than twice when the number of
inputs reduces to half. - You mentioned the design is power efficient but
you didnt report any power numbers - Answer For this paper we didnt get the power
numbers, but it can be estimated from the fact
the major energy comes from the wires (p1/2cf2)
and we can say its scaled down linearly so its
about 58 reduction. - Are there other works close to your design?
32- Which applications can tolerate this error
performance loss? - This a very broad question. It really depends on
the power budget and how much low you want to go
on ber. - What is the difference between viterbi and LDPC
code? - What is the difference between the turbo and
LDPC? - If dont know the answer
- I was not involved in That part of project but
from what I know . - Review the previous works
- If asked why the chip figure is not square?
- If somebody asked the way yu proposed didnt
decrease the no of wires how do you say that it
decreases the interconncet complexity. - You should notice that we are talking about long
wires. Because when there is a large no of wires
conincting one
33- Hard decision vs. soft
- In hard decision decoding each received symbol is
thresholded to yield a single received bit as
input to the decoding algorithm and messages
passed between variable and check nodes as single
bit only In soft decision decoding, multiple bits
are used to represent each received symbol and
the messages passed between variable and check
node - How did you compute