Algorithms and Architectures for Efficient Low Density Parity Check (LDPC) Decoder Hardware

Tinoosh Mohsenin
Ph.D. Dissertation
Computer Engineering Research Laboratory
Department of Electrical and Computer Engineering
University of California, Davis
Technical Report ECE-CE-2010-4, Computer Engineering Research Laboratory, University of California, Davis, 2010.

Abstract

Many emerging and future communication applications require a significant amount of high throughput data processing and operate with decreasing power budgets. This need for greater energy efficiency and improved performance of electronic devices demands a joint optimization of algorithms, architectures, and implementations.

Low Density Parity Check (LDPC) decoding has received significant attention due to its superior error correction performance, and has been adopted by recent communication standards such as 10GBASE-T 10 Gigabit Ethernet. Currently high performance LDPC decoders are designed to be dedicated blocks within a System-on-Chip (SoC) and require many processing nodes. These nodes require a large set of interconnect circuitry whose delay and power are wire-dominated circuits. Therefore, low clock rates and increased area are a common result of the codes' inherent irregular and global communication patterns. As the delay and energy costs caused by wires are likely to increase in future fabrication technologies new solutions dealing with future VLSI challenges must be considered.

Three novel message-passing decoding algorithms, Split-Row, Multi-Split and Split-Row Threshold are introduced, which significantly reduce processor logical complexity and local and global interconnections. One conventional and four Split-Row Threshold LDPC decoders compatible with the 10GBASE-T standard are implemented in 65 nm CMOS and presented along with their trade-offs in error correction performance, wire interconnect complexity, decoder area, power dissipation, and speed. For additional power saving, an adaptive wordwidth decoding algorithm is proposed which switches between a 6-bit Normal Mode and a reduced 3-bit Low Power Mode depending on the SNR and decoding iteration.

A 16-way Split-Row Threshold with adaptive wordwidth implementation achieves improvements in area, throughput and energy efficiency of 3.9x, 2.6x, and 3.6x respectively, compared to a MinSum Normalized implementation, with an SNR loss of 0.25 dB at BER = 10^-7. The decoder occupies a die area of 5.10~mm^2, operates up to 185 MHz at 1.3 V, and attains an average throughput of 85.7 Gbps with early-termination. Low power operation at 0.6 V gives a worst case throughput of 9.3 Gbps--above the 6.4 Gbps 10GBASE-T requirement, and an average power of 31 mW.

Paper

PDF (980 KB)

Reference

Tinoosh Mohsenin, "Algorithms and Architectures for Efficient Low Density Parity Check (LDPC) Decoder Hardware" Technical Report ECE-CE-2010-4, Computer Engineering Research Laboratory, ECE Department, University of California, Davis, 2010.

BibTeX entry

@phdthesis{Tinoosh:phdthesis,
   author      = {Tinoosh Mohsenin},
   title       = {Algorithms and Architectures for Efficient Low Density
Parity Check (LDPC) Decoder Hardware},
   school      = {University of California},
   year        = 2010,
   address     = {Davis, CA, USA},
   month       = Nov,
   note        = {\url{http://www.ece.ucdavis.edu/vcl/pubs/theses/2010-4}}
   }

Support Acknowledgment

This work was supported in part by Intel Corporation, UC MICRO, the National Science Foundation under Grant No. 0430090 and CAREER Award 0546907, SRC GRC Grant 1598.001, IntellaSys Corporation, ST Microelectronics, S Machines, MOSIS, Artisan, and a University of California, Davis, Faculty Research Grant. Any opinions, findings and conclusions or recomendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF) or other sponsors.

VCL Lab | ECE Dept. | UC Davis