High Throughput and Resource Efficient Pipelined Decoder Designs for Projective Geometry LDPC Codes

Ved Mitra; Mahesh C. Govil; Girdhari Singh; Sanjeev Agrawal

doi:10.3311/ppee.14807

High Throughput and Resource Efficient Pipelined Decoder Designs for Projective Geometry LDPC Codes

Periodica Polytechnica Electrical Engineering and Computer Science ◽

10.3311/ppee.14807 ◽

2019 ◽

Vol 64 (2) ◽

pp. 179-191

Author(s):

Ved Mitra ◽

Mahesh C. Govil ◽

Girdhari Singh ◽

Sanjeev Agrawal

Keyword(s):

Projective Geometry ◽

Critical Path ◽

Ldpc Codes ◽

Cmos Technology ◽

Quantization Scheme ◽

Path Delay ◽

Bit Rate ◽

Error Performance ◽

Ldpc Decoder ◽

Decoder Design

Projective geometry (PG) based low-density parity-check (LDPC) decoder design using iterative sum-product decoding algorithm (SPA) is a big challenge due to higher interconnection and computational complexity, and larger memory requirement caused by relatively higher node degrees. PG-LDPC codes using SPA exhibits the best error performance and faster convergence. This paper presents an efficient novel decoding method, modified SPA (MSPA) that not only shortens the critical-path delay but also improves the hardware utilization and throughput of the decoder while maintaining the error performance of SPA. Three fully-parallel LDPC decoder designs based on PG structure, PG(2,GF( 2s )) of LDPC codes are introduced. These designs differ in their bit-node (BN) and check-node (CN) architectures. Fixed-point, 9-bit quantization scheme is used to achieve better error performance. Another significant contribution of this work is the pipelining of the proposed decoder architectures to further enhance the overall throughput. These parallel and pipelined designs are implemented for 73-bit (rate 0.616) and 1057-bit (rate 0.769) regular-structured PG-LDPC codes, on Xilinx Virtex-6 LX760 FPGA and on 0.18 μm CMOS technology for ASIC. Synthesis and simulation results have shown the better performance, throughput and effectiveness of the proposed designs.

Download Full-text

Low-Complexity Hardware Interleaver/Deinterleaver for IEEE 802.11a/g/n WLAN

VLSI Design ◽

10.1155/2012/948957 ◽

2012 ◽

Vol 2012 ◽

pp. 1-7

Author(s):

Zhen-dong Zhang ◽

Bin Wu ◽

Yu-mei Zhou ◽

Xin Zhang

Keyword(s):

High Speed ◽

Critical Path ◽

Low Complexity ◽

Cmos Technology ◽

Path Delay ◽

Hardware Complexity ◽

Ieee 802.11A ◽

Comparison Results ◽

Mathematical Formulas ◽

High Flexibility

A high-speed low-complexity hardware interleaver/deinterleaver is presented. It supports all 77 802.11n high-throughput (HT) modulation and coding schemes (MCSs) with short and long guard intervals and the 8 non-HT MCSs defined in 802.11a/g. The paper proposes a design methodology that distributes the three permutations of an interleaver to both write address and read address. The methodology not only reduces the critical path delay but also facilitates the address generation. In addition, the complex mathematical formulas are replaced with optimized hardware structures in which hardware intensive dividers and multipliers are avoided. Using 0.13 um CMOS technology, the cell area of the proposed interleaver/deinterleaver is 0.07 mm2, and the synthesized maximal working frequency is 400 MHz. Comparison results show that it outperforms the three other similar works with respect to hardware complexity and max frequency while maintaining high flexibility.

Download Full-text

Design Space of Flexible Multigigabit LDPC Decoders

VLSI Design ◽

10.1155/2012/942893 ◽

2012 ◽

Vol 2012 ◽

pp. 1-10 ◽

Cited By ~ 7

Author(s):

Philipp Schläfer ◽

Christian Weis ◽

Norbert Wehn ◽

Matthias Alles

Keyword(s):

Design Space ◽

State Of The Art ◽

Cmos Technology ◽

Systematic Investigation ◽

Design Parameters ◽

Ldpc Decoder ◽

The Past ◽

Ieee 802.11Ad ◽

Decoder Design ◽

First Time

Multigigabit LDPC decoders are demanded by standards like IEEE 802.15.3c and IEEE 802.11ad. To achieve the high throughput while supporting the needed flexibility, sophisticated architectures are mandatory. This paper comprehensively presents the design space for flexible multigigabit LDPC applications for the first time. The influence of various design parameters on the hardware is investigated in depth. Two new decoder architectures in a 65 nm CMOS technology are presented to further explore the design space. In the past, the memory domination was the bottleneck for throughputs of up to 1 Gbit/s. Our systematic investigation of column- versus row-based partially parallel decoders shows that this is no more a bottleneck for multigigabit architectures. The evolutionary progress in flexible multigigabit LDPC decoder design is highlighted in an extensive comparison of state-of-the-art decoders.

Download Full-text

STUDY OF BIFURCATION BEHAVIOR OF LDPC DECODERS

International Journal of Bifurcation and Chaos ◽

10.1142/s0218127406016926 ◽

2006 ◽

Vol 16 (11) ◽

pp. 3435-3449 ◽

Cited By ~ 3

Author(s):

XIA ZHENG ◽

FRANCIS C. M. LAU ◽

CHI K. TSE ◽

S. C. WONG

Keyword(s):

Finite Length ◽

Convergence Rates ◽

Nonlinear Dynamical Systems ◽

Dynamical Behavior ◽

Ldpc Codes ◽

Error Performance ◽

Signal To Noise ◽

Nonlinear Dynamical ◽

Ldpc Decoder ◽

Bifurcation Phenomena

The use of low-density-parity-check (LDPC) codes in coding digital messages has aroused much research interest because of their excellent bit-error performance. The behavior of the iterative LDPC decoders of finite length, however, has not been fully evaluated under different signal-to-noise conditions. By considering the finite-length LDPC decoders as high-dimensional nonlinear dynamical systems, we attempt to investigate their dynamical behavior and bifurcation phenomena for a range of signal-to-noise ratios (SNRs). Extensive simulations have been performed on both regular and irregular LDPC codes. Moreover, we derive the Jacobian of the system and calculate the corresponding eigenvalues. Results show that bifurcations, including fold, flip and Neimark–Sacker bifurcations, are exhibited by the LDPC decoder. Results are useful for optimizing the choice of parameters that may enhance the effectiveness of the decoding algorithm and improve the convergence rates.

Download Full-text

DESIGN OF AN AREA-EFFICIENT HIGH-THROUGHPUT SHIFT-BASED LDPC DECODER

Journal of Circuits System and Computers ◽

10.1142/s0218126613500394 ◽

2013 ◽

Vol 22 (06) ◽

pp. 1350039 ◽

Cited By ~ 1

Author(s):

Yun-Ching Tang ◽

Hong-Ren Wang ◽

Hongchin Lin ◽

Jun-Zhe Huang

Keyword(s):

High Throughput ◽

Supply Voltage ◽

Critical Path ◽

Cmos Process ◽

Path Delay ◽

Clock Frequency ◽

Storage Unit ◽

Ldpc Decoder ◽

Routing Congestion ◽

Area Efficient

An area-efficient high-throughput shift-based LDPC decoder architecture is proposed. The specially designed (512, 1,024) parity-check matrix is effective for partial parallel decoding by the min-sum algorithm (MSA). To increase throughput during decoding, two data frames are fed into the decoder to minimize idle time of the check node unit (CNU) and the variable node unit (VNU). Thus, the throughput is increased to almost two-fold. Unlike the conventional architecture, the message storage unit contains shift registers instead of de-multiplexers and registers. Therefore, hardware costs are reduced. Routing congestion and critical path delay are also reduced, which increases energy efficiency. An implementation of the proposed decoder using TSMC 0.18 μm CMOS process achieves a decoding throughput of 1.725 Gbps, at a clock frequency of 56 MHz, a supply voltage of 1.8 V, and a core area of 5.18 mm2. The normalized area is smaller and the throughput per normalized power consumption is higher than those reported using the conventional architectures.

Download Full-text

A Modified Shuffling Method to Split the Critical Path Delay in Layered Decoding of QC-LDPC Codes

2019 IEEE 30th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) ◽

10.1109/pimrc.2019.8904435 ◽

2019 ◽

Author(s):

Alireza Hasani ◽

Lukasz Lopacinski ◽

Steffen Buchner ◽

Jorg Nolte ◽

Rolf Kraemer

Keyword(s):

Critical Path ◽

Ldpc Codes ◽

Path Delay ◽

Critical Path Delay ◽

Layered Decoding

Download Full-text

Majority and Minority Voted Redundancy Scheme for Safety-Critical Applications with Error/No-Error Signaling Logic

Electronics ◽

10.3390/electronics7110272 ◽

2018 ◽

Vol 7 (11) ◽

pp. 272 ◽

Cited By ~ 2

Author(s):

Padmanabhan Balasubramanian ◽

Douglas Maskell ◽

Nikos Mastorakis

Keyword(s):

Critical Path ◽

Cmos Technology ◽

Path Delay ◽

Silicon Area ◽

Multiple Faults ◽

Design Metrics ◽

Safety Critical ◽

Critical Path Delay ◽

Function Blocks ◽

Modular Redundancy

In the era of nanoelectronics, multiple faults or failures of function blocks are likely to occur. To withstand these, higher levels of redundancy are suggested to be employed in at least the sensitive portions of a circuit or system. In this context, the N-modular redundancy (NMR) scheme may be used to guard against the multiple faults or failures of function blocks. However, the NMR scheme would exacerbate the weight, cost, and design metrics to implement higher-order redundancy. Hence, as an alternative to the NMR, the majority and minority voted redundancy (MMR) scheme was proposed recently. However, the proposal was restricted to the basic implementation with no provision for indicating the correct or the incorrect operation of the MMR. Hence in this work, we present the MMR scheme with the error/no-error signaling logic (ESL). Example NMR circuits without and with the ESL (NMRESL), and example MMR circuits without and with the proposed ESL (MMRESL) were implemented to achieve similar degrees of fault tolerance using a 32/28-nm CMOS technology. The results show that, on average, the proposed MMRESL circuits have 18.9% less critical path delay, dissipate 64.8% less power, and require 49.5% less silicon area compared to their counterpart NMRESL circuits.

Download Full-text

Design and Implementation of a Farrow-Interpolator-Based Digital Front-End in LTE Receivers for Carrier Aggregation

Electronics ◽

10.3390/electronics10030231 ◽

2021 ◽

Vol 10 (3) ◽

pp. 231

Author(s):

Chester Sungchung Park ◽

Sunwoo Kim ◽

Jooho Wang ◽

Sungkyung Park

Keyword(s):

Integrated Circuit ◽

Building Block ◽

Orthogonal Frequency Division Multiplexing ◽

Critical Path ◽

Phase Error ◽

System Level ◽

Comb Filter ◽

Carrier Aggregation ◽

Path Delay ◽

Front End

A digital front-end decimation chain based on both Farrow interpolator for fractional sample-rate conversion and a digital mixer is proposed in order to comply with the long-term evolution standards in radio receivers with ten frequency modes. Design requirement specifications with adjacent channel selectivity, inband blockers, and narrowband blockers are all satisfied so that the proposed digital front-end is 3GPP-compliant. Furthermore, the proposed digital front-end addresses carrier aggregation in the standards via appropriate frequency translations. The digital front-end has a cascaded integrator comb filter prior to Farrow interpolator and also has a per-carrier carrier aggregation filter and channel selection filter following the digital mixer. A Farrow interpolator with an integrate-and-dump circuitry controlled by a condition signal is proposed and also a digital mixer with periodic reset to prevent phase error accumulation is proposed. From the standpoint of design methodology, three models are all developed for the overall digital front-end, namely, functional models, cycle-accurate models, and bit-accurate models. Performance is verified by means of the cycle-accurate model and subsequently, by means of a special C++ class, the bitwidths are minimized in a methodic manner for area minimization. For system-level performance verification, the orthogonal frequency division multiplexing receiver is also modeled. The critical path delay of each building block is analyzed and the spectral-domain view is obtained for each building block of the digital front-end circuitry. The proposed digital front-end circuitry is simulated, designed, and both synthesized in a 180 nm CMOS application-specific integrated circuit technology and implemented in the Xilinx XC6VLX550T field-programmable gate array (Xilinx, San Jose, CA, USA).

Download Full-text

High Area-Efficient Parallel Encoder with Compatible Architecture for 5G LDPC Codes

Symmetry ◽

10.3390/sym13040700 ◽

2021 ◽

Vol 13 (4) ◽

pp. 700

Author(s):

Yufei Zhu ◽

Zuocheng Xing ◽

Zerun Li ◽

Yang Zhang ◽

Yifan Hu

Keyword(s):

High Performance ◽

Ldpc Codes ◽

Low Complexity ◽

Cmos Technology ◽

Area Efficiency ◽

Prior Art ◽

New Radio ◽

High Area ◽

Significant Area ◽

Area Efficient

This paper presents a novel parallel quasi-cyclic low-density parity-check (QC-LDPC) encoding algorithm with low complexity, which is compatible with the 5th generation (5G) new radio (NR). Basing on the algorithm, we propose a high area-efficient parallel encoder with compatible architecture. The proposed encoder has the advantages of parallel encoding and pipelined operations. Furthermore, it is designed as a configurable encoding structure, which is fully compatible with different base graphs of 5G LDPC. Thus, the encoder architecture has flexible adaptability for various 5G LDPC codes. The proposed encoder was synthesized in a 65 nm CMOS technology. According to the encoder architecture, we implemented nine encoders for distributed lifting sizes of two base graphs. The eperimental results show that the encoder has high performance and significant area-efficiency, which is better than related prior art. This work includes a whole set of encoding algorithm and the compatible encoders, which are fully compatible with different base graphs of 5G LDPC codes. Therefore, it has more flexible adaptability for various 5G application scenarios.

Download Full-text

A novel error correction protocol for continuous variable quantum key distribution

Scientific Reports ◽

10.1038/s41598-021-90055-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Kadir Gümüş ◽

Tobias A. Eriksson ◽

Masahiro Takeoka ◽

Mikio Fujiwara ◽

Masahide Sasaki ◽

...

Keyword(s):

Error Correction ◽

Quantum Key Distribution ◽

Ldpc Codes ◽

Key Distribution ◽

Continuous Variable ◽

Early Termination ◽

Parity Check ◽

Secret Key ◽

Ldpc Decoder ◽

And Performance

AbstractReconciliation is a key element of continuous-variable quantum key distribution (CV-QKD) protocols, affecting both the complexity and performance of the entire system. During the reconciliation protocol, error correction is typically performed using low-density parity-check (LDPC) codes with a single decoding attempt. In this paper, we propose a modification to a conventional reconciliation protocol used in four-state protocol CV-QKD systems called the multiple decoding attempts (MDA) protocol. MDA uses multiple decoding attempts with LDPC codes, each attempt having fewer decoding iteration than the conventional protocol. Between each decoding attempt we propose to reveal information bits, which effectively lowers the code rate. MDA is shown to outperform the conventional protocol in regards to the secret key rate (SKR). A 10% decrease in frame error rate and an 8.5% increase in SKR are reported in this paper. A simple early termination for the LDPC decoder is also proposed and implemented. With early termination, MDA has decoding complexity similar to the conventional protocol while having an improved SKR.

Download Full-text

High Efficiency Generalized Parallel Counters for Look-Up Table Based FPGAs

International Journal of Reconfigurable Computing ◽

10.1155/2015/518272 ◽

2015 ◽

Vol 2015 ◽

pp. 1-16 ◽

Cited By ~ 4

Author(s):

Burhan Khurshid ◽

Roohie Naaz Mir

Keyword(s):

Power Dissipation ◽

High Speed ◽

High Efficiency ◽

Critical Path ◽

Fir Filters ◽

Path Delay ◽

Look Up Table ◽

Improved Performance ◽

Ip Cores ◽

Low Efficiency

Generalized parallel counters (GPCs) are used in constructing high speed compressor trees. Prior work has focused on utilizing the fast carry chain and mapping the logic onto Look-Up Tables (LUTs). This mapping is not optimal in the sense that the LUT fabric is not fully utilized. This results in low efficiency GPCs. In this work, we present a heuristic that efficiently maps the GPC logic onto the LUT fabric. We have used our heuristic on various GPCs and have achieved an improvement in efficiency ranging from 33% to 100% in most of the cases. Experimental results using Xilinx 5th-, 6th-, and 7th-generation FPGAs and Stratix IV and V devices from Altera show a considerable reduction in resources utilization and dynamic power dissipation, for almost the same critical path delay. We have also implemented GPC-based FIR filters on 7th-generation Xilinx FPGAs using our proposed heuristic and compared their performance against conventional implementations. Implementations based on our heuristic show improved performance. Comparisons are also made against filters based on integrated DSP blocks and inherent IP cores from Xilinx. The results show that the proposed heuristic provides performance that is comparable to the structures based on these specialized resources.

Download Full-text