Understanding all error control codes in one post
This post is a work in progress……
Introduction
The purpose of the post is to explain in as few words as possible, the principle behind the encoding and decoding of each error control code, as well as pointing to some good sources for in-depth further studys. The coding schemes covered are:
- Linear Block Code
- Convolutional Code
- Binary Cyclic Code
- Reed-Solomon Code
- Turbo Code
- Low Density Parity Check Code
Motivation
Although most of the information-theoretic proofs used in paper leverage random codes or construct codes that are not practical (i.e. there is not systematic way to encode or decode on actual hardwares). It is important to at least understand various error control coding schemes. Unfortunately, coding theory is no longer taught at Georgia Tech due to ML taking over everything (another reason to hate ML). One of the main differentiation points between communication theory and information theory research is the absence of coding in the former literature. From a communication theory background, this post is a compilation of various resources and some simplified explanation of coding schemes. The goal is to explain in very simple terms how these codes are constructed and used, and relegate techinical details on for instance, field and ring theories, to external sources where readers can explore on their own.
Some general resources
The post relies heavily on Error Correction Coding: Mathematical Methods and Algorithms by Todd K. Moon. However, the book is very heavy on actual hardware implementations and may come across as unfriendly for beginners. I also recommend the lecture series on Algebraic Coding Theory by Prof. Mary Wootters from Stanford. In addition, materials are also drawn from undisclosed lecture notes on ECE 6606 Coding Theory taught by Prof. Matthieu Bloch at Gatech.
Linear Block Code
An \((n,k,d)\) binary block code \(\mathcal{C}\) has codewords forming a subspace of dimension \(k\) of the vector space \(\mathbb{F}_2^n\) (The Galois Field of size 2), with minimum hamming distance \(d\), which is just a fancy way to say that all codewords are comprised of 1s and 0s with modolo-2 operation, and that the minimum number of bit differences between any two of the codewords is \(d\).
Essentially, a \((n,k,d)\) code transforms message bits \(\mathbf{s} \in \mathbb{F}_2^k\) into a codeword in \(\mathbb{F}_2^n\), conceptually, we want \(d\) as large as possible such that the distance measured over the codeword subspace between distinct codewords are large, such that it is less likely to confuse them at the receiver. Needless to say, there exists a tradeoff between rate \(k/n\) and distance \(d\) and one of them is in the form of hamming bound \(\begin{equation} R = \frac{k}{n} \leq 1 - \frac{\log_2 \left(\sum_{i=0}^t \binom{n}{i}\right)}{n} \end{equation}\) and a binary linear block code that correct upto \(t = \left\lfloor \frac{d-1}{2} \right\rfloor\) errors.
Encoding
Consider a \((7,4,3)\) hamming code that can be uniquely described by a generator matrix of size \(k \times n\) of the form \(\begin{equation} \mathbf{G} = \left[ \begin{array}{cccc|ccc} 1 & 0 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ \end{array} \right] \end{equation}\) Consider a message \(\mathbf{m} = (1,1,0,1)\) in \(\mathbb{F}_2^4\), the codeword is generated as \(\mathbf{mG} = (1,1,0,1,1,0,0)\)
Decoding
For every generator matrix, there exists a corresponding parity check matrix \(\mathbf{H}\) of size \((n-k) \times n\) of the form \(\begin{equation} \mathbf{H} = \left[ \begin{array}{ccccccc} 1 & 1 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 0 & 0 & 1 \\ \end{array} \right] \end{equation}\) such that \(\mathbf{GH}^T\). The idea of a parity check is that for any \(\mathbf{c} \in \mathcal{C}\), \(\mathbf{cH}^T = \mathbf{0}\). When the received codeword \(\mathbf{r} = \mathbf{c}+\mathbf{e}\) is corrupted for some \(\mathbf{e}\neq \mathbf{0}\), we see that \(\mathbf{rH}^T = \mathbf{eH}^T\) is a syndrome that can be used for error detection. For every syndrome (which a total there are \(7 = 2^{n-k}-1\)), a corresponding error position can be tabulated out of all possible \(n =7\) positions. This can be implemented the hard way with the following code.
function Corrected_bits=Hamming_correction(Coded_bits,Block_Num,k,n)
Corrected_bits=zeros(1,k,Block_Num);
H=[1 0 0 1 0 1 1; %Hamming code parity matrix (Systemic code)
0 1 0 1 1 1 0;
0 0 1 0 1 1 1];
T=[1,0,0,0; %Truncating matrix
0,1,0,0;
0,0,1,0;
0,0,0,1;
0,0,0,0;
0,0,0,0;
0,0,0,0];
for i=1:Block_Num %Multiply received bits with parity matrix, check error and corrects single bit error
if mod(Coded_bits(:,:,i)*H.',2)==[1,0,1]
Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,0,0,0,1],2);
end
if mod(Coded_bits(:,:,i)*H.',2)==[1,1,1]
Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,0,0,1,0],2);
end
if mod(Coded_bits(:,:,i)*H.',2)==[0,1,1]
Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,0,1,0,0],2);
end
if mod(Coded_bits(:,:,i)*H.',2)==[1,1,0]
Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,1,0,0,0],2);
end
if mod(Coded_bits(:,:,i)*H.',2)==[0,0,1]
Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,1,0,0,0,0],2);
end
if mod(Coded_bits(:,:,i)*H.',2)==[0,1,0]
Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,1,0,0,0,0,0],2);
end
if mod(Coded_bits(:,:,i)*H.',2)==[1,0,0]
Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[1,0,0,0,0,0,0],2);
end
if mod(Coded_bits(:,:,i)*H.',2)==[0,0,0]
Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,0,0,0,0],2);
end
end
Truncated_bits=zeros(1,n,Block_Num); %remove appended parity bits
for i=1:Block_Num
Truncated_bits(:,:,i)=Corrected_bits(:,:,i)*T;
end
Corrected_bits=Truncated_bits;
In general, linear binary block code is by far the easiest to understand, and readers may consult first several videos of Algebraic Coding Theory or Hamming code by 3B1B for details.
Convolutional Code
TBC
Encoding
TBC
Decoding
TBC
Example of rate 1/2 convolutional code with trellis encoder and viterbi decoder in matlab.
%Convolutional Coded QPSK using Viterbi Decoder (4-state trellis)
%Author: Sidong Guo
clear;clc;close all;
tb=2;
n=4;
Block_Num=5000;
Bits=randi(0:1,[1,n,Block_Num]);
Trellis=poly2trellis(3,[7,5]);
Coded_bits=zeros(1,n/0.5,Block_Num);
for i=1:Block_Num
Coded_bits(:,:,i)=convenc(Bits(:,:,i),Trellis);
end
Premod_bits=zeros(1,n,Block_Num);
Premod_bits2=zeros(1,n/2,Block_Num);
for i=1:Block_Num
b=1;
while b<length(Coded_bits(:,:,i))
Premod_bits(:,(b+1)/2,i)=bin2dec(num2str(Coded_bits(:,b:b+1,i)));
b=b+2;
end
end
for i=1:Block_Num
b=1;
while b<length(Bits(:,:,i))
Premod_bits2(:,(b+1)/2,i)=bin2dec(num2str(Bits(:,b:b+1,i)));
b=b+2;
end
end
Coded_Symbols=qammod(Premod_bits,4)*sqrt(0.5);
Symbols=qammod(Premod_bits2,4)*sqrt(0.5);
nr=randn(1,n,Block_Num);
ni=randn(1,n,Block_Num);
Noise_coded=(sqrt(2)/2)*(nr+1i*ni);
nr=randn(1,n/2,Block_Num);
ni=randn(1,n/2,Block_Num);
Noise=(sqrt(2)/2)*(nr+1i*ni);
ratio=zeros(1,6);
Coded_ratio=zeros(1,6);
Precorrect_bits=zeros(1,2*n,Block_Num);
Decoded=zeros(1,n,Block_Num);
Corrected_bits=zeros(1,n,Block_Num);
for SNRdb=0:2:10
disp(SNRdb);
Coded_error=0;
SNR=10^(SNRdb/10);
Symbols2=Symbols+(1/sqrt(SNR))*Noise;
Coded_Symbols2=Coded_Symbols+(1/sqrt(SNR))*Noise_coded;
Decoded_Symbol=qamdemod(Symbols2,4);
Decoded_Symbol2=qamdemod(Coded_Symbols2,4);
for i=1:Block_Num
for b=1:length(Decoded_Symbol(:,:,i))
c=dec2bin(Decoded_Symbol(:,b,i),2);
Decoded(:,(2*b-1):(2*b),i)=[str2double(c(1)),str2double(c(2))];
end
end
for i=1:Block_Num
for b=1:length(Decoded_Symbol2(:,:,i))
a=dec2bin(Decoded_Symbol2(:,b,i),2);
Precorrect_bits(:,(2*b-1):(2*b),i)=[str2double(a(1)),str2double(a(2))];
end
end
for i=1:Block_Num
Corrected_bits(:,:,i)=vitdec(Precorrect_bits(:,:,i),Trellis,tb,'trunc','hard');
end
ratio(SNRdb/2+1)=sum(sum(Decoded~=Bits))/(Block_Num*n);
Coded_ratio(SNRdb/2+1)=sum(sum(Corrected_bits~=Bits))/(Block_Num*n);
end
figure()
semilogy(0:2:10,ratio)
hold on
semilogy(0:2:10,Coded_ratio)
xlabel('SNRdB')
ylabel('BER')
legend('QPSK','Coded-QPSK')
Binary Cyclic Code
Binary cyclic code is another linear code, whose name come from the fact that any cyclic right shift of a codeword is another codeword. The crucial idea is that each codeword in a cyclic code can be associated with a polynomial, where coefficients correspond to the elements of the codeword. A codeword \(\mathbf{c}\) is represented as: \(\begin{equation} c(x) = c_0 + c_1x + c_2x^2 + \ldots + c_{n-1}x^{n-1} \end{equation}\) A cyclic shift to the right can be represented using polynomial as \(x c(x)mod(x^n-1)\).
Encoding
Based on previous discussion, a message \(\mathbf{m} = (m_0,m_1,\dots,m_{k-1})\) can be also represented by polynomial as \(m(x) = \sum_{i=0}^{k-1}m_i x^i\) and the corresponding codeword polynomial is obtained as: \(\begin{equation} c(x) = m(x)g(x) =\sum_{i=0}^{k-1}m_ix^i g(x) \end{equation}\) then in matrix form we have \(\begin{equation} \mathbf{c} = (m_0,m_1,\dots,m_{k-1}) \begin{bmatrix} g_0 & g_1 & \cdots & g_{k-1} & 0 & \cdots & 0 \\ 0 & g_0 & g_1 & \cdots & g_{k-1} & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & g_0 & g_1 & \cdots & g_{k-1} \end{bmatrix} \end{equation}\)
Decoding
Principally, syndrome decoding can also be applied to binary cyclic codes, for example assume we have generator polynomial \(g(x) = x^4 + x^3 + x^2 + 1\), the corresponding parity check polynomial is calculated as \((x^7 + 1)/(x^4 + x^3 + x^2 + 1) = (x^3 + x^2 + 1)\) \(\begin{equation} H = \left[ \begin{array}{ccccccc} 1 & 1 & 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 0 & 1 \end{array} \right] \end{equation}\) Using the traditional syndrome decoding, decoding cyclic code can be accomplished with
n = 7; % Codeword length
k = 3; % Message length
% Generator polynomial coefficients (excluding zero-degree term)
g = [1 0 1 1 1 0 0]; % Coefficients for x^4 + x^3 + x^2 + 1
% Create generator matrix G from the generator polynomial
G = zeros(k, n);
for i = 0:k-1
G(i+1, :) = circshift(g,i);
end
% Create the parity-check matrix H
% Parity check polynomial is (x^7 + 1)/(x^4 + x^3 + x^2 + 1) = (x^3 + x^2 +
% 1)
h = [1 1 0 1 0 0 0];
H = zeros(n-k, n);
for i = 0:n-k-1
H(i+1, :) = circshift(h,i);
end
% Example message vector (random binary values of length k)
msg = randi([0 1], 1, k);
% Encode the message using the generator matrix
codeword = mod(msg * G, 2);
% Introduce a single bit error for testing
error_position = randi([1 n]);
received = codeword;
received(error_position) = ~received(error_position);
% Decode and correct the error
syndrome = mod(received * H', 2);
error_pattern = zeros(1, n);
if any(syndrome)
% Check each column of H to find a match with the syndrome
for i = 1:n
if all(H(:, i) == syndrome')
error_pattern(i) = 1;
break;
end
end
end
% Correct the codeword by flipping the error bit
corrected_codeword = mod(received + error_pattern, 2);
However, the beauty of cyclic codes, from a hardware perspective, is the leveraging of shift registers in systematic implementation. Here is also a systemic decoding algorithm for cyclic code :
- Let \(i = 0\), compute the syndrome \(\mathbf{s}\) for a received codeword \(\mathbf{r}\)
- If \(\mathbf{s}\) is in syndrome look-up table, goto 6.
- Let \(i = i+1\), enter a 0 into shift register, and compute \(\mathbf{s}_i\)
- If \(\mathbf{s}_i\) is not in the syndrome look-up table, goto 3
- Let \(\mathbf{e}_i\) be the error pattern corresponding to syndrome \(\mathbf{s}_i\). Determine \(\mathbf{e}\) by cyclically shifting \(\mathbf{e}_i\) \(i\) times to the left.
- Let \(\mathbf{c} = \mathbf{r} - \mathbf{e}\) and output \(\mathbf{c}\)
Reed-Solomon Code
RS code is a special cyclic code that is a \((n,k,n-k+1)\) linear code, meaning that it can achieve the singleton bound.
Encoding
TBC
Direct Method Decoding
Berlekamp-Massey Algorithm
Turbo Code
Turbo codes are a class of high-performance error-correcting codes that were introduced in 1993 by Claude Berrou and his colleagues. They have significantly improved the efficiency of communications over noisy channels, approaching the Shannon limit — the theoretical maximum efficiency for error-free communication in a given noise level.
Turbo codes are based on the principle of iterative decoding of parallel concatenated convolutional codes (PCCCs). The basic structure of a turbo code involves two or more convolutional codes (usually 2) separated by an interleaver.
Interleaver An interleaver is a device that permutes the order of the input bits. The purpose of interleaving in turbo codes is to spread out error bursts that might occur in the channel, making it easier for the error correction process to handle them. Mathematically, we use \(\pi\) to represent the interleaver function.
Encoding
Given the message sequence \(\mathbf{m} = (m_1, m_2, \ldots, m_k)\), the encoding process involves two main steps:
- First Encoder Output: \(c_1 = G_1(D) \cdot m(D)\), where \(G_1(D)\) are the generator polynomials for the first encoder, and \(m(D)\) is the polynomial representation of \(m\).
- After applying the interleaver \(\pi\): \(m'(D) = \pi(m(D))\), where \(\pi\) permutes the bits of \(\mathbf{m}\) to produce \(\mathbf{m}'\).
- Second Encoder Output: \(c_2 = G_2(D) \cdot m'(D)\), where \(G_2(D)\) are the generator polynomials for the second encoder.
The final codeword is a combination of \(c_1\) and \(c_2\) and possibly the original message itself. This depends on whether the turbo code is systematic (where the message forms part of the codeword).
BCJR
There are now many famed algorithms for decoding turbo, here we introduce the celebrated BCJR algorithm, which is also detailed in this tutorial.
The idea is the following: BCJR tries to iteratively compute the a posteriori likelihood, given some received codeword \(\mathbf{y}\), of input bit at time \(t\) being either \(+1\) or \(-1\) (1 or 0) with the following metric \(\begin{equation} L(u_t|\mathbf{y}) = ln(\frac{P(u_t = 1|\mathbf{y})}{P(u_t =-1|\mathbf{y})}) \end{equation}\)
Recall the trellis decoding from convolutional code, suppose we are at time \(t\) on a trellis with state \(s_t = s\), we can see that the received codeword \(\mathbf{y}\) can be divided into three regions, prior, current and posterior in the following way: \(\begin{equation} \mathbf{y} = \mathbf{y}_{<t}y_t\mathbf{y}_{>t} \end{equation}\)
Averaging all state transitions on trellis the metric known as log-likelihood ratio can be expressed as \(\begin{equation} L(u_t|\mathbf{y}) = ln(\frac{\sum_{s_{t-1}\rightarrow s_t |u_t = +1}P(s_{t-1},s_t,\mathbf{y})}{\sum_{s_{t-1}\rightarrow s_t |u_t = -1}P(s_{t-1},s_t,\mathbf{y})}) \end{equation}\) where we have \(P(s_{t-1},s_t,\mathbf{y}) = P(s_{t-1},\mathbf{y}_{<t})P(y_t,s_t|s_{t-1})P(\mathbf{y}_{>t}|s) = \alpha_{t-1}(s_{t-1})\gamma_k(s_{t-1},s_t)\beta_{t}(s_t)\), and we want to iteratively find these three quantities. For the following, for simplicity, let \(s_t = s\) and \(s' = s_{t-1}\)
- Initialization \(\alpha_0(s) = \begin{cases} 1 & \text{if } s = 0 \\ 0 & \text{otherwise} \end{cases}\)
\(\beta_N(s) = \begin{cases} 1 & \text{if } s = 0 \\ 0 & \text{otherwise} \end{cases}\) Where \(N\) is the length of the trellis.
-
Calculate \(\gamma\) The gamma values are calculated based on the transition probabilities between states and the observed received symbols: \(\gamma_t(s', s) = \exp\left(\sum_{i=1}^{n} \frac{y_{t,i} \cdot x_{t,i}(s', s) - x_{t,i}^2(s', s)/2}{2\sigma^2}\right)\) Here, \(x_{t,i}(s', s)\) is the coded bit for transition from state \(s'\) to \(s\) at time \(t\).
-
Iteratively (forward and backward) calculate \(\alpha\) and \(\beta\) \(\alpha_t(s) = \sum_{s'} \alpha_{t-1}(s') \gamma_t(s', s)\)
- Calculate LLR
- Normalization for stability
\(\beta_t(s) = \frac{\beta_t(s)}{\sum_{s'} \beta_t(s')}\) Normalization helps prevent the probabilities from becoming too small or too large, which could lead to underflow or overflow in digital computations.
LDPC Code
There is an excellent tutorial on LDPC code by our very own Prof. John Barry
Enjoy Reading This Article?
Here are some more articles you might like to read next: