Understanding all error control codes in one post

This post is a work in progress……

Introduction

The purpose of the post is to explain in as few words as possible, the principle behind the encoding and decoding of each error control code, as well as pointing to some good sources for in-depth further studys. The coding schemes covered are:

  1. Linear Block Code
  2. Convolutional Code
  3. Binary Cyclic Code
  4. Reed-Solomon Code
  5. Turbo Code
  6. Low Density Parity Check Code

Motivation

Although most of the information-theoretic proofs used in paper leverage random codes or construct codes that are not practical (i.e. there is not systematic way to encode or decode on actual hardwares). It is important to at least understand various error control coding schemes. Unfortunately, coding theory is no longer taught at Georgia Tech due to ML taking over everything (another reason to hate ML). One of the main differentiation points between communication theory and information theory research is the absence of coding in the former literature. From a communication theory background, this post is a compilation of various resources and some simplified explanation of coding schemes. The goal is to explain in very simple terms how these codes are constructed and used, and relegate techinical details on for instance, field and ring theories, to external sources where readers can explore on their own.

Some general resources

The post relies heavily on Error Correction Coding: Mathematical Methods and Algorithms by Todd K. Moon. However, the book is very heavy on actual hardware implementations and may come across as unfriendly for beginners. I also recommend the lecture series on Algebraic Coding Theory by Prof. Mary Wootters from Stanford. In addition, materials are also drawn from undisclosed lecture notes on ECE 6606 Coding Theory taught by Prof. Matthieu Bloch at Gatech.

Linear Block Code

An \((n,k,d)\) binary block code \(\mathcal{C}\) has codewords forming a subspace of dimension \(k\) of the vector space \(\mathbb{F}_2^n\) (The Galois Field of size 2), with minimum hamming distance \(d\), which is just a fancy way to say that all codewords are comprised of 1s and 0s with modolo-2 operation, and that the minimum number of bit differences between any two of the codewords is \(d\).

Essentially, a \((n,k,d)\) code transforms message bits \(\mathbf{s} \in \mathbb{F}_2^k\) into a codeword in \(\mathbb{F}_2^n\), conceptually, we want \(d\) as large as possible such that the distance measured over the codeword subspace between distinct codewords are large, such that it is less likely to confuse them at the receiver. Needless to say, there exists a tradeoff between rate \(k/n\) and distance \(d\) and one of them is in the form of hamming bound \(\begin{equation} R = \frac{k}{n} \leq 1 - \frac{\log_2 \left(\sum_{i=0}^t \binom{n}{i}\right)}{n} \end{equation}\) and a binary linear block code that correct upto \(t = \left\lfloor \frac{d-1}{2} \right\rfloor\) errors.

Encoding

Consider a \((7,4,3)\) hamming code that can be uniquely described by a generator matrix of size \(k \times n\) of the form \(\begin{equation} \mathbf{G} = \left[ \begin{array}{cccc|ccc} 1 & 0 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 0 & 1 & 1 \\ 0 & 0 & 0 & 1 & 1 & 1 & 1 \\ \end{array} \right] \end{equation}\) Consider a message \(\mathbf{m} = (1,1,0,1)\) in \(\mathbb{F}_2^4\), the codeword is generated as \(\mathbf{mG} = (1,1,0,1,1,0,0)\)

Decoding

For every generator matrix, there exists a corresponding parity check matrix \(\mathbf{H}\) of size \((n-k) \times n\) of the form \(\begin{equation} \mathbf{H} = \left[ \begin{array}{ccccccc} 1 & 1 & 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 0 & 0 & 1 \\ \end{array} \right] \end{equation}\) such that \(\mathbf{GH}^T\). The idea of a parity check is that for any \(\mathbf{c} \in \mathcal{C}\), \(\mathbf{cH}^T = \mathbf{0}\). When the received codeword \(\mathbf{r} = \mathbf{c}+\mathbf{e}\) is corrupted for some \(\mathbf{e}\neq \mathbf{0}\), we see that \(\mathbf{rH}^T = \mathbf{eH}^T\) is a syndrome that can be used for error detection. For every syndrome (which a total there are \(7 = 2^{n-k}-1\)), a corresponding error position can be tabulated out of all possible \(n =7\) positions. This can be implemented the hard way with the following code.

    function Corrected_bits=Hamming_correction(Coded_bits,Block_Num,k,n)
    Corrected_bits=zeros(1,k,Block_Num);
    H=[1 0 0 1 0 1 1;      %Hamming code parity matrix (Systemic code)
    0 1 0 1 1 1 0;
    0 0 1 0 1 1 1];
    T=[1,0,0,0;      %Truncating matrix 
        0,1,0,0;
        0,0,1,0;
        0,0,0,1;
        0,0,0,0;
        0,0,0,0;
        0,0,0,0];
    for i=1:Block_Num     %Multiply received bits with parity matrix, check error and corrects single bit error
        if mod(Coded_bits(:,:,i)*H.',2)==[1,0,1]
            Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,0,0,0,1],2);
        end
        if mod(Coded_bits(:,:,i)*H.',2)==[1,1,1]
            Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,0,0,1,0],2);
        end
        if mod(Coded_bits(:,:,i)*H.',2)==[0,1,1]
            Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,0,1,0,0],2);
        end
        if mod(Coded_bits(:,:,i)*H.',2)==[1,1,0]
            Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,1,0,0,0],2);
        end
        if mod(Coded_bits(:,:,i)*H.',2)==[0,0,1]
            Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,1,0,0,0,0],2);
        end
        if mod(Coded_bits(:,:,i)*H.',2)==[0,1,0]
            Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,1,0,0,0,0,0],2);
        end
        if mod(Coded_bits(:,:,i)*H.',2)==[1,0,0]
            Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[1,0,0,0,0,0,0],2);
        end
        if mod(Coded_bits(:,:,i)*H.',2)==[0,0,0]
            Corrected_bits(:,:,i)=mod(Coded_bits(:,:,i)+[0,0,0,0,0,0,0],2);
        end
    end
    Truncated_bits=zeros(1,n,Block_Num);     %remove appended parity bits 
    for i=1:Block_Num
        Truncated_bits(:,:,i)=Corrected_bits(:,:,i)*T;
    end
    Corrected_bits=Truncated_bits;

In general, linear binary block code is by far the easiest to understand, and readers may consult first several videos of Algebraic Coding Theory or Hamming code by 3B1B for details.

Convolutional Code

TBC

Encoding

TBC

Decoding

TBC

Example of rate 1/2 convolutional code with trellis encoder and viterbi decoder in matlab.

    %Convolutional Coded QPSK using Viterbi Decoder (4-state trellis)
    %Author: Sidong Guo 
    clear;clc;close all;
    tb=2;
    n=4;
    Block_Num=5000;
    Bits=randi(0:1,[1,n,Block_Num]);
    Trellis=poly2trellis(3,[7,5]);
    Coded_bits=zeros(1,n/0.5,Block_Num);

    for i=1:Block_Num
        Coded_bits(:,:,i)=convenc(Bits(:,:,i),Trellis);
    end
    Premod_bits=zeros(1,n,Block_Num);
    Premod_bits2=zeros(1,n/2,Block_Num);
    for i=1:Block_Num
        b=1;
        while b<length(Coded_bits(:,:,i))
            Premod_bits(:,(b+1)/2,i)=bin2dec(num2str(Coded_bits(:,b:b+1,i)));
            b=b+2;
        end
    end
    for i=1:Block_Num
        b=1;
        while b<length(Bits(:,:,i))
            Premod_bits2(:,(b+1)/2,i)=bin2dec(num2str(Bits(:,b:b+1,i)));
            b=b+2;
        end
    end
    Coded_Symbols=qammod(Premod_bits,4)*sqrt(0.5);
    Symbols=qammod(Premod_bits2,4)*sqrt(0.5);

    nr=randn(1,n,Block_Num);
    ni=randn(1,n,Block_Num);
    Noise_coded=(sqrt(2)/2)*(nr+1i*ni);
    nr=randn(1,n/2,Block_Num);
    ni=randn(1,n/2,Block_Num);
    Noise=(sqrt(2)/2)*(nr+1i*ni);

    ratio=zeros(1,6);
    Coded_ratio=zeros(1,6);
    Precorrect_bits=zeros(1,2*n,Block_Num);
    Decoded=zeros(1,n,Block_Num);
    Corrected_bits=zeros(1,n,Block_Num);
    for SNRdb=0:2:10
        disp(SNRdb);
        Coded_error=0;
        SNR=10^(SNRdb/10);
        Symbols2=Symbols+(1/sqrt(SNR))*Noise;
        Coded_Symbols2=Coded_Symbols+(1/sqrt(SNR))*Noise_coded;
        Decoded_Symbol=qamdemod(Symbols2,4);
        Decoded_Symbol2=qamdemod(Coded_Symbols2,4);
        for i=1:Block_Num
            for b=1:length(Decoded_Symbol(:,:,i))
                c=dec2bin(Decoded_Symbol(:,b,i),2);
                Decoded(:,(2*b-1):(2*b),i)=[str2double(c(1)),str2double(c(2))];
            end
        end
        for i=1:Block_Num
            for b=1:length(Decoded_Symbol2(:,:,i))
                a=dec2bin(Decoded_Symbol2(:,b,i),2);
                Precorrect_bits(:,(2*b-1):(2*b),i)=[str2double(a(1)),str2double(a(2))];
            end
        end
        for i=1:Block_Num
            Corrected_bits(:,:,i)=vitdec(Precorrect_bits(:,:,i),Trellis,tb,'trunc','hard');
        end
        ratio(SNRdb/2+1)=sum(sum(Decoded~=Bits))/(Block_Num*n);
        Coded_ratio(SNRdb/2+1)=sum(sum(Corrected_bits~=Bits))/(Block_Num*n);
    end

    figure()  
    semilogy(0:2:10,ratio)
    hold on 
    semilogy(0:2:10,Coded_ratio)
    xlabel('SNRdB')
    ylabel('BER')
    legend('QPSK','Coded-QPSK')

Binary Cyclic Code

Binary cyclic code is another linear code, whose name come from the fact that any cyclic right shift of a codeword is another codeword. The crucial idea is that each codeword in a cyclic code can be associated with a polynomial, where coefficients correspond to the elements of the codeword. A codeword \(\mathbf{c}\) is represented as: \(\begin{equation} c(x) = c_0 + c_1x + c_2x^2 + \ldots + c_{n-1}x^{n-1} \end{equation}\) A cyclic shift to the right can be represented using polynomial as \(x c(x)mod(x^n-1)\).

Encoding

Based on previous discussion, a message \(\mathbf{m} = (m_0,m_1,\dots,m_{k-1})\) can be also represented by polynomial as \(m(x) = \sum_{i=0}^{k-1}m_i x^i\) and the corresponding codeword polynomial is obtained as: \(\begin{equation} c(x) = m(x)g(x) =\sum_{i=0}^{k-1}m_ix^i g(x) \end{equation}\) then in matrix form we have \(\begin{equation} \mathbf{c} = (m_0,m_1,\dots,m_{k-1}) \begin{bmatrix} g_0 & g_1 & \cdots & g_{k-1} & 0 & \cdots & 0 \\ 0 & g_0 & g_1 & \cdots & g_{k-1} & \ddots & \vdots \\ \vdots & \ddots & \ddots & \ddots & \ddots & \ddots & 0 \\ 0 & \cdots & 0 & g_0 & g_1 & \cdots & g_{k-1} \end{bmatrix} \end{equation}\)

Decoding

Principally, syndrome decoding can also be applied to binary cyclic codes, for example assume we have generator polynomial \(g(x) = x^4 + x^3 + x^2 + 1\), the corresponding parity check polynomial is calculated as \((x^7 + 1)/(x^4 + x^3 + x^2 + 1) = (x^3 + x^2 + 1)\) \(\begin{equation} H = \left[ \begin{array}{ccccccc} 1 & 1 & 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 0 & 1 \end{array} \right] \end{equation}\) Using the traditional syndrome decoding, decoding cyclic code can be accomplished with

    n = 7; % Codeword length
    k = 3; % Message length

    % Generator polynomial coefficients (excluding zero-degree term)
    g = [1 0 1 1 1 0 0]; % Coefficients for x^4 + x^3 + x^2 + 1

    % Create generator matrix G from the generator polynomial
    G = zeros(k, n);
    for i = 0:k-1
        G(i+1, :) = circshift(g,i);
    end


    % Create the parity-check matrix H
    % Parity check polynomial is (x^7 + 1)/(x^4 + x^3 + x^2 + 1) = (x^3 + x^2 +
    % 1)
    h = [1 1 0 1 0 0 0];
    H = zeros(n-k, n);
    for i = 0:n-k-1
        H(i+1, :) = circshift(h,i);
    end

    % Example message vector (random binary values of length k)
    msg = randi([0 1], 1, k);

    % Encode the message using the generator matrix
    codeword = mod(msg * G, 2);

    % Introduce a single bit error for testing
    error_position = randi([1 n]);
    received = codeword;
    received(error_position) = ~received(error_position);

    % Decode and correct the error
    syndrome = mod(received * H', 2);
    error_pattern = zeros(1, n);
    if any(syndrome)
        % Check each column of H to find a match with the syndrome
        for i = 1:n
            if all(H(:, i) == syndrome')
                error_pattern(i) = 1;
                break;
            end
        end
    end

    % Correct the codeword by flipping the error bit
    corrected_codeword = mod(received + error_pattern, 2);

However, the beauty of cyclic codes, from a hardware perspective, is the leveraging of shift registers in systematic implementation. Here is also a systemic decoding algorithm for cyclic code :

  1. Let \(i = 0\), compute the syndrome \(\mathbf{s}\) for a received codeword \(\mathbf{r}\)
  2. If \(\mathbf{s}\) is in syndrome look-up table, goto 6.
  3. Let \(i = i+1\), enter a 0 into shift register, and compute \(\mathbf{s}_i\)
  4. If \(\mathbf{s}_i\) is not in the syndrome look-up table, goto 3
  5. Let \(\mathbf{e}_i\) be the error pattern corresponding to syndrome \(\mathbf{s}_i\). Determine \(\mathbf{e}\) by cyclically shifting \(\mathbf{e}_i\) \(i\) times to the left.
  6. Let \(\mathbf{c} = \mathbf{r} - \mathbf{e}\) and output \(\mathbf{c}\)

Reed-Solomon Code

RS code is a special cyclic code that is a \((n,k,n-k+1)\) linear code, meaning that it can achieve the singleton bound.

Encoding

TBC

Direct Method Decoding

Berlekamp-Massey Algorithm

Turbo Code

Turbo codes are a class of high-performance error-correcting codes that were introduced in 1993 by Claude Berrou and his colleagues. They have significantly improved the efficiency of communications over noisy channels, approaching the Shannon limit — the theoretical maximum efficiency for error-free communication in a given noise level.

Turbo codes are based on the principle of iterative decoding of parallel concatenated convolutional codes (PCCCs). The basic structure of a turbo code involves two or more convolutional codes (usually 2) separated by an interleaver.

Interleaver An interleaver is a device that permutes the order of the input bits. The purpose of interleaving in turbo codes is to spread out error bursts that might occur in the channel, making it easier for the error correction process to handle them. Mathematically, we use \(\pi\) to represent the interleaver function.

Encoding

Given the message sequence \(\mathbf{m} = (m_1, m_2, \ldots, m_k)\), the encoding process involves two main steps:

  1. First Encoder Output: \(c_1 = G_1(D) \cdot m(D)\), where \(G_1(D)\) are the generator polynomials for the first encoder, and \(m(D)\) is the polynomial representation of \(m\).
  2. After applying the interleaver \(\pi\): \(m'(D) = \pi(m(D))\), where \(\pi\) permutes the bits of \(\mathbf{m}\) to produce \(\mathbf{m}'\).
  3. Second Encoder Output: \(c_2 = G_2(D) \cdot m'(D)\), where \(G_2(D)\) are the generator polynomials for the second encoder.

The final codeword is a combination of \(c_1\) and \(c_2\) and possibly the original message itself. This depends on whether the turbo code is systematic (where the message forms part of the codeword).

BCJR

There are now many famed algorithms for decoding turbo, here we introduce the celebrated BCJR algorithm, which is also detailed in this tutorial.

The idea is the following: BCJR tries to iteratively compute the a posteriori likelihood, given some received codeword \(\mathbf{y}\), of input bit at time \(t\) being either \(+1\) or \(-1\) (1 or 0) with the following metric \(\begin{equation} L(u_t|\mathbf{y}) = ln(\frac{P(u_t = 1|\mathbf{y})}{P(u_t =-1|\mathbf{y})}) \end{equation}\)

Recall the trellis decoding from convolutional code, suppose we are at time \(t\) on a trellis with state \(s_t = s\), we can see that the received codeword \(\mathbf{y}\) can be divided into three regions, prior, current and posterior in the following way: \(\begin{equation} \mathbf{y} = \mathbf{y}_{<t}y_t\mathbf{y}_{>t} \end{equation}\)

Averaging all state transitions on trellis the metric known as log-likelihood ratio can be expressed as \(\begin{equation} L(u_t|\mathbf{y}) = ln(\frac{\sum_{s_{t-1}\rightarrow s_t |u_t = +1}P(s_{t-1},s_t,\mathbf{y})}{\sum_{s_{t-1}\rightarrow s_t |u_t = -1}P(s_{t-1},s_t,\mathbf{y})}) \end{equation}\) where we have \(P(s_{t-1},s_t,\mathbf{y}) = P(s_{t-1},\mathbf{y}_{<t})P(y_t,s_t|s_{t-1})P(\mathbf{y}_{>t}|s) = \alpha_{t-1}(s_{t-1})\gamma_k(s_{t-1},s_t)\beta_{t}(s_t)\), and we want to iteratively find these three quantities. For the following, for simplicity, let \(s_t = s\) and \(s' = s_{t-1}\)

  1. Initialization \(\alpha_0(s) = \begin{cases} 1 & \text{if } s = 0 \\ 0 & \text{otherwise} \end{cases}\)

\(\beta_N(s) = \begin{cases} 1 & \text{if } s = 0 \\ 0 & \text{otherwise} \end{cases}\) Where \(N\) is the length of the trellis.

  1. Calculate \(\gamma\) The gamma values are calculated based on the transition probabilities between states and the observed received symbols: \(\gamma_t(s', s) = \exp\left(\sum_{i=1}^{n} \frac{y_{t,i} \cdot x_{t,i}(s', s) - x_{t,i}^2(s', s)/2}{2\sigma^2}\right)\) Here, \(x_{t,i}(s', s)\) is the coded bit for transition from state \(s'\) to \(s\) at time \(t\).

  2. Iteratively (forward and backward) calculate \(\alpha\) and \(\beta\) \(\alpha_t(s) = \sum_{s'} \alpha_{t-1}(s') \gamma_t(s', s)\)

\[\beta_{t-1}(s') = \sum_{s} \beta_t(s) \gamma_t(s', s)\]
  1. Calculate LLR
\[\text{LLR}(x_t) = \log \frac{\sum_{s, s' : x_t(s', s) = 1} \alpha_{t-1}(s') \gamma_t(s', s) \beta_t(s)}{\sum_{s, s' : x_t(s', s) = 0} \alpha_{t-1}(s') \gamma_t(s', s) \beta_t(s)}\]
  1. Normalization for stability
\[\alpha_t(s) = \frac{\alpha_t(s)}{\sum_{s'} \alpha_t(s')}\]

\(\beta_t(s) = \frac{\beta_t(s)}{\sum_{s'} \beta_t(s')}\) Normalization helps prevent the probabilities from becoming too small or too large, which could lead to underflow or overflow in digital computations.

LDPC Code

There is an excellent tutorial on LDPC code by our very own Prof. John Barry




    Enjoy Reading This Article?

    Here are some more articles you might like to read next:

  • Type is all you need
  • The importance of decentralized ego
  • Yorushika Song List
  • On Figures
  • Why veil of ignorance is total BS