We are now at the stage where we are able to generate linear codes that detect and correct errors fairly easily, but it is still a time-consuming process to decode a received \(n\)-tuple and determine which is the closest codeword, because the received \(n\)-tuple must be compared to each possible codeword to determine the proper decoding. This can be a serious impediment if the code is very large.

Example8.35

Given the binary matrix
\begin{equation*}H =
\begin{pmatrix}
1 & 1 & 1 & 0 & 0 \\
0 & 1 & 0 & 1 & 0 \\
1 & 0 & 0 & 0 & 1
\end{pmatrix}\end{equation*}
and the 5-tuples \({\mathbf x} = (11011)^{\rm t}\) and \({\mathbf y} = (01011)^{\rm t}\), we can compute
\begin{equation*}H{\mathbf x} =
\begin{pmatrix}
0 \\ 0 \\ 0
\end{pmatrix}
\qquad
\text{and}
\qquad
H{\mathbf y} =
\begin{pmatrix}
1 \\ 0 \\ 1
\end{pmatrix}.\end{equation*}
Hence, \({\mathbf x}\) is a codeword and \({\mathbf y}\) is not, since \({\mathbf x}\) is in the null space and \({\mathbf y}\) is not. Notice that \(H{\mathbf y}\) is identical to the first column of \(H\). In fact, this is where the error occurred. If we flip the first bit in \({\mathbf y}\) from 0 to 1, then we obtain \({\mathbf x}\).

If \(H\) is an \(m \times n\) matrix and \({\mathbf x} \in {\mathbb Z}_2^n\), then we say that the syndrome of \({\mathbf x}\) is \(H{\mathbf x}\). The following proposition allows the quick detection and correction of errors.

Proposition8.36

Let the \(m \times n\) binary matrix \(H\) determine a linear code and let \({\mathbf x}\) be the received \(n\)-tuple. Write \({\mathbf x}\) as \({\mathbf x} = {\mathbf c} +{\mathbf e}\), where \({\mathbf c}\) is the transmitted codeword and \({\mathbf e}\) is the transmission error. Then the syndrome \(H{\mathbf x}\) of the received codeword \({\mathbf x}\) is also the syndrome of the error \({\mathbf e}\).

The proof follows from the fact that
\begin{equation*}H{\mathbf x} = H({\mathbf c} +{\mathbf e}) = H{\mathbf c} + H{\mathbf e} = {\mathbf 0} + H{\mathbf e} = H{\mathbf e}.\end{equation*}

This proposition tells us that the syndrome of a received word depends solely on the error and not on the transmitted codeword. The proof of the following theorem follows immediately from Proposition 8.36 and from the fact that \(H{\mathbf e}\) is the \(i\)th column of the matrix \(H\).

Theorem8.37

Let \(H \in {\mathbb M}_{ m \times n} ( {\mathbb Z}_2)\) and suppose that the linear code corresponding to \(H\) is single error-correcting. Let \({\mathbf r}\) be a received \(n\)-tuple that was transmitted with at most one error. If the syndrome of \({\mathbf r}\) is \({\mathbf 0}\), then no error has occurred; otherwise, if the syndrome of \({\mathbf r}\) is equal to some column of \(H\), say the \(i\)th column, then the error has occurred in the \(i\)th bit.

Example8.38

Consider the matrix
\begin{equation*}H =
\begin{pmatrix}
1 & 0 & 1 & 1 & 0 & 0 \\
0 & 1 & 1 & 0 & 1 & 0 \\
1 & 1 & 1 & 0 & 0 & 1
\end{pmatrix}\end{equation*}
and suppose that the 6-tuples \({\mathbf x} = (111110)^{\rm t}\), \({\mathbf y} = (111111)^{\rm t}\), and \({\mathbf z} = (010111)^{\rm t}\) have been received. Then
\begin{equation*}H{\mathbf x} =
\begin{pmatrix}
1 \\ 1 \\ 1
\end{pmatrix},
H{\mathbf y} =
\begin{pmatrix}
1 \\ 1 \\ 0
\end{pmatrix},
H{\mathbf z} =
\begin{pmatrix}
1 \\ 0 \\ 0
\end{pmatrix}.\end{equation*}
Hence, \({\mathbf x}\) has an error in the third bit and \({\mathbf z}\) has an error in the fourth bit. The transmitted codewords for \({\mathbf x}\) and \({\mathbf z}\) must have been \((110110)\) and \((010011)\), respectively. The syndrome of \({\mathbf y}\) does not occur in any of the columns of the matrix \(H\), so multiple errors must have occurred to produce \({\mathbf y}\).

We can use group theory to obtain another way of decoding messages. A linear code \(C\) is a subgroup of \({\mathbb Z}_2^n\). Coset or standard decoding uses the cosets of \(C\) in \({\mathbb Z}_2^n\) to implement maximum-likelihood decoding. Suppose that \(C\) is an \((n,m)\)-linear code. A coset of \(C\) in \({\mathbb Z}_2^n\) is written in the form \({\mathbf x} + C\), where \({\mathbf x} \in {\mathbb Z}_2^n\). By Lagrange's Theorem (Theorem 6.10), there are \(2^{n-m}\) distinct cosets of \(C\) in \({\mathbb Z}_2^n\).

Example8.39

Let \(C\) be the \((5,3)\)-linear code given by the parity-check matrix
\begin{equation*}H =
\begin{pmatrix}
0 & 1 & 1 & 0 & 0 \\
1 & 0 & 0 & 1 & 0 \\
1 & 1 & 0 & 0 & 1
\end{pmatrix}.\end{equation*}
The code consists of the codewords
\begin{equation*}(00000) \quad (01101) \quad (10011) \quad (11110).\end{equation*}
There are \(2^{5-2} = 2^3\) cosets of \(C\) in \({\mathbb Z}_2^5\), each with order \(2^2 =4\). These cosets are listed in Table 8.40.

Coset

Coset

Representative

\(C\)

(00000) (01101) (10011) (11110)

\((10000) + C\)

(10000) (11101) (00011) (01110)

\((01000) + C\)

(01000) (00101) (11011) (10110)

\((00100) + C\)

(00100) (01001) (10111) (11010)

\((00010) + C\)

(00010) (01111) (10001) (11100)

\((00001) + C\)

(00001) (01100) (10010) (11111)

\((10100) + C\)

(00111) (01010) (10100) (11001)

\((00110) + C\)

(00110) (01011) (10101) (11000)

Table8.40Cosets of \(C\)

Our task is to find out how knowing the cosets might help us to decode a message. Suppose that \({\mathbf x}\) was the original codeword sent and that \({\mathbf r}\) is the \(n\)-tuple received. If \({\mathbf e}\) is the transmission error, then \({\mathbf r} = {\mathbf e} + {\mathbf x}\) or, equivalently, \({\mathbf x} = {\mathbf e} + {\mathbf r}\). However, this is exactly the statement that \({\mathbf r}\) is an element in the coset \({\mathbf e} + C\). In maximum-likelihood decoding we expect the error \({\mathbf e}\) to be as small as possible; that is, \({\mathbf e}\) will have the least weight. An \(n\)-tuple of least weight in a coset is called a coset leader. Once we have determined a coset leader for each coset, the decoding process becomes a task of calculating \({\mathbf r} + {\mathbf e}\) to obtain \({\mathbf x}\).

Example8.41

In Table 8.40, notice that we have chosen a representative of the least possible weight for each coset. These representatives are coset leaders. Now suppose that \({\mathbf r} = (01111)\) is the received word. To decode \({\mathbf r}\), we find that it is in the coset \((00010) + C\); hence, the originally transmitted codeword must have been \((01101) = (01111) + (00010)\).

A potential problem with this method of decoding is that we might have to examine every coset for the received codeword. The following proposition gives a method of implementing coset decoding. It states that we can associate a syndrome with each coset; hence, we can make a table that designates a coset leader corresponding to each syndrome. Such a list is called a decoding table.

Syndrome

Coset Leader

(000)

(00000)

(001)

(00001)

(010)

(00010)

(011)

(10000)

(100)

(00100)

(101)

(01000)

(110)

(00110)

(111)

(10100)

Table8.42Syndromes for each coset

Proposition8.43

Let \(C\) be an \((n,k)\)-linear code given by the matrix \(H\) and suppose that \({\mathbf x}\) and \({\mathbf y}\) are in \({\mathbb Z}_2^n\). Then \({\mathbf x}\) and \({\mathbf y}\) are in the same coset of \(C\) if and only if \(H{\mathbf x} = H{\mathbf y}\). That is, two \(n\)-tuples are in the same coset if and only if their syndromes are the same.

Two \(n\)-tuples \({\mathbf x}\) and \({\mathbf y}\) are in the same coset of \(C\) exactly when \({\mathbf x} - {\mathbf y} \in C\); however, this is equivalent to \(H({\mathbf x} - {\mathbf y}) = 0\) or \(H {\mathbf x} = H{\mathbf y}\).

Example8.44

Table 8.42 is a decoding table for the code \(C\) given in Example 8.39. If \({\mathbf x} = (01111)\) is received, then its syndrome can be computed to be
\begin{equation*}H {\mathbf x} =
\begin{pmatrix}
0 \\ 1 \\ 1
\end{pmatrix}.\end{equation*}
Examining the decoding table, we determine that the coset leader is \((00010)\). It is now easy to decode the received codeword.

Given an \((n,k)\)-block code, the question arises of whether or not coset decoding is a manageable scheme. A decoding table requires a list of cosets and syndromes, one for each of the \(2^{n - k}\) cosets of \(C\). Suppose that we have a \((32, 24)\)-block code. We have a huge number of codewords, \(2^{24}\), yet there are only \(2^{32 - 24} = 2^{8} = 256\) cosets.