Skip to main content
Logo image

Applied Discrete Structures

Section 15.5 Coding Theory, Linear Codes

A Transmission Problem.

In this section, we will introduce the basic ideas involved in coding theory and consider solutions of a coding problem by means of linear codes.
Imagine a situation in which information is being transmitted between two points. The information takes the form of high and low pulses (for example, radio waves or electric currents), which we will label 1 and 0, respectively. As these pulses are sent and received, they are grouped together in blocks of fixed length. The length determines how much information can be contained in one block. If the length is \(r\text{,}\) there are \(2^r\) different values that a block can have. If the information being sent takes the form of text, each block might be a character. In that case, the length of a block may be seven, so that \(2^7 = 128\) block values can represent letters (both upper and lower case), digits, punctuation, and so on. During the transmission of data, noise can alter the signal so that what is received differs from what is sent. Figure 15.5.1 illustrates the problem that can be encountered if information is transmitted between two points.
A noisy transmission
Figure 15.5.1. A noisy transmission
Noise is a fact of life for anyone who tries to transmit information. Fortunately, in most situations we could expect a high percentage of the pulses that are sent to be received properly. However, when large numbers of pulses are transmitted, there are usually some errors due to noise. For the remainder of the discussion, we will make assumptions about the nature of the noise and the message that we want to send. Henceforth, we will refer to the pulses as bits.
We will assume that our information is being sent along a binary symmetric channel. By this, we mean that any single bit that is transmitted will be received improperly with a certain fixed probability, \(p\text{,}\) independent of the bit value. The magnitude of \(p\) is usually quite small. To illustrate the process, we will assume that \(p = 0.001\text{,}\) which, in the real world, would be considered somewhat large. Since \(1 - p = 0.999\text{,}\) we can expect \(99.9\%\) of all bits to be properly received.
In addition to assuming \(p=0.001\) throughout, we will also suppose that our message consists of 3,000 bits of information. Two factors will be considered in evaluating a method of transmission. The first is the probability that the message is received with no errors. The second is the number of bits that will be transmitted in order to send the message. This quantity is called the rate of transmission:
\begin{equation*} \textrm{ Rate}= \frac{\textrm{ Message length}}{\textrm{ Number of bits transmitted}} \end{equation*}
As you might expect, as we devise methods to improve the probability of success, the rate will decrease.
Suppose that we ignore the noise and transmit the message without any coding. The probability of success is \(0.999^{3000}= 0.0497124\text{.}\) Therefore we only successfully receive the message in a totally correct form less than \(5\%\) of the time. The rate of \(\frac{3000}{3000} = 1\) certainly doesn’t offset this poor probability.
Our strategy for improving our chances of success will be to send an encoded message. The encoding will be done in such a way that small errors can be identified and corrected. This idea is illustrated in Figure 15.5.2.
The Coding Process
Figure 15.5.2. The Coding Process
In all of our examples, the functions that will correspond to our encoding devices will involve multiplication of messages by matrices using mod 2 arithmetic. First we will introduce some geometric ideas to make the process more intuitive.

Subsection 15.5.1 Introduction

Although we’ll be using algebra to help improve communications, the basic solution can be imagined from a geometric point of view. For any positive integer \(n\text{,}\) we define a distance function on the elements of the group \(\mathbb{Z}_2{}^n\text{.}\) This distance is called the Hamming Distance.

Definition 15.5.3. Hamming Distance.

Given two elements of \(\mathbb{Z}_2{}^n\text{,}\) \(a\) and \(b\text{,}\) the Hamming Distance, \(d_H(a,b)\) between them is the number of positions in which they differ.
For example, \(d_H((1,1,0,0),(1,1,0,1))=1\) since these two elements of \(\mathbb{Z}_2{}^4\) differ in just the last position; and \(d_H((1,1,0,0),(1,1,0,0))=0\text{.}\) Notice that we can compute the distance between two bit strings by adding them coordinatewise in the Cartesian product and counting the number \(1\)’s that appear in the sum. For example \((1,1,0,0)+(1,0,0,1)=(0,1,0,1)\text{.}\) The sum has two \(1\)’s, so the distance between \((1,1,0,0) \textrm{ and }(1,0,0,1)\) is \(2\text{.}\) In addition, the location of the \(1\)’s in the sum tell us where the two bit strings differ.
When we look at groups like \(\mathbb{Z}_2{}^4\) from a point of view, we refer to these sets as metric spaces or simply spaces. In the case of \(\mathbb{Z}_2{}^4\text{,}\) there are just \(2^4=16\) points in the space and the maximum distance between the points is \(4\text{.}\) More generally \(\mathbb{Z}_2{}^n\) has \(2^n\) points and the maximum distance between points in that space is \(n\text{.}\) Looking at the group \(\mathbb{Z}_2{}^n\) from this geometric point of view is essentially the same as the \(n\)-cube 9.4.17 we considered in discussing Hamiltonian graphs. In this section we will use \(n\)-tuples such as \((1,1,0,1)\) interchangeably with strings of bits such as \(1101\text{.}\)
For any distance \(r\) in a space, the ball of radius \(r\) centered at a point \(a\text{,}\) denoted \(B_r(a)\text{,}\) is the set of all points whose distance from \(a\) is \(r\) or less. For example, in the space \(\mathbb{Z}_2{}^4\text{,}\)
\begin{equation*} B_1((1,1,0,0))=\{(1,1,0,0), (0,1,0,0), (1,0,0,0), (1,1,1,0), (1,1,0,1)\}. \end{equation*}
The ultimate goal of our encoding will be to take a set of possible messages, the message space, and distribute them in a larger space, the code space, in such a way that the encoded message, called a code word is at least a certain distance away from any other code word. The minimum distance between the code words will determine whether we can correct errors or just detect them. Now let’s turn to some examples.

Subsection 15.5.2 Error Detection

Suppose that each block of three bits \(a = \left(a_1, a_2 , a_3 \right)\) is encoded with the function \(e: \mathbb{Z}_2{}^3\to \mathbb{Z}_2{}^4\text{,}\) where
\begin{equation*} e(a) = \left(a _1, a _2 , a _3, a_1+_2a_2+_2a_3 \right) \end{equation*}
The fourth bit of \(e(a)\) is called the parity-check bit. When the encoded block is received, the four bits will probably all be correct (they are correct approximately \(99.6\%\) of the time under our assumed parameters), but the added bit that is sent will make it possible to detect single bit errors in the block. Note that when \(e(a)\) is transmitted, the sum of its components is \(a_1+_2 a_2 +_2 a_3+_2 \left( a_1+_2 a_2+_2 a_3\right)= 0\text{,}\) since \(a_i+a_i=0\) in \(\mathbb{Z}_2\text{.}\)
If any single bit is garbled by noise, the sum of the received bits will be 1. A parity error occurs if the sum of the received bits is 1. Since more than one error is unlikely when \(p\) is small, a high percentage of all errors can be detected.
At the receiving end, the decoding function acts on the four-bit block \(b = \left(b_1,b _2 ,b_3,b_4 \right)\) with the function \(d: \mathbb{Z}_2{}^4\to \mathbb{Z}_2^4\text{,}\) where
\begin{equation*} d(b) = \left(b_1,b _2 ,b_3,b_1+_2b _2 +_2b_3+_2b_4 \right) \end{equation*}
Notice that the fourth bit of \(d(b)\) is an indicator of whether there is a parity error - 0 if no error, and 1 if an error. If no parity error occurs, the first three bits are recorded as part of the message. If a parity error occurs, we will assume that a retransmission of that block can be requested. This request can take the form of automatically having the parity-check bit of \(d(b)\) sent back to the source. If 1 is received, the previous block is retransmitted; if 0 is received, the next block is sent. This assumption of two-way communication is significant, but it is desirable to make this coding system useful. For our calculations, it is reasonable to expect that the probability of a transmission error in the opposite direction is also 0.001. Without going into the details, we will report that the probability of success in sending 3000 bits is approximately 0.990 and the rate is approximately \(3/5\text{.}\) The rate includes the transmission of the parity-check bit to the source and is only approximate because the resent blocks will decrease the rate below \(3/5\) somewhat.
described in detail following the image
The \(4\)-cube with code words displayed as larger vertices.
Figure 15.5.4. The \(4\)-cube with code words displayed as larger vertices
Let’s consider the geometry of this code. If we examine the \(4\)-cube in Figure 15.5.4, the code words are the strings of four bits with an even number of ones. These vertices are the larger ones. Notice that the ball of radius 1 centered around any of the code words consists of that code word and the smaller vertices that are connected to the code word with an edge of the \(4\)-cube. Since there are no other code-words in the ball, a single bit error produces a non-code word and so an error can be detected.

Subsection 15.5.3 Error Correction

Next, we will consider coding functions that allow us to correct errors at the receiving end so that only one-way communication is needed. Before we begin, recall that every element of \(\mathbb{Z}_2{}^n\text{,}\) \(n \geq 1\text{,}\) is its own inverse; that is, \(-b = b\text{.}\) Therefore, \(a - b = a + b\text{.}\)

Example 15.5.5. The Triple Repetition Code.

Suppose we take each individual bit in our message and encode it by repeating it three times. In other words, if \(a\) is a single bit, \(e(a)=(a,a,a)\text{.}\) The code words for this code are \((0,0,0)\) and \((1,1,1)\text{.}\) Let’s look at the geometry behind this code. The message space has just two points, but the code space is \(\mathbb{Z}_2{}^3\text{,}\) which has 8 points, the vertices of the \(3\)-cube, which appears in Figure 15.5.6.
described in detail following the image
The \(3\)-cube with code words displayed as circular vertices.
Figure 15.5.6. The \(3\)-cube with code words displayed as circular vertices
In the figure for this code, the code words are circular vertices. If we identify the balls of radius 1 centered around the two code words, you might notice that the two balls do not intersect. Each has a different vertex with triangular, square and pentagonal shapes. From a geometric point of view, this is why we can correct a single bit error. If any string of three bits in the code space is received it is in one of the two balls and the code word in that ball had to have been the one that was transmitted.
Regarding the actual correction process, the shapes have a meaning, as outlined in the following list.
  • Circle: No correction needed
  • Pentagon: Correct the first bit
  • Square: Correct the second bit
  • Triangle: Correct the third bit
Of course, once the correction is made, only the first bit is extracted from the code word since all bits will be equal. The simplicity of the final result masks an important property of all error correcting codes we consider. All of the possible points in the code space can be partitioned in such a way that each block in the partition corresponds with a specific correction that we can make to recover the correct code word.
If you have read about cosets, you will see that the partition we refer to is the set of left cosets of the set of code words.
Triple repetition is effective, but not very efficient since its rate is quite low, \(1/3\text{.}\) Next we consider a slightly more efficient error correcting code based on matrix multiplication. Any such code that is computed with a matrix multiplication is called a linear code. We should point out that both the parity check code and the triple repetition code are linear codes. For the parity check code, the encoding function can be thought of as acting on a \(1 \times 3\) row vector \(a=(a_1,a_2,a_3)\) by multiplying times a \(3 \times 4\) matrix:
\begin{equation*} e(a) = (a_1,a_2,a_3) \left( \begin{array}{cccc} 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ \end{array} \right)=(a_1,a_2,a_3,a_1 +_2 a_2+_2 a_3) \end{equation*}
For triple repetition, the encoding function can be thought of as acting on a \(1 \times 1\) matrix \(a\) by multiplying times a \(1 \times 3\) matrix:
\begin{equation*} e(a) = \left(a\right) \left( \begin{array}{ccc} 1 & 1 & 1 \\ \end{array} \right)= \left( \begin{array}{ccc} a & a & a \\ \end{array} \right) \end{equation*}

Example 15.5.7. A Somewhat More Efficient Linear Code.

The encoding that we will consider here takes a block \(a = \left(a_1, a_2, a_3 \right)\) and produces a code word of length 6. As in the triple repetition code, each code word will differ from each other code word by at least three bits. As a result, any single error will not push a code word close enough to another code word to cause confusion. Now for the details.
Let
\begin{equation*} G=\left( \begin{array}{cccccc} 1 & 0 & 0 & 1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 0 & 1 \\ 0 & 0 & 1 & 0 & 1 & 1 \\ \end{array} \right). \end{equation*}
We call \(G\) the generator matrix for the code, and let \(a = \left(a_1, a_2, a_3 \right)\) be our message. Define \(e : \mathbb{Z}_2{}^3 \to \mathbb{Z}_2{}^6\) by
\begin{equation*} e(a) = a G = \left(a_1, a_2, a_3,a_4, a_5, a_6\right) \end{equation*}
where
\begin{equation*} \begin{array}{r@{}r@{}r@{}r@{}r@{}r@{}r@{}} a_4 &= & a_1 & {}+_2{} & a_2 & & \\ a_5 &= & a_1 & & &{}+_2{} & a_3 \\ a_6 &= & & & a_2 &{}+_2{} & a_3 \\ \end{array} \end{equation*}
Notice that since matrix multiplication is distributive over addition, we have
\begin{equation*} e(a + b)= (a+b)G = aG + bG = e(a)+e(b) \end{equation*}
for all \(a, b \in \mathbb{Z}_2{}^3\text{.}\) This equality, may look familiar from the definition of an isomorphism, but in this case the function \(e\) is not onto. If you’ve read about homomorphisms, this is indeed an example of one.
One way to see that any two distinct code words have a distance from one another of at least 3 is to consider the images of any two distinct messages. If \(a\) and \(b\) are distinct elements of \(\mathbb{Z}_2{}^3\text{,}\) then \(c = a + b\) has at least one coordinate equal to 1. Now consider the difference between \(e(a)\) and \(e(b)\text{:}\)
\begin{equation*} \begin{split} e(a) + e(b) &= e(a + b) \\ & = e(c)\\ \end{split} \end{equation*}
Whether \(c\) has 1, 2, or 3 ones, \(e(c)\) must have at least three ones. This can be seen by considering the three cases separately. For example, if \(c\) has a single one, two of the parity bits are also 1. Therefore, \(e(a)\) and \(e(b)\) differ in at least three bits. By the same logic as with triple repetition, a single bit error in any code word produces an element of the code space that is contained in on of the balls of radius 1 centered about a code word.
Now consider the problem of decoding received transmissions. Imagine that a code word, \(e(a)\text{,}\) is transmitted, and \(b= \left(b_1, b_2, b_3,b_4, b_5, b_6\right)\) is received. At the receiving end, we know the formula for \(e(a)\text{,}\) and if no error has occurred in transmission,
\begin{equation*} \begin{array}{c} b_1= a_1 \\ b_2=a_2 \\ b_3=a_3 \\ b_4=a_1+_2a_2 \\ b_5=a_1+_2a_3 \\ b_6=a_2+_2a_3 \\ \end{array} \Rightarrow \begin{array}{r@{}r@{}r@{}r@{}r@{}r@{}r@{}r@{}r@{}r@{}r@{}r@{}} b_1 & +_2 & b_2 & & &+_2& b_4& & & & &=0\\ b_1 & & & +_2 &b_3& & & +_2& b_5 & & &=0\\ & & b_2 & +_2 &b_3& & & & &+_2 &b_6&=0\\ \end{array} \end{equation*}
The three equations on the right are called parity-check equations. If any of them are not true, an error has occurred. This error checking can be described in matrix form.
Let
\begin{equation*} H=\left( \begin{array}{ccc} 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} \right) \end{equation*}
The matrix \(H\) is called the parity-check matrix for this code. Now define \(p:\mathbb{Z}_2{}^6 \to \mathbb{Z}_2{}^3\) by \(p(b) = b H\text{.}\) We call \(p(b)\) the syndrome of the received block. For example, \(p(0,1,0,1,0,1)=(0,0,0)\) and \(p(1,1,1,1,0,0)=(1,0,0)\)
Note that \(p\) has a similar property as \(e\text{,}\) that \(p(b_1+b_2)=p(b_1)+p(b_2)\text{.}\) If the syndrome of a block is \((0, 0, 0)\text{,}\) we can be almost certain that the message block is \(\left(b_1, b_2, b_3\right)\text{.}\)
Next we turn to the method of correcting errors. Despite the fact that there are only eight code words, one for each three-bit block value, the set of possible received blocks is \(\mathbb{Z}_2{}^6\text{,}\) with 64 elements. Suppose that \(b\) is not a code word, but that it differs from a code word by exactly one bit. In other words, it is the result of a single error in transmission. Suppose that \(w\) is the code word that \(b\) is closest to and that they differ in the first bit. Then \(b + w = (1, 0, 0, 0, 0, 0)\) and
\begin{equation*} \begin{split} p(b) & = p(b) + p(w)\quad \textrm{ since }p(w) = (0, 0, 0)\\ &=b H + w H \\ &=(b+w)H\quad \textrm{ by the distributive property}\\ &=p(b+w)\\ & =p(1,0,0,0,0,0)\\ & =(1,1,0)\\ \end{split} \end{equation*}
This is the first row of \(H\text{!}\)
Note that we haven’t specified \(b\) or \(w\text{,}\) only that they differ in the first bit. Therefore, if \(b\) is received, there was probably an error in the first bit and \(p(b)= (1, 1, 0)\text{,}\) the transmitted code word was probably \(b + (1, 0, 0, 0, 0, 0)\) and the message block was \(\left(b_1+_2 1, b_2, b_3\right)\text{.}\) The same analysis can be done if \(b\) and \(w\) differ in any of the other five bits.
In general, if the syndrome of a received string of bits is the \(k\)th row of the parity check matrix, the error has occurred in the \(k\)th bit.
Probability Epilog. For the two error correction examples we’ve looked at, we can compare their probabilities of successfully receiving all 3000 bits correctly over a binary symmetric channel with \(p=0.001.\)
For the triple repetition code, the probability is
\begin{equation*} \left(0.999^3 + 3 \cdot 0.999^2 \cdot 0.001\right)^{3000}= 0.991, \end{equation*}
and the rate of this code is \(\frac{1}{3}\) which means we need to transmit 9000 bits.
For the second code, the probability of success is
\begin{equation*} \left(0.999^6 + 6 \cdot 0.999^5 \cdot 0.001\right)^{1000}=0.985\text{,} \end{equation*}
and rate for this code is \(\frac{1}{2}\text{,}\) which means we need to transmit 6000 bits.
Clearly, there is a trade-off between accuracy and speed.

Example 15.5.8. Another Linear Code.

Consider the linear code with generator matrix
\begin{equation*} G = \left( \begin{array}{ccccc} 1 & 1 & 0 & 1 & 0 \\ 0 & 1 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 1 \\ \end{array} \right). \end{equation*}
Since \(G\) is \(3 \times 5\text{,}\) this code encodes three bits into five bits. The natural question to ask is what detection or correction does it afford? We can answer this question by constructing the parity check matrix. We observe that if \(\vec{a}=(a_1, a_2, a_3)\) the encoding function is
\begin{equation*} e(\vec{a}) = \vec{a}G = (a_1,a_1+a_2,a_2,a_1+a_3,a_3) \end{equation*}
where addition is mod 2 addition. If we receive five bits \((c_1,c_2,c_3,c_4,c_5)\) and no error has occurred, the following two equations would be true.
\begin{gather} c_1+c_2+c_3 = 0\tag{15.5.1}\\ c_1+c_4+c_5 = 0\tag{15.5.2} \end{gather}
Notice that in general, the number of parity check equations is equal to the number of extra bits that are added by the encoding function. These equations are equivalent to the single matrix equation \((c_1,c_2,c_3,c_4,c_5)H = \vec{0}\text{,}\) where
\begin{equation*} H = \left(\begin{array}{cc} 1 & 1 \\ 1 & 0 \\ 1 & 0 \\ 0 & 1 \\ 0 & 1 \\ \end{array} \right) \end{equation*}
At a glance, we can see that this code will not correct most single bit errors. Suppose an error \(\vec{e}=(e_1,e_2,e_3,e_4,e_5)\) is added in the transmission of the five bits. Specifically, suppose that 1 is added (mod 2) in position \(j\text{,}\) where \(1 \leq j\leq 5\) and the other coordinates of \(\vec{e}\) are 0. Then when we compute the syndrome of our received transmission, we see that
\begin{equation*} \vec{c}H = (\vec{a}G + \vec{e})H = (\vec{a}G)H + \vec{e}H = \vec{e}H. \end{equation*}
But \(\vec{e}H\) is the \(j^{th}\) row of \(H\text{.}\) If the syndrome is \((1,1)\) we know that the error occurred in position 1 and we can correct it. However, if the error is in any other position we can’t pinpoint its location. If the syndrome is \((1,0)\text{,}\) then the error could have occurred in either position 2 or position 3. This code does detect all single bit errors but only corrects one fifth of them.

Exercises 15.5.4 Exercises

1.

If the error-detecting code is being used, how would you act on the following received blocks?
  1. \(\displaystyle (1, 0, 1, 1)\)
  2. \(\displaystyle (1, 1, 1, 1)\)
  3. \(\displaystyle (0, 0, 0, 0)\)
Answer.
  1. Error detected, since an odd number of 1’s was received; ask for retransmission.
  2. No error detected; accept this block.
  3. No error detected; accept this block.

2.

Determine the parity check matrix for the triple repetition code.

3.

If the error-correcting code from this section is being used, how would you decode the following blocks? Expect an error that cannot be fixed with one of these.
  1. \(\displaystyle (1,0,0,0,1,1)\)
  2. \(\displaystyle (1,0,1,0,1,1)\)
  3. \(\displaystyle (0,1,1,1,1,0)\)
  4. \(\displaystyle (0,0,0,1,1,0)\)
  5. \(\displaystyle (1,0,0,0,0,1)\)
  6. \(\displaystyle (1,0,0,1,0,0)\)
Answer.
  1. Syndrome = \((1,0,1)\text{.}\) Corrected coded message is \((1,1,0,0,1,1)\) and original message was \((1, 1, 0)\text{.}\)
  2. Syndrome = \((1,1,0)\text{.}\) Corrected coded message is \((0,0,1,0,1,1)\) and original message was \((0, 0, 1)\text{.}\)
  3. Syndrome = \((0,0,0)\text{.}\) No error, coded message is \((0,1,1,1,1,0)\) and original message was \((0, 1, 1)\text{.}\)
  4. Syndrome = \((1, 1,0)\text{.}\) Corrected coded message is \((1,0,0,1,1,0)\) and original message was \((1, 0, 0)\text{.}\)
  5. Syndrome = \((1,1,1)\text{.}\) This syndrome occurs only if two bits have been switched. No reliable correction is possible.
  6. Syndrome = \((0,1,0)\text{.}\) Corrected coded message is \((1,0,0,1,1,0)\) and original message was \((1, 0, 0)\text{.}\)

4.

Suppose that the code words of a coding function have the property that any two of them have a Hamming distance of at least 5. How many bit errors could be corrected with such a code?

5.

Consider the linear code defined by the generator matrix
\begin{equation*} G=\left( \begin{array}{cccc} 1 & 0 & 1 & 0 \\ 0 & 1 & 1 & 1 \\ \end{array} \right) \end{equation*}
  1. What size blocks does this code encode and what is the length of the code words?
  2. What are the code words for this code?
  3. With this code, can you detect single bit errors? Can you correct all, some, or no single bit errors?
Answer.
  1. Blocks of two bits are encoded into code words of length 4.
  2. The code words are 0000, 1010, 0111 and 1101.
  3. Since the first two code words have a Hamming distance of 2, not all single bit errors can be corrected. For example, if 0000 is transmitted and the first bit is switch, then 1000 is received and we can’t tell for sure whether this came from 0000 or 1010. To see what can be corrected, we note that \(a_1 a_2\) is encoded to \(a_1 a_2 (a_1 +_2 a_2) a_2\) and so if \(b_1 b_2 b_3 b_4\) is recieved and no error has occurred,
    \begin{gather*} b_1 +_2 b_2 +_2 b_3 =0\\ b_2 +_2 b_4 =0 \end{gather*}
    We can extract the parity check matrix from this set of equations. It is
    \begin{equation*} \begin{pmatrix} 1 & 0 \\ 1 & 1 \\ 1 & 0 \\ 0 & 1 \end{pmatrix} \end{equation*}
    The rows of this matrix correspond with the syndromes for errors in bits 1 through 4, which are all nonzero, so we can detect any single bit error. Notice that the syndromes for bits 1 and 3 are identical. This reflects the fact that errors in these bits can’t be corrected. However, the syndromes for bits 2 and 4 are unique and so we can correct them. Therefore the second bit of the original message can be sent with more confidence than the first.

6. Rectangular codes.

To build a rectangular code, you partition your message into blocks of length \(m\) and then factor \(m\) into \(k_1\cdot k_2\) and arrange the bits in a \(k_1 \times k_2\) rectangular array as in the figure below. Then you add parity bits along the right side and bottom of the rows and columns. The code word is then read row by row.
\begin{equation*} \begin{array}{cccccc} \blacksquare & \blacksquare & \blacksquare & \cdots & \blacksquare & \square \\ \blacksquare & \blacksquare & \blacksquare & \cdots & \blacksquare & \square \\ \vdots & \vdots & \vdots & & \vdots & \vdots \\ \blacksquare & \blacksquare & \blacksquare & \cdots & \blacksquare & \square \\ \square & \square & \square & \cdots & \square & \\ \end{array} \quad \begin{array}{c} \textrm{ }\blacksquare = \textrm{ message} \textrm{ bit} \\ \square =\textrm{ parity} \textrm{ bit} \\ \end{array} \end{equation*}
For example, if \(m\) is 4, then our only choice is a 2 by 2 array. The message 1101 would be encoded as
\begin{equation*} \begin{array}{cc|c} 1 & 1 & 0\\ 0 & 1 & 1\\ \hline 1 & 0 &\\ \end{array} \end{equation*}
and the code word is the string 11001110.
  1. Suppose that you were sent four bit messages using this code and you received the following strings. What were the messages, assuming no more than one error in the transmission of coded data?
    1. 11011000
    2. 01110010
    3. 10001111
  2. If you encoded \(n^2\) bits in this manner, what would be the rate of the code?
  3. Rectangular codes are linear codes. For the 2 by 2 rectangular code, what are the generator and parity check matrices?

7.

Suppose that the code in Example 15.5.8 is expanded to add the column
\begin{equation*} \left(\begin{array}{c} 0 \\ 1 \\ 1 \\ \end{array} \right) \end{equation*}
to the generator matrix \(G\text{,}\) can all single bit errors be corrected? Explain your answer.
Solution.
Yes, you can correct all single bit errors because the parity check matrix for the expanded code is
\begin{equation*} H = \left(\begin{array}{ccc} 1 & 1 & 0\\ 1 & 0 & 0\\ 1 & 0 & 1\\ 0 & 1 & 0\\ 0 & 1 & 1\\ 0 & 0 & 1\\ \end{array} \right). \end{equation*}
Since each possible syndrome of single bit errors is unique we can correct any error.

8.

Suppose that a linear code has parity check matrix
\begin{equation*} H=\left( \begin{array}{ccc} 1 & 1 & 1 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} \right) \end{equation*}
Determine the generator matrix, \(G\text{,}\) and in so doing, identify the number of bits in each message block and the number of parity bits.
Hint.
There is a parity check equation for each parity bit.

9.

A code with minimum distance \(d\) is called perfect if every string of bits is within Hamming distance \(r=\frac{d-1}{2}\) of some code word. For such a code, the spheres of radius \(r\) around the code words partition the set of all strings. This is analogous to packing objects into a box with no wasted space. Using just the number of bit strings of length \(n\) and the number of strings in a sphere of radius \(1\text{,}\) for what values of \(n\) is it possible to find a perfect code of distance \(3\text{?}\) You don’t have to actually find the codes.

10.

  1. Prove that the code words of a linear code are a subgroup of the code space.
  2. Prove that if \(C\) is a left coset of the set of code words, then all elements of \(C\) will have the same syndrome.
You have attempted of activities on this page.