Application to
coding theory

Introduction Transmitted messages, like data from a
satellite, are always subject to noise. It is important; therefore, to be able
to encode a message in such a way that after noise scrambles it, it can be
decoded to its original form. This is done sometimes by repeating the message
two or three times, something very common in human speech. However, copying
data stored on a compact disk, or a floppy disk once or twice requires extra
space to store. In this application, we will examine ways of decoding a message
after it gets distorted by some kind of noise. This process is called coding. A code that detects errors in a
scrambled message is called error
detecting. If, in addition, it can correct the error it is called error correcting. It is much harder to find error correcting
than error-detecting codes.
Some basic coding techniques Most messages are sent as digital-sequences of
0’s and 1’s, such as 10101 or 1010011, so let us assume we want to send the
message 1011. This binary “word” may stand for a real word, such as buy, or a
sentence such as buy stocks. One way of encoding 1011 would be to attach a
binary “tail” to it so that if the message gets distorted to, say, 0011, we can
detect the error. One such tail could be a 1 or 0, depending on whether we have
an odd or an even number of 1’s in the word. This way all encoded words will
have an even number of 1’s. So 1011 will be encoded as 10111. Now if this is
distorted to 00111 we know that an error has occurred, because we only received
an odd number of 1’s. This error-detecting code is called a parity check and is too simple to be
very useful. For example, if two digits were changed, our scheme will not
detect the error, so this is definitely not an error-correcting code. Another
approach would be to encode the message by repeating it twice, such as
10111011. Then if 00111011 were received, we know that one of the two equal
halves was distorted. If only one error occurred, then it is clearly at
position 1. This coding scheme also gives poor results and is not often used.
We could get better results by repeating the message several times, but that
takes space and time.
An advanced coding technique: Hamming code In the 1950’s, R.H. Hamming introduced an
interesting single error-correcting code that became known as the Hamming code. Before we can
examine the details of that technique, we need some background from linear
algebra.
Vector Spaces over Z2
In a typical first
year linear algebra course, when students are introduced to the notion of a
vector space, the word “scalar” means a real or a complex number. This can be
generalized to an arbitrary element of a certain “field”.
A field is a set F with two operations, addition and multiplication, which satisfies the following axioms: 1. Closure of addition: if x, y are in F, then x+y is in F. 2. Closure of multiplication: if x, y are in F, then xy is in F. 3. Associative Law of Addition: if x, y, z are in F, then (x+y)+z=x+(y+z) 4. Associative Law of Multiplication: if x, y, z are in F, then (xy)z=x(yz) 5. Distributive Law: if x, y, z are in F, then x(y+z)=xy+yz 6. Existence of 0: an element of F satisfying x+0=x for all x in F 7. Existence of 1: an element of F satisfying x.1=x for all x in F 8. Existence of additive inverses (negatives):If x is in F, there exists y in F such that x+y=0 9. Existence of multiplicative inverses (reciprocals), except for: If x is in F is not the zero element, then there exists an element y in F such that xy=1. 10. Commutative Law of Addition: If x, y are in F, then x+y=y+x 11. Commutative Law of Multiplication: If x, y are in F, then xy=yx. Examples of fields are Q (rational numbers), R (real numbers), C (complex numbers), and Z/pZ if p is a prime number (integers modulo a prime p):
The last example of
fields is one of special interest of us. In particular the case when p=2. In this case, the field Z/2Z is denoted by Z2. It consists
of two numbers only: 0 and 1:
![]()
In Z2, addition and
multiplication rules are defined as follows:
![]()
Thus, addition and
subtraction are the same in Z2.
Recall the vector
space structure of Rn (R is the set of all real numbers) over R:
1.
(x1,…, xn)+ (y1,…, yn)=
(x1+ y1,…, xn+yn)
2.
a(x1,…, xn)=
(ax1,…,a xn) if a is a real number.
The same structure can
be defined on Z2n.
We equip Z2n
with an addition and a scalar multiplication (multiplication with 0 and 1). For
example, in Z25
we have:
![]()
Equipped with these
two operations, Z2n
becomes a vector space over the field Z2
(the scalars here are 0 and 1). All the basic concepts of vector spaces like
linear independence, spanning set, subspaces, dimension, row space, null space,
… apply in this case. A big difference from the vector space Rn is that Z2n contains a
finite number of vectors, namely 2n vectors.
The Hamming
(7,4) code Given two
integers k≤ n, a subspace Z2n
of dimension k is called an (n,k) linear code. The
elements of a linear code are called encoded words.
Consider the matrix H
over Z2 whose columns c1,
…, c7 are all the nonzero vectors of Z23:

The null space, Null(H)
(also called the kernel), of H is called a Hamming (7,4) code. Now
recall that Null(H) is nothing but the set of all solutions to the
homogeneous linear system HX=0 that corresponds to H. We say that
H is a parity check matrix for the code Null(H). Let us solve the
system HX=0 to see what Null(H) looks like.
Using Gauss-Jordan elimination (together with the arithmetic of Z2),
we get the following reduced echelon form of H:

and since the rank of H
is 3, the dimension of Null(H) is 7-3=4. Actually, one can show easily
(by writing the general solution in parametric form) that
![]()
is a basis of Null(H)
over Z2.
Remark Let {e1,…,e7} be
the standard basis of Z27, then Hei=ci
for all i=1,…,7 (check this), and therefore none of the standard vectors
ei is in Null(H). As a consequence, we have the two
observations:
1. If v is a vector of Null(H), then
v+ ei is not in Null(H) for i=1,2,…,7.
2. If v is a vector of Z27 such that
Hv=ci for some i, then v+ ei
is a vector of Null(H). Furthermore, v+ ej
is not in Null(H) for all j
i.
The matrix G whose
rows are the elements of the basis B is called the generator matrix of
the Hamming (7,4)-code:

We are now ready to
explain the procedure of Hamming decoding and correcting errors:
Suppose we want to
send a word u consisting of four
binary digits u1 u2
u3 u4, and
suppose that it is known that the encoded word might get distorted by some noise changing only one of its
components. Let w be the received
word.
1. To encode u,
we form the linear combination v of the elements of the basis B
above with the four digits of u as
coefficients. Note that v can be
obtained from the original word by performing the matrix multiplication v=[u1
u2 u3 u4]G, where G is the matrix above. By construction, the vector v is in Null(H). Note also that [u1 u2 u3 u4]G would give a seven digit vector whose first four digits represent
the original word.
2. Compute Hw,
where H is the matrix described
above.
3. If Hw=0, then
w is in Null(H).Thus, a single
error would mean w is not in Null(H)
by the first part of the above remark. We conclude that there is no distortion,
and u is the first four digits of w.
4. If Hw=ci
for some i, then then v+ ei is
a vector of Null(H), and v+ ej is not in Null(H)
for all j
i. That suggests
changing the ith component of w (from 0 to 1 or from 1
to 0) and get a new vector w’. The first four digits of w’ represent the word u.
Let see how this works
with two examples.
Example 1 Suppose that we received the message w=1100011 encoded by the
Hamming (4, 7)-code. Suppose that there is at most one error in the
transmission, find the original message.

Since Hw is
equal to the second column of H, changing the second component of w
gives the encoded word: 1000011. We conclude that the original message is 1000.
Example 2 Supposing that we received the message w=0101010
encoded by the Hamming (4, 7)-code. Suppose that there is at most one error in
the transmission, find the original message.
Solution

Since Hw =0,
there is no error in the transmission of this message, which was 0101.
In the above
technique, the words we sent were very short: 4 digits only. There are only 24 such words. In real life, electronic messages consist of many more
digits. Another problem with the Hamming (4, 7)-code is that it cannot detect
more than one error in an encoded message.
With the “electronic revolution” of our time, one can imagine that there
are other types of more efficient codes.
If you want to learn
more about coding theory, check the following links:
1. The error correcting
codes site (ECC) (many programs for coding theory are included)
2. The
coding theory group at Notre Dame university (for advanced coding theory)
3. A neat talk
given by Professor Burgess, University of Ottawa
Some books:
1. Discrete and combinatorial Mathematics, fourth edition, R.P Grimaldi, Addison Wesley
Longman, Reading, Mass., 1999.
2. Coding Theory, The Essentials. K.G. Hoffman, et al, Marcel Dekker, New York,
1991.
3. Introduction to Coding Theory. J.H. van lint, Springer Verlag, New York, 1982.