BULETIN PEMODELAN
MATEMATIKA Nomor 4, Volume 1, September 2006 DAFTAR ISI
Sepatah kata dari redaksi
Sepatah kata dari redaksi …………………………….. 1 Pengumuman lomba pemodelan matematika …………………………….. 1 Coding Theory ……….……………… 2, 3, 4
IKUTILAH KOMPETISI PEMODELAN MATEMATIKA ! Kompetisi ini terbuka bagi seluruh pelajar SMA dan diikuti secara berkelompok yang terdiri dari 2 s/d 4 orang. Setiap kelompok dipimpin oleh seorang ketua dan dapat dibimbing oleh seorang guru. Cara mengikuti kompetisi: kirimkan jawaban kelompok anda terhadap permasalahan matematika yang ada pada setiap penerbitan. Apabila anda mengirimkan paling sedikit 2 jawaban dari 4 permasalahan yang dimuat dalam 4 nomor/edisi (per tahun), anda akan diundang untuk mengikuti kompetisi. Jawaban anda boleh ditulis dalam bahasa Indonesia. ----------------------------------------------------Diterbitkan oleh LabMath-Indonesia bekerja-sama dengan Pusat Pemodelan Matematika dan Simulasi (P2MS) ITB Penanggung jawab Pimpinan LabMath – Indonesia Dewan Redaksi Ketua: Dr. Rinovia Simanjuntak Anggota: Dr. Gerard Jeurnink Dr. Wono Setyabudhi Dr. Andonowati Prof. Dr. E. (Brenny) van Groesen E-mail:
[email protected] Website: www.labmath-indonesia.or.id
Dalam nomor kedua Buletin yang lalu, telah dibahas mengenai salah satu cabang matematika yaitu Teori Koding dengan salah satu aplikasinya yaitu error correcting code. Banyak hal yang menarik dari salah satu bidang matematika ini mengingat aplikasi-aplikasinya yang banyak ditemukan dalam berbagai bidang, terutama ilmu computer dan teknik elektro. Sesuai dengan namanya, Teori Koding merupakan salah satu bidang matematika yang mempelajari banyak hal tentang pengkodean. Pengkodean adalah metode yang mengubah suatu informasi menjadi kode (yang dinamakan encoding) dan mengembalikan kode tersebut ke dalam informasi semula (atau juga disebut decoding). Dari proses encoding, penyimpanan (storage), pengiriman, sampai pada proses decoding, kode-kode tersebut sangat besar kemungkinannya untuk mengalami perubahan (atau biasa kita sebut error). Penyebabnya, bisa berbagai macam hal, contohnya goresan kecil pada CD penyimpan data. Akibatnya tentu saja data yang ada pada CD tersebut menjadi rusak (error) sebagian atau malah rusak seluruhnya. Salah satu hal yang menarik dari Teori Koding adalah error correcting code. Error correcting code adalah suatu kode yang memperbaiki sendiri kesalahan (error) yang terjadi selama proses pengiriman atau penyimpanan kode, sehingga pembaca kode tetap dapat mendecode kode yang telah terkontaminasi error tersebut menjadi informasi yang benar. Jika pada buletin sebelumnya dibahas sedikit tentang bilangan biner (yang menjadi bagian penting dalam kode komputer) dan error correcting code, pada bulletin ini akan dibahas mengenai The Hamming code untuk menggambarkan bagaimana error correcting code ini bekerja serta perluasannya melalui pendekatan aljabar. Error correcting code pada Teori Koding menjadi bagian yang penting dalam pembuatan CD. Banyak peneliti diseluruh dunia yang terlibat dalam pengembangan bidang ini. Tertarik untuk menjadi penemu CD yang tercanggih dimasa mendatang?; atau sekedar ingin mengetahui bagaimana prinsip kerja error correcting code dalam menyelamatkan data anda? Anda bisa memulainya dengan membaca bulletin ini. Redaksi
KESEMPATAN TERBATAS Anda ingin tahu apa yang dikerjakan oleh para matematikawan/wati? Pusat Pemodelan Matematika dan Simulasi, Institut Teknologi Bandung membuka kesempatan terbatas bagi para pelajar, guru SMA dan masyarakat umum untuk terlibat dan berpatisipasi secara langsung dalam aktifitas kami; misalnya bagaimana mengkoordinasi konferensi, seminar rutin, penelitian, dsb. Keterangan lebih lanjut dapat diperoleh melalui email:
[email protected] .
LabMath-Indonesia is a private non-commercial research institute aimed to facilitate the execution of scientific research and to disseminate the results to the community.
Coding Theory: the mathematics of error correcting Computers are widely used nowadays to store and handle information in a digital way: plain text files, audio files, data bases, etc. Information is stored in a computer using the binary alphabet consisting of two ‘bits', a 0 and a 1. The binary alphabet is essential, since we can let 0 and 1 correspond to ‘switch on' and ‘switch off’ in electrical circuitry. This project is about the crucial role of mathematics in storing digital information and correcting errors that arise in handling it, the so-called field of coding theory. For all sorts of reasons, individual bits used in storing information may change, for instance, in saving a file to disk, or during the production process of a music cd, or in transmitting a file over the internet or through space. There is always the risk that some bits change, like in the pictures below, where the second one contains errors.
Box 4. The Hamming code In this section we use the Hamming code ‘of length 7’ to illustrate the concept of an error correcting code. (There is a whole ‘family’ of Hamming codes, but we limit our discussion here to the simplest one.) Using 4 bits, we can form 24=16 distinct strings of length 4: 0000, 0001, etc. The Hamming code is a code that adds three bits to every string of 4 bits according to a certain rule. As it turns out, this way a code arises with still 16 code words that is one error correcting, i.e., if at one spot an error is introduced in such a 7 bit string, then this error can be detected and corrected by finding the code word which is closest to the received string. In cd's more involved correcting codes are used, but they are basically of a similar nature. Here is a geometric way of introducing this code. This explanation only tells you how to add the three extra bits and how to correct single errors. It does not explain the idea behind this code. To explain how to add three extra bits to a 4-bit codeword, let us look at 1011.
Draw three circles I, II, III and label the seven regions that arise as in the figure. The seven regions 1 up to 7 correspond to the seven positions of the final code word. In every region a 0 or a 1 of the final code word will appear. The rule is that every one of the three circles should contain an even number of 1's. Start by writing the first symbol (in our case a 1) in region 1, the second symbol (this is a 0) in region 2, the third symbol (1) in the region 3 and the fourth symbol (again a 1) in region 4. Next, we determine the remaining three bits of our code word with the rule: every circle contains exactly an even number of 1's. To determine the bit in position 5, take a look at circle III: There are two 1's and a 0 so far. So region 5 should be filled with a 0. This 0 is already in the picture. 4.1
Now try to fill the 6th and 7th region with a 0 or 1 yourself. So what will be the code word?
By adding three bits to every 4-bit code word, the new set of 16 code words becomes ‘1-error correcting’: if during transmission of a such a 7-bit word exactly one bit changes, then the receiver is able to detect this error and correct it. Here is how this error correction works.
Suppose we transmit 1100110 (this is a code word) and upon transmission the 4th bit changes into a 1. So on the other end of the channel the string 1101110 is received. Then he/she takes the three circles, fills in the 7 bits and notes that in circles I, II and III something goes wrong: the sum of the bits is not even. The receiver concludes that in region 4, the region common to all three circles, something must be wrong. So he/she changes the bit in position 4 into a 0. 4.2
The string 0111101 is received. How would you decode it?
Box 5. Hamming code: algebraic approach If x1, x2, x3, x4 are the first 4 bits, then the 5th, 6th and 7th bit, denoted by x5, x6, x7, respectively are determined as follows. To find x5, we use circle III. The sum of the bits should be 0 (why?), so x5 = x2+x3+x4. Likewise, we find x6 = x1 + x3+x4 (circle I) and x7 = x1 + x2 + x4 (circle II). In summary: x5 = x2 + x3 + x4 x6 = x1+ x3 + x4 x7 = x1 + x2 + x4 An important property of the Hamming code is that it is linear: sums of code words are again code words, i.e., if (x1, … ,x7) and (y1, … ,y7) are codewords, then the sum (x1+y1 , … , x7+y7) is also a code word. In order to see this, we have to verify that the three equations are satisfied for this sum, i.e., the 5th bit should be the sum of the 2nd, 3rd and 4th bit; the 6th bit should be the sum of the 1st, 3rd and 4th; and the 7th bit should be the sum of the 1st, the 2nd and the 4th bit. To check this for the 5th bit, we use the fact that x5 = x2+x3+x4 and y5 = y2+y3+y4. Add these two relations, rearrange a bit, and you find: x5+y5 = (x2+y2)+(x3+y3)+(x4+y4) as desired. Mathematicians have set up the equations in such a way that every two distinct length 7 words of our code have distance at least 3. Here is a way to see this. Suppose that two code words of length 7 differ in at most two places. We will argue that this is impossible. The sum of the two code words contains at most two 1's, is still a code word, and therefore satisfies the equations. It is easily seen from these equations that the sum can not contain exactly one 1. If the sum contains exactly two 1's, for instance in positions 2 and 4, then the first and third equation are ok, but the second equation is not satisfied, so this is impossible. We use the following exercise to deal with all cases. To check the impossibility of all cases, proceed for instance as follows.
5.1 5.2
5.3
Suppose the 1's are both among positions 5, 6 and 7. Why is this impossible? Suppose the two 1's are both among positions 1, 2, 3 and 4. Show that this is impossible. [Hint: sum the three relations.] Now deal with the remaining case. [Hint: sum the three relations.]
There are faster ways to see that every two code words differ in at least 3 positions, but that is beyond the scope of this lesson. It involves viewing the code as a linear space. It makes use of techniques in linear algebra applied to our binary context. In conclusion The field of coding theory is a vital branch of mathematics close to computer science and electrical engineering. The field was of crucial importance for the development of the cd. Many researchers around the world are involved in further developing the field, to improve on existing solutions, or to tackle new challenges arising as new technologies and products emerge. Box 6: Linear Algebra In box 5 we enlarged a codeword
(x1 , x2 , x3 , x4 )
to (x1 , x2 , x3 , x4 , x5 , x6 , x7 ) . In fact this produces a map between the space of all 0-1-sequences of length four to the space of 0-1-sequences of length 7. More precisely, using the algebraic notion of matrix (= rectangle scheme of numbers, in this case a matrix with 7 rows and 4 columns) we have:
⎛1 ⎜ ⎜0 ⎜0 ⎜ ⎜0 ⎜0 ⎜ ⎜1 ⎜ ⎝1
0 1
0 0
0
1
0 1
0 1
0 1
1 0
0⎞ ⎟ 0⎟ ⎛ 0 ⎟⎜ ⎟⎜ 1 ⎟⎜ 1 ⎟⎟ ⎜⎜ ⎝ 1⎟ ⎟ 1⎠
⎛ ⎜ ⎜ x1 ⎞ ⎜ ⎟ x2 ⎟ ⎜ = ⎜ x3 ⎟ ⎜ ⎟ x 4 ⎟⎠ ⎜ ⎜ ⎜ ⎝
x1 ⎞ ⎟ x2 ⎟ x3 ⎟ ⎟ x4 ⎟ x 5 ⎟⎟ x6 ⎟ ⎟ x7 ⎠
where the fifth row represents the sixth equation x 2 + x3 + x 4 = x5 ,
the row
x1 + x3 + x 4 = x6 and finally x1 + x 2 + x 4 = x 7 is indicated by the seventh row. The matrix determines the map completely.
So far we have made a one-error-correcting code for an alphabet of 16 words. How should the receiver interpret his received information? Well, First he tests whether a received word ( y1 , y 2 , y3 , y 4 , y5 , y 6 , y 7 ) fulfills the equations
y 2 + y3 + y 4 + y5 = 0 y1 + y 3 + y 4 + y 6 = 0 y1 + y 2 + y 4 + y 7 = 0 In other words (using matrix notation, the so-called parity-check-matrix) :
⎛ y1 ⎞ ⎜ ⎟ ⎜ y2 ⎟ ⎛ 0 1 1 1 1 0 0 ⎞⎜ y3 ⎟ ⎛ 0 ⎞ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎜ 1 0 1 1 0 1 0 ⎟⎜ y4 ⎟ = ⎜ 0 ⎟ ⎜ 1 1 0 1 0 0 1 ⎟⎜ y ⎟ ⎜ 0 ⎟ ⎠⎜ 5 ⎟ ⎝ ⎠ ⎝ ⎜ y6 ⎟ ⎜ ⎟ ⎝ y7 ⎠
Remark: If we continue with one-error-correcting codes and are willing to add four bits (instead of three) to a word, then we are able to code message-words of length 11 with words of length 11 + 4 = 15. This is called the Hamming code H 4 , a parity-checkmatrix to
H 4 has 4 rows and 15 columns.
--------------------------------------------------------------This project on coding was developed by Dr. Hans Sterk Technische Universiteit Eindhoven, The Netherlands
Coordinator Dr. Gerard Jeurnink Department of Applied Mathematics
[email protected]
Now suppose this equation is not fulfilled. So there is one error, let’s say y 3 = 0 where it should be y 3 = 1 . Then the receiver has gotten correct codeword + (0,0,1,0,0,0,0) . If we multiply this by the parity-check-matrix then
⎛ 0⎞ ⎛1⎞ ⎛1⎞ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ we get ⎜ 0 ⎟ + ⎜ 1 ⎟ = ⎜ 1 ⎟ , which is exactly the third ⎜ 0⎟ ⎜ 0⎟ ⎜ 0⎟ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ column. So the receiver acknowledges that y 3 wasn’t correct and now he is able to decode the word.
Kami Mengundang para pembaca untuk mengirimkan naskah berisi permasalahan metematika yang berasal dari masalah nyata untuk dimuat di buletin ini. Contoh Penulisan naskah dapat di lihat pada halaman 2, 3 dan 4 di buletin ini. Naskah yang dimuat akan memperoleh imbalan sebesar Rp. 100.000,- dan 25 kopi buletin. Kirim naskah versi elektronik anda melalui email
[email protected]
We end this letter with some more advanced exercises.
6.1.
6.2.
Find a set of 16 code-words (0-1sequences) with length 7 for which every word has a distance at least 3 to every other word in the set.
Convince yourself that we need length 4+3 = 7 in order to have such a set of words with a one-error16 = 2 4 correcting code.
Kirimkan jawaban, saran-saran anda ke
[email protected] atau Laboratorium Matematika Indonesia (LabMath-Indonesia) Jl.Cigadung Raya Barat 7A kav A2 Bandung 40191 Tlp +62-22-91271863