• Ei tuloksia

Continues on the next page!


Academic year: 2022

Jaa "Continues on the next page!"




582650 Information-Theoretic Modeling (Autumn 2012) Homework 3 (due 25 September)

You should turn in your solutions on paper at the start of the exercise session. For problem 3 and the bonus problem, send your code to Hannes Wettig by e-mail.

1. (Cover & Thomas, Ex5.3.) Consider a prexix codeC:X → {0,1}where|X |=m, and assume the code lengths`1, . . . , `msatisfy




2−`i <1

(i.e., the Kraft Inequality is not tight.) Show that there are arbitrarily long bit strings that are not codes for any sequence of symbols. That is, there is an infinite number of bit strings s∈ {0,1}such that for allx1. . . xn ∈ Xwe haveC(x1. . . xn)6=s, whereCis the extension ofC. (Hint: find a stringsthat is not a prefix of any codeword.)

2. Consider a source alphabet X = {x1, . . . , xm} of size m. Assume we are given propabilities p1, . . . , pm for the source symbols, so Pr[X = xi] = pi. Recall that the Shannon-Fano code works as follows:

(a) Sort the symbols according to decreasing probability so that we can assume p1 ≥ p2 ≥ . . .≥pm.

(b) Initialize all codewordsw1, . . . , wmas the empty string.

(c) Split the symbols in two sets,(x1, . . . , xk)and(xk+1, . . . , xm), so that the total probabilities of the two sets are as equal as possible, i.e., minimize the difference|(p1+. . .+pk)−(pk+1+ . . .+pm)|.

(d) Add the bit ‘0’ to all codewords in the first set, wi 7→wi0, for all 1≤i≤k, and ‘1’ to all codewords in the second set, wi 7→wi1 for allk < i≤m.

(e) Keep splitting both sets recursively (Step (c)) until each set contains only a single symbol.

Simulate the Shannon-Fano code, either on paper or by computer, for a source with symbols A : 0.2, B : 0.22, C : 0.25, D : 0.15, E : 0.13, F : 0.05, where the numbers indicate the probabilitiespi.

3. Take a piece of text, estimate the symbol occurrence probabilities from it. Then use them to encode the text using the Shannon-Fano code (on a computer). Compare the code-length to the entropy of the symbol distributionH(X).

4. (Cover & Thomas, Ex5.12.) Consider a random variableX that takes on four values, a, b, c,!, with respective probabilities 13,13,14,121


(a) Construct a Huffman code for this random variable.

(b) Show that there exist two different sets of optimal (Huffman) lengths for the codewords;

namely, show that codeword length assignments(1,2,3,3)and(2,2,2,2)are both optimal.

(c) Conclude that there are optimal codes with code-word lengths for some symbols that exceed the Shannon code-length l

log2p(x)1 m .

Continues on the next page!


5. Assume that a personxis picked at random according to distributionpfrom a finite population X. In order to identify x, you are allowed to ask an oracle a sequence of any yes—no questions like “Doesxbelong to the subsetSi={xi,1, xi,2, . . . , xi,n(i)}?”.

(a) What is the optimal strategy if you want to minimize the expected number of questions required? What can you say about the expected number of questions that are required?

(Hint: the sequence of answers can be regarded as the codeword ofx.)

(b) What if the goal is to minimize theworst-case (maximum) number of questions?

Bonus problem. Implement the Hamming(7,4)code. The decoder should try to find a sequence of data bitsd1, d2, d3, d4 for which the difference between the received seven-bit sequence and the seven-bit codeword corresponding to the data bits is minimal (at most one bit). For example, in Lecture 2, the (Hamming) distance between the received sequence1111010and the codeword 1011010 is one, so that the data bits is decoded as 1011. Test your code by encoding the sequencerat.txt. Simulate the binary symmetric channel,p= 0.01, by saying ‘bsc.py 0.01’, then decode the noisy signal. You can create a PBM image file by the script ‘txt2pbm.py’. You can display a PBM file by ‘gimp file.pbm’. Compare the result with and without Hamming coding. Your experiment should be something like:

cat rat.txt | ./bsc.py 0.01 | ./txt2pbm.py >ratnoisy.pbm enc <rat.txt | ./bsc.py 0.01 | dec | ./txt2pbm.py >ratham.pbm

where encand dec are the encoding and decoding commands, respectively. The example file rat.txtand the scriptsbsc.pyandtxt2pbm.pyare available at the course web page.

Figure 1: ratnoisy.pbm




(This has been counted as two problems to keep the total work load more manageable. For purposes of grading, and perhaps also for planning your work, you may think that Problem 4 is

(a) What is the optimal strategy if you want to minimize the expected number of questions required?. What can you say about the expected number of questions that

The goal is to make the Weighted Majority algorithm a bit more concrete and prepare for later exercises with other online

Your solution should consist of a brief explanation of the observations you made, a couple of representative plots to support this, and a printout of your program

Remark: The idea is to always update the weight vector so that the latest data point is classified with a positive margin, but try to keep close to the old weight vector since

Give your solution as a fairly high-level pseudocode, but explain the data structures in sufficient detail so that the running times are justified.. Remark: On this course we do

You can send your solutions as a PDF file to jyrki.kivinen@cs.helsinki.fi, drop a paper version to a box outside Jyrki Kivinen’s office B229a, or bring a paper version to the

• When you use a modifying word like ”set”, vector”, ”model” etc. before the