• Ei tuloksia

Continues on the next page!

N/A
N/A
Info
Lataa
Protected

Academic year: 2022

Jaa "Continues on the next page!"

Copied!
2
0
0

Kokoteksti

(1)

582669 Supervised Machine Learning (Spring 2012) Homework 5 (24 February)

Solutions are due onFriday, 24 February, at 9:00am. Notice that some of the problems are based on material that will be presented during the last week of lectures.

1. Consider the problem minimise x2

subject to (x−2)2≤1.

Write out the LagrangianL(x, λ). Solve the problem using the KKT conditions. Write out also the dualg(λ)in a closed form. Repeat the exercise with the constraint(x−2)2≤8.

The following is not a compulsory part of the homework (so you don’t need to provide plots etc.

as part of your homework), but you might find it useful to illustrate the notions of Lagrangian and dual: Use some mathematical software (such as Maple) to plot L(x, λ) (as a function of two variables, so you get a three-dimensional picture). Find from this plot the functionsx7→

maxλL(x, λ)andλ7→minxL(x, λ). Notice the point where they coincide.

2. Consider a variant of the Perceptron algorithm, where the new weight vectorwt+1is defined as thew that solves the minimisation problem

minimise kw−wtk2 subject to ytw·xt≥1.

Writewt+1 as a function of wt,xtandyt(in other words, solve the optimisation problem).

Assuming that the initial weight vector w1 is zero, show that wt is a linear combination of x1, . . . ,xt−1. Write the kernelised form of the algorithm.

Remark:The idea is to always update the weight vector so that the latest data point is classified with a positive margin, but try to keep close to the old weight vector since it contains all the information we have gained from the previous trials. This particular algorithm is probably not very good (at least not with noisy data), but the general idea of deriving online algorithms from minimisation problems like this is very useful.

3. The 2-norm soft margin SVM is like the 1-norm soft margin SVM except that the objective function is

−µ+C

m

X

i=1

ξ2i

(the slack variables are squared). Show that nowαis obtained by maximising

W(α) =− 1 4C

m

X

i=1

α2i

m

X

i,j=1

αiαjyiyjk(xi, xj)

1/2

.

Continues on the next page!

(2)

4. The following is written assuming you use Matlab and LIBSVM, but as usual feel free to use whatever tools you prefer.

The filehw05data.txtavailable from the course home page contains a sample of 400 points, one on each row. Each row contains the valuesx1, x2 and y for a data point(x, y)where x ∈R2 andy ∈ { −1,1}. You can read the file into an400×3 Matlab array with commanddlmread.

The data consists of two concentric rings, with some noise near their boundary (plot it to see for yourself). Split the data into two parts, half as a training set and half as the test set.

Download the LIBSVM package fromhttp://www.csie.ntu.edu.tw/˜cjlin/libsvm/. We will use it to learn a soft margin support vector machine with the Gaussian kernel (i.e., radial basis function) using the training set, and test it using the test set.

The objective of the exercise is to see how the parametersC andγ (the “width” of the kernel) affect learning. Try fixingC to a suitable value and varyγ. At extreme values you should see overfitting (small error on training set, large error on test set) and underfitting (large error on training set), with the optimum somewhere in between. Repeat the same with γ fixed and C varying. You don’t have to find the optimal parameter combination. Just try to produce some graphics that show this effect of varying the parameters. Include in your solution also a brief description of what you did and what observations you made.

Before you start running the data through Matlab, it may be useful to familiarise yourself with the parameterisation etc. of LIBSVM using thesvm-toythat comes with the download and can also be used on the web site. Use the graphical user interface to create a data set that looks similar tohw05data.txt(but can be much smaller) and try different parameter combinations.

2

Viittaukset

LIITTYVÄT TIEDOSTOT

Interestingly, on the same day that AUKUS saw the light of day, the EU launched its own Indo-Pacific strategy, following regional strate- gy papers by member states France –

Indeed, while strongly criticized by human rights organizations, the refugee deal with Turkey is seen by member states as one of the EU’s main foreign poli- cy achievements of

In practice you should just calculate the values exactly (if you need specific numerical values) or use Chernoff bounds or similar (if you need an asymptotically fairly

Then an error occurs for a given bit, i.e., the recipient is unable to recover its correct value, if at least k/2 of the sent bits were flipped.. (a) Give an exact expression in

(a) What is the optimal strategy if you want to minimize the expected number of questions required.. What can you say about the expected number of questions that

(This has been counted as two problems to keep the total work load more manageable. For purposes of grading, and perhaps also for planning your work, you may think that Problem 4 is

The goal is to make the Weighted Majority algorithm a bit more concrete and prepare for later exercises with other online

Your solution should consist of a brief explanation of the observations you made, a couple of representative plots to support this, and a printout of your program