COMP3170 Machine Learning - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

COMP3170 Machine Learning

Description:

b) Assume a learning rate of 0.1, compute the weights at time (n 1) ... All the connection weights, the bias of Neuron 1 and Neuron 2 are shown in the Figure. ... – PowerPoint PPT presentation

Number of Views:55

Avg rating:3.0/5.0

Slides: 42

Provided by: compHk

Category:

more less

Transcript and Presenter's Notes

Title: COMP3170 Machine Learning

1
COMP3170Machine Learning

tutorial

2
Q (Lecture 1)

Describe informally in one paragraph of English,
the task of learning to recognize handwriting
numerical digits.

A sample answer
To construct a learning system F(W) based
on the training digit examples E(X,D), to
predict the classification for a new unknown
input digit F(W,X).

2. Describe the various steps involved in
designing a learning system to perform the task
of question 1, give as much detail as possible
the tasks that have to be performed in each step.

A sample answer
The main steps are given below, please refer to
the details in the lecture note1 (pp.10-21).
Step 1 Collect Training Examples.
Step 2 Representing Experience by certain
representation scheme.
Step 3 Choose a Representation for the Black
Box (F).
Step 4 Learning/Adjusting the parameters of F.
Step 5 Use/Test the System

3. For the tasks of learning to recognize human
faces and finger print respectively, redo
questions 1 and 2.

4. In the lecture, we used a very long binary
vector to represent the handwriting digits, can
you think of other representation methods?

One possible approach by projection

9
Q (Lecture 2)

What is the weight values of a perceptron having
the following decision surfaces

2. Design two-input perceptrons for implementing
the following boolean functions
AND, OR, NAND, NOR

AND
X1 X2 D
-1 -1 -1
-1 1 -1
1 1 1
1 -1 -1

3. A single layer perceptron is incapable of
learning simple functions such as XOR (exclusive
OR). Explain why this is the case (hint use the
decision boundary)

X1 X2 D
-1 -1 -1
-1 1 1
1 1 -1
1 -1 1
The hyperplane does not exist in this case!

A single layer Perceptron is as follows
Write down and plot the equation of the decision
boundary of this device
Change the values of w1 and w2 so that the
Perceptron can separate following two-class
patterns
Class 1 Patterns (1, 2), (1.5. 2.5), (1, 3)
Class 2 Patterns (2, 1.5), (2, 1)

W12W2gt0, 1.5W12.5W2gt0, W13W2gt0
?W1gt-2W2, W1gt-(5/3)W2, W1gt-3W2
? W1gt-(5/3)W2
2W11.5W2lt0, 2W1W2lt0
? W1lt-(1.5/2)W2, W1lt-(1/2)W2
?W1lt-(1.5/2)W2
? -(5/3)W2lt W1lt-(1.5/2)W2

17
Perceptron pseudo-code

Pseudo code
Input X , delta, max_iteration,
Error_criterion
Output w
Begin Initialize w
Do loop max_iteration
For each data x step1. calculate R
step2update w
error(iteration) of misclassified
data / of data
if error lt Error_criterion then Break
Return w
End

18
Q (lecture 3)

K nearest neighbor classifier has to store all
training data creating high requirement on
storage. Can you think of ways to reduce the
storage requirement without affecting the
performance? (hint search the Internet, you will
find many approximation methods).

Use training set size reduction scheme(select a
few prototypes)

20
Q (lecture 4)

An application of the k-means algorithm to a 2
dimensional feature space has produced following
three cluster prototypes M1 (1, 2), M2 (2,
1) and M3 (2, 2). Determine which cluster will
each of the following feature vectors be
classified into. (hint minimum Euclidean
Distance)
(i) X1 (1, 1)
X2 (2, 3)

The k-means algorithm has been shown to minimize
the following cost function. Derive an online
version of the k-means algorithm following
similar ideas as the delta rule. In this case,
the prototype will be updated every time a
training sample is presented for training, this
is in contrast to k-means algorithm where only
after all samples are presented the prototypes
will be updated.

22
Online K-means

1- Initialize K group centroids.
2- For each sample
a- Assign a sample to the group that has the
closest centroid.
b- Recalculate the new position of the assigned
centroid. (deltaxk)
3- Repeat Step 2 until the centroids no longer
move or the maximum iteration number has been
reached.

23
Q (lecture 5)

Derive a gradient descent training rule for a
single unit with output y (hint Batch)

24
(No Transcript)
25

2. A network consists of two ADLINE units N1
and N2 is shown as follows. Derive a delta
training rule for all the weights (hint online)

26
(No Transcript)
27

3.The connection weights of a two-input ADLINE at
time n have following values

w0 (n) -0.5 w1 (n) 0.1 w2 (n) -0.3.
The training sample at time n is
x1 (n) 0.6 x2 (n) 0.8
The corresponding desired output is d(n) 1
a) Base on the Least-Mean-Square (LMS) algorithm,
derive the learning equations for each weight at
time n
b) Assume a learning rate of 0.1, compute the
weights at time (n1)
w0 (n1), w1 (n1), and w2 (n1).

Similar to the last question (omit)

29
Q (lecture 6)

Assume that a system uses a three-layer
perceptron neural network to recognize 10
hand-written digits 0, 1, 2, 3, 4, 5, 6, 7, 8,
9. Each digit is represented by a 9 x 9 pixels
binary image and therefore each sample is
represented by an 81-dimensional binary vector.
The network uses 10 neurons in the output layer.
Each of the output neurons signifies one of the
digits. The network uses 120 hidden neurons. Each
hidden neuron and output neuron also has a bias
input.
(i) How many connection weights does the network
contain?
(ii) For the training samples from each of the 10
digits, write down their possible corresponding
desired output vectors.
(iii) Describe briefly how the backprogation
algorithm can be applied to train the network.
(iv) Describe briefly how a trained network will
be applied to recognize an unknown input.

(i) (81X120120)(120X1010)
(ii) 0000000001---gt0
0000000010---gt1
0000000100---gt2
1000000000---gt9

(iii)

based on the output (y(k)) of the Multilayer
perceptron given the unknown input

The network shown in the Figure is a 3 layer feed
forward network. Neuron 1, Neuron 2 and Neuron 3
are McCulloch-Pitts neurons which use a threshold
function for their activation function. All the
connection weights, the bias of Neuron 1 and
Neuron 2 are shown in the Figure. Find an
appropriate value for the bias of Neuron 3, b3,
to enable the network to solve the XOR problem
(assume bits 0 and 1 are represented by level 0
and 1, respectively). Show your working process.

34
(No Transcript)
35

Consider on case 2 we get the conclusion that b3
should be chosen in the range
(-0.5,0).

Consider a 3 layer perceptron with two inputs a
and b, one hidden unit c and one output unit d.
The network has five weights which are
initialized to have a value of 0.1. Given their
values after the presentation of each of the
following training samples
Input Desired
Output
a1 b0 1
b0 b1 0

Ysigmod(sigmod(0.1a0.1b0.11)0.10.1)
Output delta_Oy(1-y)(d-y)
Hidden delta_Hsigmod(0.1a0.1b0.11)1-sigmod
(0.1a0.1b0.11)0.1delta_O

W_cd0.1etadelta_Osigmod(0.1a0.1b0.11)
W_ac0.1etadelta_Ha
W_bc0.1etadelta_Hb

39
Q(lecture 7)

Customers responses to a market survey is a
follows. Attribute are age, which takes the value
of young (Y), middle age (M) and old age (O)
income which can be low (L) and high (H) owner
of a credit card can be yes (Y) and (N). Design a
Naïve Bayesian classifier to decide if customer
David will response or not.