Skip to content

lxmonk/nlg12_hw2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NLP12 Assignment 2: Bayesian Curve Fitting, Classification

NOTES:

  1. The script for running the code as done by me in preparing this assignment, is written to be used in IPython 1. A detailed session (with outputs as well, is given in session.ipy)
  2. This document has some equations that require javascript to run, and an internet connection (to http://orgmode.org/ for the functions).

1 Polynomial Curve Fitting

1.1 Synthetic Dataset Generation

I used this code: #+INCLUDE “code/hw2.py” src python :lines “1-17”

#+INCLUDE “code/session.ipy” src python :lines “5-14”

And got this scatter plot (Figure 1):

images/generateDataset(50,sin,0.03).png

1.2 Polynomial Curve Fitting

I used #+INCLUDE “code/hw2.py” src python :lines “19-30”

and ran #+INCLUDE “code/session.ipy” src python :lines “16-38”

to get Figure 2

images/Q1.2_sigma=0.03.png

but this seemed a bit to small of an error, so I also ran: #+INCLUDE “code/session.ipy” src python :lines “39-56”

to get Figure 3:

images/Q1.2_sigma=0.1.png

Which I feel makes the point of over-fitting more obvious.

1.3 Polynomial Curve Fitting with Regularization

Using the standard penalty function:

\begin{equation} EW(w) = \frac{1}{2} WT⋅ W = \frac{1}{2} ∑m=1MWm2 \end{equation}

and the given solution to the penalized least-squares problem: \begin{equation} WPLS = (ΦTΦ + λ \mathrm{I})-1ΦTt \end{equation}

I wrote: #+INCLUDE “code/hw2.py” src python :lines “31-46”

To generate the 3 slices of the data set: #+INCLUDE “code/hw2.py” src python :lines “47-59”

To get the error term for given $xi$, $ti$ $M$ and the normalized error function, for the training and other sets:

1.3.1 N=10

#+INCLUDE “code/session.ipy” src python :lines “57-82” Producing:

images/Q1.3_M=1_N=10_sigma=0.1.png

images/Q1.3_M=3_N=10_sigma=0.1.png

images/Q1.3_M=5_N=10_sigma=0.1.png

images/Q1.3_M=10_N=10_sigma=0.1.png

1.3.2 N=100

#+INCLUDE “code/session.ipy” src python :lines “84-116”

images/Q1.3_M=1_N=100_sigma=0.1.png

images/Q1.3_M=3_N=100_sigma=0.1.png

images/Q1.3_M=5_N=100_sigma=0.1.png

images/Q1.3_M=10_N=100_sigma=0.1.png

images/Q1.3_M=20_N=100_sigma=0.1.png

images/Q1.3_M=40_N=100_sigma=0.1.png

images/Q1.3_M=60_N=100_sigma=0.1.png

images/Q1.3_M=80_N=100_sigma=0.1.png

images/Q1.3_M=100_N=100_sigma=0.1.png

My conclusion is that (as pointed out in class) choosing the $λ$ value that minimizes the error on the validation set, is a good heuristic to the value that will minimize the test set. Therefore, I wrote LoptimizePLS(xt, tt, xv, tv, M) such that it will choose the $λ$ that has the minimal error on the validate set. It’s also worth mentioning that a $λ$ value greater than 1 is not very helpful.

#+INCLUDE “code/hw2.py” src python :lines “87-104”

1.4 Probabilistic Regression Framework

To return the following equations:

\begin{equation} m(x) = \frac{1}{σ2} Φ(x)T S ∑n=1NΦ(xn) tn \end{equation}

\begin{equation} var(x) = S2(x) = σ2 + Φ(x)T S Φ(x) \end{equation}

\begin{equation} S-1 = α I + \frac{1}{σ2} ∑n=1NΦ(xn)Φ(xn)T \end{equation}

The implementation is: #+INCLUDE “code/hw2.py” src python :lines “106-128”

running: #+INCLUDE “code/session.ipy” src python :lines “112-127” resulted in Figure 4:

images/bishop_N=10_sin(x).png

and for $N=100$: #+INCLUDE “code/session.ipy” src python :lines “112-127” resulted in Figure 5:

images/bishop_N=100_sin(x).png

BUT Bishop used $sin(2 π x)$ which looks nicer, so I tried that too: #+INCLUDE “code/session.ipy” src python :lines “147-183”

images/bishop_N=10_sin(2*pi*x).png

images/bishop_N=100_sin(2*pi*x).png

We should notice that in contrast to bishop (see below), in our graph, the $σ2$ values visibly decrease on ‘linear’ parts of the sinusoidal, and increase on ‘curved’ ones.

http://www.cs.bgu.ac.il/~elhadad/nlp12/prmlfigs-png/Figure1.17.png

2 Footnotes

1 Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June 2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org

Valid XHTML 1.0 Strict

Releases

No releases published

Packages

No packages published

Languages