Ex. 4.6

Ex. 4.6

Suppose we have N points xi in Rp in general position, with class labels gi{1,1}. Prove that the perceptron learning algorithm converges to a separating hyperplane in a finite number of steps:

  • (a) Denote a hyperplane by f(x)=β1Tx+β0=0 or in a more compact notation βTx=0, where x=(x,1) and β=(β1,β0). Let zi=xi/xi. Show that separability implies the existence of a βsep such that yiβsepTzi1  i.
  • (b) Given a current βold, the perceptron algorithm identifies a point zi that is misclassified, and produces the update βnewβold+yizi. Show that βnewβsep2βoldβsep21, and hence that the algorithm converges to a separating hyperplane in no more than βstartβsep2 steps (Pattern Recognition and Neural Networks).
Soln. 4.6
  • (a) By definition of separability, there exists β such that
βTxi>0for  yi=1βTxi<0for  yi=1.

Thus we have yiβTxi>0 for all xi, thus for yiβTzi>0 for all zi.

Define

m:=miniyiβTzi.

Thus, yi(1mβT)zi1. So there exists a βsep:=1mβ such that yiβsepTzi1 i.

  • (b) We have
βnewβsep2=βoldβsep+yizi2=βoldβsep2+yizi2+2yi(βoldβsep)Tzi=βoldβsep2+1+2yiβoldTzi2yiβsepTziβoldβsep2+1+2021=βoldβsep21.