Ex. 2.1

Ex. 2.1

Suppose each of \(K\)-classes has associated target \(t_k\), which is a vector of all zeros, except a one in the \(k\)-th position. Show that classifying to the largest of \(\hat y\) amounts to choosing the closet target, \(\min_k\|t_k-\hat y\|\), if the elements of \(\hat y\) sum to one.

Soln. 2.1

We need to prove:

\[\begin{equation} \underset{k}{\operatorname{argmax}} \hat y_k = \underset{k}{\operatorname{argmin}} \|t_k-\hat y\|^2 \label{eq:2-1a} \end{equation}\]

By definition of \(t_k\), we have

\[\begin{align} \|t_k-\hat y\|^2 &= (1-\hat y_k)^2 + \sum_{l \neq k }(0 - \hat y_l)^2\nonumber\\ &= (1-\hat y_k)^2 + \sum_{l \neq k }\hat y_l^2\nonumber\\ &= 1 - 2\hat y_k + \sum\hat y_l^2 \label{eq:2-1b} \end{align}\]

Given \(\eqref{eq:2-1b}\), it's straightforward to see that \(\eqref{eq:2-1a}\) indeed holds because only \(-2\hat y_k\) depends on \(k\).

Remark

The assumption \(\sum_{k=1}^K\hat y_k=1\) is actually not required.