Ex. 2.1
Ex. 2.1
Suppose each of \(K\)-classes has associated target \(t_k\), which is a vector of all zeros, except a one in the \(k\)-th position. Show that classifying to the largest of \(\hat y\) amounts to choosing the closet target, \(\min_k\|t_k-\hat y\|\), if the elements of \(\hat y\) sum to one.
Soln. 2.1
We need to prove:
\[\begin{equation}
\underset{k}{\operatorname{argmax}} \hat y_k = \underset{k}{\operatorname{argmin}} \|t_k-\hat y\|^2
\label{eq:2-1a}
\end{equation}\]
By definition of \(t_k\), we have
\[\begin{align}
\|t_k-\hat y\|^2
&= (1-\hat y_k)^2 + \sum_{l \neq k }(0 - \hat y_l)^2\nonumber\\
&= (1-\hat y_k)^2 + \sum_{l \neq k }\hat y_l^2\nonumber\\
&= 1 - 2\hat y_k + \sum\hat y_l^2
\label{eq:2-1b}
\end{align}\]
Given \(\eqref{eq:2-1b}\), it's straightforward to see that \(\eqref{eq:2-1a}\) indeed holds because only \(-2\hat y_k\) depends on \(k\).
Remark
The assumption \(\sum_{k=1}^K\hat y_k=1\) is actually not required.