Ex. 11.4

Ex. 11.4

Consider a neural network for a \(K\) class outcome that uses cross-entropy loss. If the network has no hidden layer, show that the model is equivalent to the multinomial logistic model described in Chapter 4.

Soln. 11.4

From (11.5) in the text, if there are no hidden layers, then

\[\begin{equation} T_k = \beta_{0k} + \beta_k^TX, \ k=1,...,K\non \end{equation}\]

thus by (11.6) in the text

\[\begin{eqnarray} f_k(X) &=& g_k(X)\non\\ &=& \frac{\exp(\beta_{0k} + \beta_k^TX)}{\sum_{l=1}^K\exp(\beta_{0l} + \beta_l^TX)}.\non \end{eqnarray}\]

If we normalize these probabilities by

\[\begin{equation} f_k(x) \leftarrow f_k(x)/f_K(x) \cdot \frac{1}{1+\sum_{l=1}^{K-1}\exp(\beta_{0l} + \beta_l^TX)},\ k=1,...,K \non \end{equation}\]

we get exactly the multinomial logistic model studied in Ex. 4.4.