Ex. 11.1

Ex. 11.1

Establish the exact correspondence between the projection pursuit regression model (11.1) and the neural network (11.5). In particular, show that the single-layer regression network is equivalent to a PPR model with \(g_m(\omega_m^Tx) = \beta_m\sigma(\alpha_{0m}+s_m(\omega_m^Tx))\), where \(\omega_m\) is the \(m\)th unit vector. Establish a similar equivalence for a classification network.

Soln. 11.1

Let \(K=1\), from (11.5) we have

\[\begin{eqnarray} f(X) &=& g(T) \non\\ &=&g(\beta_0+\beta^TZ)\non\\ &=&g(\beta_0 + \sum_{m=1}^M\beta_m\sigma(\alpha_{0m} + \alpha_m^TX)).\non \end{eqnarray}\]

Consider \(\beta_0\) added as the bias term into \(X\), and assume as usual that \(g\) is the identity function, we have

\[\begin{equation} f(X) = \sum_{m=0}^M\beta_m\sigma(\alpha_{0m} + \alpha_m^TX).\non \end{equation}\]

Comparing with (11.1), we have

\[\begin{equation} g_m(\omega_m^TX) = \beta_m\sigma(\alpha_{0m} + \|\alpha_m\|(\omega_m^TX))\non \end{equation}\]

where \(\omega_m = \alpha_m/\|\alpha_m\|\).

For a classification network, let \(K > 1\). Assume that (see (11.6) in the text)

\[\begin{equation} g_k(T) = \frac{e^{T_k}}{\sum_{l=1}^Ke^{T_l}},\non \end{equation}\]

by similar calculations above we get

\[\begin{equation} f_k(X) = \frac{e^{\sum_{m=0}^M\beta_{mk}\sigma(\alpha_{0m} + \|\alpha_m\|(\omega_m^TX))}}{\sum_{l=1}^Ke^{\sum_{m=0}^M\beta_{ml}\sigma(\alpha_{0m} + \|\alpha_m\|(\omega_m^TX))}}.\non \end{equation}\]

Note that \(\sum_{k=1}^Kf_k(X) = 1\). Instead of model \(f_k(X)\), it's more convenient to model the log ratio \(\log(f_k(X)/f_K(X))\), which is then simplified to

\[\begin{equation} \sum_{m=0}^M(\beta_{km}-\beta_{Km})\sigma(\alpha_{0m} + \|\alpha_m\|(\omega_m^TX)),\non \end{equation}\]

which is in the PPR form of (11.1).