Ex. 10.1

Ex. 10.1

Derive expression (10.12) for the update parameter in AdaBoost.

Soln. 10.1

Plug \(G_m\) in (10.11) into (10.9), and solve for \(\beta\) by taking its partial derivative and then setting it to be zero, we get

\[\begin{equation} \sum_{i=1}^Nw_i^{(m)}y_iG(x_i)\exp(-\beta y_i G(x_i)) = 0,\non \end{equation}\]

which is

\[\begin{equation} \sum_{y_i = G(x_i)}w_i^{(m)}\exp(-\beta) - \sum_{y_i \neq G(x_i)}w_i^{(m)}\exp(\beta) = 0.\non \end{equation}\]

Multiplying \(\exp{(\beta)}\) on both sides and by a little algebra we get

\[\begin{eqnarray} \exp(2\beta) &=& \frac{\sum_{y_i = G(x_i)}w_i^{(m)}}{\sum_{y_i \neq G(x_i)}w_i^{(m)}}\non\\ &=&\frac{1-\text{err}_m}{\text{err}_m},\non \end{eqnarray}\]

where \(\text{err}_m\) is the minimized weighted error rate

\[\begin{equation} \text{err}_m = \frac{\sum_{i=1}^Nw_i^{(m)}\bb{1}(y_i \neq G_m(x_i))}{\sum_{i=1}^N w_i^{(m)}}.\non \end{equation}\]

Therefore, we get (10.12) below

\[\begin{equation} \beta = \frac{1}{2}\log \frac{1-\text{err}_m}{\text{err}_m}.\non \end{equation}\]