Ex. 7.2
Ex. 7.2
For 0-1 loss with \(Y\in \{0, 1\}\) and \(\text{Pr}(Y=1|x_0) = f(x_0)\), show that
where \(\hat G(x) = \bb{1}(\hat f(x) > 1/2)\), \(G(x) = \bb{1}(f(x) > 1/2)\) is the Bayes classifier, and \(\text{Err}_B(x_0) = \text{Pr}(Y\neq G(x_0)|X=x_0)\), the irreducible Bayes error at \(x_0\). Using the approximation \(\hat f(x_0)\sim N(E\hat f(x_0), \text{Var}(\hat f(x_0))\), show that
In the above
the cumulative Gaussian distribution function, This is an increasing function, with value 0 at \(t=-\infty\) and value 1 at \(t=+\infty\).
We can think of \(\text{sign}(\frac{1}{2}-f(x_0))(E\hat f(x_0) - \frac{1}{2})\) as a kind of boundary-bias term, as it depends on the true \(f(x_0)\) only through which side of the boundary \((\frac{1}{2})\) that it lies. Notice also that the bias and variance combine in a multiplicative rather than additive fashion. If \(E\hat f(x_0)\) is on the same side of \((\frac{1}{2})\), then the bias is negative, and decreasing the variance will decrease the misclassification error. On the other hand, if \(E\hat f(x_0)\) is on the opposite side of \((\frac{1}{2})\) to \(f(x_0)\), then the bias is positive and it pays to increase the variance! Such an increase will improve the chance that \(\hat f(x_0)\) falls on the correct side of \((\frac{1}{2})\) (On bias, variance, 0/1—loss, and the curse-of-dimensionality).
Soln. 7.2
First consider the case when \(f(x_0) \ge 1/2\), we have \(G(x_0) = 1\), and
Similar arguments hold for the case when \(f(x_0) < 1/2\) and \(G(x_0) = 0\). Therefore, we have showed
For the second part, again, we first consider the case when \(f(x_0) \ge 1/2\) (thus \(G(x_0) = 1\)). In such case, we have
Similar arguments hold for the case when \(f(x_0) \ge 1/2\) as well.