Ex. 6.10
Ex. 6.10
Suppose we have \(N\) samples generated from the model \(y_i=f(x_i)+\epsilon_i\), with \(\epsilon_i\) independent and identically distributed with mean zero and variance \(\sigma^2\), the \(x_i\) assumed fixed (non random). We estimate \(f\) using a linear smoother (local regression, smoothing spline, etc.) with smoothing parameter \(\lambda\). Thus the vector of fitted value is given by \(\boldsymbol{\hat f} = \bm{S}_\lambda \bb{y}\). Consider the in-sample prediction error
for predicting new responses at the \(N\) input values. Show that the average squared residual on the training data, ASR(\(\lambda\)), is a biased estimate (optimistic) for PE(\(\lambda\)), while
is unbiased.
Soln. 6.10
The proof follows directly from Ex. 7.4 and Ex. 7.5 for general linear smoother. Specifically, by Ex. 7.4, we know
and from Ex. 7.5 we have
Then the proof is straightforward.