Ex. 5.13

Ex. 5.13

You have fitted a smoothing spline \(\hat f_\lambda\) to a sample of \(N\) pairs \((x_i, y_i)\). Suppose you augment your original sample with the pair \(x_0, \hat f_\lambda(x_0)\), and refit; describe the result. Use this to derive the \(N\)-fold cross-validation formula (5.26).

Soln. 5.13

Let \(\hat f^{(-i)}_\lambda(x_i)\) denote the predicted value for the \(i-\)th case when \(\{x_i, y_i\}\) is left out of the data doing the fitting. We claim that

\[\begin{equation} \label{eq:5-13leave} \hat f^{(-i)}_\lambda(x_i) = \frac{1}{1-S_\lambda(i,i)}\sum_{j\ne i} S_\lambda(i,j)y_j. \end{equation}\]

Starting from \(\eqref{eq:5-13leave}\), we multiply \((1-S_\lambda(i,i))\) on both sides and move one term from left side to right side, we have

\[\begin{equation} \hat f^{(-i)}_\lambda(x_i) = \sum_{j\ne i} S_\lambda(i,j)y_j + S_\lambda(i,i)\hat f^{(-i)}_\lambda(x_i).\non \end{equation}\]

Recall that

\[\begin{equation} \hat f_\lambda(x_i) = \sum_{j=1}^nS_\lambda(i,j)y_j\non, \end{equation}\]

we have

\[\begin{equation} \hat f^{(-i)}_\lambda(x_i) = \hat f_\lambda(x_i) + S_\lambda(i,i)\hat f^{(-i)}_\lambda(x_i) - S_\lambda(i,i)y_i,\non \end{equation}\]

thus

\[\begin{equation} y_i - \hat f^{(-i)}_\lambda(x_i) = \frac{y_i-\hat f_\lambda(x_i)}{1-S_\lambda(i,i)}.\non \end{equation}\]

It remains to prove \(\eqref{eq:5-13leave}\). Intuitively, any reasonable smoother is constant preserving, which means \(S_\lambda \bb{1} = \bb{1}\). Therefore, the rows of \(S_\lambda\) sum to one. Thus if we want to use the same smoother with the \(i\)-th row and column deleted, we must re-normalize the rows to sum to one, that gives \(\eqref{eq:5-13leave}\). For a rigorous proof, please see Ex. 7.3 (a).