Ex. 7.1
Ex. 7.1
Derive the estimate of in-sample error (7.24).
Soln. 7.1
It suffices to show that
\[\begin{equation}
\sum_{i=1}^N\text{Cov}(\hat y_i, y_i) = d\sigma^2_\epsilon.\non
\end{equation}\]
Note that for a linear fit, we have \(\hat y = \bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^Ty\), so
\[\begin{eqnarray}
\text{Cov}(\hat y, y) &=& \text{Cov}(\bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^Ty, y)\non\\
&=& \bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^T\text{Cov}(y, y)\non\\
&=& \bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^T\sigma^2_\epsilon.\non
\end{eqnarray}\]
Therefore, by cyclic property of trace operator,
\[\begin{eqnarray}
\sum_{i=1}^N\text{Cov}(\hat y_i, y_i) &=& \text{trace}(\bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^T)\sigma^2_\epsilon\non\\
&=&\text{trace}(\bb{X}^T\bb{X}(\bb{X}^T\bb{X})^{-1})\sigma^2_\epsilon\non\\
&=&\text{trace}(\bb{I}_d)\sigma^2_\epsilon\non\\
&=&d\sigma^2_\epsilon.\non
\end{eqnarray}\]