Ex. 7.1

Ex. 7.1

Derive the estimate of in-sample error (7.24).

Soln. 7.1

It suffices to show that

\[\begin{equation} \sum_{i=1}^N\text{Cov}(\hat y_i, y_i) = d\sigma^2_\epsilon.\non \end{equation}\]

Note that for a linear fit, we have \(\hat y = \bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^Ty\), so

\[\begin{eqnarray} \text{Cov}(\hat y, y) &=& \text{Cov}(\bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^Ty, y)\non\\ &=& \bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^T\text{Cov}(y, y)\non\\ &=& \bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^T\sigma^2_\epsilon.\non \end{eqnarray}\]

Therefore, by cyclic property of trace operator,

\[\begin{eqnarray} \sum_{i=1}^N\text{Cov}(\hat y_i, y_i) &=& \text{trace}(\bb{X}(\bb{X}^T\bb{X})^{-1}\bb{X}^T)\sigma^2_\epsilon\non\\ &=&\text{trace}(\bb{X}^T\bb{X}(\bb{X}^T\bb{X})^{-1})\sigma^2_\epsilon\non\\ &=&\text{trace}(\bb{I}_d)\sigma^2_\epsilon\non\\ &=&d\sigma^2_\epsilon.\non \end{eqnarray}\]