Ex. 3.25

Ex. 3.25

LAR look-ahead (Efron et al., 2004, Sec. 2 Least Angle Regression).} Starting at the beginning of the \(k\)th step of the LAR algorithm, derive expressions to identify the next variable to enter the active set at step \(k+1\), and the value of \(\alpha\) at which this occurs (using the notation around equation (3.55) on page 74).

Soln. 3.25

Suppose the next step of LAR algorithm update is

\[\begin{equation} \label{eq:3-25a} \hat{\textbf{f}}_k(\alpha) = \hat{\textbf{f}}_k + \alpha\cdot \textbf{u}_k. \end{equation}\]

Then, the current correlation with residual, for each \(\textbf{x}_j\), becomes

\[\begin{equation} \label{eq:3-25b} c_j(\alpha) = \textbf{x}_j^T(\textbf{y}-\hat{\textbf{f}}_k(\alpha)) = \textbf{x}_j^T\textbf{r}_k - \alpha\textbf{x}_j^T\textbf{u}_k. \end{equation}\]

By Ex. 3.23 and Ex. 3.24, \(c_j(\alpha)\) and \(\textbf{x}_j^T\textbf{u}_k\) are both the same for \(j\in \mathcal{A}_k\). Therefore, for \(j\in\mathcal{A}_k\), we have

\[\begin{equation} \label{eq:3-25c} |c_j(\alpha)| = \hat C - \alpha A, \end{equation}\]

showing that all of the maximal absolute current correlations decline equally where

\[\begin{eqnarray} \hat C &=& \max_{j}|\textbf{x}_j^T\textbf{r}_k|\non\\ A &=& \textbf{x}_j^T\textbf{u}_k.\non \end{eqnarray}\]

For \(j\notin \mathcal{A}_k\), equating \(\eqref{eq:3-25b}\) with \(\eqref{eq:3-25c}\) shows that the optimal value \(\hat\alpha\) is

\[\begin{equation} \hat\alpha = \frac{\hat C - \textbf{x}_j^T\textbf{r}_k}{A-\textbf{x}_j^T\textbf{u}_k}.\non \end{equation}\]

Likewise for \(-c_j(\alpha)\), the optimal values is

\[\begin{equation} \hat\alpha = \frac{\hat C + \textbf{x}_j^T\textbf{r}_k}{A + \textbf{x}_j^T\textbf{u}_k}.\non \end{equation}\]

Therefore, we conclude that the optimal value \(\hat\alpha\) can be written as

\[\begin{equation} \label{eq:3-25d} \hat\alpha = {\min_{j\in\mathcal{A}_k^c}}^+\Big\{\frac{\hat C - \textbf{x}_j^T\textbf{r}_k}{A-\textbf{x}_j^T\textbf{u}_k}, \frac{\hat C + \textbf{x}_j^T\textbf{r}_k}{A + \textbf{x}_j^T\textbf{u}_k}\Big\}, \end{equation}\]

where indicates that the minimum is taken over only positive components within each choice of \(j\in\mathcal{A}_k^c\). Note that \(\eqref{eq:3-25d}\) here is the same as the formula (2.13) given in Least Angle Regression.