Ex. 3.15

Ex. 3.15

Verify expression (3.64), and hence show that the partial least squares directions are a compromise between the ordinary regression coefficient and the principal component directions.

Soln. 3.15

Note that

Corr2(y,Xα)Var(Xα)=Cov2(y,Xα)Var(y)Var(Xα)Var(Xα)=Cov2(y,Xα)Var(y)

We are essentially solving

maxα(yTXα)2s.t.α=1αTSφ^l=0, l=1,...,m1,

where S=XTX is the sample covariance matrix of the xj.

We start with the case m=1, which immediately gives what we call the first canonical covariance variable (see Ex. 3.20) with

α^1=XTy/XTy2.

Note that α^1φ^1 in Algorithm 3.3 in the text.

The second canonical covariance variable, namely α^2, has to maximize the same objective with additional constraint α^2TSα^1=0. It turns out

(1)α^2XTy(yTXSXTyyTXS2XTy)SXTy.

To see that, we first verify that

α^2TSα^1yTXSXTy(yTXSXTyyTXS2XTy)yTXSTSXTy=0.

Second, for α2 satisfying α2TSα^1=0, that is, α2TSXTy=0, we have the objective to maximize

α2TXTy=α2T(XTy(yTXSXTyyTXS2XTy)SXTy).

Therefore we see (1) holds. Note that α^2φ^2 in the Algorithm 3.3 in the text. Continuing this, we are able to derive φ^m for m1.

Now we are ready to show that partial least squares (PLS) directions are a compromise between the ordinary regression coefficient (OLS) and the principal component directions (PCR). The regressors for OLS, PCR and PLS may be referred to as canonical correlation, canonical variance and canonical covariance variables respectively. A generalized criterion that encompasses all three methods is

maxα(yTXα)2(αTXTXα)r1r1s.t.α=1αTSα^l=0, l=1,...,m1,

where r[0,1). When r=0 we recover OLS, and when r1 we get PCR. The case when r=1/2 gives PLS. Note that this generalized regression is referred to as continuum regression. See paper Continuum Regression and Ridge Regression for more details.