Ex. 14.10
Ex. 14.10
Derive the solution to the affine-invariant average problem (14.60). Apply it to the three S’s, and compare the results to those computed in Exercise 14.9.
Soln. 14.10
We need to solve the problem
where \(\bA_l\) are \(p\times p\) nonsingular matrices and \(\bM^T\bM = \bb{I}\).
The Lagrangian for the problem is
Taking derivative w.r.t \(\bA_l\) and setting it to be zero we have
thus we know \(\bA_l = (\bX_l^T\bX_l)^{-1}\bX_l\bM\). Denote \(\bb{H}_l = \bX_l(\bX_l^T\bX_l)^{-1}\bX_l\), and plug the solution for \(\bA_l\) into the original problem, it reduces to minimize
Now we see the problem reduces to
under the condition that \(\bM^T\bM = \bb{I}\). The optimal solution is achieved when \(\bM\) is an orthogonal basis of the eigenspace associated with the largest eigenvalues of \(\bar{\bb{H}}\). That is, \(\bM\) is the \(N\times p\) matrix formed from the \(p\) largest eigenvectors of \(\bar{\bb{H}}\). The proof follows directly from Courant-Fisher characterization.
We give a short proof here. Consider the Lagrangian again and taking derivatives w.r.t \(\bM\), we get
for some symmetric \(\bar{\bb{A}}\). This is an invariant subspace equation. The equations allow \(\bM\) to be an arbitrary orthogonal basis for the rank-\(p\) subspace. It's then clear the choice of \(\bM\) described earlier is an optimal solution.