Ex. 12.10

Ex. 12.10

Kernels and linear discriminant analysis. Suppose you wish to carry out a linear discriminant analysis (two classes) using a vector of transformations of the input variable \(h(x)\). Since \(h(x)\) is high-dimensional, you will use a regularized within-class covariance matrix \(\bb{W}_h + \gamma\bb{I}\). Show that the model can be estimated using only the inner products \(K(x_i, x_{i'})=\langle h(x_i), h(x_{i'}) \rangle\). Hence the kernel property of support vector machines is also shared by regularized linear discriminant analysis.

Soln. 12.10

This problem is two-fold. First, with the kernel \(K\), LDA can be solved using the same logic discussed Section 12.3.3 as SVM, or more generally in Section 5.8. See Ex. 18.13 for solving logistic regression as well.

Second, the regularized LDA (may arise for high-dimensional problems) share the same logic as normal LDA in finding solutions. See Ex. 12.6 - Ex. 12.7 for comparison in optimal scoring problem (which has the solution proportional to LDA directions). In addition, see Ex. 18.10 for it's connection to the maximal data piling direction.