Ex. 12.6
Ex. 12.6
Suppose that the regression procedure used in FDA (Section 12.5.1) is a linear expansion of basis functions \(h_m(x), m=1,...,M\). Let \(\bb{D}_\pi = \bb{Y}^T\bb{Y}/N\) be the diagonal matrix of class proportions.
(a) Show that the optimal scoring problem (12.52) can be written in vector notation as
where \(\theta\) is a vector of \(K\) real numbers, and \(\bb{H}\) is the \(N\times M\) matrix of evaluations \(h_j(x_i)\).
(b) Suppose that the normalization on \(\theta\) is \(\theta^T\bb{D}_\pi 1 = 0\) and \(\theta^T\bb{D}_\pi\theta=1\). Interpret these normalizations in terms of the original scored \(\theta(g_i)\).
(c) Show that, with this normalization, (12.65) can be partially optimized w.r.t. \(\beta\), and leads to
subject to the normalization constraints, where \(\bb{S}\) is the projection operator corresponding to the basis matrix \(H\).
(d) Suppose that the \(h_j\) include the constant function. Show that the largest eigenvalue of \(\bb{S}\) is 1.
(e) Let \(\Theta\) be a \(K\times K\) matrix of scores (in columns), and suppose the normalization is \(\Theta\bb{D}_\pi\Theta=\bb{I}\). Show that the solution to (12.53) is given by the complete set of eigenvectors of \(\bb{S}\); the first eigenvector is trivial, and takes care of the centering of the scores. The remainder characterize the optimal scoring solution.
Soln. 12.6
(a) Let \(\bb{Y}\) be an \(N\times K\) indicator response matrix with each row corresponding to the encoding of \(K\) classes, see Section 12.5.1.
Let \(h(x_i) = (h_1(x_i),...,h_M(x_i))^T\) for \(i=1,...,N\). Then we have
where \(j_1, j_2, ...,j_K\) are determined by \(\bb{Y}\)'s encoding. Then we only need to choose \(\theta\) such that \(\theta_{j_i} = g(\theta_i)\) for \(i=1,...,N\), so that
The rest of proof follows directly.
(b) Since
we have
Similarly we know \(\theta^T\bb{D}_\pi\theta=1\) implies that
(c) With \(\theta\) fixed, the minimizing \(\beta\) for the optimal scoring problem is the least square estimate
Then the minimization objective becomes, by noting the constraint \(\theta^T\bb{D}_\pi\theta=1\),
which is equivalent to solving
where
is the projection operator (see, e.g., p.153 in the text) corresponding to the basis matrix \(\bb{H}\).
(d) Note that \(\bb{S}\) is idempotent, i.e., \(\bb{S}\bb{S} = \bb{S}\), so the largest eigenvalue of \(\bb{S}\) is 1.
(e) From (c) we know that the problem formulated in (12.53) can be written as an eigenvalue problem
By classical results for generalized eigenvalue problems (see, e.g., Eigenvalue and generalized eigenvalue problems: Tutorial), we know the solution to (12.53) is given by the eigenvectors of \(\bb{S}\).