Ex. 13.5

Ex. 13.5

Let \(\bb{B}_i, i=1,2,...,N\) be square \(p\times p\) positive semi-definite matrices and let \({\bar{\bb B}} = (1/N)\sum\bb{B}_i\). Write the eigen-decomposition of \({\bar{\bb B}}\) as \(\sum_{\ell=1}^p\theta_\ell e_\ell e_\ell^T\) with \(\theta_\ell\ge\theta_{\ell-1}\ge \cdots \ge\theta_1\). Show that the best rank-\(L\) approximation for the \(\bb{B}_i\),

\[\begin{equation} \min_{\text{rank}(M)=L}\sum_{i=1}^N\text{trace}[(\bb{B}_i-\bb{M})^2],\non \end{equation}\]

is given by \({\bar{\bb B}}_{[L]}=\sum_{\ell=1}^L\theta_\ell e_\ell e^T_\ell\). Hint: Write \(\sum_{i=1}^N\text{trace}[(\bb{B}_i-\bb{M})^2]\) as

\[\begin{equation} \sum_{i=1}^N\text{trace}[(\bb{B}_i-{\bar{\bb B}})^2] + \sum_{i=1}^N\text{trace}[(\bb{M}-{\bar{\bb B}})^2]).\non \end{equation}\]
Soln. 13.5

By properties of trace operator (see \cite{matbook}), we have

\[\begin{eqnarray} &&\sum_{i=1}^N\text{trace}[(\bb{B}_i - \bb{M})^2]\non\\ &=&\sum_{i=1}^N\text{trace}[(\bb{B}_i - \bar{\bB})^2 + (\bar\bB - \bb{M})^2] + \sum_{i=1}^N\text{trace}[(\bB_i-\bar\bB)(\bar\bB-\bb{M})]\non\\ &=&\sum_{i=1}^N\text{trace}[(\bb{B}_i - \bar{\bB})^2 + (\bar\bB - \bb{M})^2]\non\\ &&+\sum_{i=1}^N\text{trace}[\bar\bB_i(\bar\bB-\bb{M})] - N\cdot\text{trace}[\bar\bB(\bar\bB-\bb{M})]\non\\ &=&\sum_{i=1}^N\text{trace}[(\bb{B}_i - \bar{\bB})^2] +\sum_{i=1}^N\text{trace}[(\bar\bB - \bb{M})^2],\non \end{eqnarray}\]

since the last summand vanishes.

It suffices to show that \(\bar\bB_{[L]}\) solves

\[\begin{equation} \min_{\text{rank}(\bb{M})=L}\text{trace}[(\bb{M}-\bar\bB)^2].\non \end{equation}\]

Recall that the Frobenius norm of a matrix \(M\) is defined as

\[\begin{equation} \|M\|_F^2 = \text{trace}[M^TM].\non \end{equation}\]

We are essentially solving the low-rank matrix approximation problem under Forbenius norm (in our case, for matrix \(\bb{M}-\bar\bB\)). The desired solution \(\bar\bB_{[L]}=\sum_{l=1}^L\theta_le_le_l^T\) follows directly from Eckart–Young–Mirsky theorem. We stated below for reference.

For any matrix \(M\in \mathbb{R}^{m\times n}\) (with \(m\le n\)) with singular values \(\sigma_1\le \sigma_2 \le ... \le \sigma_m\),

\[\begin{equation} \min_{\text{rank}(M_k)=k}\|M-M_k\|_F^2 = \sum_{i=k+1}^m\sigma_i^2.\non \end{equation}\]