Ex. 6.11

Ex. 6.11

Show that for the Gaussian mixture model (6.32) the likelihood is maximized at \(+\infty\), and describe how.

Soln. 6.11

Recall that

\[\begin{equation} f(x) = \sum_{m=1}^M\alpha_m\phi(x;\mu_m,\bm{\Sigma}_m)\non \end{equation}\]

with \(\sum_{m=1}^M\alpha_m=1\) and

\[\begin{equation} \phi(x;\mu_m,\bm{\Sigma}_m) = (2\pi)^{-\frac{d}{2}}\text{det}(\bm{\Sigma})^{-\frac{1}{2}}e^{-\frac{1}{2}(x-\mu_m)^T\bm{\Sigma}_m^{-1}(x-\mu_m)}.\non \end{equation}\]

Suppose that \(M > 1\). Without loss of generality, assume that \(\bm{\Sigma}_m = \sigma_m\bb{I}\). Given \(N \ge 1\) observations, the density becomes

\[\begin{equation} \prod_{i=1}^Nf(x_i) \propto \prod_{i=1}^N\left(\sum_{m=1}^M\alpha_m \sigma_m^{-\frac{d}{2}}e^{-\frac{(x_i-\mu_m)^T(x_i-\mu_m)}{2\sigma_m^d}}\right). \non \end{equation}\]

If \(x_i = \mu_m\) for some \(1\le i\le N\) and \(1\le m \le M\), then it's density term reduces to

\[\begin{equation} \alpha_m \sigma_m^{-\frac{d}{2}}\non \end{equation}\]

and we can let \(\sigma_m\ra 0\) so that \(\prod_{i=1}^Nf(x_i)\ra\infty\).

The case when \(M=1\) (a single Gaussian model without mixture) worth some discussions. For simplicity let's consider the 1-dimensional case, that is, \(d=1\). Given \(N > 1\) observations, the density becomes proportional to

\[\begin{equation} \frac{1}{\sigma^N}e^{-\frac{\sum_{i=1}^N(x_i-\mu)^2}{2\sigma^2}}.\non \end{equation}\]

In this case, as long as \(x_i\neq \mu\) for at least one \(i\in \{1,...,N\}\), the density does not explode when \(\sigma\ra 0\) because the exponential term dominates the convergence rate.