Ex. 12.11
Ex. 12.11
The MDA procedure models each class as a mixture of Gaussians. Hence each mixture center belongs to one and only one class. A more general model allows each mixture center to be shared by all classes. We take the joint density of labels and features to be
a mixture of joint densities. Furthermore we assume
This model consists of regions centered at
where the denominator is the marginal distribution
(a) Show that this model (called MDA2) can be viewed as a generalization of MDA since
where
(b) Derive the EM algorithm for MDA2.
(c) Show that if the initial weight matrix is constructed as in MDA, involving separate
Soln. 12.11
(a) We have
where
Intuitively, we can think that, in MDA, there are total of
(b) The EM algorithm is similar to that of MDA. Specifically, we have
E-step: Given current parameters (
M-step: Compute the weighted MLEs for the parameters of each of the component Gaussians within each of the classes, using the weights from the E-step.
(c) For MDA, the
Note that from (a), MDA2 is a generalization of MDA. Once