Ex. 2.2
Ex. 2.2
Show how to compute the Bayes decision boundary for the simulation example in Figure 2.5.
Soln. 2.2
Let's first recall how the data is generated (starting from the bottom of page 16 in the text).
First we generated 10 means \(m_k\) from a bivariate Gaussian \(N((1,0)^T, \textbf{I})\) and labeled this class BLUE. Similarly we generate 10 more means, denoted as \(o_k\), from \(N((0,1)^T, \textbf{I})\) and labeled this class ORANGE. We regard \(m_k\) and \(o_k\) as fixed for this problem.
Next, for each color (class), we generate 100 observations in the following way. For each observation, we picked an \(m_k\) (\(o_k\), respectively) at random with probability \(1/10\), and generate a variable with distribution \(N(m_k, \textbf{I}/5)\) (\(N(o_k, \textbf{I}/5)\), respectively), thus leading to a mixture of Gaussian clusters for each class.
Therefore we have
where \(\phi\) is the density of \(N(m_i, \textbf{I}/5)\).
Similarly
The Bayes decision boundary is determined by
From Bayes formula we have
Similarly we have
There the boundary equation \(\eqref{eq:2-2bound}\) reduces to
Note that in this example \(P(\text{BLUE}) = P(\text{ORANGE}) = 1/2\), so we have
Recall \(\eqref{eq:2-2blue}\) - \(\eqref{eq:2-2organge}\), it's now easy to see how to calculate the Bayes decision boundary.