Ex. 18.14

Distance weighted 1-NN classification. Consider the 1-nearest-neighbor method (Section 13.3) in a two-class classification problem. Let \(d_+(x_0)\) be the shortest distance to a training observation in class +1, and likewise \(d_-(x_0)\) the shortest distance for class -1. Let \(N_-\) be the number of samples in class -1, \(N_+\) the number in class \(+1\), and \(N=N_- + N_+\).

(a) Show that

\[\begin{equation} \delta(x_0) = \log \frac{d_-(x_0)}{d_+(x_0)}\non \end{equation}\]

can be viewed as a nonparametric discriminant function corresponding to \(1\)-NN classification. [Hint: Show that \(\hat f_+(x_0)=\frac{1}{N_+d_+(x_0)}\) can be viewed as a nonparametric estimate of the density in class +1 at \(x_0\)].

(b) How would you modify this function to introduce class prior probabilities \(\pi_+\) and \(\pi_-\) different from the sample-priors \(N_+/N\) and \(N_-/N\)?

(c) How would you generalize this approach for \(K\)-NN classification?

Soln. 18.14

(a) Note that \(\delta(x_0) > 0\) is equivalent to \(d_-(x_0) > d_+(x_0)\), thus we can assign \(x_0\) to class \(+1\). Therefore \(\delta (x_0)\) can be viewed as a nonparametric discriminant function corresponding to \(1\)-NN classification.

(b) Note that by Bayes formula, we have

\[\begin{eqnarray} P(x \in \text{class 1}|x=x_0) &=& \frac{P(x=x_0, x\in \text{class 1})}{P(x=x_0)} \non\\ &=&\frac{\hat f_+(x_0)\pi_+}{P(x=x_0)}\non\\ &=& \frac{\pi_+/N_+d_+(x_0)}{P(x=x_0)}.\non \end{eqnarray}\]

Therefore we have

\[\begin{eqnarray} \delta(x_0) &=& \log \left(\frac{\pi_+N_-}{\pi_-N_+}\cdot \frac{d_-(x_0)}{d_+(x_0)}\right).\non \end{eqnarray}\]

(c) For \(k\)-NN, given \(x_0\), compute the distances \(d(x_0, x_k)\) between \(x_0\) and \(K\) closest training samples \(x_k\) for \(k=1,...,K\). Then we can choose nonparametric discriminant function as

\[\begin{equation} \delta(x_0) = \frac{\sum_{k=1}^Kw_k y_k}{\sum_{k=1}^Kw_k}\non \end{equation}\]

where \(w_k\) represents the weights and in our case \(w_k = \frac{1}{d(x_0, x_k)}\) for \(k=1,...,K\).