Ex. 15.4
Ex. 15.4
Suppose \(x_i, i =1,...,N\) are iid \((\mu, \sigma^2)\). Let \(\bar x_1^\ast\) and \(\bar x_2^\ast\) be two bootstrap realizations of the sample mean. Show that the sampling correlation \(\text{corr}(\bar x_1^\ast, \bar x_2^\ast)=\frac{n}{2n-1}\approx 50\%\). Along the way, derive \(\text{var}(\bar x_1^\ast)\) and the variance of the bagged mean \(\bar x_{\text{bag}}\). Here \(\bar x\) is a linear statistic; bagging produces no reduction in variance for linear statistics.
Soln. 15.4
Denote
\[\begin{eqnarray}
\bar x_1^\ast = \frac{1}{n}\sum_{i=1}^n\hat x_i,\ \ \bar x_2^\ast = \frac{1}{n}\sum_{i=1}^n\tilde x_i,\non
\end{eqnarray}\]
where \(\{\hat x_i, i=1,...,n\}\) and \(\{\tilde x_i, i=1,...,n\}\) are realizations from the first and the second bootstrap respectively.
Note that both \(\hat x_i\) and \(\tilde x_i\) are sampled from the empirical distribution of \(\{x_i, i=1,...,n\}\).
Therefore, for any \(1\le i\le n\), we have
\[\begin{eqnarray}
E[\hat x_i] &=& E[\tilde x_i] = \mu,\non\\
\text{var}(\hat x_i) &=& \text{var}(\tilde x_i) = \sigma^2.\non
\end{eqnarray}\]
Also, for any \(1\le i, j\le n\), we have
\[\begin{eqnarray}
\text{cov}(\hat x_i, \tilde x_j) = \frac{\sigma^2}{n}.\non
\end{eqnarray}\]
Then, we know
\[\begin{eqnarray}
\text{cov}(\bar x_1^\ast, \bar x_2^\ast) = \frac{1}{n^2}\left(\sum_{i,j}^n\text{cov}(\hat x_i, \tilde x_j)\right)=\frac{\sigma^2}{n},\non
\end{eqnarray}\]
and
\[\begin{eqnarray}
\text{var}(\bar x_1^\ast)
&=&\text{var}\left(\frac{1}{n}\sum_{i=1}^n\tilde x_i\right)\non\\
&=&\frac{1}{n^2}\left(\sum_{i=1}^n\text{var}(x_i) + \sum_{j\neq k}^n\text{cov}(\hat x_j, \hat x_k)\right)\non\\
&=&\frac{1}{n^2}\left(n\cdot \sigma^2 + (n^2-n)\cdot \frac{\sigma^2}{n}\right)\non\\
&=&\frac{(2n-1)\sigma^2}{n^2}.\non
\end{eqnarray}\]
Therefore we know that
\[\begin{equation}
\text{corr}(\bar x_1^\ast, \bar x_2^\ast) = \frac{\text{cov}(\bar x_1^\ast, \bar x_2^\ast)}{\sqrt{\text{var}(\bar x_1^\ast)\text{var}(\bar x_2^\ast)}} = \frac{n}{2n-1}.\non
\end{equation}\]
We have already derived \(\text{var}(\bar x_1^\ast)\) above. For \(\bar x_{\text{bag}}\), assume we have \(B\) realizations, then
\[\begin{eqnarray}
\text{var}(\bar x_{\text{bag}}) &=& \text{var}\left(\frac{1}{B}\sum_{i=1}^B\bar x_i^\ast\right)\non\\
&=&\frac{1}{B^2}\sum_{i=1}^B\text{var}(\bar x_i^\ast) + \frac{1}{B^2}\sum_{j\neq k}^B\text{cov}(\bar x_j^\ast, \bar x_k^\ast)\non\\
&=&\frac{1}{B}\cdot\frac{(2n-1)\sigma^2}{n^2}+\frac{B-1}{B}\cdot \frac{\sigma^2}{n}\non\\
&=&\frac{(2n-1)+(B-1)n}{Bn^2}\cdot \sigma^2.\non
\end{eqnarray}\]