Ex. 9.2

Let \(\bb{A}\) be a known \(k \times k\) matrix, \(\bb{b}\) be a matrix known \(k\)-vector, and \(\bb{z}\) be an unknown \(k\)-vector. A Gauss-Seidel algorithm for solving the linear system of equations \(\bb{A}\bb{z} = \bb{b}\) works by successively solving for element \(z_j\) in the \(j\)th equation, fixing all other \(z_j\)'s at their current guesses. This process is repeated for \(j=1,2,...,k,1,2,...,k,...,\) until convergence (Matrix computations).

(a)

Consider an additive model with \(N\) observations and \(p\) terms, with the \(j\)th term to be fit by a linear smoother \(\bb{S}_j\). Consider the following system of equations:

\[\begin{equation} \begin{pmatrix} \bb{I} & \bb{S}_1 & \bb{S}_1 & \cdots & \bb{S}_1 \\ \bb{S}_2 & \bb{I} & \bb{S}_2 & \cdots & \bb{S}_2 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \bb{S}_p & \bb{S}_p & \bb{S}_p & \cdots & \bb{I} \end{pmatrix} \begin{pmatrix} \bb{f}_1\\ \bb{f}_2\\ \vdots \\ \bb{f}_p \end{pmatrix} = \begin{pmatrix} \bb{S}_1\by\\ \bb{S}_2\by\\ \vdots\\ \bb{S}_p\by \end{pmatrix}.\non \end{equation}\]

Here each \(\bb{f}_j\) is an \(N\)-vector of evaluations of the \(j\)th function at the data points, and \(\by\) is an \(N\)-vector of the response values. Show that backfitting is a blockwise Gauss-Seidel algorithm for solving system of equations.

(b)

Let \(\bb{S}_1\) and \(\bb{S}_2\) be symmetric smoothing operators (matrices) with eigenvalues in \([0,1)\). Consider a backfitting algorithm with response vector \(\by\) and smoothers \(\bb{S}_1\), \(\bb{S}_2\). Show that with any starting values, the algorithm converges and give a formula for the final iterates.

Soln. 9.2

(a)

For \(j\)-th equation, we have

\[\begin{equation} \bb{S}_j\bb{f}_1 + ... \bb{S}_j\bb{f}_{j-1} + \bb{f}_j + ... + \bb{S}_j\bb{f}_p = \bb{S}_j\by,\non \end{equation}\]

so that

\[\begin{equation} \bb{f}_j = \bb{S}_j\left(\by - \sum_{k\neq j}\bb{f}_k\right)\non \end{equation}\]

which has the same as the second step in Algorithm 9.1. Therefore we can regard backfitting as a blockwise Gauss-Seidel algorithm.

(b)

Denote \(\bb{f}_1(k)\) and \(\bb{f}_2(k)\) as the value for the \(k\)-th iteration with initial values \(\bb{f}_1(0)\) and \(\bb{f}_2(0)\), we have

\[\begin{eqnarray} \bb{f}_1(k) &=& \bb{S}_1(\by-\bb{f}_2(k-1))\non\\ \bb{f}_2(k) &=& \bb{S}_2(\by-\bb{f}_1(k-1)).\non \end{eqnarray}\]

Therefore it's easy to derive

\[\begin{eqnarray} \bb{f}_2(k) &=& \bb{S}_2\bb{S}_1\bb{f}_2(k-1) + \bb{S}_2\by-\bb{S}_2\bb{S}_1\by\non\\ &=&(\bb{S}_2\bb{S}_1)^2\bb{f}_2(k-2) + (\bb{S}_2\bb{S}_1+\bb{I})(\bb{S}_2\by-\bb{S}_2\bb{S}_1\by)\non\\ &=&\cdots\non\\ &=&(\bb{S}_2\bb{S}_1)^k\bb{f}_2(0) + [(\bb{S}_2\bb{S}_1)^{k-1} + \cdots + \bb{S}_2\bb{S}_1+\bb{I}](\bb{S}_2\by-\bb{S}_2\bb{S}_1\by).\non \end{eqnarray}\]

Thus, as \(k\ra\infty\), we have \(\bb{f}_2(k)\) converges to

\[\begin{equation} (\bb{I}-\bb{S}_2\bb{S}_1)^{-1}(\bb{S}_2-\bb{S}_2\bb{S}_1)\by.\non \end{equation}\]

Similarly for \(\bb{f}_1(k)\), it converges to

\[\begin{equation} (\bb{I}-\bb{S}_1\bb{S}_2)^{-1}(\bb{S}_1-\bb{S}_1\bb{S}_2)\by.\non \end{equation}\]