Ex. 14.8

Ex. 14.8

Derive the solution (14.57) to the Procrustes problem (14.56). Derive also the solution to the Procrustes problem with scaling (14.58).

Soln. 14.8

We first derive the solution (14.57). We need to solve the following optimization problem

minμ,Rtrace[(X2(X1R+1μT))T(X2(X1R+1μT))]s.t. RTR=I.

The Lagrangian for the problem is

L(μ,R,A)=trace[(X2(X1R+1μT))T(X2(X1R+1μT))] +trace[A(RTRI)].

By properties of derivatives on trace operator, we calculate the derivatives w.r.t to μ,R and set them to zero, which gives

L(μ,R,A)μ=2X2T1+RTX1T1+2Nμ=0L(μ,R,A)R=2X1TX1R2X1TX22X1T1μT+R(A+AT)=0

From the first equation above, we get

μ^=1NX2T11NRTX1T1=x¯2RTx¯1.

Plug above into the second condition for R we get (by some algebra)

(1)X~1TX~2=X~1TX~1R+RA¯,

where

X~i=Xi1x¯iTA¯=A+AT2.

Since RTR=I, multiply RT on both sides of (1) we get

RTX~1TX~2=RTX~1TX~1R+A¯.

Since both summands on the right hand side are symmetric, we know that RTX~1TX~2=X~2TX~1TR. Now it's easy to verify that

R^=UVT

satisfies the condition, where X~1X~2=UDVT.

For the problem with scaling (14.58), the idea is the similar. The Lagrangian is

L(β,R,A)=trace[(X2βX1R)T(X2βX1R)]+trace[A(RTRI)].

We have

L(β,R,A)β=2trace[X2RTX1T]+2βtrace[RTX1TX1R]=2trace[X2RTX1T]+2βtrace[RRTX1TX1]=2trace[X2RTX1T]+2βtrace[X1TX1].

Therefore we see

(2)β^=trace[RTX1TX2]X1F.

With β^ fixed, we already derived R^ in the first part of this exercise, with X~1 replaced with β^X~1. Therefore we know R^=UVT as before. Plug R^ into (2) we have

β^=trace[VUTUDVT]X1F=trace[D]X1F.