Estimation of Rotations and Translations

There are several techniques to obtain an optimal estimate of the rigid rotation and translation transformation between points belonging to the same space. An optimal estimate refers to the maximum likelihood estimate, where the observation noise $\xi_i$ is completely additive in the space $\mathbb{R}^{n}$ in which the points reside.

Said are therefore two sets of points $\left\{ \prescript{1}{}{\mathbf{x}}_i \right\}$ and $\left\{ \prescript{2}{}{\mathbf{x}}_i \right\}$ such that they are related by

\begin{displaymath}
\prescript{2}{}{\mathbf{x}}_i = \prescript{2}{}{\mathbf{R}}_...
...cript{1}{}{\mathbf{x}}_i + \prescript{2}{}{\mathbf{t}} + \xi_i
\end{displaymath} (1.68)

as in equation (1.64) but with the noise vector $\xi_i$.

To solve the optimal problem $(\hat{\mathbf{R}}, \hat{\mathbf{t}})$ that transforms all points from $\mathbf{x}^{1}$ to $\mathbf{x}^{2}$, a least squares criterion is required that optimizes a cost function of the form

\begin{displaymath}
S = \sum_i w_i \Vert \left( \prescript{2}{}{\mathbf{R}}_1 \...
...{}{\mathbf{t}} \right) - \prescript{2}{}{\mathbf{x}}_i \Vert^2
\end{displaymath} (1.69)

where $w_i$ are any priors associated with each sample. The least squares solution in non-linear form clearly faces issues of local minima.

A notable result concerning the translation vector is obtained by applying the classic derivatives $0 = \partial S / \partial \mathbf{t}$ to equation (1.69), which is null at

\begin{displaymath}
\hat{\mathbf{t}} = \bar{\prescript{2}{}{\mathbf{x}}} - \hat{\mathbf{R}} \bar{\prescript{1}{}{\mathbf{x}}}
\end{displaymath} (1.70)

and therefore, given $\mathbf{R}$, it is immediate to find the translation in optimal form or vice versa; it suffices to derive only the rotation $\hat{\mathbf{R}}$ to solve the pose problem. This result can also be reached from an intuitive perspective: the centroid of the two sets of points following a rigid transformation must comply with the relationship of equation (1.68).

The optimal rotation $\mathbf{R}$ must therefore minimize

\begin{displaymath}
S = \sum_i w_i \Vert \prescript{2}{}{\mathbf{R}}_1 \prescript{1}{}{\mathbf{p}}_i - \prescript{2}{}{\mathbf{p}}_i \Vert^2
\end{displaymath} (1.71)

having defined $\mathbf{p}_i = \mathbf{x}_i - \bar{\mathbf{x}}$. Therefore, any pose registration problem in $n$ dimensions always reduces to a problem with $n$ DOF.

An initial technique for calculating the rotation makes use of principal components (see section 2.10.1). The principal components extracted from each set of points separately form a basis of the space. It is possible to determine a rotation that aligns these bases since the matrices of the column eigenvectors $\mathbf{R}_1$ and $\mathbf{R}_2$ yield $\mathbf{R} = \mathbf{R}_2 \left( \mathbf{R}_1 \right)^{\top}$ directly. However, there may exist multiple solutions (each of which should be verified), and due to noise, deriving the axes through PCA can become extremely unreliable (for example, if the resulting distribution were circular, any estimation would be impossible).

The best approach to minimize (1.71) is to minimize or rather maximize:

\begin{displaymath}
\begin{array}{l}
\argmin_{\mathbf{R}} \sum_i w_i \Vert \pre...
...\mathbf{P}_1 \mathbf{W} \mathbf{P}^{\top}_2 \right)
\end{array}\end{displaymath} (1.72)

minimizing the system of equations (1.69) is equivalent to maximizing the trace of the matrix $\hat{\mathbf{R}} \mathbf{H}$ where $\mathbf{H}$ is the correlation matrix between the two point clouds defined as

\begin{displaymath}
\mathbf{H} = \cov (\prescript{1}{}{\mathbf{x}}, \prescript{...
...athbf{x}}_i - \bar{\prescript{2}{}{\mathbf{x}}} \right)^{\top}
\end{displaymath} (1.73)

It is demonstrated that the matrix $\hat{\mathbf{R}}$ that maximizes the trace of $\hat{\mathbf{R}} \mathbf{H}$ is
\begin{displaymath}
\hat{\mathbf{R}} = \mathbf{V} \mathbf{U}^{\top}
\end{displaymath} (1.74)

having decomposed using singular value decomposition $\mathbf{H} = \mathbf{U} \Sigma \mathbf{V}^{\top}$. This algorithm is first described in (Kab76), rediscovered in (AHB87), but certainly improved in (Ume91) (particularly case $\det\hat{\mathbf{R}}=-1$) and is commonly referred to as the Kabsch-Umeyama algorithm.

This solution, much more stable than the PCA-based one and always valid for $n > 3$, requires particular attention only in two-dimensional and three-dimensional cases to handle potential reflections (in such cases, the determinant of the resulting matrix may be negative).

The "disadvantage" of the SVD-based technique compared to the PCA-based one is the requirement that the associations between the points of the two distributions be correct.

The combination of the two techniques along with an iterative approach is called Iterative Closest Point (ICP).

Paolo medici
2025-10-22