Subsections

MLE Calculation of the Homography

From a computational perspective, the equation (8.48) is ill-conditioned since each column represents a quantity with a different order of magnitude. To obtain a correct linear solution, a prior normalization phase is required. Hartley and Zisserman (HZ04) emphasize that normalization in the DLT is an essential step and cannot be considered purely optional.

The calculation of the homography in equation (8.49) has the drawback of not accounting for measurement errors on the points. In fact, the SVD decomposition minimizes something that, by pure coincidence, resembles the error on the known term (which is not the case in equation (8.48)), and in any case, it is not possible to assess the error on the parameter matrix. In this specific case, where a purely mathematical error is minimized using least squares without a corresponding geometric interpretation, it is referred to as algebraic least squares (ALS).

Since the DLT minimizes an algebraic error rather than a geometric one, even though from a computational standpoint the normalized DLT is superior, it may yield poorer results from the perspective of geometrically fitting the data. The version of the system (8.48) normalized by least squares is referred to as normalized algebraic least squares (NALS).

To overcome the limitations of algebraic error calculation, it is necessary to return to the original problem and not attempt to transform it into a linear problem, but rather to solve it, for instance, iteratively, using a nonlinear minimizer.

If the noise is present only in one of the two images, an appropriate cost function, with geometric significance, is the Euclidean distance between the measured points and the transformed points. This is commonly referred to as the transfer error and minimizes a nonlinear cost function of the form

\begin{displaymath}
\argmin_\mathbf{H} \sum \Vert \mathbf{m}'_i - \mathbf{H} \mathbf{m}_i \Vert^2
\end{displaymath} (8.53)

where $\mathbf{m}'_i$ is the image point affected by white Gaussian noise, while $\mathbf{m}_i$ is a perfectly known point. In this case, the function that minimizes the geometric error is also the one that represents the best estimate of the result from a Bayesian perspective (Maximum Likelihood Estimator or MLE).

However, when both data are affected by noise, the cost function (8.53) is not optimal. The simplest way to extend the previous solution is to attempt to minimize both the direct transfer error and the inverse transfer error (symmetric transfer error):

\begin{displaymath}
\argmin_\mathbf{H} \sum \Vert \mathbf{m}'_i - \mathbf{H} \m...
... + \Vert \mathbf{m}_i - \mathbf{H}^{-1} \mathbf{m}'_i \Vert^2
\end{displaymath} (8.54)

In this way, both contributions are taken into account in the solution of the problem.

This, however, is not yet the optimal solution, at least from a statistical standpoint. A maximum likelihood estimator must indeed properly account for the noise in both datasets when present (what Hartley and Zisserman refer to as the Gold Standard). The alternative solution, which is in fact the more accurate one, consists of minimizing the Reprojection error.

This solution significantly increases the size of the problem, as it aims to identify the optimal points that are not affected by noise $\hat{\mathbf{m}}_i$ and $\hat{\mathbf{m}}'_i$:

\begin{displaymath}
\argmin_\mathbf{H} \sum \Vert \mathbf{m}'_i - \hat{\mathbf{m}}'_i \Vert^2 + \Vert \mathbf{m}_i - \hat{\mathbf{m}}_i \Vert^2
\end{displaymath} (8.55)

under the constraint $\hat{\mathbf{m}'}_i = \mathbf{H} \hat{\mathbf{m}}_i$.

In the even more general case with covariance noise measured for each individual point, the correct metric is the Mahalanobis distance (see section 2.4):

\begin{displaymath}
\Vert \mathbf{m} - \hat{\mathbf{m}} \Vert^{2}_{\Gamma} = (\...
...athbf{m}})^{\top} \Gamma^{-1} (\mathbf{m} - \hat{\mathbf{m}})
\end{displaymath} (8.56)

In the case where the noise per point is constant, the previous expression simplifies to the more intuitive Euclidean distance.

Since it is a nonlinear minimization problem, an initial solution is required to start the search for the minimum that satisfies the cost equation: the linear solution remains useful and is employed as an initial reference to identify a minimum under a different metric.

The MLE estimator requires the use of an additional auxiliary variable $\hat{\mathbf{m}}_i$ for each point and iterative techniques to solve the problem. It is possible to use the Sampson error as an approximation of the geometric distance, as discussed in section 3.3.7. The homographic constraint (1.75) that relates the points of the two images can be expressed in the form of a two-dimensional manifold $\mathcal{V}_H$

\begin{displaymath}
\begin{array}{l}
h_0 u_1 + h_1 v_1 + h_2 - h_6 u_1 u_2 - h_...
...+ h_5 - h_6 u_1 v_2 - h_7 v_1 v_2 - h_8 v_2 = 0 \\
\end{array}\end{displaymath} (8.57)

from which the Jacobian
\begin{displaymath}
\mathbf{J}_\mathcal{V} = \begin{bmatrix}
h_0 - h_6 u_2 & h_...
...& h_4 - h_7 v_2 & 0 & -h_6 u_1 -h_7 v_1 - h_8 \\
\end{bmatrix}\end{displaymath} (8.58)

can be derived for use in the calculation of the Sampson distance (CPS05).

Error Propagation in Homography Calculation

In the case of an error on a single image, to calculate how the error propagates through the matrix $\mathbf{H}$, it is necessary to compute the Jacobian of the cost function (8.53). By specifying the homographic transformation, one obtains (HZ04)

\begin{displaymath}
\mathbf{J}_i = \frac{\partial r}{\partial \mathbf{h}} = \fra...
...& - \hat{v}'_i \mathbf{m}_i^{\top} / \hat{w}' \\
\end{bmatrix}\end{displaymath} (8.59)

with $\mathbf{m}_i = (u_i, v_i, 1)^{\top}$ and $\hat{\mathbf{m}}'_i = (\hat{u}'_i, \hat{v}'_i, \hat{w}'_i)^{\top} = \mathbf{H} \mathbf{m}_i$.

Using the theory presented in section 3.5, it is possible to compute the covariance matrix of the homography parameters given the covariance on the points $\mathbf{m}'_i$. Since the total covariance matrix $\boldsymbol\Sigma$ of the noise on the individual points will be very sparse, as different points are assumed to have independent noise, the covariance $\boldsymbol\Sigma_h$ on the obtained parameters is given by (HZ04)

\begin{displaymath}
\boldsymbol\Sigma_h = \left( \sum \mathbf{J}_i^{\top} \boldsymbol\Sigma^{-1}_i \mathbf{J}_i \right)^{+}
\end{displaymath} (8.60)

with $\boldsymbol\Sigma_i$ being the covariance matrix of the noise on the individual point.

Paolo medici
2025-10-22