The fundamental idea of the Direct Linear Transformation proposed by Abdel-Aziz and Karara (AAK71) allows for the direct calculation of the coefficients of the matrices (8.47), (8.50), or the matrix (8.15), completely disregarding the parameters and the structure of the perspective transformation model. This article also presents an approach to solve overdetermined problems using the Pseudoinverse technique.
Given the system (8.15), it is necessary to derive the 12 parameters of the projection matrix to achieve an implicit calibration of the system, where the internal parameters (ranging from 9 to 11 depending on the model) that generated the elements of the matrix itself are unknown. This representation of the pin-hole camera is, of course, ideal (without non-linearities from the model).
The perspective function written in implicit form is
Being a homogeneous system, its solution will be the null subspace of
, the kernel of the matrix of known terms. For this reason, the matrix
is known up to a multiplicative factor, resulting in only 11 degrees of freedom (even fewer when considering that a modern camera typically has only 3-4 intrinsic parameters and 6 extrinsic parameters).
Having rearranged the system, the propagation of noise across the points is no longer linear, and this solution does not satisfy the maximum likelihood criterion. The matrix obtained through this procedure, although it conceals the internal structure of the sensor, allows for the projection of a point from world coordinates to image coordinates and enables the derivation of the line that underlies such a point in the world from a point in image coordinates.
The result is generally unstable when using only 6 points; therefore, the estimation is typically performed by processing more points than the minimum required. Techniques such as the pseudoinverse are employed to determine a solution that minimizes measurement errors.
The problem is the same as previously encountered; the homogeneous solution exists, and the homogeneous resolutive equation (8.44) generalizes to
This formulation is useful when the projective model does not adhere to the pinhole model, but it is still possible to derive the "camera" coordinates of the optical rays corresponding to the pixel, which are therefore available in homogeneous format.
Typically, to reduce the number of elements in the matrix , one can impose the constraint that all points involved in the calibration process lie on a specific plane (for example, the ground). This means setting the condition
, which implies the elimination of a column (related to the axis
) from the matrix.
which reduces to the size
, becomes invertible, and can be defined as homographic (see section 1.10).
We therefore define the matrix
(see (8.27)) as
As in the previous case, it is possible to transform the nonlinear relationship (8.47) in order to obtain linear constraints:
If you have a sufficiently modern linear systems solver, the additional constraint
is automatically satisfied during the computation of the kernel of the matrix of known terms (QR factorization or SVD decomposition).
Another simpler and more intuitive method consists of imposing an additional constraint : in this way, instead of solving a homogeneous system, one can solve a traditional linear problem. The system (8.47) can also be rearranged to obtain linear constraints in the form:
However, imposing implies that the point
cannot be a singularity of the image (e.g., the horizon line), and in general, it is not an optimal choice in terms of solution accuracy, as previously discussed.
It is important to note that the solution is heavily dependent on the chosen normalization. The choice can be referred to as standard least-squares.
In both cases, at least 4 points are required to obtain a homography , and each additional point allows for a solution with a lower error. These systems, when overdetermined, can be solved using the pseudoinverse method
.
The matrix is defined by 4 intrinsic parameters and 6 extrinsic parameters. The separation of intrinsic parameters from extrinsic parameters suggests that these parameters should be extracted independently to strengthen the calibration process. After all, intrinsic parameters can be determined with a certain degree of accuracy offline and are applicable to all possible camera placements (see also 8.5.4).
Let us define the matrix
(see (8.28)) as
The matrix is defined up to a scaling factor, while
allows for the definition of the scale since it still has two orthonormal columns. The knowledge of the two columns of the rotation matrix enables the derivation of the third column, and therefore this calibration becomes valid for points even outside the plane
.
As done previously, a non-linear system consisting of 3 homogeneous equations, when appropriately rearranged, yields two linear constraints:
(Abdel-Aziz and Karara (AAK71)). It is therefore possible to construct a system of equations for all
control points, in order to solve for the 9 unknowns. The matrix is defined up to a multiplicative factor, but in this case, the internal structure of the matrix
can be helpful in deriving the extrinsic parameters (see section 8.5.3). In fact, the two columns of the matrix must be orthonormal:
| (8.51) |
The equations (8.44) and (8.48) can also be derived from purely geometric considerations since the image and camera vectors must be parallel (the factor is purely multiplicative and at most affects the vector through an affine transformation):
| (8.52) |
Paolo medici