Factorization of the Essential Matrix

In the previous sections, it has been demonstrated how, by utilizing at least 5 correspondences between homologous points in two images, it is possible to obtain the Essential matrix that encodes the relative pose between the two cameras. The Essential matrix can be factorized again into rotation and translation. This allows for the retrieval of the relative parameters of the involved cameras and, through this information, enables the execution of a three-dimensional reconstruction of the observed scene.

As suggested by Trivedi, from the definition of the essential matrix (9.41), it is easy to show that the symmetric matrix $\mathbf{E} \mathbf{E}^{\top}$ is independent of the rotation vector:

\begin{displaymath}
\mathbf{E} \mathbf{E}^{\top} = [\mathbf{t}]_{\times} [\math...
...z \\
- t_z t_x & - t_z t_y & t^2_x + t^2_y \\
\end{bmatrix}\end{displaymath} (9.70)

From the matrix $\mathbf{E} \mathbf{E}^{\top}$, the translation vector $\mathbf{t}$ can be derived, keeping in mind that this vector is known up to a multiplicative factor (and therefore a sign), which can then be used to obtain $\mathbf{R}$.

The Essential matrix can also be directly factored through Singular Value Decomposition. Let $\mathbf{U}\mathbf{D}\mathbf{V}^{\top}$, where $\mathbf{D}=\diag (1,1,0)$, the SVD of $\mathbf{E}$ (if this were not the case, it is still possible to project the matrix $\mathbf{E}$ into the space of Essential matrices, as described in section 9.4.1). Through this decomposition, the generating components of $\mathbf{E}$ can be extracted:

\begin{displaymath}[\mathbf{t}]_{\times} = \mathbf{U} \left( \mathbf{R}^{\top}_z...
...\top} \vert \mathbf{U} \mathbf{R}_{z}^{\top} \mathbf{V}^{\top}
\end{displaymath} (9.71)

where
\begin{displaymath}
\mathbf{R}^{\top}_z \mathbf{D} = \begin{bmatrix}
0 & 1 & 0...
...x}
0 & -1 & 0 \\
1 & 0 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}\end{displaymath} (9.72)

with $\mathbf{R}_{z}$ rotation around the axis $z$ by an angle of $\frac{\pi}{2}$. It should be noted that $[\mathbf{t}]_{\times} \mathbf{t} = 0$ for every possible $\mathbf{t}$. It can be demonstrated that this is only possible when $\mathbf{t} = \mathbf{U} (0,0,1)^\top = \mathbf{u}_3$, the last column of the matrix $\mathbf{U}$.

The rotation matrix $\mathbf{R}$ thus presents two possible solutions that are rotated 180^ with respect to the axis connecting the two pinholes. Since the vector $\mathbf{t}$ is known up to a multiplicative factor and the constraint $\vert\mathbf{t}\vert=1$ does not allow us to determine the sign of the translation, there are also two additional alternatives for the factorization due to an ambiguity regarding the sign that $\mathbf{t}$ can assume. Therefore, there are 4 different plausible factorizations of an Essential matrix, and among these, the one that projects all points (or the majority) frontally with respect to both cameras must be selected.

Paolo medici
2025-10-22