Perspective Mapping and Inverse Perspective Mapping

Using homography, it is possible to perform the transformation of inverse perspective mapping (or bird eye view) by simply inverting the matrix of the perspective mapping.

The homographic matrix $\mathbf{H} = \mathbf{P}_{Z}$ of the perspective mapping of a plane, perspective mapping, related to a constant plane $z$, where typically $z=0$ is the ground, the most significant plane, can be derived quite simply as follows:

\begin{displaymath}
\mathbf{P}_{Z} = \mathbf{K} \cdot \mathbf{R}_{Z}
\end{displaymath} (8.27)

where $\mathbf{R}_{Z}$ is the rotation-translation matrix of a plane that can be expressed as
\begin{displaymath}
\mathbf{R}_{Z} = \begin{bmatrix}\mathbf{r}_1 & \mathbf{r}_2...
...de{t}_y \\
r_6 & r_7 & r_8 z + \tilde{t}_z \\
\end{bmatrix}\end{displaymath} (8.28)

having defined the vector $\mathbf{\tilde{t}}$ as the translation expressed in camera coordinates, as shown in equation (8.13).

This matrix is very important and will be discussed extensively in section 8.5 on calibration.

The transformation (8.27), being a homography, is invertible. When it densely transforms all image points into world points, it is referred to as Inverse Perspective Mapping, whereas when it transforms all world points into image points, it is denoted as Perspective Mapping. In both cases, only the plane $z$ is correctly projected.

It is always interesting to note how even the simplest model of the pin-hole camera with 9 parameters (6 extrinsic and 3 intrinsic) cannot be derived from the 8 parameters constraints provided by the homography matrix. However, knowing the intrinsic parameters allows for an estimation of the camera's rotation and position (section 8.5), as the equation 8.27 becomes invertible:

\begin{displaymath}
\mathbf{R}_{Z} = \mathbf{K}^{-1} \mathbf{H}
\end{displaymath} (8.29)

Paolo medici
2025-10-22