In the previous chapters, it has been repeatedly emphasized that it is not possible to obtain the world coordinates of the points that make up an image from a single image alone, without additional information.
![]() |
The only thing that a generic point of the image can provide, given the equation (8.16) of the pin-hole camera, is a relationship between the (infinite) world coordinates
underlying the image point, that is, the locus of world coordinates that, when projected, would yield exactly that particular image point. This relationship is the equation of a line passing through the pin-hole
and the point on the sensor corresponding to the image point
.
By rewriting the equation (8.16), it is easy to see what the dependency is between the parameters of the i-th camera, the image point , and the line that represents all possible world points
underlying
:
As can be inferred both from experience and from the linear relationship that connects these points, it can be stated that the underlying point is known up to a scale factor
.
In the case of stereo vision, we have two sensors, and therefore we need to define two reference systems with parameters
and
, respectively, and the positions of the pinholes
and
, which are always expressed in world coordinates.
The line (9.7), the locus of world points associated with the image point observed in the first reference frame, can be projected into the view of the second camera:
This constant point is the epipole. The epipole is the intersection point of all epipolar lines and represents the projection of the pinhole of one camera onto the image of the other, or the "vanishing point" of the epipolar lines.
Given two cameras, the projections of the coordinates of the pin-hole and
onto the opposite image are
| (9.10) |
The matrix is designed to convert from camera 1 coordinates to camera 2 coordinates, and
represents the position of the pin-hole of camera 1 expressed in the reference frame of camera 2.
The lines generated by the points in the first image all converge at a single point formed by the projection of the pin-hole onto the second image: in fact, the point in world coordinates and the two epipoles create a plane (the epipolar plane) where the possible solutions, the points in camera coordinates, of the three-dimensional reconstruction problem reside (figure 9.1).
Epipolar geometry is the geometry that connects two images captured from different viewpoints. The relationships between the images, however, do not depend on the observed scene but solely on the intrinsic parameters of the cameras and their relative poses.
For each observed point, the epipolar plane is the plane formed by the point in world coordinates and the two optical centers.
The epipolar line is the intersection between the epipolar plane and the image plane in the second image. In fact, the epipolar plane intersects the plane in both images along the epipolar lines and defines the correspondences between the lines.
In the following sections, we will discuss both how to derive the line along which a point belonging to one image must be located in another image, and how to obtain the corresponding three-dimensional point given two (or more) homologous points.
Paolo medici