If the corresponding points between the two images of a stereo pair were on the same row of the image (i.e., the same coordinate ), it would be possible to leverage highly optimized code to search for correspondences (LZ99) and obtain dense disparity maps.
There exists a particular configuration of two cameras in which this condition is satisfied, namely when the intrinsic parameters are equal and the optical axes are oriented perpendicularly to the vector connecting the pinholes. For instance, in the case where the vector connecting the pinholes lies along the axis , the stereo pair configuration that allows for the acquisition of corresponding points on the same row is one that has rotation angles
and
with an equal pitch angle.
The software procedure to achieve this configuration, when the hardware does not meet such constraints, involves rectification (see 8.3.4). Specifically, starting from an image acquired with a set of parameters (hardware), a new view of the same scene is obtained, but with the desired intrinsic parameters: yaw, pitch, and roll.
Through this consideration, the problem of three-dimensional reconstruction can always be reduced to a pair of cameras perfectly aligned with each other and with the axes, along with a rigid transformation to convert the world coordinates from this sensor system to the actual real-world system.
In the following sections, we will present the specific cases of both perfectly aligned cameras with respect to the axes, as well as cameras that are aligned but tilted (with a non-zero pitch angle), and finally, cameras that are arbitrarily oriented.
Paolo medici