Multi-Camera Vision

This chapter generally discusses algorithms that involve the analysis of images from multiple cameras, with particular emphasis on the case of stereoscopic vision.

Stereoscopic vision (stereopsis) is the process through which it is possible to estimate distances and positions of objects observed by two visual sensors, and through this information, reconstruct the observed scene. This concept can be easily extended to the case where the scene is observed not by two but by multiple cameras (multiple view geometry).

These views can be temporally coincident (for instance, in the case of a pair of cameras that form a stereo camera) or they may observe the scene from different points in space and time, as occurs when processing images from the same camera that is moving through space (motion stereo, structure from motion).

Stereoscopic analysis can be primarily implemented through two techniques:

A necessary condition for achieving a complete three-dimensional reconstruction of the observed scene, through the analysis of multiple images acquired from different viewpoints, is the knowledge of the intrinsic parameters of the involved cameras and their relative poses.

If the relative pose is unknown, it can be estimated through the analysis of the images themselves. However, as will be shown later, the distance between the cameras will be derived up to a multiplicative factor, and consequently, the three-dimensional reconstruction will also be known up to that factor.

If the intrinsic parameters are not known, it is still possible to relate corresponding points between the two images. This process can accelerate the comparison between KeyPoints, but it will not be possible to make any statements about the three-dimensional reconstruction of the observed scene (the reconstruction is known up to an affine transformation).



Subsections
Paolo medici
2025-10-22