The Speeded Up Robust Features algorithm (BETVG08) is inspired by the SIFT algorithm and the scale-space representation theory, proposing an optimized version that utilizes approximate Hessians by employing the integral image, both for detecting keypoints and for extracting their descriptors.
SURF is invariant to translation, scale, and rotation, but there exists a simplified variant, referred to as "U-SURF," which is only invariant to variations in translation and scale. In this case, the area around the identified point is not normalized with respect to rotation when the descriptor is extracted.
In SURF, the characteristic points are detected by calculating local maxima on the determinant of the Hessian image defined as:
| (5.12) |
The bandwidth of these approximate filters can be estimated as
| (5.13) |
The determinant image is calculated as
| (5.14) |
The image is analyzed across multiple octaves (each octave has a scale factor that is double that of the previous octave). Each octave is divided into an equal number of scale levels. The number of scales per octave is constrained by the inherently quantized nature of the filter, and the approximated Gaussians are not as evenly spaced as in the case of SIFT. In fact, 4 intervals per octave is the only feasible number of subdivisions.
Within each octave, as the scale and position vary, a Non-Maxima Suppression
is performed on the determinant image of
. The local minima/maxima, interpolated through a three-dimensional quadratic as in SIFT, are the interest points identified by SURF. The scale is set equal to the variance of the associated filter
.
From the identified maxima, using the integral image, the dominant orientation is extracted in the vicinity of the point (within a radius of and sampled at a step of
). In this case, Haar features of size
are also utilized and weighted with a Gaussian distribution of
.
Through the orientation information, a descriptor is generated based on the directions of the gradients by sampling the area around , divided into
regions and weighting the points with a Gaussian
. Within each region,
,
,
, and
are calculated. Both the orientation and the gradient histogram are extracted at the detection scale of the feature.
Paolo medici