Lucas-Kanade

The Lucas-Kanade optical flow estimation method (LK81) is a technique for estimating the motion of interesting features across successive frames of a video. The goal is to associate a motion vector $(u,v)$ with each "interesting" pixel in the scene by comparing two consecutive images.

The algorithm makes the following assumptions:

Starting from the optical flow equation for each point $(x,y)$:

\begin{displaymath}
I (x + u\delta t, y + v\delta t, t + \delta t) = I(x,y,t)
\end{displaymath} (7.2)

where $I(t)$ is one image and $I(t + \delta t)$ is the subsequent one. With the first-order Taylor series expansion:
\begin{displaymath}
\begin{array}{l}
I (x + u\delta t, y + v\delta t, t + \delt...
...,y,t) v + \frac{\partial I}{\partial t} (x,y,t) = 0
\end{array}\end{displaymath} (7.3)

The Lucas-Kanade algorithm assumes that the change in brightness of a pixel in the scene is fully compensated by the gradient of the scene itself, that is:
\begin{displaymath}
I_x u + I_y v + I_t = 0
\end{displaymath} (7.4)

given the temporal gradient $I_t$ and the spatial gradient $(I_x,I_y)$.

Obviously, a single pixel does not contain enough information to solve this problem. To gather more observations, it is assumed that a neighborhood of the pixel exhibits the same motion, that is,


\begin{displaymath}
\begin{bmatrix}
I_x(\mathbf{p}_1) & I_y(\mathbf{p}_1) \\
...
... (\mathbf{p}_2) \\
\vdots \\
I_t (\mathbf{p}_n)
\end{bmatrix}\end{displaymath} (7.5)

where $\mathbf{p}_1 \ldots \mathbf{p}_n$ are the points in the neighborhood of the point to be estimated. The solution can be obtained through the method of normal equations


\begin{displaymath}
\begin{bmatrix}
\sum I_x I_y & \sum I_x I_y \\
\sum I_x ...
...egin{bmatrix}
\sum I_x I_t \\
\sum I_y I_t \\
\end{bmatrix}\end{displaymath} (7.6)

It is noteworthy that this is also the matrix of characteristic points utilized by Shi-Tomasi or Harris (see 5.2): the characteristic points of this matrix are points that can be easily tracked using the Lucas-Kanade algorithm.

When the motion is greater than one pixel, an iterative algorithm is required to solve the problem, along with a coarse-to-fine approach to avoid local minima: there will exist a scale at which the motion of the pixel is less than one pixel.

Paolo medici
2025-10-22