Subsections


Kalman Filter

The Kalman filter (WB95) aims to estimate the internal state $\mathbf{x} \in \mathbb{R}^{n}$, which is not accessible, of a discrete-time system, given that the model knowledge is complete. In fact, the Kalman filter is the optimal recursive estimator: if the noise in the problem is Gaussian, the Kalman filter provides the least squares estimate of the internal state of the system.

For historical reasons, the Kalman filter specifically refers to the filtering of a system where the state transition and the observation are linear functions of the current state.

According to linear systems theory, the dynamics of a continuous-time "linear" system is represented by a differential equation of the form

\begin{displaymath}
\dot{\mathbf{x}} = \mathbf{A}(t) \mathbf{x}(t) + \mathbf{B} \mathbf{u}(t) + \mathbf{w}(t)
\end{displaymath} (2.93)

state update equation, to which an indirect observation of this state is associated through a linear system:
\begin{displaymath}
\mathbf{z}(t) = \mathbf{H}(t) \mathbf{x}(t) + \mathbf{v}(t)
\end{displaymath} (2.94)

with $\mathbf{z} \in \mathbf{R}^{m}$ the observable.

The discrete-time Kalman filter assists real systems where the world is sampled at discrete intervals, transforming the continuous-time linear system into a linear system of the form

\begin{displaymath}
\left\{
\begin{array}{l}
\mathbf{x}_{k+1} = \mathbf{A}_{k} ...
...mathbf{H}_k \mathbf{x}_{k} + \mathbf{v}_{k}
\end{array}\right.
\end{displaymath} (2.95)

If the system evolves according to this model, it is referred to as a Linear-Gaussian State Space Model or a Linear Dynamic System. If the values of the matrices are time-independent, the model is termed stationary.

The variables $w_{k}$ and $v_{k}$ represent the process noise and observation noise, respectively, with a mean value of zero $\bar{w_k}=\bar{v_k}=0$ and known variances $\mathbf{Q}$ and $\mathbf{R}$ (assuming white Gaussian noise). $\mathbf{A}$ is a state transition matrix, $n \times n$ is a matrix $\mathbf{B}$ that connects the optional control input $n \times l$ with the state $\mathbf{u} \in \mathbb{R}^{l}$, and finally $\mathbf {x}$ is a matrix $\mathbf{H}$ that links the state with the measurement $m \times n$. All these matrices, representing the system model, must be known with absolute precision, otherwise systematic errors will be introduced.

The Kalman filter is a recursive estimation filter that requires at each iteration the knowledge of the estimated state from the previous step $\hat{\mathbf{x}}_{k-1}$ and the current observation $\mathbf{z}_k$, which serves as an indirect observation of the system's state.

Let $\hat{\mathbf{x}}^{-}_{k}$ be the a priori estimate of the system state, based on the estimate obtained at time $k-1$ and the dynamics of the problem, and let $\hat{\mathbf{x} }_{k}$ be the a posteriori estimate of the state based on the observation $\mathbf{z}_k$. From these definitions, it is possible to define the error of the a priori and a posteriori estimates as

\begin{displaymath}
\begin{array}{l}
\mathbf{e}^{-}_{k} = \mathbf{x}_k - \hat{\...
...bf{e}_{k} = \mathbf{x}_k - \hat{\mathbf{x}}_{k} \\
\end{array}\end{displaymath} (2.96)

To these errors, it is possible to associate
\begin{displaymath}
\begin{array}{l}
\mathbf{P}^{-}_k = \E[\mathbf{e}^{-}_k {\m...
...\mathbf{P}_k = \E[\mathbf{e}_k \mathbf{e}^{\top}_k]
\end{array}\end{displaymath} (2.97)

the a priori and a posteriori covariance matrices, respectively.

The objective of the Kalman filter is to minimize the covariance of the a posteriori error $\mathbf{P}_k$ and to provide a method for obtaining the estimate of $\hat{\mathbf{x} }_{k}$ given the a priori estimate $\hat{\mathbf{x}}^{-}_{k}$ and the observation $\mathbf{z}_k$.

The Kalman filter provides a posterior state estimate through a linear combination of the previous state estimate and the observation error:

\begin{displaymath}
\hat{\mathbf{x}}_{k} = \hat{\mathbf{x}}^{-}_{k} + \mathbf{K}_k( \mathbf{z}_k - \mathbf{H}_k \hat{\mathbf{x} }^{-}_{k})
\end{displaymath} (2.98)

shifting the estimation problem to that of deriving the gain factor $\mathbf{K}_k$ (blending factor). The difference $\mathbf{z}_k - \mathbf{H}_k \hat{\mathbf{x} }^{-}_{k}$ is referred to as the residual, or innovation, and represents the discrepancy between the predicted observation and the actual observation. It is noteworthy that the metric used to calculate the residual may depend on the specific characteristics of the problem.

The Kalman filter is typically presented in two phases: the time update (prediction phase) and the measurement update (observation phase).

In the first phase, the a priori estimate of both $\hat{\mathbf{x}}_k$ and the covariance $\mathbf{P}_{k}$ is obtained. The a priori estimate $\hat{\mathbf{x}}^{-}_{k}$ derives from a good understanding of the system dynamics given by equation (2.95):

\begin{displaymath}
\hat{\mathbf{x}}^{-}_{k} = \mathbf{A} \hat{\mathbf{x}}_{k-1} + \mathbf{B} \mathbf{u}_{k}
\end{displaymath} (2.99)

Similarly, the a priori estimate of the error covariance is updated:
\begin{displaymath}
\mathbf{P}^{-}_{k} = \mathbf{A} \mathbf{P}_{k-1} \mathbf{A}^{\top} + \mathbf{Q}_k
\end{displaymath} (2.100)

These are the best estimates of the state and the covariance at the instant $k$ that can be obtained a priori from the observation of the system.

In the second phase, the gain

\begin{displaymath}
\mathbf{K}_k = \mathbf{P}^{-}_{k} \mathbf{H}_k^{\top} \left...
...hbf{P}^{-}_{k} \mathbf{H}_k^{\top} + \mathbf{R}_k \right)^{-1}
\end{displaymath} (2.101)

is calculated to minimize the a posteriori covariance, and with this factor, the a posteriori state is updated using equation (2.98).

Using this value for the gain $\mathbf{K}$, the a posteriori estimate of the covariance matrix becomes

\begin{displaymath}
\mathbf{P}_{k} = (\mathbf{I} - \mathbf{K}_k \mathbf{H}_k) \mathbf{P}^{-}_{k}
\end{displaymath} (2.102)

To unify the various forms of the Kalman filters, these equations can be expressed using the variance-covariance matrices as follows:


\begin{displaymath}
\begin{array}{l}
\cov (x_k, z_k) = \mathbf{P}^{-}_{k} \mat...
...mathbf{H}_k \mathbf{P}^{-}_{k} \mathbf{H}_k^{\top}
\end{array}\end{displaymath} (2.103)

so that we can express equation (2.101) as
\begin{displaymath}
\mathbf{K}_k = \cov (x_k, z_k) \left( \cov (z_k) + \mathbf{R}_k \right)^{-1}
\end{displaymath} (2.104)

and, by substituting the covariances (2.103) into (2.102), we obtain
\begin{displaymath}
\mathbf{P}_{k} = \mathbf{P}^{-}_{k} - \mathbf{K}_k \cov (x_k, z_k)^{\top}
\end{displaymath} (2.105)

It can be easily observed that the covariance matrix and the Kalman gain do not depend at all on the state, the observations, or the residual, and have an independent history.

However, the Kalman filter requires an initial value for the state variable and the covariance matrix: the initial state value should be as close as possible to the true value, and the degree of similarity to this value should be reflected in the initial covariance matrix.


One-Dimensional Kalman Filter

It is interesting to illustrate, as an example, the simplified case of a one-dimensional Kalman state filter that coincides with the observable. The transition and observation equations are


\begin{displaymath}
\begin{array}{l}
x_{i} = x_{i-1} + u_{i} + w_{i} \\
z_{i} = x_{i} + v_{i}
\end{array}\end{displaymath} (2.106)

where $w_i$ is the process noise whose variance $q_i$ represents the estimate of the probability of variation of the signal itself (low if the signal varies little over time, high if the signal varies significantly), while $v_i$ is the observation noise with variance $r_i$, which is the noise associated with the observation of the state.

The prediction cycle is very simple and becomes:

\begin{displaymath}
\begin{array}{l}
x^{-}_{i} = x_{i-1} + u_{i}\\
p^{-}_{i} = p_{i-1} + q_i
\end{array}\end{displaymath} (2.107)

The Kalman gain $k$ becomes

\begin{displaymath}
k_i = \frac{p^{-}_{i}}{p^{-}_{i} + r_i}
\end{displaymath} (2.108)

and finally the observation phase becomes
\begin{displaymath}
\begin{array}{l}
x_{i} = x^{-}_{i} + k_i (z_i - x^{-}_i) =...
...(1 - k_i) x^{-}_i \\
p_{i} = (1 - k_i) p^{-}_{i}
\end{array}\end{displaymath} (2.109)

It is usually possible to estimate the value of $r$ a priori, while the value of $q$ must be determined through experiments.

As seen in the first of the equations (2.109), the factor $k$ is essentially a blending factor between the observation of the state and the previously estimated state.

In the one-dimensional case, it is easy to see how the gain $k$ and the variance $p$ are independent of the state and the observations, let alone the error. If $r$ and $q$ do not vary over time, $k$ and $p$ are numerical sequences that converge to a constant value determined solely by the characterization of the noise, regardless of the initial values. This result should be compared with what is obtained from equation (2.65).

Paolo medici
2025-10-22