In this section, the problem of statistical filtering is discussed, which refers to the class of problems where data from one or more sensors affected by noise is available. This data represents the observation of the dynamic state of a system that is not directly observable, but for which an estimate is required. The process through which one seeks to find the best estimate of the internal state of a system is called "filtering," as it is a method for filtering out the various components of noise. The evolution of a system (the evolution of its internal state) must adhere to known physical laws, influenced by a noise component (process noise). It is precisely through the understanding of the equations governing the evolution of the state that it becomes possible to provide a better estimate of the internal state.
A physical process can be viewed, in its state space representation (State Space Model), through a function that describes how the state evolves over time:
| (2.83) |
| (2.84) |
with representing the observation noise and being a function solely of the current state.
This formalism is described in the continuous time domain. In practical applications, signals are sampled at discrete time and therefore a discrete-time version is typically used in the form
In systems that satisfy the equations (2.85), the evolution of the state is solely a function of the previous state, while the observation is only a function of the current state (see figure 2.4). If a system meets these assumptions, it is said to be a Markov process: the evolution of the system and the observation must depend only on the current state and not on past states. Access to information about the state always occurs indirectly through observation (known as a Hidden Markov Model).
Many approaches to estimate the unknown state of a system from a set of measurements do not account for the noisy nature of such observations. It is indeed possible to construct an algorithm that performs nonlinear regression on the observations to obtain estimates of all the states of the problem, solving an optimization problem with a high number of unknowns.
Filters, unlike regressions, aim to provide the best estimate of the variables (state) as the observation data arrives. From a theoretical standpoint, regressions represent the optimal case, while filtering converges to the correct result only after a sufficiently large number of samples.
Bayesian filters aim to estimate, at the discrete time instant , the state of the random variable
given an indirect observation of the system,
.
Filtering techniques allow for both obtaining the best estimate of the unknown state and the multivariate probability distribution
that represents the knowledge of the state itself.
Given the observation of the system, it is possible to define a probability density of a posteriori to the observation of the event
due to the additional knowledge gained from such observation:
Applying Bayes' theorem to equation (2.86) yields
| (2.87) |
In addition to the a posteriori knowledge of the probability distribution, it is possible to leverage further information to improve the estimation: the a priori knowledge regarding the observation, obtained from the constraint that the state does not evolve in a completely unpredictable manner but rather can only evolve in certain ways with specific probabilities. These ways in which the system can evolve are solely a function of the current state.
The Markovian process hypothesis implies that the only past state influencing the evolution of the system is the state at time , that is,
.
It is therefore possible to perform the prediction a priori, thanks to the Chapman-Kolmogorov equation:
| (2.88) |
From the knowledge of the a priori state and the observation , it is possible to rewrite equation (2.86) in the state update equation
The state is estimated by alternating between a prediction phase (a priori estimation) and an observation phase (a posteriori estimation). This iterative process is known as Recursive Bayesian Estimation.
The techniques described in this section will refer only to the most recent observation available for state estimation, for reasons of performance and simplicity. Formally, it is possible to extend the discussion to the case where all observations are utilized to obtain a more accurate estimate of the state. In this case, the filtering and prediction equations become
| (2.90) |
As an estimate of continuous variables, it is not possible to exploit Bayesian theory "directly"; however, several approaches have been proposed in the literature to enable efficient estimation both from a computational perspective and in terms of memory usage.
Depending on whether the problem is linear or non-linear, and whether the noise probability distribution is Gaussian or not, each of these filters performs in a more or less optimal manner.
The Kalman Filter (section 2.12.2) is the optimal filter when the problem is linear and the noise distribution is Gaussian. The Extended Kalman Filter and the Unscented Kalman Filter, discussed in sections 2.12.4 and 2.12.5 respectively, are sub-optimal filters for nonlinear problems with Gaussian noise distribution (or slightly deviating from it). Finally, particle filters provide a sub-optimal solution for nonlinear problems with non-Gaussian noise distribution.
The grid-based filters (section 2.12.1) and particle filters (section 2.12.8) operate on a discrete representation of the state, while the Kalman, Extended, and Sigma-Point filters work on a continuous representation of the state.
Kalman, Extended Kalman, and Sigma Point Kalman filters estimate the uncertainty distribution (of the state, process, and observation) as a single Gaussian. There are multimodal extensions such as Multi-hypothesis tracking (MHT) that allow the application of Kalman filters to distributions modeled as mixtures of Gaussians, while particle filters and grid-based methods are inherently multimodal.
An excellent survey on Bayesian filtering is (Che03).