The Bayesian Classifier

Through the Bayesian approach, it would be possible to construct an optimal classifier if both the prior probabilities $p(y_i)$ and the class-conditional densities $p(x\vert y_i)$ were known perfectly. Typically, such information is rarely available, and the adopted approach is to build a classifier from a set of examples (the training set).

To model $p(x\vert y_i)$, a parametric approach is typically employed, and whenever possible, this distribution is aligned with that of a Gaussian or with spline functions.

The most commonly used estimation techniques are Maximum-Likelihood (ML) and Bayesian Estimation, which, although differing in their underlying logic, yield nearly identical results. The Gaussian distribution is typically an appropriate model for most pattern recognition problems.

Let us examine the quite common case in which the probabilities of the various classes follow a multivariate Gaussian distribution with mean $\boldsymbol\mu_i$ and covariance matrix $\boldsymbol\Sigma_i$. The optimal Bayesian classifier is given by


\begin{displaymath}
\begin{array}{rl}
\hat{y}(\mathbf{x}) = & \argmax_i p(\math...
...\det \boldsymbol\Sigma_i - 2 \log \pi_i \right) \\
\end{array}\end{displaymath} (4.11)

It seems that you've provided a snippet of LaTeX code that includes the end of an array and an equation environment. If you need assistance with translating specific content or text that accompanies this code, please provide that text, and I'll be happy to help! using the negative log-likelihood (section 2.8). In the case of equal prior probabilities $\pi_i$, equation (4.11) corresponds to the problem of finding the minimum of the Mahalanobis distance (section 2.4) between the classes of the problem.

Paolo medici
2025-10-22