Bayesian Classifiers

The Bayes theorem, associated with Computer Vision, represents a fundamental technique for pattern classification, based on experience (training set).

To understand Bayes' theorem, it is essential to consider a simple example. Suppose we want to classify fruit that is presented to an observer (or a processor in the extreme case). For simplicity, let us assume that there are only two types of fruit (the categories of the classifier), for instance, oranges and apples. For humans, as well as for machines, determining the type of fruit being observed involves examining specific characteristics (features) extracted from the observation of the fruit, using appropriate techniques.

If the fruits are selected completely at random and no additional information can be extracted from them, the optimal approach for classifying them would be to provide a completely random response.

The Bayesian decision theory plays an important role only when some a priori information about the objects is known.

As a first step, let us assume that there is no prior knowledge about the characteristics of the fruits, but it is known that 80% of the fruits are apples and the remaining are oranges. If this is the only information available to make a decision, one would instinctively tend to classify the fruit as an apple (the optimal classifier): every fruit will be classified as an apple since, in the absence of other information, this is the only way to minimize the error. The a priori information in this case consists of the probabilities that the chosen fruit is an apple or an orange.

Let us now examine the case where it is possible to extract additional information from the observed scene. The concept of Bayes applied to classification is very intuitive from this perspective: if I observe a particular measurable characteristic of the image $x$ (features), I can estimate the probability that this image represents a certain class $y_i$ a posteriori of the observation. From this standpoint, Bayesian classifiers provide exactly the probability that the input data vector represents the specified output class.

Subsections

Paolo medici
2025-10-22