SVM returns an objective function
whose absolute value lacks a true meaning as it is an uncalibrated output. The extension to the multiclass case is challenging because the different objective functions for each class are not directly comparable to one another.
The concept of hinge loss can, however, be extended to the multiclass case. In this scenario, a SVM Loss of the form
 |
(4.41) |
is defined, where
denotes, for simplicity, the objective function associated with class
for the
-th sample.
Another similar metric is the squared hinge loss:
 |
(4.42) |
Finally, a loss function is defined over the entire dataset as the average
 |
(4.43) |
with the optional regularization term on the weights.
A different metric, extended to the multiclass case, is the normalized exponential function known as Softmax:
 |
(4.44) |
The objective function
can be interpreted as an unnormalized logarithmic probability for each class, and therefore, the cardinal loss function can be replaced with the cross-entropy loss function. A Softmax classifier minimizes the cross-entropy among the classes, and since it minimizes the negative log likelihood of the correct class, it can be viewed as a maximum likelihood estimator. In the case of Softmax, the regularization term
can be seen, from a statistical perspective, as a prior on the weights: in this case, it represents a Maximum a posteriori (MAP) estimate.
Paolo medici
2025-10-22