In real-world applications, a margin does not always exist, meaning that classes are not always linearly separable in the feature space through a hyperplane. The concept underlying the Soft Margin allows us to overcome this limitation by introducing an additional variable
for each sample, thereby relaxing the constraint on the margin
 |
(4.26) |
. The parameter
represents the slackness associated with the sample. When
, the sample is correctly classified but lies within the margin area. When
, the sample enters the decision space of the opposing class and is therefore classified incorrectly.
To search for a more optimal separating hyperplane, the cost function to minimize must also take into account the distance between the sample and the margin:
 |
(4.27) |
subject to the constraints (4.26).
The parameter
represents a degree of freedom in the problem, indicating how much a sample must "pay" for violating the margin constraint.
When
is small, the margin is wide, whereas when
approaches infinity, it reverts to the Hard Margin formulation of SVM discussed earlier.
Each sample
can fall into one of three possible states:
- it may lie beyond the margin
and therefore not contribute to the function;
- it may lie on the margin
not participating directly in the minimization but only as a support vector;
- finally, it may fall within the margin and be penalized according to how much it deviates from the hard constraints.
The Lagrangian of the system (4.27), with the constraints introduced by the variables
, is
 |
(4.28) |
With the increase in the number of constraints, the dual variables are both
and
.
The remarkable result is that, upon applying the derivatives, the dual formulation of (4.28) becomes exactly the same as the dual of the Hard Margin case: the variables
do not appear in the dual formulation, and the only difference between the Hard Margin case and the Soft Margin case lies in the constraint on the parameters
, which in this case are limited to
 |
(4.29) |
instead of being subject to the simple inequality
. The significant advantage of this formulation is precisely in the high simplicity of the constraints and in the fact that it allows us to reduce the Hard Margin case to a particular case (
) of the Soft Margin. The constant
serves as an upper limit on the values that the
can assume.
Paolo medici
2025-10-22