L1 and L2 regularization involves adding an additional term to the cost function that penalizes certain configurations. Regularizing, for example, the cost function
 |
(4.96) |
means adding a term, which is a function solely of
, in order to obtain the new cost function of the form
 |
(4.97) |
with
being a regularizing function.
A widely used regularization function is
 |
(4.98) |
Common values for
are
or
(hence it is referred to as L1 or L2 regularization). When
can also be defined in the literature as weight decay. This type of regularization function penalizes parameters with excessively high values.
Paolo medici
2025-10-22