Optimization Methods

Now, let us consider a generic problem of modeling (optimization) of an unconstrained function, applicable, for example, to classification problems in the field of computer vision. The considerations expressed in this section apply to the case of least squares but can be extended to a generic loss function.

Let $\mathbf{z}$ be the dataset involved in the modeling operation, consisting of a pair $(\mathbf{x}_i,y_i)$ composed of an arbitrary input $\mathbf{x}_i$ and the output $y_i$. Let $\ell (\hat{y}, y)$ be the cost function (loss function) that returns the quality of the estimate on $y$. The objective is to find the weights $\boldsymbol\beta$ that parameterize the function $f(\mathbf{x}; \boldsymbol\beta)$ that minimize a cost function $S(\boldsymbol\beta)$.

\begin{displaymath}
S( \boldsymbol\beta) = \int \ell(\mathbf{z} ; \boldsymbol\b...
...S( \boldsymbol\beta) = \sum_{i=1}^{n} \ell_i(\boldsymbol\beta)
\end{displaymath} (3.26)

both in the continuous case and in the discrete case, having defined $\ell_i(\boldsymbol\beta) = \ell (f_i(\mathbf{x}_i; \boldsymbol\beta), y_i )$. For simplicity, we will always refer to the second case, the discrete one, to describe the cost function.

In the case of normal additive Gaussian error, the maximum likelihood estimator is the quadratic loss function given by equation (3.6):

\begin{displaymath}
\ell_i(\boldsymbol\beta) = r_i^2 (\boldsymbol\beta) = \left( y_i - f_i(\mathbf{x}_i ; \boldsymbol\beta) \right)^2
\end{displaymath} (3.27)

In practical applications, it is almost never possible to obtain the minimum of the function in closed form; therefore, it is necessary to resort to appropriate iterative methods, which, starting from an initial state and moving in suitable directions $\boldsymbol\delta$, gradually approach the minimum of the objective function.



Subsections
Paolo medici
2025-10-22