Gauss-Newton
The methods discussed so far allow considerable freedom in the choice of a particular loss function over another. In practical cases where the cost function
is quadratic, further optimizations can be made to the Newton method, avoiding the cumbersome computation of the Hessian. In this case, the loss function takes the form already seen previously,
 |
(3.40) |
. The term
in the cost function serves to provide a more compact expression of the Jacobian.
With this cost function, the gradient and Hessian are expressed as
 |
(3.41) |
When the parameters are close to the exact solution, the residual is small, and the Hessian can be approximated by only the first term of the expression, namely
 |
(3.42) |
.
Under these conditions, the gradient and the Hessian of the cost function
can be expressed solely in terms of the Jacobian of the functions
.
The approximated expression for the Hessian can be incorporated into equation (3.34):
 |
(3.43) |
.
This, similar to the case of Newton, is a linear minimization problem that can be solved using the normal equations:
 |
(3.44) |
.
The significance of the normal equations is geometric: the minimum is achieved when
becomes orthogonal to the column space of
.
In the particular case of the residue function written as
 |
(3.45) |
,
that is similar to those in equation (3.6), it is possible to use
, the Jacobian of
, instead of
.
 |
(3.46) |
Having observed that the derivatives of
and
are equal up to a sign3.2.
Footnotes
- 3.2
- Clearly, the derivatives coincide when a residue of the type
is chosen.
Paolo medici
2025-10-22