It is easy to assume that the notion of the average among numbers is a concept familiar to everyone, at least from a purely intuitive standpoint. In this section, a brief summary is provided, definitions are given, and some interesting aspects will be highlighted.
For samples of an observed quantity
, the sample mean is denoted as
and is given by
| (2.1) |
If an infinite number of values of and
could be sampled, they would converge to the theoretical expected value. This is known as the Law of Large Numbers.
The expected value (expectation, mean) of a random variable is denoted by
or
and can be calculated for discrete random variables using the formula
| (2.2) |
| (2.3) |
Now we introduce the concept of the mean of a random variable function.
There are certain functions whose mean has a significant meaning. When we refer to first-order statistics (first statistical moment), and generally when
we discuss statistics of
-order. The mean is therefore the first-order statistic, and another statistic of particular interest is the second-order moment:
| (2.5) |
The variance is defined as the expected value of the square of the random variable after subtracting its mean, which corresponds to the second moment of the function
:
| (2.6) |
| (2.7) |
The square root of the variance is known as the standard deviation and has the advantage of having the same unit of measurement as the observed quantity:
| (2.8) |
Let's extend the concepts discussed so far to the multivariable case. The multivariable case can be viewed as an extension to multiple dimensions, where each dimension is associated with a different variable.
The covariance matrix is the extension to multiple dimensions (or multiple variables) of the concept of variance. It is constructed as
| (2.9) |
The possible ways to denote the covariance matrix are
| (2.10) |
The notation for cross-covariance is, instead, unambiguous
| (2.11) |
The covariance matrix, which describes the relationships between variables and consequently how uncorrelated they are with each other, is also referred to as the scatter plot matrix. The inverse of the covariance matrix is known as the concentration matrix or precision matrix.
The correlation matrix
is the normalized cross-covariance matrix with respect to the covariance matrices:
| (2.12) |
Paolo medici