Mean and Variance

It is easy to assume that the notion of the average among numbers is a concept familiar to everyone, at least from a purely intuitive standpoint. In this section, a brief summary is provided, definitions are given, and some interesting aspects will be highlighted.

For $n$ samples of an observed quantity $x$, the sample mean is denoted as $\bar{x}$ and is given by

\begin{displaymath}
\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i
\end{displaymath} (2.1)

The sample mean, by definition, is an empirical quantity.

If an infinite number of values of $x$ and $\bar{x}$ could be sampled, they would converge to the theoretical expected value. This is known as the Law of Large Numbers.

The expected value (expectation, mean) of a random variable $X$ is denoted by $\E[X]$ or $\mu$ and can be calculated for discrete random variables using the formula

\begin{displaymath}
\E[X] = \mu_x = \sum_{-\infty}^{+\infty} x_i p_X(x_i)
\end{displaymath} (2.2)

and for continuous variables using
\begin{displaymath}
\E[X] = \mu_x = \int_{-\infty}^{+\infty} x p_X(x) dx
\end{displaymath} (2.3)

given the knowledge of the probability distribution $p_X(x)$.

Now we introduce the concept of the mean of a random variable function.

Definizione 4   Let $X$ be a random variable with probability function $p_X(x)$ and $g(x)$a generic measurable function in $x$. If the integral
\begin{displaymath}
\E[g(X)] = \sum_{-\infty}^{+\infty} g(x_i) p_i \qquad \E[g(X)] = \int_{-\infty}^{+\infty} g(x) p_X(x) dx
\end{displaymath} (2.4)

is absolutely convergent, it is referred to as the "mean value of the random variable $Y = g(X)$".

There are certain functions whose mean has a significant meaning. When $g(x)=x$ we refer to first-order statistics (first statistical moment), and generally when $g(x)=x^{k}$ we discuss statistics of $k$-order. The mean is therefore the first-order statistic, and another statistic of particular interest is the second-order moment:

\begin{displaymath}
\E[X^{2}] = \int_{-\infty}^{+\infty} x^{2} p_X(x) dx
\end{displaymath} (2.5)

This statistic is important because it allows us to estimate the variance of $X$.

The variance is defined as the expected value of the square of the random variable $X$ after subtracting its mean, which corresponds to the second moment of the function $g(X)= X-\E[X]$:

\begin{displaymath}
\text{var}(X) = \sigma^{2}_X = \E[ (X - \E[X])^{2} ]
\end{displaymath} (2.6)

Assuming $X$ and $\E[X]$ are independent processes, one obtains the simplest and most widely used form of the variance:
\begin{displaymath}
\text{var}(X) = \sigma^{2}_X = \E[X^{2}] - \E[X]^{2}
\end{displaymath} (2.7)

The square root of the variance is known as the standard deviation and has the advantage of having the same unit of measurement as the observed quantity:

\begin{displaymath}
\sigma_X = \sqrt{ \text{var}(X) }
\end{displaymath} (2.8)

Let's extend the concepts discussed so far to the multivariable case. The multivariable case can be viewed as an extension to multiple dimensions, where each dimension is associated with a different variable.

The covariance matrix $\Sigma$ is the extension to multiple dimensions (or multiple variables) of the concept of variance. It is constructed as

\begin{displaymath}
\Sigma_{ij} =\text{cov}(X_i,X_j)
\end{displaymath} (2.9)

where each element of the matrix contains the covariance between the various components of the random vector $X$. Covariance indicates how the different random variables that make up the vector $X$ are related to each other.

The possible ways to denote the covariance matrix are

\begin{displaymath}
\Sigma = \E \left[ (X - \E[X])(X - \E[X])^{\top} \right] = \text{var}(X) = \text{cov}(X) = \text{cov}(X,X)
\end{displaymath} (2.10)

The notation for cross-covariance is, instead, unambiguous

\begin{displaymath}
\text{cov}(X,Y) = \E \left[ (X - \E[X])(Y - \E[Y])^{\top} \right]
\end{displaymath} (2.11)

generalization of the concept of the covariance matrix. The cross-covariance matrix $\boldsymbol\Sigma$ has as elements in the position $(i,j)$ the covariance between the random variable $X_i$ and the variable $Y_j$: It seems that you've entered a placeholder or a command related to a LaTeX environment. If you have specific content or equations you'd like me to translate, please provide that text, and I'll be happy to assist you! The covariance matrix $\text{cov}(X,X)$ is consequently symmetric.

The covariance matrix, which describes the relationships between variables and consequently how uncorrelated they are with each other, is also referred to as the scatter plot matrix. The inverse of the covariance matrix is known as the concentration matrix or precision matrix.

The correlation matrix $\mathbf{r}(X,Y)$ is the normalized cross-covariance matrix with respect to the covariance matrices:

\begin{displaymath}
\mathbf{r}(X,Y) = \frac{\text{cov}(X,Y) } {\sqrt{\text{var}(X) \text{var}(Y) } }
\end{displaymath} (2.12)

This matrix has values that always lie within the range $[-1,1]$ or $[-100\%,100\%]$.

Paolo medici
2025-10-22