Summary statistics describes how a ๐Ÿช Random Variable behaves, essentially summarizing the distribution.

Info

The following will assume that weโ€™re operating over continuous space. However, the equations are roughly similar for discrete random variables if we substitute the integration for a summation.

Expected Value

The expected value of a function of a random variable is given by

If we choose , then the expected value is called the mean,

If is multivariate, we compute the expectation over each dimension separately.

Covariance

If we have two random variables and , their covariance

measures their deviation from their respective means.

The covariance of a variable with itself is called variance,

The square root of the variance is the standard deviation .

If and are multivariate, we have a slightly modified equation

For variance, we get the covariance matrix

Another formula for variance is given by the law of variance: for two random variables and ,

Correlation

When comparing random variables and , their variances affect the covariance. Normalizing their variance gives us correlation,

Another interpretation is that correlation is the covariance of the standardized random variables .

Positive correlation means that if increases, is expected to increase. Negative correlation means that if increases, is expected to decrease.

Properties

For random variables and , the expected value and covariance have the properties below. 1.

Furthermore, for , let and be the mean and covariance matrix of . Then, we have the following. 1.

Empirical Summaries

If we have only a small empirical sample for the random variable instead of access to the whole population, we can only use values to estimate the true summary statistics. Then, our sample mean, sample variance, and sample covariance matrix equations are as follows:

Note

Note that for our sample variance, we divide by instead of to account for the variance of the sample mean . Since weโ€™re basing the sample variance off an empirical estimate for the mean instead of the true mean, we have to adjust for its inaccuracy accordingly. This formula can be mathematically derived, and we can show that .