Statistics is about the organization, modeling, analysis, and interpretation of data. Standard tasks include summarizing data and inferring conclusions from the data.
To model data, we commonly use probability. Within statistics, there are two classes that differ in how they interpret probabilities:
- Classical or Frequentist statistics see it as frequency (over many trials). To build a model, we assume the data is generated following some unknown fixed parameters
, and we use our data samples to make point estimates for . - Bayesian statistics see it as a degree of belief, and a model combines both the belief of the unknown parameter
as well as the information provided by the data to make a more informed probabilistic estimate for .
Fundamentally, the difference between Classical and Bayesian statistics is the treatment of the unknown parameter
Classical Statistics
To explain our observations, we assume a probability model that links observed data
For example, we can have a normal distribution,
where
In Classical statistics, we assume
For our normal model, this is
However, because our observations are random samples, thereโs some uncertainty in
Bayesian Statistics
In the Bayesian approach, we instead assume
Like above, we assume our data
However, we also assume our parameters come from their own distribution,
called a prior distribution.
To estimate our distribution for
which combines both the data model
With our posterior
Note that this distribution accounts for all variabilityโboth from the parameter values