The binomial model measures the number of successes/failures in trials, if the probability of success in each trial is . Then, the number of successes is
with likelihood
The ๐ฅ Conjugate prior distribution for this model will be
which has the probability density
Note that if , then .
With this prior, our posterior can be derived to be
which we can interpret as a balance between our observation and prior . If we have a lot of data, , then our mean becomes , the MLE estimate. Similarly, if we set , we also have a mean thatโs close to .
In this sense, and can be seen as โpseudo-countsโ of success and failure trials. If we make them smallโcalled non-informativeโat , they donโt have a big effect on the posterior. The prior in this case is a uniform distribution from to , which we can interpret as being equally probable for anything.