The binomial model measures the number of successes/failures in trials, if the probability of success in each trial is . Then, the number of successes is

with likelihood

The ๐Ÿฅ‚ Conjugate prior distribution for this model will be

which has the probability density

Note that if , then .

With this prior, our posterior can be derived to be

which is

Interpretation

Our posterior distribution has a mean

which we can interpret as a balance between our observation and prior . If we have a lot of data, , then our mean becomes , the MLE estimate. Similarly, if we set , we also have a mean thatโ€™s close to .

In this sense, and can be seen as โ€œpseudo-countsโ€ of success and failure trials. If we make them smallโ€”called non-informativeโ€”at , they donโ€™t have a big effect on the posterior. The prior in this case is a uniform distribution from to , which we can interpret as being equally probable for anything.

Special Priors

The improper non-informative prior for has , so . ๐Ÿ’โ€โ™‚๏ธ Jeffreyโ€™s Prior for the binomial is .