A regression model uses a distribution to model some function from predictors
where noise
Our model has the form
which looks very much like the ๐๏ธ Normal Model except with the mean being determined by
Non-Informative Prior
Our normal model has the improper non-informative prior
The joint posterior is
Again breaking this up, we have the conditional posterior
where
The marginal posterior is
Sampling
To sample estimates for
- Sample
. - Set
. - Sample
. - Sample
.
Conjugate Prior
Weโll keep the non-informative prior for
The posterior is
The conditional posterior is
where
and the marginal posterior for
Connection to Machine Learning
We often want sparsity in our modelโfor as many coefficients
which pulls each
Without this prior (using the non-informative prior), we get the least square estimate. With the prior, we have a penalty term
- With the sparsity Normal prior above, the penalty becomes
which is called ridge regression. - With a Laplace prior
, we have the penalty , called LASSO regression.
See ๐ฆ Linear Regression and โฝ๏ธ Regularization Penalties for more details.
Poisson Model
Regression can also be done with the ๐ฉ๏ธ Poisson Model. We have
where
Using a non-informative joint prior
We can use ๐งฑ Grid Sampling to get parameter samples