A hierarchical model captures grouped data with multiple levels of variation, organized in a hierarchy. The datapoint is observation within group . We assume that there are groups, each with observations.

The key is that we want to allow variation between groups. Extending the ๐Ÿ›Ž๏ธ Normal Model, we have

where the mean comes from

This accounts for variation within each group with and variation between groups with .

Known Variance

First, letโ€™s assume is known. Our remaining parameters are

The prior for is already , so we just need priors for and . Weโ€™ll use a non-informative prior and slightly different , which is less informative than . Note that this is different than the simple Normal model!

With these priors, our joint posterior is

Breaking this down, the group meansโ€™ conditional posterior is

where is the mean of group .

The conditional posterior for global mean is

Finally, the marginal posterior for is non-standard and simplifies to

which requires ๐Ÿงฑ Grid Sampling.

Interpretation

The group means are a compromise between the groupโ€™s data mean and global prior mean . The between-group variance is important here because if , the prior mean is ignored and we have

On the other hand, with , there is no variation between groups, so .

Sampling

To sample from this hierarchical model, we perform the steps below:

  1. Select grid of values for .
  2. Calculate for each grid value.
  3. Sample grid value with probability .
  4. Sample .
  5. For each group , sample .
  6. If we want posterior predictions for group , sample .
  7. If we want a new group, sample and .