DAgger, short for Dataset Aggregation, is an enhanced ๐Ÿต Behavioral Cloning algorithm that enriches the dataset with mistake correction examples. Specifically, we loop the following:

  1. Train from human data .
  2. Run to get dataset of states.
  3. Ask an expert to label with correct actions.
  4. Aggregate, , repeat.

By incorporating the policyโ€™s empirical states into our dataset, over many iterations, weโ€™ll have the dataโ€™s distribution of observations converge to the policyโ€™s distribution,

thus allowing our model to learn correct responses to the states it encounters.