🦄 Log Derivative Trick

The log derivative trick allows us to estimate the gradient

by rewriting it as a valid expectation.

Note that simply expanding out this expectation gives us the following:

However, isn’t a valid probability, so we can’t directly approximate the integral as an average via 🤔 Monte Carlo Sampling.

Instead, we observe that

which allows us to rewrite the expectation below:

where are samples drawn from .

Explorer