❄️ Semi-Gradient

Semi-gradients are commonly used for function approximation updates. While most deep learning methods follow some variant of ⛰️ Gradient Descent, reinforcement learning updates sometimes require bootstrapping—that is, our function estimate is part of the target, and thus updating the function with a standard gradient descent update gives us a semi-gradient instead of a true gradient.

For example, let’s approximate with a function . In standard gradient descent, since our target also uses our function , the update step would need to consider how the update would affect the target as well. We usually don’t consider this and instead allow a slight bias to our update; hence, in bootstrapping methods that update the gradient with a target estimate that depends on parameters , we use the semi-gradient update:

Note that this update is exactly the standard gradient update for some target —the only reason this is a semi-gradient is because is dependent on .

Explorer

❄️ Semi-Gradient

Backlinks

Graph View