Least-squares TD is a โ๏ธ Linear Function Approximation solution to the standard temporal difference formula. This solution
In steady state (expectation over all possible TD updates), we expect
where
Solving, we derive the equation
which is called the TD fixed point, or the parameters for linear function convergence.
Itโs possible to arrive at this solution through both gradient descent and calculating the
which gives us