Response surface methods is a mix between ♟️ Reinforcement Learning and ✋ Active Learning that aims some that minimizes (for unknown function ). To do so, we repeatedly query and improve our guess for .

Unlike active learning, our goal is to minimize instead of fitting to the entire data. This makes the problem much more like reinforcement learning, specifically 📖 Contextual Bandit, where corresponds with an action and is the reward (or loss, in our minimization case).

Training

Given a set of datapoints , fit a model , known as the response surface (analogous to model of the world). Then, repeat the following.

Pick the next using gradient descent on . This is similar to the exploitation step in standard reinforcement learning.
Measure the corresponding , then use to update .

Explorer

🚒 Response Surface Methods

Training

Backlinks

Graph View

Explorer

🚒 Response Surface Methods

Training §

Backlinks

Graph View

Training