The bolt-on fairness method converts a standard machine learning classifier into an adjusted classifier that we can control to satisfy fairness guarantees.
Specifically, for subgroups and , is defined by four parameters:
In other words, these parameters define the probability of outputting a positive prediction for each of the four groups defined by and the prediction from . For , we would set and . For random decisions (perfect fairness), we have .
These parameters thus range from perfect fairness to standard machine learning, giving us a wide range of valid models. Specifically, the set of all gives us a Pareto frontier of defined on difference in errors and overall error.
For a certain maximum difference , we can solve a linear program to find the values for that minimize .
However, this bolt-on method restricts our possible models to only those that can be modified from our initial model . There are possibly better models beyond our Pareto frontier above, and we canโt get to simply because our initial model wasnโt as good.
The Pareto frontier formed from mixtures of models is thus better than the Pareto frontier from our bolt-on method. However, finding is intractable in the worst case, but we do have heuristics like the ๐ฎ Oracle Fairness Approach.