The bolt-on fairness method converts a standard machine learning classifier into an adjusted classifier that we can control to satisfy fairness guarantees.

Specifically, for subgroups and , is defined by four parameters:

In other words, these parameters define the probability of outputting a positive prediction for each of the four groups defined by and the prediction from . For , we would set and . For random decisions (perfect fairness), we have .

These parameters thus range from perfect fairness to standard machine learning, giving us a wide range of valid models. Specifically, the set of all gives us a Pareto frontier of defined on difference in errors and overall error.

For a certain maximum difference , we can solve a linear program to find the values for that minimize .

Limitations

However, this bolt-on method restricts our possible models to only those that can be modified from our initial model . There are possibly better models beyond our Pareto frontier above, and we canโ€™t get to simply because our initial model wasnโ€™t as good.

The Pareto frontier formed from mixtures of models is thus better than the Pareto frontier from our bolt-on method. However, finding is intractable in the worst case, but we do have heuristics like the ๐Ÿ”ฎ Oracle Fairness Approach.