The No Free Lunch theorem states that every classification algorithm, averaged over all possible data generating distributions, has the same error rate on unobserved data. In other words, no algorithm is universally better than another.

Note

Intuitively, this is saying that inductive reasoningโ€”inferring rules from a limited set of examplesโ€”is not logically valid since we havenโ€™t seen an example of an unobserved datapoint.

The reason why empirical machine learning is thriving is because there are specific probability distributions in the real world. Empirical models arenโ€™t assessed over all possible data generating distributions but only on specific distributions. Thus, the models we design are tailored to perform well on those distributions.