The Pokémon Theorem

If you have been near algorithmic fairness for the past decade, you already know the punchline: you cannot have all the things you want. Calibration, class-conditional balance, and predictive parity refuse to coexist whenever base rates differ across groups. This is the impossibility trinity of Kleinberg, Mullainathan and Raghavan (2017), Chouldechova (2017) and Pleiss et al. (2017). Barocas, Hardt and Narayanan collect the pairwise tensions among independence, separation and sufficiency into what they call the incompatibility triangle. Hutchinson and Mitchell (2019) survey the fifty years of educational-testing impossibility work that came before all of this. The 1970s knew most of the punchline. None of us listened.

There is an obvious question that the trinity leaves open. What if we only impose scalar criteria, rather than distributional ones? Will a few of them suffice to ensure fairness? After all, this is what individual scalar metrics such as college acceptance rates, recidivism rates, etc. aim to control for, only to be thwarted again by another study. With dozens of criteria, with hundreds, with every reasonable test anyone has ever proposed, can we close every gap? It turns out we cannot. The argument is a very simple piece of linear algebra.

A kernel refresher

Pick a characteristic kernel \(k\) on the feature space \(\mathcal{X}\) with feature map \(\phi\). For each group \(g \in \{a, b\}\), the conditional mean embedding is

\[\mu_g \;:=\; \mathbb{E}[\phi(X) \mid G = g] \;\in\; \mathcal{H}.\]

A linear mean-fairness criterion is a test direction \(v \in \mathcal{H}\), and the score (or classifier, or learned representation) passes the test exactly when the expected score on group \(a\) matches that on group \(b\):

\[\mathbb{E}[c(X) \mid G = a] - \mathbb{E}[c(X) \mid G = b] \;=\; \langle v, \delta \rangle \;=\; 0,\]

where \(\delta := \mu_a - \mu_b\) is the group-difference vector. Demographic parity gaps, calibration moments, equalized-odds residuals, every “is the expected something the same across groups” check fits this form for some \(v\). A fairness checklist of size \(m\) is a finite set of directions \(\{v_1, \ldots, v_m\}\), and the classifier passes the checklist iff \(\delta\) is orthogonal to every \(v_i\).

If the groups are distributionally distinct, \(P_a \neq P_b\), then \(\delta \neq 0\) by the characteristic property of \(k\). This is the only ingredient we need.

The geometric escape

Let \(V_m = \mathrm{span}\{v_1, \ldots, v_m\}\) be the subspace of the tests we apply. This is why the RKHS formulation helps: tests and distribution differences are vectors in the same inner-product space, so passing a mean-fairness criterion is exactly orthogonality to a test direction, and a whole checklist is just a subspace. If our classifier passes all \(m\) criteria, then \(\delta \in V_m^\perp\), the orthogonal complement of the audit subspace. Now consider the unit vector \(\delta / \|\delta\|_\mathcal{H}\). By construction it also lies in \(V_m^\perp\), and

\[\left\langle \frac{\delta}{\|\delta\|_\mathcal{H}}, \, \delta \right\rangle \;=\; \|\delta\|_\mathcal{H} \;=\; \mathrm{MMD}(P_a, P_b) \;>\; 0.\]

That direction is a fairness violation. We just constructed it. Whichever finite checklist you brought, the MMD witness is sitting in its orthogonal complement, perfectly visible, completely unaudited.

Gotta Catch ’Em All? Alas, the Pokémon theorem says that you can’t.

Why does this work?

Our argument is structural, not numerical. It does not depend on which criteria you chose, only on the fact that there are finitely many of them and the groups are distributionally distinct. The four-hundredth criterion buys you an audit subspace of dimension at most four hundred, and the orthogonal complement of a four-hundred-dimensional subspace of an infinite-dimensional Hilbert space is still infinite-dimensional. Some direction in it carries the group-difference signal. The MMD witness is always one such direction.

You can ask how large the residual is after \(m\) criteria, and the paper gives a quantitative answer in terms of the spectral regularity of \(\delta\) relative to the pooled data covariance: under polynomial eigendecay of the covariance operator and a source condition on \(\delta\), the minimax residual decays at the Kolmogorov \(m\)-width rate \(\Theta(m^{-2\alpha r})\). The minimax-optimal allocation of a size-\(m\) fairness budget is the top-\(m\) Mercer eigenspace of the pooled covariance. Spectral budgeting beats heuristic checklists. The paper has the details.

Provenance

I first posited this theorem (without proof) in a Stanford CS 329P lecture on applied machine learning. It sat there for a few years, in the form of a single slide and a feeling that it ought to be true. Then my son Daniel Matsui Smola and I sat down together and turned it into a proof, a quantitative version, and a handful of approximation results around it. I have written quite a few papers over the years. This is the first one I have written with Daniel, and I am incredibly proud of him. Watching him chase down the spectral arguments and add many more results has been one of the genuine joys this year.

Next post in this series: the impossibility of fair feature learning. A different and more brutal piece of orthogonality folklore from the same paper. Check out arXiv:2605.09221.