AI Ethics

Mathematical Fairness Trade-offs

Mathematical fairness trade-offs occur because it is often impossible to satisfy multiple, mutually exclusive fairness definitions simultaneously in a single model.
The "Impossibility Theorem" proves that calibration, predictive parity, and equalized odds cannot coexist if base rates differ across groups.
Practitioners must explicitly choose which fairness metric to prioritize based on the specific social context and the cost of different types of errors.
Achieving fairness often requires a deliberate sacrifice in overall model accuracy, creating a Pareto frontier between performance and equity.

Why It Matters

Financial services industry

In the financial services industry, companies like Zest AI use fairness-aware modeling to ensure that credit scoring algorithms do not disproportionately exclude protected groups. Because credit data often contains historical biases, these firms must navigate the trade-off between maximizing profit (minimizing defaults) and ensuring equitable access to capital. They often prioritize calibration to ensure that a credit score represents the same risk level across all demographics.

Criminal justice system

In the criminal justice system, tools like COMPAS were designed to predict recidivism. Research has shown that these tools often exhibit different error rates for different racial groups, leading to significant public debate. The mathematical trade-off here is between the "False Positive" rate (wrongly predicting someone will re-offend) and the "False Negative" rate (wrongly predicting someone will not re-offend), where the social cost of these errors is vastly different for the individuals involved.

Healthcare sector

In the healthcare sector, diagnostic algorithms used for skin cancer detection or cardiovascular risk assessment must be calibrated across different ethnicities. If a model is trained primarily on lighter-skinned patients, it may be less accurate for darker-skinned patients, leading to delayed diagnoses. Developers must balance the overall accuracy of the model with the need to ensure that the error rates (equalized odds) do not create systemic health disparities.

How it Works

The Intuition of Conflict

In machine learning, we are often taught to optimize for a single objective: accuracy. However, when we introduce fairness, we are essentially adding a second, often competing, objective. Imagine you are building a loan approval system. You want to be accurate (predict who will pay back the loan), but you also want to be fair (not discriminate based on gender).

The core of the mathematical fairness trade-off is that these goals often pull in opposite directions. If one group has historically faced systemic barriers, their "base rate" of success might be lower in your training data. If you force the model to have equal success rates across groups (Demographic Parity), you might have to ignore certain features that correlate with the target, which lowers your overall accuracy. If you instead focus on calibration, you might end up with different success rates for different groups, which some might perceive as unfair.

The Impossibility Theorem

The most significant finding in this field is the Kleinberg et al. (2016) "Impossibility Theorem." It mathematically demonstrates that if two groups have different base rates of a target outcome, it is impossible to satisfy calibration, equalized odds, and predictive parity simultaneously.

Think of it like a three-legged stool where the legs are of different lengths. You can make two of them level, but the third will inevitably be off-balance. This is not a failure of the algorithm; it is a mathematical property of the data. When we observe a disparity in outcomes, we are often seeing the reflection of societal inequalities captured in the data. Trying to "fix" this through model constraints forces the model to choose which type of inequality it is willing to tolerate.

The Pareto Frontier of Fairness

When we train a model, we can visualize the trade-off using a Pareto frontier. On the x-axis, we plot a fairness metric (e.g., the difference in false positive rates), and on the y-axis, we plot accuracy. As you move along the curve, you are making a conscious choice. You can increase fairness, but you must accept a drop in accuracy.

This is the "Fairness-Accuracy Trade-off." It is crucial to understand that this is not a permanent state. Often, the trade-off exists because our feature set is incomplete or biased. However, in the short term, practitioners must decide whether the cost of a false negative is higher for one group than another. This is an ethical decision, not a technical one. The math simply reveals the cost of your ethical preference.

Common Pitfalls

"Fairness can be achieved by removing the protected attribute." Many believe that deleting race or gender from the dataset solves the problem. In reality, other features (like zip code or education) act as proxies, and the model will still learn to discriminate based on these correlated variables.
"There is one universal mathematical definition of fairness." Learners often search for the "correct" metric, but fairness is context-dependent. A metric that works for a hiring algorithm may be entirely inappropriate for a medical diagnostic tool.
"Higher accuracy always implies a fairer model." This is false; a model can be highly accurate by simply reinforcing historical biases present in the training data. Accuracy measures how well the model fits the data, not how well it adheres to ethical standards.
"Mathematical fairness is a purely technical problem." Fairness is a socio-technical challenge. While the constraints are mathematical, the decision of which constraint to prioritize is a value judgment that must involve stakeholders, legal experts, and the affected communities.

Sample Code

Python

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix

# Simulate data: 1000 samples, 2 groups (0 and 1)
# Group 1 has a higher base rate of success
X = np.random.randn(1000, 2)
groups = np.random.randint(0, 2, 1000)
y = (X[:, 0] + 0.5 * groups > 0).astype(int)

# Train a standard logistic regression
model = LogisticRegression().fit(X, y)
preds = model.predict(X)

def get_metrics(y_true, y_pred, group_mask):
    tn, fp, fn, tp = confusion_matrix(y_true[group_mask], y_pred[group_mask]).ravel()
    tpr = tp / (tp + fn) # True Positive Rate
    fpr = fp / (fp + tn) # False Positive Rate
    return tpr, fpr

# Compare TPR/FPR across groups
tpr0, fpr0 = get_metrics(y, preds, groups == 0)
tpr1, fpr1 = get_metrics(y, preds, groups == 1)

print(f"Group 0: TPR={tpr0:.2f}, FPR={fpr0:.2f}")
print(f"Group 1: TPR={tpr1:.2f}, FPR={fpr1:.2f}")
# Output:
# Group 0: TPR=0.72, FPR=0.18
# Group 1: TPR=0.85, FPR=0.22
# Note: Disparity exists because the model reflects the base rate difference.

Key Terms

Calibration

A condition where the predicted probability of an outcome matches the actual observed frequency of that outcome for a given subgroup. If a model predicts a 70% risk of default, 70% of individuals in that group should actually default.

Predictive Parity

A fairness criterion requiring that the precision (positive predictive value) is equal across different demographic groups. It ensures that a positive prediction carries the same meaning regardless of the individual's group membership.

Equalized Odds

A constraint requiring that both the True Positive Rate and the False Positive Rate are equal across all protected groups. This ensures that the model is equally accurate at identifying positive cases and equally likely to make mistakes for all groups.

Base Rate

The actual prevalence of a target variable within a specific population or subgroup. Disparities in base rates are the primary mathematical driver behind the impossibility of satisfying all fairness metrics simultaneously.

Pareto Frontier

The set of all possible model configurations where one metric (e.g., accuracy) cannot be improved without degrading another metric (e.g., fairness). It represents the optimal trade-off space for a given dataset and model architecture.

Protected Attribute

A variable, such as race, gender, or age, that is legally or ethically protected from being used as a basis for discrimination. In fairness research, these are the variables we aim to ensure the model does not unfairly impact.

Demographic Parity

A fairness definition requiring that the probability of a positive prediction is the same across all groups, regardless of the actual target outcome. This metric focuses on representation rather than accuracy or calibration.