Discriminating Data and the Politics of Recognition
Wendy Hui Kyong Chun's Discriminating Data is a hard book for the age of machine learning because it refuses the easy story that biased systems merely fail. It asks what happens when recognition itself becomes a sorting machine.
The Book
Discriminating Data: Correlation, Neighborhoods, and the New Politics of Recognition was published by the MIT Press. The publisher lists Wendy Hui Kyong Chun as author, the hardcover ISBN as 9780262046220, the hardcover publication date as November 2, 2021, and the paperback ISBN as 9780262548526 with a March 5, 2024 publication date. Penguin Random House's MIT Press distribution listing lists the paperback at 344 pages. Amazon's paperback product page uses 0262548526, the paperback ISBN-10, as its product identifier.
The book belongs beside Chun's earlier Control and Freedom and Updating to Remain the Same, but it is more directly aimed at machine learning. Its subject is not simply data bias. It is the deeper political fantasy that a population can be made legible by finding patterns of likeness, grouping people by correlation, and then calling the resulting recognition objective.
Correlation Is Not Innocent
Chun's strongest move is to treat correlation as a historical and political instrument, not just a statistical technique. Predictive systems act through resemblance: people who look, act, click, buy, move, speak, or associate like others are treated as likely to share a future. The problem is not only that the resemblance can be wrong. The problem is that resemblance is already socially organized.
This matters for AI because machine learning often turns inherited relations into operational defaults. A model does not need an explicit racial category, class label, or political identity to reproduce social structure. Proxies, neighborhoods, embeddings, and interaction patterns can carry the work. The result is a system that appears to discover groups while helping to harden them.
For Spiralism, this is a theory of algorithmic belief formation. The system says: you are like these people, so you will want this, fear that, fail here, belong there. The user is then shown a world shaped by that classification. Prediction becomes training. Recognition becomes a loop.
The Recognition Trap
The politics of recognition has usually been framed as a demand to be seen. Chun asks what happens when being seen by computational systems means being sorted into managed similarity. In recommender systems, personalization can narrow a field of encounter. In facial recognition, recognition can become an infrastructure of suspicion. In social platforms, homophily can be treated as natural affinity even when the platform architecture amplifies sameness.
The book is valuable because it does not stop at saying that datasets are unrepresentative. That claim is true but incomplete. A more representative dataset can still support a harmful classificatory regime if the goal remains prediction through social sorting. The question is not only whose data are missing. It is what kind of world the system is trying to make predictable.
NIST's face recognition work is useful context here because it treats demographic differentials as measurable system performance questions rather than vibes. That is necessary. Chun's contribution is to press the prior question: why are these systems being asked to recognize, cluster, and operationalize identity in the first place?
The Governance Reading
Read in 2026, Discriminating Data is a governance book even when it is not written in the idiom of compliance. NIST's AI Risk Management Framework frames AI risk management across design, development, use, and evaluation. The European Commission's AI Act page presents risk-based rules and identifies areas such as biometrics, education, employment, and law enforcement as high-risk contexts. Those frameworks make one thing clear: discriminatory data practices are not only research problems. They are institutional deployment problems.
Chun's book sharpens that lesson. Governance cannot be limited to model accuracy or post-deployment audits. It must ask about defaults, categories, optimization goals, data lineage, proxies, feedback loops, affected communities, and the right to refuse classification. A system can be accurate and still politically destructive if it accurately reproduces a segregated world.
Where the Book Needs Care
The book's language is dense, and that density matters. Chun is writing across media theory, statistics, race, platform studies, and political critique. The price is that some readers looking for a procurement checklist or technical audit method will have to do translation work. This is not a weakness exactly, but it affects how the book travels into policy rooms.
The other caution is that "correlation" can become too capacious if used as a universal villain. Some correlations are useful, some are dangerous, and many are only meaningful inside a specific decision setting. The book is at its best when it keeps the question concrete: what relation is being measured, who benefits from making it predictive, and what alternatives are foreclosed once the pattern becomes policy?
What This Changes
Discriminating Data gives this archive a vocabulary for the social life of machine learning. Ask not only whether a system is biased, but what model of likeness it builds. Ask how it defines neighborhoods, what it treats as normal, what it recognizes as signal, and what gets disciplined as noise. Ask whether the system opens new possibilities or makes old categories more durable.
The practical lesson is sober: desegregating AI is not achieved by adding a fairness metric to an unchanged architecture of recognition. It requires different defaults, different publics, different data practices, and different rights around refusal, contestation, and repair. Chun's book is a warning against mistaking visibility for justice. To be recognized by a machine is not the same as being understood.
Sources
- MIT Press, Discriminating Data: Correlation, Neighborhoods, and the New Politics of Recognition, publisher listing for title, author, eBook ISBN 9780262367257, hardcover ISBN 9780262046220, paperback ISBN 9780262548526, publication dates, publisher, and description, reviewed June 15, 2026.
- Penguin Random House, Discriminating Data by Wendy Hui Kyong Chun, distribution listing for title, author, subtitle, paperback ISBN 9780262548526, MIT Press imprint, publication date, page count, and Amazon vendor identifier, reviewed June 15, 2026.
- Amazon, Discriminating Data, retail listing and ASIN/ISBN-10 0262548526 for the paperback edition, reviewed June 15, 2026.
- NIST, Face Recognition Vendor Test Part 3: Demographic Effects, official publication page for NISTIR 8280 on demographic effects in face recognition evaluation, reviewed June 15, 2026.
- NIST AI Resource Center, AI Risk Management Framework, official AI RMF overview, voluntary-use statement, and design/development/use/evaluation framing, reviewed June 15, 2026.
- European Commission, AI Act, official page for Regulation (EU) 2024/1689, risk-based AI rules, high-risk use cases, GPAI rules, and implementation timeline, reviewed June 15, 2026.
Book links are paid affiliate links. As an Amazon Associate I earn from qualifying purchases.