Blog · Review Essay · Last reviewed June 24, 2026

Invisible Women and the Data Gap Under AI

Caroline Criado Perez's Invisible Women is not an AI book in the narrow sense. It is more useful than that: a map of how missing data becomes designed reality before any model is trained.

For this review, a data gap means more than an empty cell. It is any absence, misclassification, proxy, sampling skew, measurement default, or failure to disaggregate that lets an institution mistake a partial record for the whole population. Under AI, that gap becomes a pipeline risk: the omission can be learned, scored, summarized, routed, and acted on at scale.

The sharper lesson is that "default" is not a harmless starting point. A default body, worker, patient, commuter, caregiver, applicant, or user becomes a rule once a system is allowed to optimize around it.

The Book

Invisible Women: Data Bias in a World Designed for Men was published by Abrams Press in the United States on March 12, 2019. Abrams lists Caroline Criado Perez as the author and ISBN-13 9781419729072 for the hardcover edition. Amazon's listing uses ISBN-10 1419729071, the same number used in the site's affiliate link for this review.

The book's argument is simple and severe: when institutions collect, classify, and design around incomplete data, the missing population does not merely disappear from spreadsheets. It is made to live inside systems calibrated for someone else. Criado Perez moves across medicine, transport, work, public policy, consumer products, and technology, but the recurring mechanism is the same. The male default is treated as general reality, while women are handled as exceptions, edge cases, or noise.

That makes the book a companion to Data Feminism, All Data Are Local, More than a Glitch, and Weapons of Math Destruction. Those books ask related questions about power in data work, local evidence, systemic bias, and opaque scoring. Invisible Women gives the gender-data version of the same problem: the system can be unfair before anyone calls it automated.

Current Context

As of June 24, 2026, the book reads less like a warning about neglected statistics and more like a practical requirement for AI governance. Current rules and standards increasingly ask the question Criado Perez keeps forcing into view: what population does the evidence actually describe, and who is harmed when a system treats that evidence as universal?

The EU AI Act makes this explicit for high-risk AI systems that use training, validation, or testing data. Article 10 requires data governance practices tied to the system's intended purpose, including data origin, relevant design choices, assumptions about what data measure, bias examination, relevant data gaps, and data that are sufficiently representative for the intended use. Article 53 and the European Commission's 2025 public-summary template for general-purpose AI training content are weaker than full provenance, but they confirm the same direction: consequential AI needs records about the evidence base, not only output demos.

U.S. federal procurement has moved toward the same evidence problem. OMB Memorandum M-25-22, issued in April 2025, directs agencies acquiring AI systems to attend to performance, risk management, data rights, documentation, transparency, and vendor lock-in across the acquisition lifecycle. For a data-gap reading, procurement is not paperwork after the model is chosen. It is where the buyer demands proof that the vendor knows who the system was built and tested for.

Health and clinical research show why the issue cannot be reduced to "AI bias" as a new phenomenon. NIH expects sex as a biological variable to be addressed in research designs, analyses, and reporting for vertebrate animal and human studies unless justified otherwise. FDA's 2024 draft diversity action plan guidance asks sponsors to set clinical-study enrollment goals by age group, sex, race, and ethnicity for the clinically relevant study population. Those are limited domain rules, but they illustrate the governance move: the default must be named before it can be tested.

The Default Is a Decision

The value of Invisible Women for this archive is that it turns "default" into a political word. A default body, default worker, default route, default voice, default schedule, or default risk profile looks neutral only because the institution has stopped noticing the people it was built around. In that sense, a data gap is not emptiness. It is an active design condition.

This is the same grammar that runs through algorithmic governance. A model does not need hatred to harm. It can inherit a world where care work is undercounted, clinical research is uneven, workplace equipment is standardized around the wrong body, and mobility patterns are read through one narrow map of daily life. The model then compresses these inherited omissions into scores, recommendations, priorities, and interfaces that feel newly objective because the old bias has been laundered through computation.

The default also decides who must do extra work. A person who does not fit the assumed body, schedule, name, voice, caregiving pattern, symptom profile, or risk history has to translate themselves into a system that was not built to hear them. In human administration this is already exhausting. In automated administration it becomes harder to contest because the default can appear as a form field, threshold, classifier, benchmark, or synthetic "typical user" rather than as a visible human choice.

This is a reference-class error with consequences. If the evidence describes one group but the system is deployed as if it describes everyone, the institution has not found a neutral baseline. It has mistaken a partial population for reality and then shifted the burden of proof to everyone who falls outside it.

The Gap Is a Pipeline

The data gap is best understood as a pipeline failure, not a single missing dataset. It can enter at collection, when no one gathers evidence about a population; at classification, when categories force people into the wrong box; at disaggregation, when averages hide subgroup failure; at interpretation, when a proxy is treated as a fact; and at deployment, when a system is moved into a setting where its evidence no longer fits.

Averages are one of the quietest failure modes. A system can perform well overall while failing people whose bodies, speech patterns, routes, household structures, symptoms, occupations, or documentation histories were rare in the source data. The aggregate metric can then become a shield against the very subgroup evidence needed to show harm.

AI systems add more places for the gap to harden. Training data can underrepresent a group. Evaluation data can hide that underrepresentation by reporting aggregate accuracy. Retrieval systems can rank sources that repeat the old default. Synthetic data can reproduce the same assumptions with cleaner formatting. Agents can turn the inherited default into calendar choices, triage decisions, procurement recommendations, benefit workflows, hiring screens, and support replies.

This is why "add more data" is only the first answer. The stronger requirement is fit-for-purpose evidence: the institution must know which population the system will affect, which variables are missing or dangerous, which differences should be disaggregated, which categories should be changed, which uses are out of scope, and which affected people can correct the record before the output becomes policy.

The AI Reading

Read after the generative AI boom, the book becomes a warning about training data and evaluation. AI systems are often described through scale: more data, bigger models, broader coverage. Criado Perez pushes against the comfort of that scale. Bigger data can still be partial data. Broader coverage can still miss the questions nobody asked. Benchmark performance can still hide subgroup failure when the test set inherits the same social defaults as the training set.

For AI agents, the problem sharpens. An agent that schedules, triages, drafts, routes, or buys things acts on defaults. If its tools assume a generic worker, patient, applicant, commuter, or user, then the agent's action can turn omission into administration. The issue is not consciousness, intention, or machine will. It is authorized automation operating inside a biased description of the world.

The risk is not only representational. It is material. A medical summarizer can normalize symptoms around the wrong baseline. A workplace model can treat caregiving interruptions as low commitment. A benefits system can make household forms harder for people whose lives do not match the form. A hiring tool can learn from records shaped by earlier exclusion. A city-routing tool can optimize for commute patterns that miss care trips, part-time schedules, or safety constraints. None of those systems needs to mention gender explicitly to reproduce a gendered default.

Generative interfaces make the problem harder to see because they translate partial evidence into fluent prose. A chatbot can sound helpful while smoothing over uncertainty, missing subgroup evidence, or a category that should not have been used. The more conversational the system feels, the more important the audit trail becomes: what sources were available, what population was tested, what context was assumed, and what a harmed person can do when the machine-readable story is wrong.

Governance After the Gap

NIST's AI Risk Management Framework treats trustworthy AI as a matter of design, development, deployment, and evaluation, not as a public-relations label. NIST's 2022 publication on identifying and managing AI bias is even more direct: bias is sociotechnical, shaped by data, models, institutional practices, and human interpretation. The NIST Generative AI Profile adds technology-specific risks, including harmful bias, homogenization, privacy, information integrity, and value-chain issues. That is the governance bridge from Criado Perez to AI safety. The question is not only whether a model is accurate on average. It is accurate for whom, under what conditions, with what missing variables, and with what path of appeal when the system fails.

The gender-data context is not only literary. WHO's gender-data work treats quality disaggregated data as necessary for health and well-being. The World Bank Gender Data Portal collects sex-disaggregated data and gender statistics across domains such as health, education, economic opportunity, public life, and agency. NIH expects sex as a biological variable to be factored into research designs, analyses, and reporting for vertebrate animal and human studies unless a strong justification is given. FDA's 2024 draft diversity action plan guidance shows the same regulatory concern in clinical studies by asking for enrollment goals separated by age group, sex, race, and ethnicity for the clinically relevant study population. These are not complete solutions, but they show the basic governance move: the default must be made visible before it can be tested.

The European Commission's AI Act overview describes a risk-based framework for AI systems, including stricter rules for high-risk uses and transparency obligations. Article 10 of the AI Act is especially relevant: high-risk systems that use training, validation, or testing data need data governance practices, attention to data origin, assumptions about what data measure, bias detection and mitigation, and data that is relevant, sufficiently representative, and as complete and error-free as possible for the intended purpose. Whatever one thinks of the legal details, the direction confirms the book's point: consequential systems need evidence about the people they affect.

For general-purpose AI, public training-content summaries are a floor, not an audit. A summary can help users, rightsholders, researchers, and deployers understand broad data sources, but it does not by itself prove representativeness, consent, lawful use, labor conditions, subgroup performance, or downstream suitability. A buyer still needs deployment-specific evidence before putting a general model into a hiring, health, benefits, education, credit, or public-service workflow.

Procurement is where this becomes concrete. A buyer should not accept a vendor's claim that a system is fair because it performs well on average. Contracts for high-impact AI should require documented data provenance, subgroup and intersectional evaluation where lawful and feasible, known gap statements, monitoring by deployment setting, records of overrides and complaints, human review with authority to change outcomes, data-minimization limits, and exit rights if the system cannot be audited. OMB's April 2025 AI acquisition memo for U.S. agencies points in this direction by tying acquisition to responsible use, performance tracking, documentation, data rights, and risk management.

Safety also includes restraint. Measuring gender, sex, race, disability, pregnancy, caregiving, or household structure can reveal hidden harm, but sensitive measurement can also become surveillance. A responsible program states the lawful basis for collection, limits access, uses aggregation or privacy-preserving methods where possible, deletes or segregates audit data when the purpose ends, and gives affected people notice and correction routes. The answer to invisibility is not total visibility. It is accountable visibility for a defined public-interest purpose.

Where the Book Needs Care

The book's evidentiary abundance is persuasive, but it can leave a reader with a tempting remedy: collect more data, and design will improve. That is necessary, but not sufficient. More complete data can expose a problem; it does not force an institution to care. Data can also become a new channel of surveillance, especially for people who already face medical, workplace, welfare, or platform scrutiny.

The stronger lesson is therefore not "include women in the dataset" alone. It is to ask who controls the category, who benefits from the measurement, who can refuse collection, who can correct the record, and who has power to change the design after harm appears. The missing-data problem is real. So is the danger of solving it by building more exhaustive systems of observation without building more democratic systems of control.

The book also needs careful handling around the words "women," "sex," and "gender." Some design failures concern bodies and physiology, some concern gendered social roles, some concern pregnancy or caregiving, and some concern administrative categories that do not fit many people cleanly. Treating all of those as one variable can reproduce the same category mistake the book is trying to expose. Good AI governance names the construct being measured and tests whether the category is necessary, proportionate, inclusive, and safe to collect.

Intersectionality matters for the same reason. A system can improve for women on average while still failing disabled women, Black women, trans women, migrant women, older women, pregnant people, low-income caregivers, or people using lower-resource languages and dialects. Average correction can become a new default if governance stops at the first subgroup slice.

What This Changes

Invisible Women belongs in this catalog because it explains a basic ritual of machine society: what is uncounted becomes unreal, and what is counted badly becomes policy. It gives readers a concrete way to inspect AI claims without mystification. Ask what population is missing. Ask what default body or life pattern the system assumes. Ask whether performance is averaged over the very differences that matter.

The practical audit is direct. Name the affected population. Name the intended decision. Show the data source and original purpose. Identify who is missing or misclassified. Explain which variables are proxies. Report subgroup and intersectional results with uncertainty. State the limits of the evidence. Provide notice, correction, appeal, and a responsible human who can change the outcome. Keep a post-deployment record of complaints, overrides, incidents, and drift.

The book's lasting force is its refusal to treat exclusion as an accident at the margins. In an AI system, the margin can become the rule for everyone routed through it. If the record is incomplete, the machine does not repair reality by optimizing over it. It formalizes the gap and then asks affected people to live inside the formalization.

Source Discipline

This review separates book facts, book interpretation, and current governance claims. Abrams and Amazon support bibliographic and retail metadata. WHO, World Bank, NIH, FDA, and UN Women support the broader factual context about gender data, health data, clinical-study demographic planning, and unpaid care. NIST, OMB, EU AI Act, and European Commission training-content materials support current AI-governance claims. The AI-era reading is an application of Criado Perez's framework, not a claim that the book predicted every feature of generative AI, procurement, or agentic workflows.

Claims about data gaps must also stay scoped. A missing-data claim should name the population, setting, variable, source, decision context, time period, and harm. A fairness claim should say whether it concerns representation, quality of service, allocation, appeal, privacy, or downstream institutional behavior. Without that discipline, "bias" becomes a vague slogan and "more data" becomes a risky cure-all.

Data Feminism, All Data Are Local, and The Data Sheet Becomes the Supply Chain extend the review's questions into power, absence, category, context, provenance, and documentation.
More than a Glitch and Weapons of Math Destruction show how biased assumptions become deployed systems with real consequences.
Algorithmic Bias, Automation Bias, and Human Oversight of AI Systems give the controls needed when defaults turn into decisions.
AI Governance, AI Procurement, Algorithmic Impact Assessments, and Algorithmic Recourse turn the review's argument into institutional duties.
AI in Healthcare, AI in Employment, Training Data, AI Data Provenance, Model Cards and System Cards, and Data Minimization mark the places where better evidence and restraint have to be designed together.

Sources

Abrams Books, Invisible Women, publisher listing for title, author, imprint, publication date, and ISBN-13 9781419729072, reviewed June 24, 2026.
Amazon, Invisible Women: Data Bias in a World Designed for Men, retail listing and ASIN/ISBN-10 1419729071, reviewed June 24, 2026.
Caroline Criado Perez, Invisible Women author page, author framing of the gender data gap and book summary, reviewed June 24, 2026.
World Health Organization, Closing data gaps in gender, official WHO activity page on disaggregated gender data and health, reviewed June 24, 2026.
World Bank, Gender Data Portal, official portal for sex-disaggregated data and gender statistics, reviewed June 24, 2026.
National Institutes of Health Office of Research on Women's Health, Sex as a Biological Variable, NIH policy expectation for research design, analysis, and reporting, reviewed June 24, 2026.
Food and Drug Administration, Diversity Action Plans to Improve Enrollment of Participants from Underrepresented Populations in Clinical Studies, draft guidance on enrollment goals by age group, sex, race, and ethnicity, reviewed June 24, 2026.
UN Women, Care: A critical investment for gender equality and the rights of women and girls, unpaid care work context, reviewed June 24, 2026.
National Institute of Standards and Technology, AI Risk Management Framework, official NIST page for AI risk management guidance, reviewed June 24, 2026.
National Institute of Standards and Technology, Towards a Standard for Identifying and Managing Bias in Artificial Intelligence, NIST Special Publication 1270, reviewed June 24, 2026.
National Institute of Standards and Technology, Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile, NIST AI 600-1, reviewed June 24, 2026.
European Commission, AI Act overview, official policy page for the EU AI Act, risk-based rules, transparency, and high-risk AI systems, reviewed June 24, 2026.
European Commission AI Act Service Desk, Article 10: Data and data governance, Regulation (EU) 2024/1689, official text and summary, reviewed June 24, 2026.
European Commission, Explanatory Notice and Template for the Public Summary of Training Content for general-purpose AI models, published July 24, 2025 and last updated March 26, 2026, reviewed June 24, 2026.
White House Office of Management and Budget, M-25-22: Driving Efficient Acquisition of Artificial Intelligence in Government, federal AI acquisition, performance, documentation, data-rights, and risk-management guidance, reviewed June 24, 2026.

Book links are paid affiliate links. As an Amazon Associate I earn from qualifying purchases.

Buy on Amazon Browse Books

Amazon, Invisible Women by Caroline Criado Perez, reviewed June 24, 2026.

Return to Blog · Return to Books