Blog · Review Essay · May 2026

The Digital Person and the Dossier Machine

Daniel J. Solove's The Digital Person is a privacy book that has aged into an AI governance book. Its core warning is that databases do not merely reveal people. They assemble administrative versions of people, circulate those versions through institutions, and make decisions around records the subject cannot fully see, correct, or contest.

The Book

The Digital Person: Technology and Privacy in the Information Age was published by NYU Press in December 2004, with a 283-page hardcover edition and a paperback edition following in 2006. NYU Press lists it as the first volume in the Ex Machina: Law, Technology, and Society series. Solove was then an associate professor at George Washington University Law School, and the GW Law repository makes the full text available as a faculty publication.

The book studies the social, political, and legal consequences of personal information held in computer databases. Its central term is the "digital dossier": the assembled record of transactions, identifiers, public records, browsing traces, background checks, credit information, and other fragments that businesses and government agencies use to make decisions about people.

That makes the book an important companion to The Black Box Society, Data and Goliath, Automating Inequality, Delete, and The Ordinal Society. Solove is earlier than most of them, and less focused on machine learning, but his account of database personhood explains the administrative substrate those later systems inherit.

The Dossier as Person

The strongest idea in The Digital Person is that privacy harm is not limited to exposure. The old mental picture says privacy is invaded when a hidden room is entered or a secret is published. Solove argues that digital databases create a different kind of vulnerability. Much of the information may be mundane, semi-public, voluntarily submitted, or collected through ordinary participation. The harm comes from aggregation, circulation, interpretation, and use.

A dossier is not a mirror. It is an institutional working model. It gives lenders, employers, marketers, police, platforms, insurers, schools, and agencies a version of the person they can sort, search, flag, price, exclude, target, or approve. Once that model becomes operational, the flesh-and-blood person must live with the consequences of a record they did not design.

This is why the book remains useful after the first wave of internet privacy debates. Solove is not only worried that someone knows too much. He is worried that records become decision infrastructure. The database does not have to be malicious to be dangerous. It can be incomplete, stale, decontextualized, merged with other data, shared through routine business arrangements, or treated as more authoritative than the person standing in front of the institution.

That is a legibility problem. A society that runs through databases must translate people into fields, identifiers, categories, and risk signals. The translation can make administration easier, but it also creates a second self that is easier to govern than the original. The record becomes portable. The person becomes locally absent from decisions made in their name.

The Kafka Problem

Solove's most durable move is his shift away from Orwell as the master metaphor for privacy harm. Orwell helps describe centralized watching, fear, and political domination. But many database harms are more bureaucratic than theatrical. They are closer to Kafka: opaque procedure, inaccessible files, uncertain accusation, and a subject who cannot locate the point where the system can be answered.

This distinction matters because bad metaphors produce bad governance. If privacy is imagined only as secrecy, then any fact that is not fully secret may appear fair game. If privacy is imagined only as control, then a long notice, a consent screen, or an opt-out maze may look like a solution. Solove's book asks for a broader account: privacy as protection against institutional power that collects, processes, shares, and acts on personal information without meaningful participation from the people described.

The Kafka frame also explains why database power often feels banal. There may be no single villain, no dramatic disclosure, no room full of monitors. There are vendors, forms, data brokers, public records, background-screening firms, access controls, matching rules, security exceptions, credit files, and bureaucratic habits. Harm arrives as delay, denial, suspicion, misclassification, exposure to fraud, inability to correct a file, or a decision that seems to come from nowhere.

That is the world many AI systems now enter. Models do not replace the dossier machine. They plug into it. They summarize records, infer missing traits, score risk, personalize offers, route applicants, flag behavior, and generate explanations around data trails that were already unevenly visible and hard to contest.

The AI-Age Reading

Read in 2026, The Digital Person looks like a prehistory of automated personhood. Solove wrote before smartphones, social graphs, real-time bidding, large language models, and current AI agents became ordinary infrastructure. Yet the book identifies the condition that makes all of them politically serious: institutions increasingly act on data doubles.

AI intensifies this by making dossiers active. A database stores and retrieves. A model infers, ranks, summarizes, predicts, drafts, recommends, and sometimes acts. The old dossier said, "Here is what the record contains." The AI-era dossier says, "Here is what this pattern probably means, what should happen next, and how the decision can be justified in language."

This changes the stakes of privacy. A person's data trail can become training material, personalization memory, risk signal, customer-service context, fraud score, hiring feature, ad target, educational profile, law-enforcement lead, or chatbot prompt history. The issue is not just whether an individual fact was public or private. The issue is whether the institutional model built from those facts can be inspected, corrected, limited, forgotten, or refused.

Solove's later article "A Taxonomy of Privacy" helps make this explicit by separating privacy problems into information collection, processing, dissemination, and invasion. That structure is useful for AI because harms often move across stages. A system may collect innocuous traces, process them into sensitive inferences, disseminate outputs through vendors or agencies, and then turn the result into an intervention in housing, work, credit, policing, education, or care.

The book also helps avoid a common mistake in AI debates: treating the model as the entire problem. The model is important, but the surrounding dossier machine supplies the raw material, institutional authority, and operational route. A good model attached to an unjust record system can still produce unjust governance. A transparent model attached to unappealable records can still leave people powerless.

Where the Book Needs Updating

The Digital Person is a 2004 book, and it shows. Its examples center on spyware, web bugs, data mining, airline passenger profiling, public records, the USA PATRIOT Act, identity theft, and database sharing between business and government. Those examples are still relevant, but the information environment has become more intimate, mobile, social, biometric, and generative.

The book also works mostly through U.S. privacy law and legal reform. That focus gives it rigor, but it can understate how much privacy now depends on platform design, procurement rules, labor rights, competition policy, standards bodies, public infrastructure, and international regulation. The problem is not only what courts recognize as privacy harm. It is also who builds the systems that make people administratively real.

Still, the age of the book is part of its value. It reminds readers that AI did not invent the crisis of machine-readable personhood. AI inherits decades of database practice, data-broker economics, public-private information exchange, weak consent rituals, and bureaucratic deference to records. The novelty is not that institutions have started making data doubles. The novelty is that those doubles can now be scored, narrated, simulated, and acted on with much greater speed.

The Site Reading

The practical lesson of The Digital Person is that a person cannot be protected only at the moment of exposure. Protection has to cover the whole life of a record: collection, combination, inference, access, retention, sharing, decision, appeal, deletion, and reuse.

That lesson matters for AI agents, answer engines, automated welfare systems, hiring tools, companion memories, educational profiles, and workplace dashboards. These systems do not simply process information. They create institutional versions of people and then make those versions consequential.

A healthier system needs data minimization, real deletion, purpose limits, audit trails, human appeal, source visibility, correction rights, procurement discipline, and refusal paths that do not punish people for declining the dossier. It also needs a cultural shift: records should be treated as partial administrative artifacts, not as the person rendered in digital form.

Solove's book matters because it names the quiet danger before the interface becomes intelligent. Once the dossier is accepted as the person, every later automation inherits a category error. The system is not merely learning about someone. It is learning from an institutional shadow and then asking the person to answer for it.

Sources

Book links are paid affiliate links. As an Amazon Associate I earn from qualifying purchases.


Return to Blog · Return to Books