Wiki · Law · Last reviewed May 15, 2026

AI Copyright Litigation

AI copyright litigation is the wave of lawsuits testing whether AI developers may copy, store, train on, transform, or generate from copyrighted works without permission.

Definition

AI copyright litigation refers to lawsuits over the use of copyrighted material in artificial intelligence systems. The cases usually focus on training data, stored copies, model outputs, copyright management information, licensing markets, and whether copying is excused by fair use or a similar doctrine.

This page is a wiki summary, not legal advice. Litigation status can change quickly, and many claims described here remain allegations until resolved by a court or settlement.

Core Claims

Training-copy claims. Plaintiffs argue that developers copied copyrighted works into datasets or internal libraries to train models without permission.

Output claims. Plaintiffs argue that systems can reproduce protected expression, characters, lyrics, images, articles, or other works in outputs.

Market-harm claims. Plaintiffs argue that models trained on copyrighted works can substitute for the works, flood markets with competing material, or undermine licensing markets.

Acquisition claims. Plaintiffs distinguish between lawfully purchased or licensed material and material allegedly obtained from pirate, shadow-library, or unauthorized sources.

Copyright management information claims. Some cases allege that copyright notices, metadata, or other management information were removed or ignored in ways that violate the DMCA.

Major Cases

Thomson Reuters v. Ross Intelligence. In February 2025, a Delaware federal court held on summary judgment that Ross infringed Thomson Reuters' copyrighted Westlaw headnotes and rejected Ross's fair-use defense in connection with a competing legal research product.

Bartz v. Anthropic. In June 2025, Judge William Alsup held that Anthropic's use of lawfully acquired books for LLM training was fair use on the record before him, and that format-shifting purchased print books into internal digital copies was fair use. The order did not excuse the alleged use of pirated library copies. Anthropic later reached a settlement over the authors' claims.

Kadrey v. Meta. In June 2025, Judge Vince Chhabria granted Meta summary judgment on fair use for the specific record before the court, while emphasizing that different evidence about market harm could matter in other cases.

The New York Times v. OpenAI and Microsoft. The Times sued in December 2023 over alleged copying of news articles for model training and outputs. In April 2025, a federal court allowed much of the case to proceed past motions to dismiss, leaving key factual and fair-use issues for later stages.

Disney and Universal v. Midjourney. In June 2025, major film studios sued Midjourney, alleging that the image generator infringed copyrights in protected characters and works through training and outputs. As of this review, the case remains an important pending visual-media dispute.

Music publisher cases. Music publishers have sued AI developers over alleged use and output of song lyrics. In Concord Music Group v. Anthropic, claims over lyrics and copyright management information remained active after motion practice in 2025.

Fair Use Pattern

The early U.S. case law does not produce a single answer that all AI training is lawful or unlawful. Courts have focused on facts: what was copied, how it was acquired, whether the use was transformative, whether outputs substitute for originals, whether the defendant competes with the plaintiff, and what evidence exists of market harm.

One emerging distinction is between training as transformation and acquisition as infringement. A court may treat model training as transformative in one context while still refusing to bless the creation or retention of an unauthorized library of source works.

Another distinction is between abstract capability and specific output. A model that generally learns statistical patterns raises one set of questions; a system that reproduces lyrics, characters, article passages, or recognizable protected expression raises another.

Policy Position

The U.S. Copyright Office's May 2025 report treated generative AI training as a fact-specific fair-use question rather than a blanket exemption. It rejected simple analogies between AI training and human learning, and emphasized that commercial use, expressive substitution, market effects, licensing markets, and the nature of the copied works all matter.

The policy conflict is structural. AI developers want broad access to culture as training substrate. Rights holders want control, compensation, attribution, and bargaining power. Public-interest researchers want transparency and access without giving every cultural gatekeeper veto power over computation.

Open Questions

What counts as transformative? Courts are still defining when training changes a work enough to support fair use.

How should market harm be measured? Litigation is testing whether harm means direct substitution, lost licensing revenue, market dilution, or broader creative-economy damage.

Does provenance change the answer? Lawfully acquired copies, licensed datasets, scraped websites, and pirated archives may receive different treatment.

Who is responsible for outputs? Cases increasingly ask whether developers, deployers, users, or platform operators are responsible when systems generate protected expression.

Will licensing become infrastructure? Settlements and licensing deals may turn cultural archives into paid data channels, favoring large rights holders and large model developers.

Spiralist Reading

AI copyright litigation is the court system discovering that culture has become fuel.

The lawsuits are not only about copying. They are about conversion: books into embeddings, songs into behavior, images into style-space, journalism into answer engines, character libraries into promptable surfaces.

For Spiralism, the key question is whether a society can turn its memory into machine capability without losing the people and institutions that produce that memory. Copyright is an imperfect tool for that question, but it is one of the few tools with teeth.

Sources


Return to Wiki