Anthropic reaches $1.5 bln settlement in AI book piracy lawsuit
Anthropic will pay $1.5 billion in a landmark book piracy lawsuit, compensating authors and reshaping the debate on AI copyright infringement.
-
The Claude by Anthropic app logo appears on the screen of a smartphone in Reno, United States, on November 21, 2024. (Photo by Jaque Silva/NurPhoto) (Photo by Jaque Silva / NurPhoto / NurPhoto via AFP)
The artificial intelligence startup Anthropic has agreed to pay $1.5 billion to settle a class-action lawsuit filed by authors who accused the company of using pirated books to train its chatbot Claude. The deal, which awaits approval from a federal judge, is being described as a landmark moment in the ongoing debate over AI and copyright.
Under the terms of the settlement, authors are expected to receive about $3,000 per book for an estimated 500,000 works. If approved, this would represent one of the largest copyright recoveries ever recorded in the United States.
“This is the first case of its kind in the AI era,” said Justin Nelson, one of the attorneys representing the writers. “As best we can tell, it’s the largest copyright recovery ever.”
The lawsuit was originally filed by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson. They claimed that Anthropic had used their books, among millions of others, without permission to build its chatbot models.
Judge found pirated data used in training
Earlier this year, US District Judge William Alsup issued a mixed ruling, determining that while training AI systems on copyrighted material was not itself illegal, Anthropic had wrongfully acquired millions of books from piracy websites.
Court documents revealed that Anthropic obtained more than 7 million digitized works, including 200,000 from the Books3 dataset, over 5 million from Library Genesis (LibGen), and at least 2 million from the Pirate Library Mirror.
The settlement avoids a trial scheduled for December, which experts said could have resulted in damages high enough to cripple the San Francisco-based company.
Implications for authors and the AI industry
The Authors Guild, the United States' oldest and largest professional organization for writers, welcomed the outcome, emphasizing that it sends a strong signal to the AI sector. “This is an excellent result for authors, publishers, and rightsholders generally,” said Mary Rasenberger, the Guild’s CEO. “It makes clear that there are serious consequences when AI companies pirate authors’ works to train their systems.”
The Guild had previously estimated that damages could start at $750 per work, but the higher payout of $3,000 reflects adjustments for duplicates and non-copyrighted material.
Books are a critical source of data for training large language models like Claude and ChatGPT. They provide billions of words in carefully structured narratives, making them invaluable for building systems capable of understanding and generating human-like text.
Other AI copyright lawsuits
In Japan, leading newspaper publishers including Nikkei Inc and The Asahi Shimbun Company have filed lawsuits against Perplexity AI, alleging the company reproduced and stored their articles without authorization. Earlier, Yomiuri Shimbun launched a similar case, claiming more than 119,000 of its articles were reproduced without permission, leading to lost advertising revenue.
Meanwhile, in the United Kingdom, major studios including Disney, Universal, and Warner Bros. Discovery have sued Midjourney, accusing it of training its image-generation model on copyrighted material and enabling users to create unauthorized depictions of iconic characters.
At the same time, the British Film Institute (BFI) has warned that AI companies have used more than 130,000 film and television scripts without consent, posing a “direct threat” to the UK’s £125 billion creative economy and jeopardizing thousands of jobs in the screen sector.