Anthropic Admits to Pirating Millions of Copyrighted Books for AI Training
Anthropic, the AI company behind models like Claude, has admitted to downloading millions of pirated, copyrighted books to train its AI systems without the authors' permission. The admission comes as part of a landmark settlement finalized in fall 2023, where a judge ruled that while the use of these books constituted fair use, the initial piracy did not.
Settlement Details: $1.5 Billion Owed to Half a Million Authors
As part of the settlement, Anthropic agreed to pay a class of half a million authors a total of $1.5 billion. However, the payout per author is minimal—estimated at around $3,000 per book (split 50-50 with publishers) for authors like Maureen Johnson, who has 28 published works. The settlement will face a fairness hearing in court on May 14, 2024.
Similar Lawsuits Pending Against Meta and OpenAI
The Anthropic case is part of a growing trend of legal challenges against AI companies for unauthorized use of copyrighted material. Similar lawsuits are currently pending against Meta and OpenAI.
The Claims Process: A Kafkaesque Nightmare for Authors
To distribute the settlement funds, Anthropic partnered with a claims administrator to create a website where authors could submit claims. However, the system has been plagued with technical issues, leaving many authors unable to access the money they are owed.
Maureen Johnson’s Struggle with the Claims Website
Maureen Johnson, author of 28 books (many of them bestsellers), described her experience with the claims website as a "Kafkaesque mess." She submitted claims for 14 eligible titles twice, spending 90 minutes each time to complete the forms. Despite her efforts, the system could not locate either submission, forcing her to navigate endless layers of unresponsive customer service.
Johnson recounted her frustration in a conversation with Vox:
“It was getting more and more surreal, how little this system worked.”
Eventually, Johnson connected with an employee who admitted the system’s flaws. She recalled:
“This system is really fluky. It’s just not well-programmed.”
The employee responded with a giggle, saying, “Coding is hard.”
Other Authors Face Similar Challenges
Johnson is not alone in her struggle. Other authors have reported similar issues, including lost submissions, unresponsive support, and technical glitches that prevent them from accessing their rightful compensation.
What’s Next for the Settlement?
The settlement will proceed to a fairness hearing on May 14, 2024. If approved, the claims process is expected to continue, though authors remain skeptical about its reliability. The case highlights the broader challenges of holding AI companies accountable for copyright infringement while ensuring fair compensation for creators.