Meta and Mark Zuckerberg are defendants in a lawsuit filed by five major book publishers and author Scott Turow, who allege the tech giant infringed on copyrights by training its AI systems on pirated and ripped works. The lawsuit was filed on Tuesday in New York federal court.

The publishers—Hachette, Macmillan, McGraw Hill, Elsevier, and Cengage—along with Turow, claim that Zuckerberg directed Meta’s AI programs to be trained by copying millions of books, articles, and other written works obtained through pirate sites and unauthorized web scrapes.

"In their effort to win the AI ‘arms race’ and build a functional generative AI model, Defendants Meta and Zuckerberg followed their well-known motto: ‘move fast and break things.’ They first illegally torrented millions of copyrighted books and journal articles from notorious pirate sites and downloaded unauthorized web scrapes of virtually the entire internet. They then copied those stolen fruits many times over to train Meta’s multibillion-dollar generative AI system called Llama. In doing so, Defendants engaged in one of the most massive infringements of copyrighted materials in history."

The lawsuit further alleges that Meta, under Zuckerberg’s direction, copied millions of books and articles without authorization, including works owned or controlled by the plaintiffs. The company also allegedly stripped copyright management information from the stolen works to conceal its training sources and facilitate unauthorized use.

The plaintiffs are seeking unspecified damages and have requested a jury trial.

Meta Considered Licensing Before Opting for Piracy

According to the lawsuit, Meta briefly explored expanding its licensing deals with publishers after the release of its Llama 1 tool. The document mentions a proposed $200 million increase to the licensing budget before the matter was escalated to Zuckerberg for a decision.

"The question of whether to license or pirate moving forward was ‘escalated’ to Zuckerberg," the suit read. "After this escalation to Zuckerberg, Meta’s business development team received verbal instructions to stop licensing efforts. One Meta employee presciently described the rationale: ‘If we license [even] a single book, we won’t be able to lean into the fair-use strategy.’"

AI-Generated Substitutes Threaten Market, Lawsuit Claims

The lawsuit concludes by highlighting that Meta’s AI system can "readily generate, at speed and scale, substitutes for Plaintiffs’ and the Class’s works on which it was trained." It further alleges that Llama can "mimic the expressive elements and creative choices of specific authors."

"Users are touting AI’s ability to generate books with ease and Llama is flooding the market with AI-generated substitutes," the suit said. "The scale and speed at which Llama can create written works and compete with human writers is unprecedented, and it can only do that because Defendants copied Plaintiffs’ and the Class’s works to train their LLM."

A Meta spokesperson told Variety that similar lawsuits have been dismissed in court, stating that AI innovation relies on training with copyrighted material and that courts have recognized this as fair use in some cases.

Source: The Wrap