Published on 1/10/2025 | 2 min read
Meta Platforms, led by CEO Mark Zuckerberg, is facing allegations of knowingly using pirated books to train its AI systems. These claims come from newly disclosed court documents in an ongoing lawsuit.
Authors, including Ta-Nehisi Coates and comedian Sarah Silverman, filed a lawsuit in 2023 accusing Meta of copyright infringement. The plaintiffs allege that Meta misused their works to train its large language model, Llama, by relying on pirated material.
Court filings made public in California on Wednesday suggest that Meta used the LibGen dataset, a repository notorious for hosting millions of pirated books. Internal communications at Meta reportedly show that Mark Zuckerberg approved the use of LibGen, despite legal concerns raised by the company’s AI executive team.
The plaintiffs are seeking to update their complaint with the newly surfaced evidence. They allege that Meta not only used the LibGen dataset but also distributed it via peer-to-peer torrents, further strengthening their case for copyright infringement.
In 2024, U.S. District Judge Vince Chhabria dismissed some claims in the lawsuit, including:
The plaintiffs are now seeking to:
During a Thursday hearing, Judge Chhabria allowed the authors to file an amended complaint but expressed skepticism about the viability of the new fraud and CMI claims.
This lawsuit is part of a broader wave of legal challenges against companies accused of misusing copyrighted material to train AI models. While defendants like Meta argue their actions fall under fair use provisions, these cases could set crucial legal precedents for AI development and copyright law.