Get Full Government Meeting Transcripts, Videos, & Alerts Forever!

Senate Judiciary Subcommittee Hears Allegations That AI Firms Trained Models on Pirated Books

July 16, 2025 | Judiciary: Senate Committee, Standing Committees - House & Senate, Congressional Hearings Compilation


This article was created by AI summarizing key points discussed. AI makes mistakes, so for full details and context, please refer to the video of the full meeting. Please report any errors so we can fix them. Report an error »

Senate Judiciary Subcommittee Hears Allegations That AI Firms Trained Models on Pirated Books
At a hearing of the Senate Judiciary Committee’s Subcommittee on Crime and Counterterrorism, Chairman Hawley and Ranking Member Durbin presided as witnesses testified that major artificial-intelligence companies used pirated copies of books and scholarly works to train large language models.

The testimony, during the hearing titled “Too Big to Prosecute,” focused on allegations that companies including Meta and Anthropic obtained copyrighted works from illicit online repositories and peer-to-peer “torrent” networks rather than licensing material from copyright holders.

The witnesses said the scale of the alleged activity is substantial and legal accountability is unsettled. Max Pritt, a lawyer representing authors in litigation against Meta, told the panel that documents in that case show Meta acquired more than 200 terabytes of copyrighted books and articles and distributed over 40 terabytes of material via peer-to-peer networks. “This is likely the largest infringement of American intellectual property by U.S. companies in our nation’s history,” Pritt said.

Why it matters: witnesses and senators said the issue could undercut creators’ incomes, distort marketplaces for licensed works and prompt new litigation or legislation. Ranking Member Dick Durbin said the outcome will affect writers, musicians and other creators who “are rightfully concerned” about whether their work can be used without compensation.

Authors and scholars described both economic and legal concerns. Best-selling novelist David Baldacci said he and other authors have found evidence that dozens of their novels were taken and used in training models. “It felt like someone had backed up a truck to my imagination and stolen everything I’d ever created,” Baldacci said. He told senators he and other plaintiffs are seeking answers through ongoing litigation.

Economists and legal scholars who testified said prior research on digital piracy suggests creators suffer market harm when their works are copied without authorization. Professor Mike Smith of Carnegie Mellon University summarized peer-reviewed literature showing digital piracy can reduce legal sales and investment in creative output and argued enforcement and licensing can create a path for both creators and technology firms to thrive.

Experts disagreed about the legal status of copying for AI training. Professor Edward Lee of Santa Clara University School of Law described recent district-court rulings finding that model training can be a “transformative” use under the fair‑use doctrine and cautioned that the law remains unsettled. Professor Bamati Viswanathan of New England Law School and others emphasized that many pirate sites have repeatedly lost litigation and that using such repositories can be a “crime compounding a crime.” Viswanathan urged enforcement of licensing and existing copyright frameworks.

Senators pressed witnesses on specific evidence. Pritt pointed lawmakers to internal Meta messages from engineers who warned that using pirated repositories was unethical or “beyond our ethical threshold,” and to company discussions about routing downloads through external servers to avoid tracing. Senator Hawley displayed excerpts and said the documents show Meta employees recognized legal and ethical risks yet proceeded.

Legal standards under discussion included willfulness and commercial advantage for criminal copyright liability, and how courts should weigh fair use when defendants acquired materials from shadow libraries. Professor Lee noted judges in different cases have reached different conclusions and recommended letting the appellate process and possibly the Supreme Court resolve unsettled questions, while other witnesses urged congressional action if courts do not provide a remedy.

No committee vote or formal legislative action occurred at the hearing. Senators and witnesses framed the issue as both a legal and moral question about whether creators’ rights should be enforced or whether broad “transformative” uses for U.S. technological leadership will be treated as fair use.

The hearing continued through rounds of questioning and closed after members thanked the witnesses and adjourned. Several panelists and senators said litigation now pending across multiple courts would further shape legal outcomes and that Congress could consider statutory changes if courts do not curb the alleged conduct.

Ending: The hearing underscored competing priorities — protecting creators and fostering AI development — while leaving unresolved legal and policy paths for addressing alleged mass ingestion of copyrighted works by AI firms. The subcommittee signaled it may continue oversight and that plaintiffs’ lawsuits and further court rulings will play a central role in determining next steps.

Don't Miss a Word: See the Full Meeting!

Go beyond summaries. Unlock every video, transcript, and key insight with a Founder Membership.

Get instant access to full meeting videos
Search and clip any phrase from complete transcripts
Receive AI-powered summaries & custom alerts
Enjoy lifetime, unrestricted access to government data
Access Full Meeting

30-day money-back guarantee