Publications

Home 9 Publication 9 Bartz v. Anthropic: First Court Decision on Fair Use Defense in LLM Training

Bartz v. Anthropic: First Court Decision on Fair Use Defense in LLM Training

June 30, 2025

On June 23, 2025, U.S. District Judge Alsup of the Northern District of California issued a decision in the high-profile lawsuit brought by several authors against Anthropic, the AI developer of the Claude chatbot. The authors accused Anthropic of copyright infringement, claiming that Anthropic used their books without permission to train its large language models (LLMs). Anthropic moved for summary judgment on its asserted defense of fair use.

Holding

Upon weighing all the fair use factors, the Court granted Anthropic’s summary judgment motion for fair use as to the training of LLMs and the digitization (format change) of legally purchased works. The Court, however, denied summary judgment relating to pirated copies and ordered a trial on that issue and any related damages.

Background

To acquire the massive amounts of data needed to train its LLMs, Anthropic purchased millions of books, scanned and digitized them and discarded the originals. Anthropic also downloaded over seven million pirated books which it also used to train its LLMs.

The authors claimed that Anthropic infringed their federal copyrights by pirating copies of their books and using the purchased books and pirated copies to train its LLMs.

Court’s Analysis of Fair Use

Fair Use Factor 1 – The purpose and character of the Use

Judge Alsup found that the use of training copies of the authors’ copyrighted works to train Anthropic’s LLMs was “exceedingly transformative” and was therefore a fair use.

As for the legally purchased books used to train the LLMs, Judge Alsup found this to be fair use because Anthropic simply replaced the print copies with more convenient digital ones. However, use of the pirated works was not fair use.

Fair Use Factor 2 – Nature of the Copyrighted Works

The Court found that Anthropic had selected the authors’ works for use in training the LLMs specifically because of their expressive content. Because of this, the second factor weighed against fair use.

Fair Use Factor 3 – Amount and Substantiality of the Portion Used

Although acknowledging that entire works were copied to train its LLMs, because Anthropic would require billions of words to train a LLM, the Court considered that “using any one work for actually training LLMs was about as reasonable as the next.” Thus, the Court held that the quantity of copying was reasonably necessary and the third factor favored fair use for the legally purchased books but weighed against fair use for the illegally obtained pirated copies.

Fair Use Factor 4 – Effect of the Use Upon the Market

The Court found that the copies used by Anthropic to train LLMs did not (and will not) displace demand for the authors’ works and that the fourth factor favors fair use. In reaching this conclusion, the Court dismissed the authors’ concerns of market dilution as being analogous to complaining that “training schoolchildren to write well would result in an explosion of competing works.” The Court further found that the digitization and storage of the legally purchased printed works was merely a format change that did not usurp any of the authors’ rights. Thus, this factor was neutral for the legally purchased copies. On the other hand, the pirated copies did displace demand for the authors’ books, and thus the fourth factor weighed against fair use for the pirated copies.

Takeaways

This decision, in what is expected to be a series of cases involving the use of copyrighted material to train AI models, explicitly lays out the Court’s rationale for its finding of fair use.
Judge Alsup seems to have been greatly influenced by the transformative aspect of the use, characterizing the technology as “exceedingly transformative,” “spectacularly so,” “quintessentially transformative,” and “among the most transformative many of us will see in our lifetimes.”
The decision still leaves some unanswered questions. For example, the Court noted that Claude interposes a layer of filtering software between the LLM output and the user to ensure that Claude does not provide any infringing copy of a work to a user. The plaintiffs did not challenge the legality of that intermediate (pre-filtered) output and the ruling did not address it.
On June 25, 2025, in Kadrey v. Meta Platforms, a subsequent but unrelated case challenging Meta’s use of copyrighted material to train its LLMs, Judge Chhabra of the Northern District of California similarly found Meta’s copying of the plaintiffs’ books for LLM training was fair use. But Judge Chhabra noted that his summary judgment ruling was limited and hinted that the case would have gone to trial had the plaintiffs presented evidence of market dilution. The decision also included a subtle criticism of Judge Alsup’s focus on the transformative nature of generative AI and less attention on the most important factor in fair use analysis – harm to the market for the copyrighted works.
As can be seen from the differing viewpoints of these two district judges, there is a tension between Factors 1 and 4 of the fair use analysis. In the Warhol v. Goldsmith case, the Supreme Court focused on the effect on the marketplace in determining how transformative the use was under Factor 1. It remains to be seen how the fair use analysis would be applied in a generative AI case where there is evidence of marketplace harm to content creators.