Training an AI Model Using Copyrighted Works Is Not Copyright Infringement
Two recent decisions found no copyright infringement under the fair use exception for using copyrighted works to train AI models.
Key Takeaways
1. Use of copyrighted works to train AI models is not copyright infringement so long as the trained AI models “[do] not result in exact copies nor ‘infringing knockoffs’ of copyrighted works [i.e., works substantially similar in expressive elements] being provided to the public.”
2. The providing to the public, by AI models, of exact copies or infringing knockoffs of copyrighted works that the AI models were trained on is copyright infringement
3. Copyright litigators should consider adding a common law action to their copyright complaint based on market dilution
4. Congress should consider enacting a Federal Copyright Dilution Act to protect the loss in value of copyrights due to the explosion of competing works by AI models that is not unlike the protection to the loss in trademark value afforded by the Federal Trademark Dilution Act but protecting not only the famous but also the not famous copyrights
Courts in two recent AI modeling decisions Bartz v. Anthropic PBC and Kadrey v. Meta Platforms, Inc. found there to be no copyright infringement under the fair use exception, for the use of copyrighted works by AI models to train; albeit so decided for different reasons.
In Anthropic, Judge Alsup found that AI models were trained using copyrighted works and training is fair use. No. 24-cv-05417 (N.D. Cal. June 23, 2025) The extent of copying was necessary for the training, and the training did not affect any potential market for the copied books or LLM licensing.¹ In Judge Alsup’s analysis of the four factors dispositive on the fair copyright use exception, the factor of “transformative use” (i.e., the first factor) weighed most heavily in finding fair copyright use in Anthropic’s use of copyrighted works to train its AI model. “In short, the purpose and character of using copyrighted works to train LLMs to generate new text was quintessentially transformative.”² Id. at *27. Citation incl. Authors Guild v. Google, Inc., 804 F.3d 202, 217 (2d Cir. 2015). See also Juhasz Law blog on Authors Guild case at One Small Step for Google, One Giant Leap for Copyright Fair Use.
Two days after Judge Alsup decided Anthropic, on June 25, 2025, Judge Chhabria, in Meta, essentially wrote that Judge Alsup’s over reliance on transformative use in the balance is not enough to support a fair copyright use finding. No. 23-cv-03417 (N.D. Cal. June 25, 2025) “No matter how transformative LLM training may be, it’s hard to imagine that it can be fair use to use copyrighted books to develop a tool to make billions or trillions of dollars while enabling the creation of a potentially endless stream of competing works that could significantly harm the market for those books.” Id. at *76. Judge Chhabria found that “In this case, because Meta’s use of the works of these thirteen authors is highly transformative, the plaintiffs needed to win decisively on the fourth factor to win on fair use. See, e.g., Perfect 10, 508 F.3d at 1168 (fair use where secondary use was “significant[ly] transformative” and the fourth factor “favor[ed] neither party”). Id. at *77. In so finding, Judge Chhabria pointed to “market dilution” by the “generation [by the AI model] of countless competing works with a miniscule fraction of the time and creativity it would otherwise take.” Id. at *64. However, because plaintiffs provided no meaningful evidence on market dilution at all, Judge Chhabria granted Meta’s motion for partial summary judgment. Id. at *78.
The different analyses taken in Anthropic and Meta decisions lies in the different views taken by the two Judges on the fourth factor in determining copyright fair use directed to the impact on the market of training AI using copyrighted works.
In the view of Judge Alsup in Anthropic, training AI models is no different than training school children to write well. Anthropic, at *49, 50. Both would lead to an explosion of competing works. Id. Both are embraced under the Copyright Act which seeks to advance original works of authorship and not protect authors against competition. Id.
In the view of Judge Chhabria in Meta, “when it comes to market effects, using books to teach children to write is not remotely like using books to create a product that a single individual could employ to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take. This inapt analogy is not a basis for blowing off the most important factor in the fair use analysis.” Meta at *16. (emphasis added)
It is debatable whether an action for “market dilution” even flows from the copyright statute which as Judge Alsup notes “seeks to advance original works of authorship and not protect authors against competition.” Anthropic, at *49, 50. An action for “market dilution” certainly is well grounded though in the world of trademarks. It is an action based on the Federal Trademark Dilution Act (FTDA), codified under 15 U.S.C. § 1125(c). The FTDA, originally passed in 1995 and amended by the Trademark Dilution Revision Act (TDRA) of 2006, seeks to protect famous marks from unauthorized use that could diminish their unique identity or harm their reputation. There is no counterpart to the FTDA that applies to copyrights. Maybe there should be. Still, a litigator could base an action of “market dilution” on common law tort principles wherein a wrongful act or omission gives rise to injury or harm to the copyright owner and amounts to a civil wrong for which courts can impose liability. But that would be an action that may not fall under the current Copyright statute and so would need to be separately pleaded.
Generating Exact Copies or Infringing Knockoffs to the Public is Copyright Infringement
In Anthropic, Judge Alsup made it crystal clear that exact copies or knockoffs [i.e., works substantially similar in expressive elements] of the copyrighted works expressed by the AI model to users is copyright infringement. “Authors concede that training LLMs did not result in any exact copies nor even infringing knockoffs of their works being provided to the public. If that were not so, this would be a different case. Authors remain free to bring that case in the future should such facts develop.” Anthropic at *49.
In Meta, there does not appear to have been any exact copies or knockoffs [i.e., works substantially similar in expressive elements] of the copyrighted works in the AI expressions to have warranted discussion. Still, “market dilution,” as an “effect upon the potential market for or value of the copyrighted work [the fourth factor in determining fair use],” as suggested by Judge Chhabria would appear to require some similarity in the expression of an AI model to the copyrighted work. That would appear to be subsumed in the “infringing knockoffs” referred to by Judge Alsup in Anthropic. See, also, Anderson v. Stability AI Ltd., Case No. 23-cv-00201-WHO (N.D. Cal., Aug 12, 2024) “direct infringement depend[s] on whether plaintifs’ protected works are contained, in some manner, in Stable Diffusion as distributed and operated.” Stability AI Ltd. at 974. See also the FTDA as “market dilution” is applied to trademarks.
A newly filed case that is likely to provide significant guidance that the generation of exact copies or knockoffs [i.e., works substantially similar in expressive elements] of the copyrighted works by an AI model is copyright infringement is one filed by Disney and Universal. Disney Enterprises, Inc., et al. v. Midjourney, Inc., Case No. 25-5275 (C.D. Cal, June 11, 2025). There, Disney and Universal have accused AI image-generation company Midjourney of copyright infringement by using AI models that are “generating endless unauthorized copies of Disney’s and Universal’s copyrighted works” and “blatantly incorporate[s] and cop[ies] Disney’s and Universal’s famous characters” like Darth Vader, the Minions, Shrek, and others. See Complaint ¶¶ 1-11. Here the trained AI models are alleged to be generating exact copies or infringing knockoffs to the public. As Judge Alsup noted in Anthropic, “this would be a different case.” Anthropic at 49.
As a final note on Midjourney, it does not appear to be a coincidence that searches on Midjourney’s AI model such as “create me a Darth Vader image” are generating look-alikes – or, in the words of Judge Alsup, “infringing knockoffs” [i.e., works substantially similar in expressive elements] – of Disney’s Darth Vader image. The term Darth Vader has become so famously associated with George Lucas’ Star Wars’ Darth Vader image and character that it is likely that most of the text and images used in the training of Midjourney’s AI model on the subject of “Darth Vader” are the very text and images of Darth Vader created by George Lucas, now owned by Disney. The same holds true for the Minions, Shrek, and other famous characters.
Thus, famous characters are likely to be strongly protected against infringement by AI generated images under the current Copyright statute. This is because the expression by an AI model responsive to user queries that invoke famous characters is likely to be an “infringing knock off” [i.e., works substantially similar in expressive elements] of the famous character invoked because a substantial part of the text and images used to train the AI model as to that “famous” character are likely to be the very text and images of the famous character itself. As for a non-famous copyright work that may form but a small part of the training of an AI model, it appears the copyright holder is currently left to demonstrate that the AI model is generating an exact copy or infringing knockoff [i.e., works substantially similar in expressive elements] of their copyrighted work. That is, unless Congress enacts a Federal Copyright Dilution Act that protects not only famous but also non-famous copyright works (unlike the protection of only famous marks against dilution afforded by the Federal Trademark Dilution Act (FTDA)). Since without such additional protection, as Judge Alsup correctly points out in Anthropic, the Copyright Act seeks to advance original works of authorship and not protect authors against competition. Anthropic, at *49, 50.
For copyright practitioners, the lessons from these decisions are these:
1. Use of copyrighted works to train AI models is not copyright infringement so long as the trained AI models “[do] not result in exact copies nor ‘infringing knockoffs’ of copyrighted works being provided to the public.”
2. The providing to the public, by AI models, of exact copies or infringing knockoffs of copyrighted works that the AI models were trained on is copyright infringement
3. Copyright litigators should consider adding a common law action based on market dilution
4. Congress should consider enacting a Federal Copyright Dilution Act to protect the loss in value of copyrights due to the explosion of competing works by AI models that is not unlike the protection to the loss in trademark value afforded by the Federal Trademark Dilution Act but protecting not only the famous but also the not famous copyrights
¹ There are four factors courts used to determine fair use – namely, (1) the purpose and character of the use, (2) the nature of the copyrighted work, (3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole, and (4) the effect of the use upon the potential market for or value of the copyrighted work. Section 107 of the Copyright Act.
² Bartz v. Anthropic PBC also involved copyright infringement allegations based on creation of libraries based on (a) digitizing of purchased copyrighted works and (b) downloading of pirated copyrighted works. As to the first library, Judge Alsup granted summary judgment because digitizing of hard copies for digital library storage is transformative and had little impact on the market for the copyrighted works. Anthropic at *57. Summary judgment was denied as to the second library because the downloads were not transformative and deprived the copyright holder of market for his work. Id.