From ChatGPT to Getty v. Stability AI: A Running List of Key AI-Lawsuits

Image: OpenAI

Law

From ChatGPT to Getty v. Stability AI: A Running List of Key AI-Lawsuits

The rising adoption of artificial intelligence (“AI”) across industries (including fashion, retail. luxury, etc.) that has come about in recent years is bringing with it no shortage of lawsuits, as parties look to navigate the budding issues that these relatively new ...

July 1, 2024 - By TFL

From ChatGPT to Getty v. Stability AI: A Running List of Key AI-Lawsuits

Image : OpenAI

Case Documentation

From ChatGPT to Getty v. Stability AI: A Running List of Key AI-Lawsuits

The rising adoption of artificial intelligence (“AI”) across industries (including fashion, retail. luxury, etc.) that has come about in recent years is bringing with it no shortage of lawsuits, as parties look to navigate the budding issues that these relatively new models raise for companies and creators, alike. A growing number of lawsuits focus on generative AI, in particular, which refers to models that use neural networks to identify the patterns and structures within existing data to generate new content. Lawsuits are being waged against the developers behind some of the biggest generative AI chatbots and text-to-image generators, such as ChatGPT and Stability AI, and in many cases, they center on how the underlying models are trained, the data that is used to do so, and the nature of the user-prompted output (which is allegedly infringing in many cases), among other things. 

In light of the onslaught of legal questions that have come about in connection with the rise of AI, we take a high-level look at some of the most striking lawsuits that are playing out in this space and corresponding developments. They are listed by filing date …

Jun. 27, 2024: Center for Investigative Reporting, Inc. v. OpenAI, Inc., et al.

The oldest nonprofit newsroom in the country has filed suit against OpenAI and Microsoft, accusing the ChatGPT and Copilot creators of engaging in copyright infringement and violating the Digital Millenium Copyright Act (“DMCA”). In the complaint that it lodged with the U.S. District Court for the Southern District of New York on June 27, the Center for Investigative Reporting, Inc. (“CIR”) alleges that OpenAI and Microsoft (the “defendants”) are offering up AI products that “are built on uncompensated and unauthorized use of the creative works of humans.” Specifically, CIR claims (citing data from “award-winning website Copyleaks”) that “nearly 60% of the responses provided by the defendants’ GPT-3.5 product contained some form of plagiarized content, and over 45% contained text that was identical to pre-existing content.” 

According to CIR, the defendants “copied, used, abridged, and displayed [its] valuable content without [its] permission or authorization, and without any compensation to CIR,” thereby, “undermin[ing] and damag[ing] its relationship with potential readers, consumers, and partners, and deprive CIR of subscription, licensing, advertising, and affiliate revenue, as well as donations from readers.” 

Setting out claims of direct and contributory copyright infringement CIR argues that the defendants infringed its exclusive rights in its registered works by: “(1) downloading those works from the internet; (2) encoding the Registered Works in computer memory; (3) regurgitating those works verbatim or nearly verbatim in response to prompts by ChatGPT users; (4) producing significant amounts of material from those works in response to prompts by ChatGPT users; and (5) producing significant amounts of material from those works in response to prompts by ChatGPT users.” And in furtherance of its DMCA claims, CIR contends that the defendants “created copies of [its] works of journalism with copyright notice information removed.” 

May 16, 2024: Lehrman, et al. v. LOVO, Inc.

Voice-over actors Paul Lehrman and Linnea Sage have filed a right of publicity and false advertising lawsuit against LOVO, Inc., a startup in the business of selling “a text-to-speech subscription service that allows its clients – typically companies – to generate voice-over narrations at a fraction of the cost of the traditional model.” According to Lehrman and Sage’s complaint, LOVO enables “subscribing customers to upload a script into its AI-driven software … and generate a professional-quality voice-over based on certain criteria,” and that it “promotes its service using barely-disguised images and names of celebrities and states on its website, ‘Clone any voice.'”

“Implicit in LOVO’s offerings to its customers is that each voice-over actor has agreed to LOVO’s terms and conditions for customers to be able to access that,” Lehrman and Sage assert. The problem with that, they claim, is that they (and other members of the class) “have not agreed to LOVO’s terms,” and that LOVO has “stolen and used” their “voices and/or identities to create millions of voice-over productions without permission or proper compensation, in violation of numerous state right of privacy laws, and the federal Lanham Act.”

Apr. 30, 2024: Daily News, LP, et al. v. Microsoft Corp., et al.

A group of eight news publications have filed a copyright infringement and trademark dilution lawsuit against Microsoft and OpenAI in a New York federal court, alleging that the generative AI pioneer and its partner of “purloining millions of [their] copyrighted articles without permission and without payment to fuel the commercialization of their generative artificial intelligence products, including ChatGPT and Copilot.” The plaintiffs – which include Chicago Tribune Company, Orlando Sentinel Communications Company, and San Jose Mercury-News, among other newspapers – argue that while OpenAI and Microsoft pay for the other elements of their businesses, such as computers, specialized chips, electricity, and programmers and other technical employees, they have opted not to pay for the “high quality content” that they need “to make their GenAI products successful.”

“Despite admitting that they need copyrighted content to produce a commercially viable GenAI product,” the plaintiffs claim that OpenAI and Microsoft “contend that they can fuel the creation and operation of these products with the [plaintiffs]’ content without permission and without paying for the privilege.” But “they are wrong on both counts,” according to the plaintiffs, who set out claims of direct, vicarious, and contributory copyright infringement, violations of the DMCA, common law unfair competition by misappropriation, federal trademark dilution, and dilution and injury to business reputation under New York General Business Law.

Apr. 26, 2024: Zhang et al. v. Google LLC and Alphabet Inc.

A group of visual artists have filed suit against Google LLC and its owner Alphabet Inc. in a federal court in Northern California, alleging that the tech titans made unauthorized use of their copyright-protected artworks to train its AI-powered image generator, Imagen. Neither the plaintiffs nor any of the proposed class members ever authorized Google to use their copyrighted works as training material, according to the complaint, which states that “these copyrighted training images were copied multiple times by Google during the training process for Imagen.” And because Imagen “contains weights that represent a transformation of the protected expression in the training dataset, Imagen is, itself, an infringing derivative work.”

Meanwhile, the plaintiffs – who set out claims of direct copyright infringement against Google and vicarious copyright infringement against Alphabet – further assert that Alphabet, “as the corporate parent of Google, also commercially benefits from these acts of massive copyright infringement.”

Mar. 8, 2024: Nazemian, et al. v. NVIDIA Corp.

NVIDIA Corp. has landed on the receiving end of a copyright infringement complaint filed with the N.D. Cal. in March 8, with author-plaintiffs Abdi Nazemian, Brian Keene, and Stewart O’Nan (collectively, the “plaintiffs”) alleging that their copyright-protected books that “were included in the training dataset that NVIDIA has admitted copying to train its NeMo Megatron models.” In their brief complaint, in which they set out a single claim of direct copyright infringement, the plaintiffs assert that NVIDIA “has admitted training its NeMo Megatron models” on a copy of a dataset called, The Pile, and therefore, “necessarily also trained its NeMo Megatron models on a copy of Books3, because Books3 is part of The Pile.”

Since “certain books written by the plaintiffs are part of Books3, including the infringed works and NVIDIA necessarily trained its NeMo Megatron models on one or more copies of the infringed works,” they claim that NVIDIA directly infringing their copyrights.

Feb. 28, 2024: Raw Story Media, et al. v. OpenAI, Inc., et al.

The latest lawsuit to be failed against OpenAI comes by way of news outlets Raw Story Media, Inc. and AlterNet Media, Inc. (the “plaintiffs”), which accuse the generative AI giant of “repackag[ing]” their “copyrighted journalism work product” by way of the outputs from its popular ChatGPT platform. Setting the stage in their complaint, the plaintiffs claim that “at least some of the time, ChatGPT provides or has provided responses to users that regurgitate verbatim or nearly verbatim copyright-protected works of journalism without providing any author, title, or copyright information contained in those works,” while other times, it “provides or has provided responses to users that mimic significant amounts of material from copyright-protected works of journalism without providing any author, title, or copyright information contained in those works.”

Part of the problem here, according to the plaintiffs, stems from how OpenAI trains the models that power ChatGPT: “When they populated their training sets with works of journalism, [OpenAI] had a choice: they could train ChatGPT using works of journalism with the copyright management information protected by the Digital Millennium Copyright Act (‘DMCA’) intact, or they could strip it away.” OpenAI “chose the latter,” the plaintiffs assert, and “in the process, trained ChatGPT not to acknowledge or respect copyright, not to notify ChatGPT users when the responses they received were protected by journalists’ copyrights, and not to provide attribution when using the works of human journalists.”

As such, when ChatGPT provides outputs in response to user prompts, it “gives the impression that it is an all-knowing, ‘intelligent’ source of the information being provided, when in reality, the responses are frequently based on copyrighted works of journalism that ChatGPT simply mimics,” the plaintiffs maintain.

With the foregoing in mind, the plaintiffs set out a single claim under 17 U.S.C. § 1202(b)(1) of the DMCA, on the basis that OpenAI “created copies of [their] works of journalism with author information removed and included them in training sets used to train ChatGPT.”

Feb. 28, 2024: The Intercept Media, Inc. v. OpenAI, Inc.

The Intercept Media similarly waged DMCA claims against OpenAI and Microsoft, accusing the two companies, as well as a number of OpenAI affiliates of violating the DMCA by creating and using copies of its works of journalism “with author information removed and included them in training sets used to train ChatGPT.” Among other things, the Intercept claims in the complaint that it lodged with the U.S. District Court for the Southern District of New York that OpenAI and co. “had reason to know that ChatGPT would be less popular and would generate less revenue if users believed that ChatGPT responses violated third-party copyrights or if users were otherwise concerned about further distributing ChatGPT responses.”

This is at least because the defendants “were aware that they derive revenue from user subscriptions, that at least some likely users of ChatGPT respect the copyrights of others or fear liability for copyright infringement, and that such users would not pay to use a product that might result in copyright liability or did not respect the copyrights of others.”

Like Raw Media and AlterNet Media, the Intercept accuses OpenAI and co. of violating 17 U.S.C. § 1202(b)(1) of the DMCA by “creat[ing] copies of [its] works of journalism with author information removed and included them in training sets used to train ChatGPT.” The Intercept goes further, though, and sets out claims under 17 U.S.C. § 1202(b)(3) on the basis that the defendants “shared copies of [its] works without author, title, copyright, and terms of use information” with each other “in connection with the development of ChatGPT.”

Jan. 25, 2024: Main Sequence, et al. v. Dudesy LLC, et al.

On the heels of Dudesy, a media company in the business of creating AI-generated works, releasing an hour-long special featuring an AI-generated imitation of George Carlin’s voice on the Dudesy podcast’s YouTube channel on January 9, the late comedian’s estate has lodged right of publicity and copyright infringement claims in a federal court in California. According to the complaint, dated January 25, more than 16 years after Carlin’s death, Dudesy and its founders, comedian Will Sasso and writer Chad Kultgen, “took it upon themselves to ‘resurrect’ Carlin with the aid of AI.”

“Using Carlin’s original copyrighted works,” Dudesy LLC, Sasso, and Kultgen (collectively, “Dudesy” and/or “defendants”) “created a script for a fake George Carlin comedy special and generated a sound-alike of George Carlin to ‘perform’ the generated script,” according to Main Sequence, Ltd., Jerold Hamza as executor for the Estate of George Carlin, and Jerold Hamza in his individual capacity (collectively, “Carlin’s estate” and/or the “plaintiffs”). The plaintiffs assert that “none of the defendants had permission to use Carlin’s likeness for the AI-generated ‘George Carlin Special,’ nor did they have a license to use any of the late comedian’s copyrighted materials.”

Against that background, they set out claims of violation of rights of publicity under California common law and deprivation of rights of publicity under Cal. Civ. Code § 3344.1; they are taking issue with Dudesy’s use of Carlin’s “name, reputation, and likeness,” namely, their use of “generated images of Carlin, Carlin’s voice, and images designed to evoke Carlin’s presence on a stage.” The plaintiffs also set out a claim of federal copyright infringement, arguing that the defendants have “unlawfully used [the] plaintiffs’ copyrighted works for building and training a dataset for purposes of generating an output intended to mimic the plaintiffs’ copyrighted work (i.e., Carlin’s stand-up comedy).”

With the foregoing in mind, the plaintiffs are seeking monetary damages, as well as preliminary and permanent injunctive relief to bar Dudesy and co. “from directly committing, aiding, encouraging, enabling, inducing, causing, materially contributing to, or otherwise facilitating use of George Carlin’s copyrighted works to generate Dudesy Specials and any other contents created or disseminated by Dudesy, LLC relating to those Dudesy Specials.” Additionally, they want the court to order Dudesy to “immediately remove, take down, and destroy any video or audio copies (including partial copies) of the ‘George Carlin Special,’ wherever they may be located.”

UPDATED (Apr. 2, 2024): Main Sequence and Dudesy have settled suit, as indicated by their filing of a joint stipulation with the court consenting to a judgment and permanent injunction barring Dudesy and co. from “uploading, posting or broadcasting the [‘George Carlin: I’m Glad I’m Dead (2024) – Full Special’] on the Dudesy Podcast, or in any content posted to any website, account or platform (including, without limitation, YouTube and social media websites) controlled by [them].” The defendants are also barred from “using George Carlin’s image, voice or likeness on the Dudesy Podcast, or in any content posted to any website, account or platform … controlled by [them] without the express written approval of the plaintiffs.”

Jan. 5, 2024: Basbanes v. Microsoft Corp. and OpenAI, et al.

Journalists Nicholas Basbanes and Nicholas Ngagoyeanes (professionally known as “Nicholas Gage”) have filed a direct, vicarious, and contributory copyright infringement suit against Microsoft Corporation and OpenAI, Inc., along with an array of affiliated OpenAI entities, arguing that the defendants copied their work “to build a massive commercial enterprise that is now valued at billions of dollars.” In particular, the plaintiffs assert that Microsoft and OpenAI, “as sophisticated commercial entities, clearly decided upon a deliberate strategy to steal the plaintiffs’ copyrighted works to power their massive commercial enterprise … [without] paying for the inputs that make their LLMs, and which are thus plainly derivative works [that] result in an even higher profit margin for the defendants.”

Pointing to the case waged against Microsoft and OpenAI as impetus for the case at hand, Basbanes and Ngagoyeanes assert that “shortly after The New York Times filed suit against these same defendants in this court, the defendants publicly acknowledged that copyright owners like the plaintiffs must be compensated for the defendants’ use of their work: ‘We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models.’”

Against that background, the plaintiffs state that they “seek to represent a class of writers whose copyrighted work has been systematically pilfered by the defendants,” seeking damages for “copyright infringement, the lost opportunity to license their works, and for the destruction of the market the defendants have caused and continue to cause to writers.” The plaintiffs are also seeking a permanent injunction “to prevent these harms from recurring.”

2023

Dec. 27, 2023: New York Times Company v. Microsoft Corp. and OpenAI, et al.

The New York Times is accusing OpenAI and partner Microsoft of copyright infringement, violations of the Digital Millennium Copyright Act, unfair competition by misappropriation, and trademark dilution in a new lawsuit. According to the complaint that it filed with the U.S. District Court for the Southern District of New York on December 27, the Times alleges that the defendants are on the hook for making “unlawful use of The Times’s work to create artificial intelligence products that compete with it [and that] threatens The Times’s ability to provide [trustworthy information, news analysis, and commentary].” The defendants’ generative AI tools “rely on large-language models that were built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more,” the paper claims.

The Times contends that the defendants “insist that their conduct is protected as ‘fair use’ because their unlicensed use of copyrighted content to train GenAI models serves a new ‘transformative’ purpose,” but the paper argues that “there is nothing “transformative” about using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it.” Moreover, it maintains that “because the outputs of the defendants’ GenAI models compete with and closely mimic the inputs used to train them, copying Times works for that purpose is not fair use.”

In addition to its copyright-centric claims, the Times sets out a trademark dilution cause of action, asserting that the defendants “have, in connection with the commerce of producing GenAI to users for profit throughout the United States, including in New York, engaged in the unauthorized use of The Times’s trademarks in outputs generated by [their] GPT-based products.” In particular, it alleges that the defendants’ “unauthorized use of The Times’s marks on lower quality and inaccurate writing dilutes the quality of The Times’s trademarks by tarnishment in violation of 15 U.S.C § 1125(c).” On this front, the New York Times claims that “at the same time as the defendants’ models are copying, reproducing, and paraphrasing [its] content without consent or compensation, they are also causing The Times commercial and competitive injury by misattributing content to The Times that it did not publish,” thereby, giving rise to “misinformation.”

While The Times “has attempted to reach a negotiated agreement with the defendants … to permit the use of its content in new digital products,” it claims that “the negotiations have not led to a resolution.”

Nov. 21, 2023: Sancton v. OpenAI Inc., Microsoft Corporation, et al.

Reporter Julian Sancton has filed suit against OpenAI and Microsoft, alleging that the tech titans “have built a business valued into the tens of billions of dollars by taking the combined works of humanity without permission.” In the complaint that he lodged with U.S. District Court for the Southern District of New York on November 21, Sancton claims that “rather than pay for intellectual property,” the defendants “pretend as if the laws protecting copyright do not exist.” Among such IP? Sancton’s book, Madhouse at the End of the Earth, along with “thousands, maybe more, [of other] copyrighted works – including nonfiction books,” which the defendants allegedly used to train their AI models. The problem, per Sanction is that “the U.S. Constitution protects the fundamental principle that creators” – including “nonfiction authors, [who] often spend years conceiving, researching, and writing their creations” – “deserve compensation for their works.”

The bottom line in the complaint, in which Sancton sets out claims of direct and contributory copyright infringement, is that “the basis of the OpenAI platform is nothing less than the rampant theft of copyrighted works.”

Oct. 18, 2023: Concord Music Group, Inc. v. Anthropic PBC

A pool of music publishers have lodged a complaint against AI startup Anthropic PBC for allegedly engaging in “systematic and widespread infringement of their copyrighted song lyrics.” According to the complaint, which was filed with the U.S. District court for the Middle District of Tennessee, “In the process of building and operating AI models, Anthropic unlawfully copies and disseminates vast amounts of copyrighted works – including the lyrics to myriad musical compositions owned or controlled by the publisher plaintiffs.” The plaintiffs – which range from Concord Music Group to Universal Music – assert that they “embrace innovation and recognize the great promise of AI when used ethically and responsibly, but Anthropic violates these principles on a systematic and widespread basis.”

In particular, Universal and co. claim that Anthropic builds its AI models by using “lyrics to innumerable musical compositions for which [they] own or control the copyrights, among countless other copyrighted works harvested from the internet.” Such copyrighted material is “not free for the taking simply because it can be found on the internet,” the plaintiffs assert, alleging that Anthropic “has neither sought nor secured the publishers’ permission to use their valuable copyrighted works in this way.” Among the copyright protected song lyrics, according to the plaintiffs, are those from artists like the Rolling Stones, Garth Brooks, Katy Perry, Gloria Gaynor, which Anthropic’s model allegedly infringed by way of “identical or substantially and strikingly similar to Publishers’ copyrighted lyrics for each of the compositions

In furtherance of its direct copyright infringement, vicarious copyright infringement, and contributory copyright infringement claims, and their DMCA claims for removal of copyright management information, and their demands for relief, the plaintiffs argue that Anthropic “must abide by well-established copyright laws, just as countless other technology companies regularly do.”

Sept. 19, 2023: Authors Guild, et al. v. OpenAI, Inc.

The Authors Guild and more than a dozens authors, including John Grishman and George R.R. Martin, are suing an array of OpenAI entities for for allegedly engaging in “a systematic course of mass-scale copyright infringement that violates the rights of all working fiction writers and their copyright holders equally, and threatens them with similar, if not identical, harm.” In the complaint that they filed with the U.S. District Court for the Southern District of New York on September 19, the plaintiffs, who are authors of “a broad array of works of fiction,” claim that they are “seeking redress for [OpenAI’s] flagrant and harmful infringements of [their] registered copyrights” by way of its “wholesale” copying of such works without permission or consideration.

Specifically, the plaintiffs claim that by way of datasets that include the texts of their books, OpenAI “fed [their] copyrighted works into its ‘large language models,’ [which are] algorithms designed to output human-seeming text responses to users’ prompts and queries,” and which are “at the heart of [its] massive commercial enterprise.” Because OpenAI’s models “can spit out derivative works: material that is based on, mimics, summarizes, or paraphrases the plaintiffs’ works, and harms the market for them,” it is endangering “fiction writers’ ability to make a living, in that the [models] allow anyone to generate – automatically and freely (or very cheaply) – texts that they would otherwise pay writers to create.”

With the foregoing in mind, the plaintiffs set out claims of direct copyright infringement, vicarious copyright infringement, and contributory copyright infringement.

Sept. 8, 2023: Chabon v. OpenAI, Inc.

Authors Michael Chabon, David Henry Hwang, Matthew Klam, Rachel Louise Snyder, and Ayelet Waldman are suing OpenAI on behalf of themselves and a class of fellow “authors holding copyrights in their published works arising from OpenAI’s clear infringement of their intellectual property.” In their September 8 complaint, which was filed with a federal court in Northern California, Chabon and co. claim that OpenAI incorporated their “copyrighted works in datasets used to train its GPT models powering its ChatGPT product.” Part of the issue, according to the plaintiffs, is that “when ChatGPT is prompted, it generates not only summaries, but in-depth analyses of the themes present in [their] copyrighted works, which is only possible if the underlying GPT model was trained using [their] works.”

The plaintiffs claim that they “did not consent to the use of their copyrighted works as training material for GPT models or for use with ChatGPT,” and that by way of their operation of ChatGPT, OpenAI “benefit[s] commercially and profit handsomely from [its] unauthorized and illegal use of the plaintiffs’ copyrighted works.”

UPDATED (Nov. 8, 2023): Following a conference with N.D. Cal. Judge Araceli Martinez-Olguin (and a subsequently-issued pretrial order), this case has been consolidated with Tremblay v. OpenAI, Inc. to form In Re OpenAI ChatGPT Litigation.

Jul. 11, 2023: J.L., C.B., K.S., et al., v. Google LLC

Google is being sued over their alleged practice of “stealing” web-scraped data and “vast troves of private user data from [its] own products” in order to build commercial artificial intelligence (“AI”) products like its Bard chatbot. In the complaint that they filed with a California federal court on Tuesday, J.L., C.B., K.S., P.M., N.G., R.F., J.D., and G.R., who have opted to file anonymously, claim that “for years, Google harvested [our personal and professional information, our creative and copywritten works, our photographs, and even our emails] in secret, without notice or consent from anyone,” thereby, engaging in unfair competition, negligence, invasion of privacy, and copyright infringement, among other causes of action.

UPDATED (Jun. 6, 2024): N.D. Cal. Judge Araceli Martinez-Olguín dismissed the plaintiffs’ amended complaint, citing “concerns expressed” by fellow N.D. Cal. Judge Vince Girdhari Chhabria, who dismissed the complaint in a similar scraping case in May 2024. According to Judge Martinez-Olguín’s June 6 order, “In light of the concerns expressed by Judge Chhabria in his order dismissing the complaint in the matter of Cousart v. OpenAI LP, and given the overlap in the plaintiffs named, the involved plaintiffs’ counsel, and the claims asserted in this case and Cousart, Google’s motion to dismiss Plaintiffs’ amended complaint is granted” – albeit without prejudice. The plaintiffs may file another amended complaint within 21 days of the court’s order.


This is a short excerpt from a Tracker that was published exclusively for TFL Pro+ subscribers. Inquire today about how to sign up for a Professional subscription and gain access to all of our exclusive content.

Updated

April 2, 2024

This article was initially published on June 5, 2023, and has been updated to reflect newly-filed lawsuits and updates in previously-reported cases.

related articles