In light of the onslaught of legal questions that have come about in connection with the rise of AI, we take a high-level look at some of the most striking lawsuits that are playing out in this space and corresponding developments. They are listed by filing date …
Sept. 11, 2024: Gemini Data v. Google
Google is being sued for trademark infringement for allegedly hijacking a smaller but older company’s name for a rebrand of its Bard chatbot. According to the complaint that it filed with the U.S. District Court for the Northern District of California on September 11, Gemini Data claims that in February 2024, “without any authorization by Gemini Data, Google publicly announced a re-branding of its BARD AI chatbot tool to ‘GEMINI.’” As a sophisticated company, Gemini Data claims that Google “undoubtedly conducted a trademark clearance search prior to publicly re-branding its entire line of AI products, and thus was unequivocally aware of Gemini Data’s registered and exclusive rights to the ‘GEMINI’ brand. Yet, Google “made the calculated decision to bulldoze over Gemini Data’s exclusive rights without hesitation,” it claims.
While Gemini Data says that it “does not hold a monopoly over the development of generative AI tools, it does have exclusive rights to the ‘GEMINI’ brand for AI tools,” noting that it “took all the steps to ensure it created a unique brand to identify its AI tools and to subsequently protect that brand.” That did not stop Google from “unabashedly wield[ing] its power to rob Gemini Data of its cultivated brand … assuming a small company like Gemini Data would not be in a position to challenge a corporate giant wielding overwhelming power.”
With the foregoing in mind, Gemini Data sets out claims of federal and state law trademark infringement, false designation of origin, and unfair competition, and is seeking monetary damages, as well as injunctive relief.
Oct. 21, 2024: Dow Jones & Co. v. Perplexity AI
Dow Jones and the New York Post have filed suit against Perplexity AI, which they claim is engaging in a “brazen scheme to compete for readers while simultaneously freeriding on the valuable content the publishers produce.” Specifically, the Wall Street Journal owner and the New York Post argue that Perplexity “claims to provide its users accurate and up-to-date news and information in a platform that, in [its] own words, allows users to ‘Skip the Links’ to original publishers’ websites,” which the AI platform “attempts to accomplish … by engaging in a massive amount of illegal copying of publishers’ copyrighted works and diverting customers and critical revenues away from those copyright holders.” With the foregoing in mind, the plaintiffs set out claims of copyright infringement (with regard to the input and output stages), false designation of origin, and trademark dilution.
Sept. 11, 2024: Gemini Data v. Google
Google is being sued for trademark infringement for allegedly hijacking a smaller but older company’s name for a rebrand of its Bard chatbot. According to the complaint that it filed with the U.S. District Court for the Northern District of California on September 11, Gemini Data claims that in February 2024, “without any authorization by Gemini Data, Google publicly announced a re-branding of its BARD AI chatbot tool to ‘GEMINI.’” As a sophisticated company, Gemini Data claims that Google “undoubtedly conducted a trademark clearance search prior to publicly re-branding its entire line of AI products, and thus was unequivocally aware of Gemini Data’s registered and exclusive rights to the ‘GEMINI’ brand. Yet, Google “made the calculated decision to bulldoze over Gemini Data’s exclusive rights without hesitation,” it claims.
While Gemini Data says that it “does not hold a monopoly over the development of generative AI tools, it does have exclusive rights to the ‘GEMINI’ brand for AI tools,” noting that it “took all the steps to ensure it created a unique brand to identify its AI tools and to subsequently protect that brand.” That did not stop Google from “unabashedly wield[ing] its power to rob Gemini Data of its cultivated brand … assuming a small company like Gemini Data would not be in a position to challenge a corporate giant wielding overwhelming power.”
With the foregoing in mind, Gemini Data sets out claims of federal and state law trademark infringement, false designation of origin, and unfair competition, and is seeking monetary damages, as well as injunctive relief.
Aug. 19, 2024: Andrea Bartz, et al. v. Anthropic PBC
Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson are suing Anthropic for copyright infringement in connection with its generative AI product, Claude. In the complaint that they lodged with the U.S. District Court for the Northern District of California on August 19, the author plaintiffs allege that Anthropic has built “a multibillion-dollar business by stealing hundreds of thousands of copyrighted books.” Rather than “obtaining permission and paying a fair price for the creations it exploits, Anthropic pirated them.” While the Constitution “recognizes the fundamental principle that creators deserve compensation for their work, Anthropic ignored copyright protections,” the plaintiffs argue, claiming that “an essential component of Anthropic’s business model and its flagship ‘Claude’ family of large language models is the largescale theft of copyrighted works.”
Delving into the damage created by Anthropic’s alleged infringement, Bartz, Graeber, and Johnson assert that as AI-powered models “have become more advanced and enabled to train on more and more copyrighted material, they are able to generate more content and more sophisticated content.” The result of this, they maintain, is that it is “easier than ever to generate rip-offs of copyrighted books that compete with the original, or at a minimum dilute the market for the original copyrighted work.” Anthropic’s Claude platform has been used to “generate cheap book content,” according to the plaintiffs, who claim that Claude “could not generate this kind of long-form content if it were not trained on a large quantity of books, books for which Anthropic paid authors nothing.”
In short: “The success and profitability of Anthropic is predicated on mass copyright infringement without a word of permission from or a nickel of compensation to copyright owners, including Plaintiffs here.” With the foregoing in mind, the plaintiffs set out a single claim of copyright infringement and are seeking certification of their class action, as well as monetary damages and injunctive relief.
Jun. 27, 2024: Center for Investigative Reporting, Inc. v. OpenAI, Inc., et al.
The oldest nonprofit newsroom in the country has filed suit against OpenAI and Microsoft, accusing the ChatGPT and Copilot creators of engaging in copyright infringement and violating the Digital Millenium Copyright Act (“DMCA”). In the complaint that it lodged with the U.S. District Court for the Southern District of New York on June 27, the Center for Investigative Reporting, Inc. (“CIR”) alleges that OpenAI and Microsoft (the “defendants”) are offering up AI products that “are built on uncompensated and unauthorized use of the creative works of humans.” Specifically, CIR claims (citing data from “award-winning website Copyleaks”) that “nearly 60% of the responses provided by the defendants’ GPT-3.5 product contained some form of plagiarized content, and over 45% contained text that was identical to pre-existing content.”
According to CIR, the defendants “copied, used, abridged, and displayed [its] valuable content without [its] permission or authorization, and without any compensation to CIR,” thereby, “undermin[ing] and damag[ing] its relationship with potential readers, consumers, and partners, and deprive CIR of subscription, licensing, advertising, and affiliate revenue, as well as donations from readers.”
Setting out claims of direct and contributory copyright infringement CIR argues that the defendants infringed its exclusive rights in its registered works by: “(1) downloading those works from the internet; (2) encoding the Registered Works in computer memory; (3) regurgitating those works verbatim or nearly verbatim in response to prompts by ChatGPT users; (4) producing significant amounts of material from those works in response to prompts by ChatGPT users; and (5) producing significant amounts of material from those works in response to prompts by ChatGPT users.” And in furtherance of its DMCA claims, CIR contends that the defendants “created copies of [its] works of journalism with copyright notice information removed.”
May 16, 2024: Lehrman, et al. v. LOVO, Inc.
Voice-over actors Paul Lehrman and Linnea Sage have filed a right of publicity and false advertising lawsuit against LOVO, Inc., a startup in the business of selling “a text-to-speech subscription service that allows its clients – typically companies – to generate voice-over narrations at a fraction of the cost of the traditional model.” According to Lehrman and Sage’s complaint, LOVO enables “subscribing customers to upload a script into its AI-driven software … and generate a professional-quality voice-over based on certain criteria,” and that it “promotes its service using barely-disguised images and names of celebrities and states on its website, ‘Clone any voice.’”
“Implicit in LOVO’s offerings to its customers is that each voice-over actor has agreed to LOVO’s terms and conditions for customers to be able to access that,” Lehrman and Sage assert. The problem with that, they claim, is that they (and other members of the class) “have not agreed to LOVO’s terms,” and that LOVO has “stolen and used” their “voices and/or identities to create millions of voice-over productions without permission or proper compensation, in violation of numerous state right of privacy laws, and the federal Lanham Act.”
Apr. 30, 2024: Daily News, LP, et al. v. Microsoft Corp., et al.
A group of eight news publications have filed a copyright infringement and trademark dilution lawsuit against Microsoft and OpenAI in a New York federal court, alleging that the generative AI pioneer and its partner of “purloining millions of [their] copyrighted articles without permission and without payment to fuel the commercialization of their generative artificial intelligence products, including ChatGPT and Copilot.” The plaintiffs – which include Chicago Tribune Company, Orlando Sentinel Communications Company, and San Jose Mercury-News, among other newspapers – argue that while OpenAI and Microsoft pay for the other elements of their businesses, such as computers, specialized chips, electricity, and programmers and other technical employees, they have opted not to pay for the “high quality content” that they need “to make their GenAI products successful.”
“Despite admitting that they need copyrighted content to produce a commercially viable GenAI product,” the plaintiffs claim that OpenAI and Microsoft “contend that they can fuel the creation and operation of these products with the [plaintiffs]’ content without permission and without paying for the privilege.” But “they are wrong on both counts,” according to the plaintiffs, who set out claims of direct, vicarious, and contributory copyright infringement, violations of the DMCA, common law unfair competition by misappropriation, federal trademark dilution, and dilution and injury to business reputation under New York General Business Law.
Apr. 26, 2024: Zhang et al. v. Google LLC and Alphabet Inc.
A group of visual artists have filed suit against Google LLC and its owner Alphabet Inc., alleging that the tech titans made unauthorized use of their copyright-protected artworks to train its AI-powered image generator, Imagen. Neither the plaintiffs nor any of the proposed class members ever authorized Google to use their copyrighted works as training material, according to the complaint, which states that “these copyrighted training images were copied multiple times by Google during the training process for Imagen.” And because Imagen “contains weights that represent a transformation of the protected expression in the training dataset, Imagen is, itself, an infringing derivative work.”
Meanwhile, the plaintiffs – who set out claims of direct copyright infringement against Google and vicarious copyright infringement against Alphabet – further assert that Alphabet, “as the corporate parent of Google, also commercially benefits from these acts of massive copyright infringement.”
Mar. 8, 2024: Nazemian, et al. v. NVIDIA Corp.
NVIDIA Corp. has landed on the receiving end of a copyright infringement complaint filed with the N.D. Cal. in March 8, with author-plaintiffs Abdi Nazemian, Brian Keene, and Stewart O’Nan (collectively, the “plaintiffs”) alleging that their copyright-protected books that “were included in the training dataset that NVIDIA has admitted copying to train its NeMo Megatron models.” In their brief complaint, in which they set out a single claim of direct copyright infringement, the plaintiffs assert that NVIDIA “has admitted training its NeMo Megatron models” on a copy of a dataset called, The Pile, and therefore, “necessarily also trained its NeMo Megatron models on a copy of Books3, because Books3 is part of The Pile.”
Since “certain books written by the plaintiffs are part of Books3, including the infringed works and NVIDIA necessarily trained its NeMo Megatron models on one or more copies of the infringed works,” they claim that NVIDIA directly infringing their copyrights.
Feb. 28, 2024: Raw Story Media, et al. v. OpenAI, Inc., et al.
The latest lawsuit to be failed against OpenAI comes by way of news outlets Raw Story Media, Inc. and AlterNet Media, Inc. (the “plaintiffs”), which accuse the generative AI giant of “repackag[ing]” their “copyrighted journalism work product” by way of the outputs from its popular ChatGPT platform. Setting the stage in their complaint, the plaintiffs claim that “at least some of the time, ChatGPT provides or has provided responses to users that regurgitate verbatim or nearly verbatim copyright-protected works of journalism without providing any author, title, or copyright information contained in those works,” while other times, it “provides or has provided responses to users that mimic significant amounts of material from copyright-protected works of journalism without providing any author, title, or copyright information contained in those works.”
Part of the problem here, according to the plaintiffs, stems from how OpenAI trains the models that power ChatGPT: “When they populated their training sets with works of journalism, [OpenAI] had a choice: they could train ChatGPT using works of journalism with the copyright management information protected by the Digital Millennium Copyright Act (‘DMCA’) intact, or they could strip it away.” OpenAI “chose the latter,” the plaintiffs assert, and “in the process, trained ChatGPT not to acknowledge or respect copyright, not to notify ChatGPT users when the responses they received were protected by journalists’ copyrights, and not to provide attribution when using the works of human journalists.”
As such, when ChatGPT provides outputs in response to user prompts, it “gives the impression that it is an all-knowing, ‘intelligent’ source of the information being provided, when in reality, the responses are frequently based on copyrighted works of journalism that ChatGPT simply mimics,” the plaintiffs maintain.
With the foregoing in mind, the plaintiffs set out a single claim under 17 U.S.C. § 1202(b)(1) of the DMCA, on the basis that OpenAI “created copies of [their] works of journalism with author information removed and included them in training sets used to train ChatGPT.”
Feb. 28, 2024: The Intercept Media, Inc. v. OpenAI, Inc.
The Intercept Media similarly waged DMCA claims against OpenAI and Microsoft, accusing the two companies, as well as a number of OpenAI affiliates of violating the DMCA by creating and using copies of its works of journalism “with author information removed and included them in training sets used to train ChatGPT.” Among other things, the Intercept claims in the complaint that it lodged with the U.S. District Court for the Southern District of New York that OpenAI and co. “had reason to know that ChatGPT would be less popular and would generate less revenue if users believed that ChatGPT responses violated third-party copyrights or if users were otherwise concerned about further distributing ChatGPT responses.”
This is at least because the defendants “were aware that they derive revenue from user subscriptions, that at least some likely users of ChatGPT respect the copyrights of others or fear liability for copyright infringement, and that such users would not pay to use a product that might result in copyright liability or did not respect the copyrights of others.”
Like Raw Media and AlterNet Media, the Intercept accuses OpenAI and co. of violating 17 U.S.C. § 1202(b)(1) of the DMCA by “creat[ing] copies of [its] works of journalism with author information removed and included them in training sets used to train ChatGPT.” The Intercept goes further, though, and sets out claims under 17 U.S.C. § 1202(b)(3) on the basis that the defendants “shared copies of [its] works without author, title, copyright, and terms of use information” with each other “in connection with the development of ChatGPT.”
Jan. 25, 2024: Main Sequence, et al. v. Dudesy LLC, et al.
On the heels of Dudesy, a media company in the business of creating AI-generated works, releasing an hour-long special featuring an AI-generated imitation of George Carlin’s voice on the Dudesy podcast’s YouTube channel on January 9, the late comedian’s estate has lodged right of publicity and copyright infringement claims in a federal court in California. According to the complaint, dated January 25, more than 16 years after Carlin’s death, Dudesy and its founders, comedian Will Sasso and writer Chad Kultgen, “took it upon themselves to ‘resurrect’ Carlin with the aid of AI.”
“Using Carlin’s original copyrighted works,” Dudesy LLC, Sasso, and Kultgen (collectively, “Dudesy” and/or “defendants”) “created a script for a fake George Carlin comedy special and generated a sound-alike of George Carlin to ‘perform’ the generated script,” according to Main Sequence, Ltd., Jerold Hamza as executor for the Estate of George Carlin, and Jerold Hamza in his individual capacity (collectively, “Carlin’s estate” and/or the “plaintiffs”). The plaintiffs assert that “none of the defendants had permission to use Carlin’s likeness for the AI-generated ‘George Carlin Special,’ nor did they have a license to use any of the late comedian’s copyrighted materials.”
Against that background, they set out claims of violation of rights of publicity under California common law and deprivation of rights of publicity under Cal. Civ. Code § 3344.1; they are taking issue with Dudesy’s use of Carlin’s “name, reputation, and likeness,” namely, their use of “generated images of Carlin, Carlin’s voice, and images designed to evoke Carlin’s presence on a stage.” The plaintiffs also set out a claim of federal copyright infringement, arguing that the defendants have “unlawfully used [the] plaintiffs’ copyrighted works for building and training a dataset for purposes of generating an output intended to mimic the plaintiffs’ copyrighted work (i.e., Carlin’s stand-up comedy).”
With the foregoing in mind, the plaintiffs are seeking monetary damages, as well as preliminary and permanent injunctive relief to bar Dudesy and co. “from directly committing, aiding, encouraging, enabling, inducing, causing, materially contributing to, or otherwise facilitating use of George Carlin’s copyrighted works to generate Dudesy Specials and any other contents created or disseminated by Dudesy, LLC relating to those Dudesy Specials.” Additionally, they want the court to order Dudesy to “immediately remove, take down, and destroy any video or audio copies (including partial copies) of the ‘George Carlin Special,’ wherever they may be located.”
UPDATED (Apr. 2, 2024): Main Sequence and Dudesy have settled suit, as indicated by their filing of a joint stipulation with the court consenting to a judgment and permanent injunction barring Dudesy and co. from “uploading, posting or broadcasting the [‘George Carlin: I’m Glad I’m Dead (2024) – Full Special’] on the Dudesy Podcast, or in any content posted to any website, account or platform (including, without limitation, YouTube and social media websites) controlled by [them].” The defendants are also barred from “using George Carlin’s image, voice or likeness on the Dudesy Podcast, or in any content posted to any website, account or platform … controlled by [them] without the express written approval of the plaintiffs.”
Jan. 5, 2024: Basbanes v. Microsoft Corp. and OpenAI, et al.
Journalists Nicholas Basbanes and Nicholas Ngagoyeanes (professionally known as “Nicholas Gage”) have filed a direct, vicarious, and contributory copyright infringement suit against Microsoft Corporation and OpenAI, Inc., along with an array of affiliated OpenAI entities, arguing that the defendants copied their work “to build a massive commercial enterprise that is now valued at billions of dollars.” In particular, the plaintiffs assert that Microsoft and OpenAI, “as sophisticated commercial entities, clearly decided upon a deliberate strategy to steal the plaintiffs’ copyrighted works to power their massive commercial enterprise … [without] paying for the inputs that make their LLMs, and which are thus plainly derivative works [that] result in an even higher profit margin for the defendants.”
Pointing to the case waged against Microsoft and OpenAI as impetus for the case at hand, Basbanes and Ngagoyeanes assert that “shortly after The New York Times filed suit against these same defendants in this court, the defendants publicly acknowledged that copyright owners like the plaintiffs must be compensated for the defendants’ use of their work: ‘We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models.’”
Against that background, the plaintiffs state that they “seek to represent a class of writers whose copyrighted work has been systematically pilfered by the defendants,” seeking damages for “copyright infringement, the lost opportunity to license their works, and for the destruction of the market the defendants have caused and continue to cause to writers.” The plaintiffs are also seeking a permanent injunction “to prevent these harms from recurring.”
2023
Dec. 27, 2023: New York Times Company v. Microsoft Corp. and OpenAI, et al.
The New York Times is accusing OpenAI and partner Microsoft of copyright infringement, violations of the Digital Millennium Copyright Act, unfair competition by misappropriation, and trademark dilution in a new lawsuit. According to the complaint that it filed with the U.S. District Court for the Southern District of New York on December 27, the Times alleges that the defendants are on the hook for making “unlawful use of The Times’s work to create artificial intelligence products that compete with it [and that] threatens The Times’s ability to provide [trustworthy information, news analysis, and commentary].” The defendants’ generative AI tools “rely on large-language models that were built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more,” the paper claims.
The Times contends that the defendants “insist that their conduct is protected as ‘fair use’ because their unlicensed use of copyrighted content to train GenAI models serves a new ‘transformative’ purpose,” but the paper argues that “there is nothing “transformative” about using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it.” Moreover, it maintains that “because the outputs of the defendants’ GenAI models compete with and closely mimic the inputs used to train them, copying Times works for that purpose is not fair use.”
In addition to its copyright-centric claims, the Times sets out a trademark dilution cause of action, asserting that the defendants “have, in connection with the commerce of producing GenAI to users for profit throughout the United States, including in New York, engaged in the unauthorized use of The Times’s trademarks in outputs generated by [their] GPT-based products.” In particular, it alleges that the defendants’ “unauthorized use of The Times’s marks on lower quality and inaccurate writing dilutes the quality of The Times’s trademarks by tarnishment in violation of 15 U.S.C § 1125(c).” On this front, the New York Times claims that “at the same time as the defendants’ models are copying, reproducing, and paraphrasing [its] content without consent or compensation, they are also causing The Times commercial and competitive injury by misattributing content to The Times that it did not publish,” thereby, giving rise to “misinformation.”
While The Times “has attempted to reach a negotiated agreement with the defendants … to permit the use of its content in new digital products,” it claims that “the negotiations have not led to a resolution.”
Nov. 21, 2023: Sancton v. OpenAI Inc., Microsoft Corporation, et al.
Reporter Julian Sancton has filed suit against OpenAI and Microsoft, alleging that the tech titans “have built a business valued into the tens of billions of dollars by taking the combined works of humanity without permission.” In the complaint that he lodged with U.S. District Court for the Southern District of New York on November 21, Sancton claims that “rather than pay for intellectual property,” the defendants “pretend as if the laws protecting copyright do not exist.” Among such IP? Sancton’s book, Madhouse at the End of the Earth, along with “thousands, maybe more, [of other] copyrighted works – including nonfiction books,” which the defendants allegedly used to train their AI models.
The problem, per Sanction is that “the U.S. Constitution protects the fundamental principle that creators” – including “nonfiction authors, [who] often spend years conceiving, researching, and writing their creations” – “deserve compensation for their works.”
The bottom line in the complaint, in which Sancton sets out claims of direct and contributory copyright infringement, is that “the basis of the OpenAI platform is nothing less than the rampant theft of copyrighted works.”
Oct. 18, 2023: Concord Music Group, Inc. v. Anthropic PBC
A pool of music publishers have lodged a complaint against AI startup Anthropic PBC for allegedly engaging in “systematic and widespread infringement of their copyrighted song lyrics.” According to the complaint, which was filed with the U.S. District court for the Middle District of Tennessee, “In the process of building and operating AI models, Anthropic unlawfully copies and disseminates vast amounts of copyrighted works – including the lyrics to myriad musical compositions owned or controlled by the publisher plaintiffs.” The plaintiffs – which range from Concord Music Group to Universal Music – assert that they “embrace innovation and recognize the great promise of AI when used ethically and responsibly, but Anthropic violates these principles on a systematic and widespread basis.”
In particular, Universal and co. claim that Anthropic builds its AI models by using “lyrics to innumerable musical compositions for which [they] own or control the copyrights, among countless other copyrighted works harvested from the internet.” Such copyrighted material is “not free for the taking simply because it can be found on the internet,” the plaintiffs assert, alleging that Anthropic “has neither sought nor secured the publishers’ permission to use their valuable copyrighted works in this way.” Among the copyright protected song lyrics, according to the plaintiffs, are those from artists like the Rolling Stones, Garth Brooks, Katy Perry, Gloria Gaynor, which Anthropic’s model allegedly infringed by way of “identical or substantially and strikingly similar to Publishers’ copyrighted lyrics for each of the compositions
In furtherance of its direct copyright infringement, vicarious copyright infringement, and contributory copyright infringement claims, and their DMCA claims for removal of copyright management information, and their demands for relief, the plaintiffs argue that Anthropic “must abide by well-established copyright laws, just as countless other technology companies regularly do.”
Sept. 19, 2023: Authors Guild, et al. v. OpenAI, Inc.
The Authors Guild and more than a dozens authors, including John Grishman and George R.R. Martin, are suing an array of OpenAI entities for for allegedly engaging in “a systematic course of mass-scale copyright infringement that violates the rights of all working fiction writers and their copyright holders equally, and threatens them with similar, if not identical, harm.” In the complaint that they filed with the U.S. District Court for the Southern District of New York on September 19, the plaintiffs, who are authors of “a broad array of works of fiction,” claim that they are “seeking redress for [OpenAI’s] flagrant and harmful infringements of [their] registered copyrights” by way of its “wholesale” copying of such works without permission or consideration.
Specifically, the plaintiffs claim that by way of datasets that include the texts of their books, OpenAI “fed [their] copyrighted works into its ‘large language models,’ [which are] algorithms designed to output human-seeming text responses to users’ prompts and queries,” and which are “at the heart of [its] massive commercial enterprise.” Because OpenAI’s models “can spit out derivative works: material that is based on, mimics, summarizes, or paraphrases the plaintiffs’ works, and harms the market for them,” it is endangering “fiction writers’ ability to make a living, in that the [models] allow anyone to generate – automatically and freely (or very cheaply) – texts that they would otherwise pay writers to create.”
With the foregoing in mind, the plaintiffs set out claims of direct copyright infringement, vicarious copyright infringement, and contributory copyright infringement.
Sept. 8, 2023: Chabon v. OpenAI, Inc.
Authors Michael Chabon, David Henry Hwang, Matthew Klam, Rachel Louise Snyder, and Ayelet Waldman are suing OpenAI on behalf of themselves and a class of fellow “authors holding copyrights in their published works arising from OpenAI’s clear infringement of their intellectual property.” In their September 8 complaint, which was filed with a federal court in Northern California, Chabon and co. claim that OpenAI incorporated their “copyrighted works in datasets used to train its GPT models powering its ChatGPT product.” Part of the issue, according to the plaintiffs, is that “when ChatGPT is prompted, it generates not only summaries, but in-depth analyses of the themes present in [their] copyrighted works, which is only possible if the underlying GPT model was trained using [their] works.”
The plaintiffs claim that they “did not consent to the use of their copyrighted works as training material for GPT models or for use with ChatGPT,” and that by way of their operation of ChatGPT, OpenAI “benefit[s] commercially and profit handsomely from [its] unauthorized and illegal use of the plaintiffs’ copyrighted works.”
UPDATED (Nov. 8, 2023): Following a conference with N.D. Cal. Judge Araceli Martinez-Olguin (and a subsequently-issued pretrial order), this case has been consolidated with Tremblay v. OpenAI, Inc. to form In Re OpenAI ChatGPT Litigation.
Jul. 11, 2023: J.L., C.B., K.S., et al., v. Google LLC
Google is being sued over their alleged practice of “stealing” web-scraped data and “vast troves of private user data from [its] own products” in order to build commercial artificial intelligence (“AI”) products like its Bard chatbot. In the complaint that they filed with a California federal court on Tuesday, J.L., C.B., K.S., P.M., N.G., R.F., J.D., and G.R., who have opted to file anonymously, claim that “for years, Google harvested [our personal and professional information, our creative and copywritten works, our photographs, and even our emails] in secret, without notice or consent from anyone,” thereby, engaging in unfair competition, negligence, invasion of privacy, and copyright infringement, among other causes of action.
UPDATED (Jun. 6, 2024): N.D. Cal. Judge Araceli Martinez-Olguín dismissed the plaintiffs’ amended complaint, citing “concerns expressed” by fellow N.D. Cal. Judge Vince Girdhari Chhabria, who dismissed the complaint in a similar scraping case in May 2024. According to Judge Martinez-Olguín’s June 6 order, “In light of the concerns expressed by Judge Chhabria in his order dismissing the complaint in the matter of Cousart v. OpenAI LP, and given the overlap in the plaintiffs named, the involved plaintiffs’ counsel, and the claims asserted in this case and Cousart, Google’s motion to dismiss Plaintiffs’ amended complaint is granted” – albeit without prejudice. The plaintiffs may file another amended complaint within 21 days of the court’s order.
Jul. 7, 2023: Silverman, et al. v. OpenAI, Inc.
Mirroring the complaint that authors Paul Tremblay and Mona Awad filed against OpenAI on June 28, Sarah Silverman (yes, that Sarah Silverman), Christopher Golden, and Richard Kadrey (“Plaintiffs”) accuse the ChatGPT developer of direct and vicarious copyright infringement, violations of section 1202(b) of the Digital Millennium Copyright Act, unjust enrichment, violations of the California and common law unfair competition laws, and negligence in a new lawsuit. The basis of the lawsuit: “Plaintiffs and Class members are authors of books. Plaintiffs and Class members have registered copyrights in the books they published. Plaintiffs and Class members did not consent to the use of their copyrighted books as training material for ChatGPT. Nonetheless, their copyrighted materials were ingested and used to train ChatGPT.”
UPDATED (Nov. 8, 2023): Following a conference with N.D. Cal. Judge Araceli Martinez-Olguin (and a subsequently-issued pretrial order), this case has been consolidated with Tremblay v. OpenAI, Inc. to form In Re OpenAI ChatGPT Litigation.
Jul. 7, 2023: Kadrey, et al. v. Meta Platforms, Inc.
The same trio of plaintiffs as above – Sarah Silverman, Christopher Golden, and Richard Kadrey – filed lodged a separate but very similar complaint against Meta Platforms in federal court in Northern California on July 7, accusing the Facebook and Instagram-owner of running afoul of copyright law by way of LLaMA, a set of large language models that it created and maintains. According to the plaintiffs’ suit, “many of [their] copyrighted books” were included in dataset assembled by a research organization called EleutherAI, which was “copied and ingested as part of training LLaMA.”
UPDATED (Nov. 9, 2023): In a motion hearing, Judge Vince Chhabri of the U.S. District Court of the Northern District of California said that in an impending written order, he will side dismiss the bulk of the plaintiffs’ copyright claims, including vicarious copyright liability, which the court called “clearly meritless” and infringement in the form of the output from Meta’s generative AI model LLaMA on the basis that the plaintiffs have not allegedly that LLaMA’s output is “substantially similar” to their works. Not a total loss for Kadrey and co., the judge said he will grant them leave to amend their complaint. Meta did not seek dismissal of the authors’ core copyright infringement claim – that Meta engaged in infringement by using their copyright-protected books as training materials for its LLaMA model – and thus, the judge held that discovery can go forward on that claim.
Jun. 28, 2023: Tremblay v. OpenAI, Inc.
A couple of authors are the latest to file suit against ChatGPT developer OpenAI. According to the complaint that they filed with a federal court in Northern California on June 28, Paul Tremblay and Mona Awad assert that in furtherance of the training of the large language model that powers the generative AI chatbot that is ChatGPT, OpenAI has made use of large amounts of data, including the text of books that they have authored without their authorization, thereby, engaging in direct copyright infringement, violations of the Digital Millennium Copyright Act, and unfair competition.
Among other things, the plaintiffs allege that OpenAI “knowingly designed ChatGPT to output portions or summaries of [their] copyrighted works without attribution,” and the company “unfairly profit[s] from and take[s] credit for developing a commercial product based on unattributed reproductions of those stolen writing and ideas.”
UPDATED (Nov. 8, 2023): Following a conference with N.D. Cal. Judge Araceli Martinez-Olguin (and a subsequently-issued pretrial order), Silverman, et al v. OpenAI, Inc., et al. and Chabon, et al. v. OpenAI, Inc, et al. were consolidated with this case to form In Re OpenAI ChatGPT Litigation.
Jun. 28, 2023: Plaintiffs P.M., K.S., et al. v. OpenAI LP, et al.
More than a dozen underage individuals have filed suit against OpenAI and its partner/investor Microsoft in connection with the development and marketing of generative AI products, which allegedly involves the scraping of “vast” amounts of personal data. According to the June 28 complaint, OpenAI and the other defendants have “stolen private information, including personally identifiable information, from hundreds of millions of internet users, including children of all ages, without their informed consent or knowledge” in furtherance of their creation and operation of the ChatGPT, Dall-E, and Vall-E programs. And they “continue to unlawfully collect and feed additional personal data from millions of unsuspecting consumers worldwide, far in excess of any reasonably authorized use, in order to continue developing and training the products.”
The plaintiffs accuse OpenAI of violating: The Electronic Communications Privacy Act; The Computer Fraud and Abuse Act; California’s Invasion of Privacy Act and Unfair Competition law; Illinois’s Biometric Information Privacy Act, Consumer Fraud and Deceptive Business Practices Act, and Consumer Fraud and Deceptive Business Practices Act; and New York General Business Law s. 349, which prohibits deceptive acts and practices unlawful. Beyond that, the plaintiffs also set out negligence, invasion of privacy, intrusion upon seclusion, larceny/receipt of stolen property, conversion, unjust enrichment, and failure to warn causes of action.
UPDATED (Sept. 15, 2023): The unnamed plaintiffs moved to voluntarily dismiss their case against OpenAI and Microsoft without prejudice, which suggests that the parties reached an agreement out of court.
Jun. 5, 2023: Walters v. OpenAI LLC
And in yet another suit being waged against OpenAI, Plaintiff Mark Walters asserts that the company behind ChatGPT is on the hook for libel as a result of misinformation that it provided to a journalist in connection with his reporting on a federal civil rights lawsuit filed against Washington Attorney General Bob Ferguson and members of his staff. In particular, Walters claims that ChatGPT’s case summary (and journalist Fred Riehl’s article) stated that the lawsuit was filed against him for fraud and embezzlement. The problem with that, according to Walters’s lawsuit, is that he is “neither a plaintiff nor a defendant in the lawsuit,” and in fact, “every statement of fact” in the ChatGPT summary that pertains to him is false.
Apr. 3, 2023: Young v. NeoCortext, Inc.
“Deep fake” app Reface is at the center of a proposed class action complaint, with TV personality Kyland Young accusing the company of running afoul of California’s right of publicity law by enabling users to swap faces with famous figures – albeit without receiving authorization from those well-known individuals to use their likenesses. According to the complaint that he filed in a California federal court in April, Young asserts that Reface developer NeoCortext, Inc. has “commercially exploit[ed] his and thousands of other actors, musicians, athletes, celebrities, and other well-known individuals’ names, voices, photographs, or likenesses to sell paid subscriptions to its smartphone application, Reface, without their permission.”
NeoCortext has since argued that Young’s case should be tossed out on the basis that the reality TV personality not only fails to adequately plead a right of publicity claim, but even if he could, that claim is preempted by the Copyright Act and barred by the First Amendment.
Feb. 15, 2023: Flora, et al., v. Prisma Labs, Inc.
Prisma Labs – the company behind AI image-generating app, Lensa A.I. – was named in a proposed class action lawsuit in February, with the plaintiffs arguing that despite “collecting, possessing, storing, using, and profiting from” Lensa users’ biometric identifiers namely, scans of their “facial geometry,” in connection with its creation of custom avatars, Prisma has failed to properly alert users about the biometric data its collects and how it will be stored/destroyed, as required by the Illinois data privacy law.
UPDATED (Aug. 8, 2023): A N.D. Cal. judge sided with Prisma Labs, granting its motion to compel arbitration in the proposed class action, despite the plaintiffs’ arguments that the arbitration provision in Lena’s terms is unconscionable and that “because some provisions in the arbitration agreement arguably fall below JAMS’ Consumer Arbitration Minimum Standards, the arbitration provision is illusory.”
Feb. 3, 2023: Getty Images (US), Inc. v. Stability AI, Inc.
In the wake of Getty announcing that it had “commenced legal proceedings” in the High Court of Justice in London against Stability AI, Getty Images (US), Inc. filed a stateside lawsuit, accusing Stability AI of “brazen infringement of [its] intellectual property on a staggering scale.” Specifically, the photo agency argues that Stability AI has copied millions of photographs from its collection “without permission from or compensation to Getty Images, as part of its efforts to build a competing business.”
In addition to setting out a copyright infringement cause of action and alleging that Stability AI has provided false copyright management information and/or removed or altered copyright management information, Getty accuses Stability AI of trademark infringement and dilution on the basis that “the Stable Diffusion model frequently generates output bearing a modified version of the Getty Images watermark.” This creates “confusion as to the source of the images and falsely implying an association with Getty Images,” per Getty. And beyond that, Getty asserts that “while some of the output generated through the use of Stable Diffusion is aesthetically pleasing, other output is of much lower quality and at times ranges from the bizarre to the grotesque,” giving rise to dilution.
In a motion to dismiss in May, Stability AI, Inc. argued that Getty has not even attempted to make a case for jurisdiction under Delaware’s long-arm statute, as it “does not allege that any of the purportedly infringing acts regarding training Stable Diffusion occurred within Delaware.” Instead (and “although the amended complaint is vague in this regard”), Stability AI claims that Getty “appears to allege that the training took place in England and Germany,” pointing to the following language from the plaintiff’s amended complaint, “Stable Diffusion was trained . . . from Datasets prepared by non-party LAION, a German entity…”. Getty also does not allege that Stability AI Ltd. “contracted to supply services or things in Delaware,” per Stability AI.
Jan. 13, 2023: Andersen, et al. v. Stability AI LTD., et al.
Stability AI was named in a copyright infringement, unfair competition, and right-of-publicity lawsuit in January 2023, along with fellow defendants DeviantArt and Midjourney. In furtherance of the lawsuit, a trio of artists is accusing Stability AI and co. of engaging in “blatant and enormous infringement” by using their artworks – without authorization – to enable AI-image generators, including Stable Diffusion, to create what are being characterized as “new” images but what are really “infringing derivative works.”
The defendants have pushed back against the suit, with Stability AI arguing this spring that while Stable Diffusion was “trained on billions of images that were publicly available on the Internet … training a model does not mean copying or memorizing images for later distribution. Indeed, Stable Diffusion does not ‘store’ any images.” Meanwhile, in a filing of its own in April, text-to-image generator DeviantArt urged the court to toss out the claims against it and to strike the right-of-publicity claims lodged against it, as they “largely concern the potential for DreamUp to create art,” which falls neatly within the bounds of free speech. As such, the Los Angeles-based online art (and AI) platform says that the plaintiffs’ claims should be barred by California’s anti-SLAPP statute.
Jan. 12, 2023: Getty Images (US), Inc. v. Stability AI Ltd.
Getty Images filed suit in the High Court of Justice in London against Stability AI, claiming that Stability AI infringed intellectual property rights, including copyright in content owned or represented by Getty Images. In a statement, Getty Images revealed that it is its “position that Stability AI unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images absent a license to benefit Stability AI’s commercial interests and to the detriment of the content creators.”
UPDATED (Feb. 27, 2024): Stability AI filed its defense.
2022
Nov. 10, 2022: J. Doe v. Github, Inc., et al.
Microsoft, GitHub, and OpenAI landed on the receiving end of an interesting lawsuit, with a couple of plaintiffs accusing them of running afoul of the Digital Millennium Copyright Act (“DMCA”), and also engaging in breach of contract, tortious interference, fraud, false designation, unjust enrichment, and unfair competition in connection with Copilot, a subscription-based AI tool co-developed by GitHub and OpenAI. At the heart of the plaintiffs’ suit: Their claim that the defendants used their copyright-protected source code as training data for Copilot, which enables software developers to easily generate code by “turning natural language prompts into coding suggestions across dozens of languages.”
According to the plaintiffs, open-source code repository GitHub, OpenAI (which created the GPT-3 language model used to create Copilot), GitHub owner and OpenAI investor Microsoft (the “defendants”) used data that they sourced from publicly accessible repositories on GitHub to train Copilot. The plaintiffs assert that they “posted such code or other works under certain open-source licenses on GitHub,” and that all of those licenses require attribution of the author’s name and copyright. The problem, they claim, is that in using their code, the defendants stripped the “attribution, copyright notice, and license terms from their code in violation of the licenses and the plaintiffs’ and the class’s rights.”
2020
May 6, 2020: Thomson Reuters Enterprise Centre GmbH et al v. ROSS Intelligence Inc.
In an early generative AI-centric case, Thomson Reuters alleges that ROSS copied the entirety of its Westlaw database (after having been denied a license) to use as training data for its competing generative AI-powered legal research platform. Reuters’ complaint survived a motion to dismiss in 2021. Fast forward to the summary judgement phase, and ROSS has argued, in part, that its unauthorized copying/use of the Westlaw database amounts to fair use. Specifically, ROSS claims that it took only “unprotected ideas and facts about the text” in order to train its model; that its “purpose” in doing so was to “write entirely original and new code” for its generative AI-powered search tool; and that there is no market for the allegedly infringed Westlaw content consisting of headnotes and key numbers.