Homepage Feature

YouTuber launches first-of-its-kind class action lawsuit against OpenAI

The reckoning over companies like OpenAI using creators’ content without their permission might finally be at hand.

Since OpenAI dropped the first version of ChatGPT in November 2022, it’s faced scrutiny over what data it uses to make its generative AI products, with many creators concerned their videos have been scraped and dumped into the slush. Creators understandably want to keep control of the content they make, and while it’s too late to extract their videos from already-scraped datasets, it may be possible for them to receive compensation for the violation of their ownership rights–and set a precedent that could prevent other companies from taking what’s theirs in the future.

That’s the goal of a lawsuit from David Millette, a Massachusetts man who’s now opened a class action lawsuit against OpenAI seeking $5 million in damages for himself and other creators.

Subscribe for daily Tubefilter Top Stories

Subscribe

Millette, who’s had a YouTube account since 2009, alleges OpenAI has engaged in the “surreptitious, non-consensual transcription of millions of YouTube users’ videos […] to train Defendants’ AI software products,” and that it’s “profited significantly” by doing so. The suit specifically refers to allegations that OpenAI created a speech recognition model, Whisper, to transcribe audio, then used Whisper to transcribe millions of hours of YouTube content. Those transcriptions were reportedly used to train GPT-4.

The lawsuit alleges that by scraping creators’ videos, OpenAI violated copyright law, since creators retain ownership rights to any videos they upload thanks to YouTube’s terms of service.

“Much of the material in OpenAI’s training datasets […] comes from works–including videos created and uploaded by Plaintiff–that were copied by OpenAI without consent, without credit, and without compensation,” the suit alleges.

As TechCrunch points out, the reason makers of large language models (LLMs) like ChatGPT have turned to using video transcriptions for training is because they’ve already scraped everything they can from the rest of the internet, and because more and more text-based websites are now installing blockers to keep future scrapes from happening. Over 35% of the world’s top 1,000 websites have those protections in place.

If you’re wondering whether YouTube is looking into solutions like that to prevent external scrapings, we’re not sure. But there’s a bigger concern with YouTube: it talks a big game about keeping creators safe in the advent of genAI, but it’s allegedly also scraping transcriptions of creators’ videos and using them to train Google‘s own AI products.

Millette’s lawsuit is a civil case, but it does ask the presiding judge to state that OpenAI violated copyright laws, something that could expose the company to future criminal charges. And, like we mentioned above, Millette’s also seeking $5 million in damages–which, since this is a class action lawsuit, would be split between him and any other affected creators in the event that things are decided in his favor.

If Millette wins his case, creators may receive some cash. But their data will still be part of potentially dozens of genAI products because there are no established protections for creators against having their videos, writing, art, and more scraped and subsumed into training sets. Until those protections are in place, creators like Millette have to fight for themselves, and hope judgments in their favor will deter companies who want to use their content without permission.

Share
Published by
James Hale

Recent Posts

Have you heard? Ludwig’s ‘GeoGuessr’ fame, Poland’s record-setting stream, and an NFL prank gone wrong.

Each week, we handpick a selection of stories to give you a snapshot of trends,…

2 days ago

Roblox hikes developer earnings by 42%–but only if they make games aimed at adults

Roblox is quadrupling down on chasing adult gamers--and rewarding developers who make games appealing to…

2 days ago

After FaZe Clan’s epic collapse, it’s CORE members are reuniting with a new creator group

Five months after FaZe Clan's collapse, some of its best-known alumni are looking to bring back…

2 days ago

TV production companies let creators use their game show formats. Then Squeezie flipped the script.

Creators have already made their mark in movie theaters and on Broadway stages. Now, they're…

2 days ago

Vine is back–and it has a zero-tolerance policy for creators using AI

Vine is back, and it's anti-AI. Jack Dorsey, co-founder and former multi-time CEO of Twitter,…

3 days ago

Spotify has a new use for “verified” check marks: They can identify human creators

On the internet, it's been a roller coaster ride for the humble check mark. At…

3 days ago