Comedian Sara Silverman recently joined forces with bestselling authors Christopher Golden and Richard Kadrey against two unlikely adversaries: ChatGPT’s creator OpenAI and Meta Platforms.
META,
The authors filed a copyright infringement lawsuit in early July in which they claim their copyrighted books were used without their permission as part of training dataset for ChatGPT and similar AI models.
The outcome of this unfolding drama could be more than a simple courtroom win or loss; it could fundamentally redefine the boundaries of artificial intelligence and copyright law.
Silverman’s case asks us to look closely at the fair use doctrine — a cornerstone of U.S. copyright law that permits a limited use of copyrighted material without acquiring permission from the rights’ holder.
Read: Sarah Silverman’s ChatGPT lawsuit raises big questions for AI
More: And thus the ChatGPT backlash has begun
Here’s where things get murky and interesting. For a work to be included in the fair use category, it needs to meet several criteria: the purpose and character of the use; the nature of the copyrighted work; the amount and substantiality of the portion used and its effect on the market for the original work. The big question now is whether AI’s ingestion and processing of text for training purposes could be considered fair use. Furthermore, would AI’s utilization of the work be considered transformative, thereby providing a unique meaning or purpose to the original work?
Here’s a critical point to remember: AIs like ChatGPT don’t parrot books verbatim. They generate new content based on patterns learned from the training data. The specific words and sentences formed aren’t direct copies from copyrighted books, blurring the lines of traditional infringement. I have a hunch there’s a slim chance this case will hold up, but the final call is in the hands of the courts.
The cornerstone of human learning is imitation. The nature of intelligence — whether biological or artificial — rests on the ability to recognize patterns and apply them in innovative ways. AI, such as ChatGPT, operates similarly. It learns from its environment — in this case, the extensive datasets of text — and mimics patterns it finds therein. That’s how it can compose a strikingly human-like string of text despite having no consciousness or inherent creativity.
The current lawsuit challenges this understanding. It claims that AI’s method of learning — reading, processing, and drawing from the patterns of countless texts — is a violation of copyright law. In essence, it suggests that by reading and integrating an author’s book into its broader dataset, an AI model infringes upon the author’s copyright.
It’s not just Silverman. Shutterstock
SSTK,
the royalty-free image provider, also thinks along the same lines. Shutterstock’s business model uses a compensation strategy that acknowledges the value of copyrighted work in AI training. By doing so, it creates an avenue where contributors can accrue a level of compensation when their intellectual property is used in the training of Shutterstock’s own AI-generative model or for licensing of generative assets created using Shutterstock’s software.
Summing it up, proponents of this perspective argue that AI shouldn’t freely use copyrighted works without permission or compensation. The process of training AI on these texts, even if it doesn’t replicate the works verbatim, still leverages the creativity, skill, and labor invested by the authors into their creations. By this argument, AI isn’t merely “learning” like a human; it is using copyrighted works to develop and enhance its capabilities.
They see the Shutterstock model as an alternative that respects authors’ rights while still permitting AI training. This revenue-sharing system, they claim, could be an answer to the challenges posed by the intersection of AI and copyright law. Essentially, this model creates a new category of use where AI training and the resulting output isn’t considered fair use but is instead a form of derivative work for which authors should be compensated.
There are potential issues with this approach. First, its implementation and enforcement might pose enormous, even unsurmountable challenges. The sheer scale of data that AI models like ChatGPT ingest for training — thousands of books, millions of articles, and more — makes it impractical to track down every single copyright holder and negotiate terms for use. This could result in prohibitive expenses for AI developers, potentially stifling innovation and decimating public models.
But what if there’s a third option? Perhaps the fact that AI learns from our combined wealth of knowledge and information makes it our global human heritage? If that’s true, perhaps access to AI should be free, unhindered and provided to everyone, and the service providers should only charge for power spent to run their calculations?
Here we run into a problem of financial motivation, according to the naysayers. Without it, there is no monetary incentive behind the development to push these companies to develop such models, they claim. To this I say “humbug” — thousands of open-source AI models have already been developed and in circulation for free, a testament that humanity understands the value of AI and is ready to create a valuable tool we can all use.
Finally, let’s talk implications. If Silverman and her cohorts win, we could see a surge of litigation, with authors across the globe claiming copyright infringement against AI developers. OpenAI, Meta Platforms and other companies that have been developing and launching AI models could face a barrage of group litigation. This wave could stifle innovation and slow the progress of AI development, casting a long shadow over a field that has long been celebrated for its boundary-pushing capabilities.
“ We are nearing a time where building AI models will be within the reach of not just tech corporations, but anyone with a computer and an internet connection. ”
But here’s an interesting paradox to consider: The victory of authors against formal AI service providers may not end the issue. Far from it.
In the burgeoning era of AI and open-source software, we are nearing a time where building AI models will be within the reach of not just tech corporations, but anyone with a computer and an internet connection. It’s a profound democratization of technology, where powerful AI tools can be forged in garages and dorm rooms, not just Silicon Valley companies.
Anonymous developers and amateur users scattered across the globe could still build, train and publish under-the-radar, powerful and capable AI models, unencumbered by the fear of litigation that could plague formal AI service providers. In this respect, open-source models, quietly churning in the background, would become a force that would replace current movers and shakers of the mainstream world. By that point, while Silverman and others may have won huge sums in their battle against AI, the war would be lost.
More: ChatGPT vs. financial adviser: How much money should you save for retirement at every age?
Also read: Will AI do to Nvidia what the dot-com boom did to Sun Microsystems? Analysts compare current hype to past ones.
Read the full article here