AIAndCopyright

ThoughtStorms Wiki

Context: IntellectualProperty, MyFearsAboutAI, PoliticsOfArtificialIntelligence

The NewYorkTimes has sued OpenAI and MicroSoft

My comment :

I think there are real issues with the verbatim replication of text. And the fake NYT articles. It's a fair cop to consider the first as a copyright violation. And in an age of fakes and disinformation, we need stronger laws to stop people falsely attributing ideas or expression to anyone else. We shouldn't be allowed to publicly lie and claim a person said or did something they didn't. Or that the NYT published an article they didn't.

OpenAI and Microsoft should be held accountable and fined for each violation of these that they allow to slip through their quality control.

OTOH we absolutely DON'T WANT and shouldn't have, any blanket laws preventing anyone, whether machine or human, "learning from", the data or works of another. Or the idea of "copyright" datasets that a machine can't look at without payment.

Copyright is already waaaay too draconian and restrictive. The last thing we need is Disney being able to prevent anyone seeing and learning how to paint in the style of Disney animation. Or the NYT preventing people paraphrasing or summarizing their articles. Or any private company being able to use the law to prevent rivals accessing and working with same raw ideas and facts that they were working with.

There would be no freedom of information or informed citizenship if we allow panic over AI to make laws that allow corporations to bottle up ideas as exclusive property.

Copyright should ALWAYS be limited to mere specific concrete expressions of ideas. Not "higher order" or more abstract representations of the same ideas (such as parameters in an LLM). And it is no more "copying" to train weights based on a Disney image or a NYT text than it is copying for a human to read an article and learn (by rewiring the neurons in their brain) to write about the same topic in a similar way. If there are to be constraints enforced, it should be downstream, on the output of generative AI, not the input to it.

We must protect the freedom to learn and share information and to spread and reuse and develop on ideas. We can't stop AI learning and sharing information. Both because that's how we're going to get AIs that create massive value for us. But also because the alternative is to clamp down even more on freedom to access and use ideas, creating new categories of oppressive restrictions.

Finally as a practical thought. The internet which should have been a tool for sharing ideas and helping the best knowledge spread, has already become a conduit for junk science and disinformation. Good academic research is already behind absurdly expensive paywalls and obscure UX, while cranks liberally spray over YouTube and social media. If quality newspapers (and other sources) demand to be removed from the biggest, most publicly used training sets, then AIs are just going to get worse. Because the cranks be happy to fill the vacuum with their nonsense. And we'll have more powerful and convenient AIs spewing fake ideas and propaganda.

It's a terrible precedent for the NYT to set, that the paper "of record" will be lest well represented in the knowledge base of the most widely accessed educators on the planet.

An open letter to ErykSalvaggio.

I've been getting into an argument with him on Twitter about whether artists should pursue IntellectualProperty as a strategy to defend themselves from predatory platforms.

My response was getting too long, so I've written it as an open letter.

Hi Eryk,
I know you think there's no point in continuing our argument because neither of us will convince the other.
But I'm not giving up quite yet, because I think this argument is important. And people like you should be on the right side of it.
Look at what has happened with music. 20 years ago we had a chance of a genuinely free music distribution network. With Napster and BitTorrent etc. Instead musicians were seduced by the promises from streamers like Spotify, that they'd get paid.
Now, the music industry is happily returning to its glory days, boasting that its revenues were $28b last year. (https://www.statista.com/chart/4713/global-recorded-music-industry-revenues/) But after Spotify pulled the rug from under its artists, 99% of musicians are explicitly getting paid NOTHING for their contribution. Listeners are paying. Artists aren't receiving. Instead the money in the music industry is being carved up by Spotify and Universal and TikTok etc. Corporations fight over it. The vast majority of artists are expected to accept that there is nothing for them unless they hit the jackpot of celebrity. And the only way to do that is to keep feeding the content mill with yet more of their free work. In a zero-sum competition for attention with all the other musicians trying to do the same thing.
This is PlatformCapitalism in action. It's "TechnoFeudalism" in action. Corporate platforms and landlords grab all the rent for the data generated by users, for themselves.
Right now, AI looks plausibly the biggest economic disruption in history. And chances are it will go the same way as the web, social media etc. And become an even bigger, even more exploitative corporate platform than any we've seen so far.
I think you and I agree that that is the future we are desperate to avoid. And the future we have to fight.
Our disagreement is that I believe that any "intellectual property" or the institution of "ownership of data" is NOT the weapon to fight this. Whereas on Twitter, I regularly see you promoting some version of the idea that this IS the solution. That we fight the exploitative platforms by reclaiming ownership over our data.
But what will happen is that every platform that artists use to share their work, to communicate it to the world, will simply write a clause into their terms of service to say "if you use our platform to communicate, you grant us a license to use your data to train AIs with". And artists won't be able to say no.
I, as a musician, can't practically choose not to put my music on Spotify. Because every time I tell anyone about my music, and they show the slightest interest in listening to it, they immediately pull out the Spotify app and ask where they can find it. Spotify are very clear with me that I'm not getting paid by them. But if I opt out, the number of potential listeners I might have, drops by over 90%.
No young visual artist or writer or aspiring film-maker, who needs to build an audience via social media, is going to not use those platforms to share their work. They will therefore accept those terms of service. If they did try to resist, they know they would languish in obscurity. But then whatever "training-licence" we try to wrap around the rights to train AIs with, the money won't go to the artists. It will simply go to the platforms that have inserted themselves as intermediaries between the artists and the world.
We have seen that movie several times now. And we can be pretty sure that "training licences", just like, say, patents, will largely be tools for protecting monopoly positions. They'll be traded and squabbled about between Microsoft and Adobe and Facebook and Google etc. But their main effect will be to keep the next StabilityAI or MistralAI or equivalent outsider startup that wants to create its own AI model, (and perhaps share it more widely) out of the market.
Maybe it will be impossible to fight the coming of gargantuan AI monopolies. Maybe we lost already, just as we've failed to fight the other platform monopolists. But if there is a chance to stop their rise, it's by cutting the supports from under their drive to monopoly. By insisting that training an AI on data is a kind of "fair use", open to anyone. Not another property right available to only those corporations rich enough to afford it.
If we can win that right, we still have a chance that there could be thousands of smaller AI model providers. A wider variety of models. And models we can download and run in the privacy of our own machines. If we don't win that right, we're much more likely to end up dependent on a handful of giant AI provider platforms running AIs in their clouds.
But either way, neither of these scenarios is likely to provide most artists with either a meaningful income or a meaningful veto on the use of their work to train AI.
Even if there were to be some kind of market, a viable art AI must aggregate the work of hundreds or thousands or artists to be useful. And unless it charges a fortune to use, the dividend for each individual artist will be a fraction of a cent.
And as for a veto or opt out, only those artists who are already famous and powerful enough can even hope to maintain a profile outside the social platforms which will demand training-rights as the toll they charge for access. To go back to the Spotify comparison again, both Taylor Swift and Neil Young, after high profile leavings of the platform, are back. And if Swift won anything from her power-struggle with it, it was ultimately recouped off the backs of the small artists losing their meagre payments.
Spotify can happily live with a regime of property rights wrapped around music, while telling 99% of its musicians to expect nothing. What would actually hurt it would be a world where anyone who wanted to make playlists would be free to post the playlist and the music to their own server, and any new company could make a rival app to share and subscribe to playlists and music.
The same will be true in a world of training-licences. Adobe will sell subscriptions to its cloud software, to artists, with the promise that it will also pay them for using their work to train its AIs. But if it pulls a Darth Vader and alters the deal, most artists will have zero recourse. Adobe will own, and be free to resell or reuse the potential training data, but those artists won't have any way to break free.
Here's what I think would be a much better bet for those same artists. If they were free to take the latest StableDiffusion or equivalent base model, and fine-tune it to wrap their own personal style around. And then have new ways to resell its services. Maybe you think it's absurd to expect most artists to create their own customized AIs. But if there were no restrictions, then third parties could easily create services to help artists do this. Anyone who was developing a style and a body of work could quickly package that in the form of a bot, and hire it out. Or use it to make more of their work by themselves. Or, indeed, simply not do so and focus on trying to find customers who valorize and pay for physical objects.
That's how this looks to me, anyway. There's never been a propertarian solution that capitalism can't ultimately coopt. Now capitalism is also a genius at coopting the commons, as we've seen, but at least the commons is still somewhat alien to it. And it can only do so when the state steps in to aid it.
We will of course see how this plays out.
regards
Phil Jones

AIAndCopyright

The NewYorkTimes has sued OpenAI and MicroSoft

An open letter to ErykSalvaggio.

As I predicted, intellectual property will not be a tool to protect you.