DataWars

ThoughtStorms Wiki

JohnRobb points out that the current wave of ArtificialIntelligence means that data is suddenly highly valued and highly contested.

https://johnrobb.substack.com/p/data-wars

Quoted below (CategoryCopyrightRisk)

The data wars have begun. We can see the signs of it everywhere;

  • API restrictions (Twitter, Reddit, and many more) prevent external software from accessing the platform’s data. AI data licensing walls to protect against screen scraping (everywhere).
  • Data rate restrictions (Twitter recently did this with post limits) and AI response rate restrictions (Open AI, Midjourney, and nearly every other AI developer).
  • New competition to soak up data (Meta’s Threads)

The Data Gold Rush

Here’s why there is a war:

  • AIs (particularly AI platforms) will likely become the most valuable technological artifacts ever built (worth tens of trillions, more valuable than anything but the largest economies).
  • AI development is a glutton for data. There’s a strong correlation between the amount of data used to train an AI and the quality of the AI you develop (large amounts of data even allow you to ignore many quality issues).
  • The early AI development efforts were able to strip mine data from the open Internet with abandon, but that was before AI was proven to work. Now that it has, everyone developing AIs are rushing in to gather as much data as possible before the barriers to doing so are fully erected.