i dont see what the problem is. AI is mostly irrelevant. okay they scrape your data.. but then what? If the data isn't offically published and doesnt have a DOI, anything built on that wont be accepted
Some people scrape charts in publications to extract data. This has been done for a while. Maybe AI could automate this step. Thatd be useful
I understand that publications are the currency of academics but they're largely irrelevant in business. Geological data are valuable and if an oil exploration company finds a nice dataset they can scrape, they're not going to publish it.
From a pure business perspective, AI is largely about copyright circumvention. The laws are lagging and people are making serious money from data theft.
Aren't you describing trade secrets? I don't see how AI makes that any better or worse. If your competitor gets his hands on your proprietary dataset you're sunk regardless of AI, right?
I don't see how copyright enters into it. I doubt that "oh hey I published this very valuable and proprietary dataset online but it's copyright me so pretty please don't use it to make money" was ever going to get you anywhere to begin with.
Am I understanding it correctly. So internally if a company is using a competitor's stolen data directly, then if anyone finds out they're in legal trouble. But if they train a model and then use the model, then they're in the clear?
Yes I think there's evidence for that. Looking at recent precedents, even if the data are illegally downloaded, big tech has been getting away with using copyrighted data, for example:
Some people scrape charts in publications to extract data. This has been done for a while. Maybe AI could automate this step. Thatd be useful