- cross-posted to:
- [email protected]
- [email protected]
- cross-posted to:
- [email protected]
- [email protected]
‘Impossible’ to create AI tools like ChatGPT without copyrighted material, OpenAI says::Pressure grows on artificial intelligence firms over the content used to train their products
This situation seems analogous to when air travel started to take off (pun intended) and existing legal notions of property rights had to be adjusted. IIRC, a farmer sued an airline for trespassing because they were flying over his land. The court ruled against the farmer because to do otherwise would have killed the airline industry.
I member
And we did so before then with ‘Mineral Rights’. You can drill for oil on your property but If you find it - it ain’t yours because you only own what you can walk on in many places.
Let’s wait until everyone is laid off and it’s ‘impossible’ to get by without mass looting then, shall we?
Then LLMs should be FOSS
All AI should be FOSS and public domain, owned by the people, and all gains from its use taxed at 100%. It’s only because of the public that AI exists, through the schools, universities, NSF, grants, etc and all the other places that taxes have been poured into that created the advances upon which AI stands, and the AI critical research as well.
That does nothing to solve the problem of data being used without consent to train the models. It doesn’t matter if the model is FOSS if it stole all the data it trained on.
The only way I can steal data from you is if I break into your office and walk off with your hard drive. Do you have access to something? It hasn’t been stolen.
Copying is not theft or stealing.
Copying copyright protected data is theft AND stealing
Edit: this also applies to my stance on piracy, which I don’t engage in for the same reason. It’s theft
deleted by creator
It’s theft.
You can steal all you want, but it’s still theft. Piracy is theft, stealing data to be used as training data is theft.
Not everyone wants their creations to be infinitely shared beyond their control. If someone creates something, they’re entitled to absolute control over it.
deleted by creator
You are only hurting yourself by adopting a rule like that.
By definition you’re wrong
Nah… it’s not too complicated
AI basically just bunch of if / else or case / switch statement in spaghetti codeCool! Then don’t!
hijacking this comment
OpenAI was IMHO well within its rights to use copyrighted materials when it was just doing research. They were* doing research on how far large language models can be pushed, where’s the ceiling for that. It’s genuinely good research, and if copyrighted works are used just to research and what gets published is the findings of the experiments, that’s perfectly okay in my book - and, I think, in the law as well. In this case, the LLM is an intermediate step, and the published research papers are the “product”.
The unacceptable turning point is when they took all the intermediate results of that research and flipped them into a product. That’s not the same, and most or all of us here can agree - this isn’t okay, and it’s probably illegal.
* disclaimer: I’m half-remembering things I’ve heard a long time ago, so even if I phrase things definitively I might be wrong
True, with the acknowledgement that this was their plan all along and the research part was always intended to be used as a basis for a product. They just used the term ‘research’ as a workaround that allowed them to do basically whatever to copyrighted materials, fully knowing that they were building a marketable product at every step of their research
That is how these people essentially function, they’re the tax loophole guys that make sure you and I pay less taxes than Amazon. They are scammers who have no regard for ethics and they can and will use whatever they can to reach their goal. If that involves lying about how you’re doing research when in actuality you’re doing product development, they will do that without hesitation. The fact that this product now exists makes it so lawmakers are now faced with a reality where the crimes are kind of past and all they can do is try and legislate around this thing that now exists. And they will do that poorly because they don’t understand AI.
And this just goes into fraud in regards to research and copyright. Recently it came out that LAION-5B, an image generator that is part of Stable Diffusion, was trained on at least 1000 images of child pornography. We don’t know what OpenAI did to mitigate the risk of their seemingly indiscriminate web scrapers from picking up harmful content.
AI is not a future, it’s a product that essentially functions to repeat garbled junk out of things we have already created, all the while creating a massive burden on society with its many, many drawbacks. There are little to no arguments FOR AI, and many, many, MANY to stop and think about what these fascist billionaire ghouls are burdening society with now. Looking at you, Peter Thiel. You absolute ghoul.
True, with the acknowledgement that this was their plan all along and the research part was always intended to be used as a basis for a product. They just used the term ‘research’ as a workaround that allowed them to do basically whatever to copyrighted materials, fully knowing that they were building a marketable product at every step of their research
I really don’t think so. I do believe OpenAI was founded with genuine good intentions. But around the time it transitioned from a non-profit to a for-profit, those good intentions were getting corrupted, culminating in the OpenAI of today.
The company’s unique structure, with a non-profit’s board of directors controlling the company, was supposed to subdue or prevent short-term gain interests from taking precedence over long-term AI safety and other such things. I don’t know any of the details beyond that. We all know it failed, but I still believe the whole thing was set up in good faith, way back when. Their corruption was a gradual process.
There are little to no arguments FOR AI
Outright not true. There’s so freaking many! Here’s some examples off the top of my head:
- Just today, my sister told me how ChatGPT (her first time using it) identified a song for her based on her vague description of it. She has been looking for this song for months with no success, even though she had pretty good key details: it was a duet, released around 2008-2012, and she even remembered a certain line from it. Other tools simply failed, and ChatGPT found it instantly. AI is just a great tool for these kinds of tasks.
- If you have a huge amount of data to sift through, looking for something specific but that isn’t presented in a specific format - e.g. find all arguments for and against assisted dying in this database of 200,000 articles with no useful tags - then AI is the perfect springboard. It can filter huge datasets down to just a tiny fragment, which is small enough to then be processed by humans.
- Using AI to identify potential problems and pitfalls in your work, which can’t realistically be caught by directly programmed QA tools. I have no particular example in mind right now, unfortunately, but this is a legitimate use case for AI.
- Also today, I stumbled upon Rapid, a map editing tool for OpenStreetMap which uses AI to predict and suggest things to add - with the expectation that the user would make sure the suggestions are good before accepting them. I haven’t formed a full opinion about it in particular (and especially wary because it was made by Facebook), but these kinds of productivity boosters are another legitimate use case for AI. Also in this category is GitHub’s Copilot, which is its own can of worms, but if Copilot’s training data wasn’t stolen the way it was, I don’t think I’d have many problems with it. It looks like a fantastic tool (I’ve never used it myself) with very few downsides for society as a whole. Again, other than the way it was trained.
As for generative AI and pictures especially, I can’t as easily offer non-creepy uses for it, but I recommend you see this video which takes a very frank take on the matter: https://nebula.tv/videos/austinmcconnell-i-used-ai-in-a-video-there-was-backlash if you have access to Nebula, https://www.youtube.com/watch?v=iRSg6gjOOWA otherwise.
Personally I’m still undecided on this sub-topic.Deepfakes etc. are just plain horrifying, you won’t hear me give them any wiggle room.
Don’t get me wrong - I am not saying OpenAI isn’t today rotten at the core - it is! But that doesn’t mean ALL instances of AI that could ever be are evil.
Here is an alternative Piped link(s):
https://www.piped.video/watch?v=iRSg6gjOOWA
Piped is a privacy-respecting open-source alternative frontend to YouTube.
I’m open-source; check me out at GitHub.
Sounds like a fatal problem. That’s a shame.
If the copyright people had their way we wouldn’t be able to write a single word without paying them. This whole thing is clearly a fucking money grab. It is not struggling artists being wiped out, it is big corporations suing a well funded startup.
if it’s impossible for you to have something without breaking the law you have to do without it
if it’s impossible for the artistocrat class to have something without breaking the law, we change or ignore the law
Copyright law is mostly bullshit, though.
Oh sure. But why is it only the massive AI push that allows the large companies owning the models full of stolen materials that make basic forgeries of the stolen items the ones that can ignore the bullshit copyright laws?
It wouldn’t be because it is super profitable for multiple large industries right?
Just because people are saying the law is bad doesn’t mean they are saying the lawbreakers are good. Those two are independent of each other.
I have never been against cannabis legalization. That doesn’t mean I think people who sold it on the streets are good people.
We’ll, strictly speaking you could have an AI that only knows about the world up to 1928 and talks like it’s 1928.
finally capitalism will notice how many times it has shot up its own foot with their ridiculous, greedy infinite copyright scheme
As a musician, people not involved in the making of my music make all my money nowadays instead of me anyway. burn it all down
Pitchfork fest 2024
TBH I only use LLMs when traditional search fails and even then I’m not sure if I’m getting something useful or hallucination. I need better search engines not fancy AI bullshitters
Copyright protection only exists in the context of generating profit from someone else’s work. If you were to figure out cold fusion and I’d look at your research and say “That’s cool, but I am going to go do some woodworking.” I am not infringing any copyrights. It’s only ever an issue if the financial incentive to trace the profits back to it’s copyrighted source outway the cost of doing so. That’s why China has had free reign to steal any western technology, fighting them in their courts is not worth it. But with AI it’s way easier to trace the output back to it’s source (especially for art), so the incentive is there.
The main issue is the extraction of value from the original data. If I where to steal some bricks from your infinite brick pile and build a house out of them, do you have a right to my house? Technically I never stole a house from you.
You stole bricks. How rich I am does not impact what you did. Copying is not theft. You can keep stretching any shady analogy you want but you can’t change the fundamentals.
You’re conflating copyright and patents.
Shit, you’re right, I’am.
Also conflating theft vs copying
Is this the point where we start UBI and start restructuring society for the future of AI?
“Impossible to build evil stronghold without walls made out of human skulls” claims necromancer.
deleted by creator
If the rule is stupid or evil we should applaud people who break it.
we should use those who break it as a beacon to rally around and change the stupid rule
Except they pocket millions of dollars by breaking that rule and the original creators of their “essential data” don’t get a single cent while their creations indirectly show up in content generated by AI. If it really was about changing the rules they wouldn’t be so obvious in making it profitable, but rather use that money to make it available for the greater good AND pay the people that made their training data. Right now they’re hell-bent in commercialising their products as fast as possible.
If their statement is that stealing literally all the content on the internet is the only way to make AI work (instead of for example using their profits to pay for a selection of all that data and only using that) then the business model is wrong and illegal. It’s as a simple as that.
I don’t get why people are so hell-bent on defending OpenAI in this case; if I were to launch a food-delivery service that’s affordable for everyone, but I shoplifted all my ingredients “because it’s the only way”, most would agree that’s wrong and my business is illegal. Why is this OpenAI case any different? Because AI is an essential development? Oh, and affordable food isn’t?
I am not defending OpenAi I am attacking copyright. Do you have freedom of speech if you have nothing to say? Do you have it if you are a total asshole? Do you have it if you are the nicest human who ever lived? Do you have it and have no desire to use it?