OpenAI declares AI race “over” if training on copyrighted works isn’t fair use

@[email protected] · 4 months ago

OpenAI declares AI race “over” if training on copyrighted works isn’t fair use

@[email protected] · 4 months ago

If artificial intelligence can be trained on stolen information, then so should be “natural” intelligence.

Oh, wait. One is owned by oligarchs raking in billions, the other just serves the plebs.

@[email protected] · 4 months ago

couldnt’ have said it better…the irony…

FlashMobOfOne · 4 months ago

Good.

Fuck Sam Altman’s greed. Pay the fucking artists you’re robbing.

@[email protected] · edit-2 4 months ago

deleted by creator

@[email protected] · 4 months ago

Whoever brings Aaron Swartz back gets to violate all the copyright laws

@[email protected] · 4 months ago

Aaron Swartz was 100% opposed to all copyright laws, you remember that yah?

@[email protected] · 4 months ago

I’m not just a copyright abolitionnist, I also abhor all intellectual property. Yes, even trademsrk

@[email protected] · 4 months ago

Me too. I fundamentally oppose the idea that ideas can be owned, even by oneself.

But a weird cult has developed around copyright where people think they are on the side of the little guy by defending copyright.

@[email protected] · 4 months ago

It’s classic false consciousness of the temporarily embarrassed billionaire, except for the benefit of the blood 🐭 mouse in this case

@[email protected] · 4 months ago

This is fine🔥🐶☕🔥 · 4 months ago

And he also said “child pornography is not necessarily abuse.”

In the US, it is illegal to possess or distribute child pornography, apparently because doing so will encourage people to sexually abuse children.

This is absurd logic. Child pornography is not necessarily abuse. Even if it was, preventing the distribution or posession of the evidence won’t make the abuse go away. We don’t arrest everyone with videotapes of murders, or make it illegal for TV stations to show people being killed.

Wired has an article on how these laws destroy honest people’s lives.

https://web.archive.org/web/20130116210225/http://bits.are.notabug.com/

Big yikes from me whenever I see him venerated.

@[email protected] · 4 months ago

Yes, and he killed himself after the FBI was throwing the book at him for doing exactly what these AI assholes are doing without repercussion

@[email protected] · 4 months ago

And for some reason suddenly everyone leaps back to the side of the FBI and copyright because it’s a meme to hate on LLMs.

It’s almost like people don’t have real convictions.

You can’t be Team Aaron when it’s popular and then Team Copyright Maximalist when the winds change and it’s time to hate on LLMs or diffusion models.

@[email protected] · 4 months ago

deleted by creator

@[email protected] · edit-2 4 months ago

Sad to see you leave (not really, tho’), love to watch you go!

Edit: I bet if any AI developing company would stop acting and being so damned shady and would just ASK FOR PERMISSION, they’d receive a huge amount of data from all over. There are a lot of people who would like to see AGI become a real thing, but not if it’s being developed by greedy and unscrupulous shitheads. As it stands now, I think the only ones who are actually doing it for the R&D and not as eye-candy to glitz away people’s money for aesthetically believable nonsense are a handful of start-up-likes with (not in a condescending way) kids who’ve yet to have their dreams and idealism trampled.

@[email protected] · 4 months ago

In Spain we trained an AI using a mix of public resources available for AI training and public resources (legislation, congress sessions, etc). And the AI turned out quite good. Obviously not top of the line, but very good overall.

It was a public project not a private company.

@[email protected] · 4 months ago

But what data would it be?

Part of the “gobble all the data” perspective is that you need a broad corpus to be meaningfully useful. Not many people are going to give a $892 billion market cap when your model is a genius about a handful of narrow subjects that you could get deep volunteer support on.

OTOH maybe there’s probably a sane business in narrow siloed (cheap and efficient and more bounded expectations) AI products: the reinvention of the “expert system” with clear guardrails, the image generator that only does seaside background landscapes but can’t generate a cat to save its life, the LLM that’s a prettified version of a knowledgebase search and NOTHING MORE

@[email protected] · edit-2 4 months ago

You’ve highlighted exactly why I also fundamentally disagree with the current trend of all things AI being for-profit. This should be 100% non-profit and driven purely by scientific goals, in which case using copyrighted data wouldn’t even be an issue in the first place… It’d be like literally giving someone access to a public library.

Edit: but to focus on this specific instance, where we have to deal with the here-and-now, I could see them receiving, say, 60-75% of what they have now, hassle-free. At the very least, and uniformly distributed. Again, AI development isn’t what irks most people, it’s calling plagiarism generators and search engine fuck-ups AI and selling them back to the people who generated the databases - or, worse, working toward replacing those people entirely with LLMs! - they used for those abhorrences.

Train the AI to be factually correct instead and sell it as an easy-to-use knowledge base? Aces! Train the AI to write better code and sell it as an on-board stackoverflow Jr.? Amazing! Even having it as a mini-assistant on your phone so that you have someone to pester you to get the damned laundry out of the washing machine before it starts to stink is a neat thing, but that would require less advertising and shoving down our throats, and more accepting the fact that you can still do that with five taps and a couple of alarm entries.

Edit 2: oh, and another thing which would require a buttload of humility, but would alleviate a lot of tension would be getting it to cite and link to its sources every time! Have it be transformative enough to give you the gist without shifting into plagiarism, then send you to the source for the details!

@[email protected] · edit-2 4 months ago

Oh no! How will I generate a picture of Sam Altman blowing himself now!?

Zier · 4 months ago

Photoshop, just like the rest of us.

This is fine🔥🐶☕🔥 · 4 months ago

Wdym? He removed his rib or something?

@[email protected] · 4 months ago

I was thinking more of a Sam 1 and Sam 2 type situation.

Ebby · 4 months ago

That’s a good litmus test. If asking/paying artists to train your AI destroys your business model, maybe you’re the arsehole. ;)

@[email protected] · 4 months ago

No, it means that copyrights should not exist in the first place.

@[email protected] · 4 months ago

Not only that, but their business model doesn’t hold up if they were required to provide their model weights for free because the material that went into it was “free”.

@[email protected] · 4 months ago

There’s also an argument that if the business was that reliant on free things to start with, then it shouldn’t be a business.

No-one would bat their eyes if the CEO of a real estate company was sobbing that it’s the end of the rental market, because the company is no longer allowed to get houses for free.

Tavi · 4 months ago

Agribusiness in shambles after draining the water table (it is still free)

@[email protected] · 4 months ago

The entire internet is built on free things.

Just saying.

SeekPie · 4 months ago

Doesn’t mean that businesses should allowed to be.

Sandwich Artist · 4 months ago

deleted by creator

@[email protected] · 4 months ago

You misspelled capitalism.

@[email protected] · 4 months ago

Unregulated capitalism. That’s why people in dominant market positions want less regulation.

@[email protected] · 4 months ago

Entrenched companies often want more regulation to prevent startup competition. Pulling the ladder up behind them.

finder · edit-2 4 months ago

Extracting free resources of the land

Not to be contrarian, but there is a cost to extract those “free” resources; like labor, equipment, transportation, lobbying (AKA: bribes for the non-Americans), processing raw material into something useful, research and development, et cetera.

Sandwich Artist · edit-2 4 months ago

I love when apologists pop their head up so i can block them. Keeps my feed clean ya know?

finder · edit-2 4 months ago

If basic economics get you upset, then alright.

Bye o/

@[email protected] · 4 months ago

Was about to post the same thing

@[email protected] · 4 months ago

even the top phds can learn things off the amount of books that openai could easily purchase, assuming they can convince a judge that if the works aren’t pirated the “learning” is fair use. however, they’re all pirating and then regurgitating the works which wouldn’t really be legal even if a human did it.

also, they can’t really say how they need fair use and open standards and shit and in the next breathe be begging trump to ban chinese models. the cool thing about allowing china to have global influence is that they will start to respect IP more… or the US can just copy their shit until they do.

imo that would have been the play against tik tok etc. just straight up we will not protect the IP of your company (as in technical IP not logo, etc.) until you do the same. even if it never happens, we could at least have a direct tik tok knock off and it could “compete” for american eyes rather than some blanket ban bullshit.

@[email protected] · 4 months ago

This particular vein of “pro-copyright” thought continuously baffles me. Copyright has not, was not intended to, and does not currently, pay artists.

Its totally valid to hate these AI companies. But its absolutely just industry propaganda to think that copyright was protecting your data on your behalf

Ebby · edit-2 4 months ago

Copyright has not, was not intended to, and does not currently, pay artists.

You are correct, copyright is ownership, not income. I own the copyright for all my work (but not work for hire) and what I do with it is my discretion.

What is income, is the content I sell for the price acceptable to the buyer. Copyright (as originally conceived) is my protection so someone doesn’t take my work and use it to undermine my skillset. One of the reasons why penalties for copyright infringement don’t need actual damages and why Facebook (and other AI companies) are starting to sweat bullets and hire lawyers.

That said, as a creative who relied on artistic income and pays other creatives appropriately, modern copyright law is far, far overreaching and in need of major overhaul. Gatekeeping was never the intent of early copyright and can fuck right off; if I paid for it, they don’t get to say no.

Snot Flickerman · edit-2 4 months ago

modern copyright law is far, far overreaching and in need of major overhaul.

https://rufuspollock.com/papers/optimal_copyright_term.pdf

This research paper from Rufus Pollock in 2009 suggests that the optimal timeframe for copyright is 15 years. I’ve been referencing this for, well, 16 years now, a year longer than the optimum copyright range. If I recall correctly I first saw this referenced by Mike Masnick of techdirt.

@[email protected] · 4 months ago

Copyright does not give the holder control over every “use”, especially something as vague as “using it to undermine their skillset”.

Copyright gives the rights holder a limited monopoly on three activities: to make and sell copies of their works, to create derivative works, and to perform or display their works publicly.

Not all uses involve making a copy, derivative, or performance.

Ebby · 4 months ago

Bingo. I was being more general in my response, but that is the more technical way of putting it.

@[email protected] · 4 months ago

Gatekeeping absolutely was the intention of copyright, not to provide artists with income.

Ebby · edit-2 4 months ago

By gatekeeping I mean the use of digital methods to verify or restrict use of purchased copyright material after a sale such as Digital rights management, encryption such as CSS/AACS/HDCP, or obfuscation.

The whole “you didn’t buy a copy, you bought a license” BS undermines what copyright was supposed to be IMO.

@[email protected] · edit-2 4 months ago

Copyright has not, was not intended to, and does not currently, pay artists.

Wrong in all points.

Copyright has paid artists (though maybe not enough). Copyright was intended to do that (though maybe not that alone). Copyright does currently pay artists (maybe not in your country, I don’t know that).

@[email protected] · 4 months ago

Wrong in all points.

No, actually, I’m not at all. In-fact, I’m totally right:

https://www.youtube.com/watch?v=mhBpI13dxkI

Copyright originated create a monopoly to protect printers, not artists, to create a monopoly around a means of distribution.

How many artists do you know? You must know a few. How many of them have received any income through copyright. I dare you, to in good faith, try and identify even one individual you personally know, engaged in creative work, who makes any meaningful amount of money through copyright.

@[email protected] · 4 months ago

I know several artists living off of selling their copyrighted work, and no one in the history of the Internet has ever watched a 55 minute YouTube video someone linked to support their argument.

@[email protected] · edit-2 4 months ago

Cool. What artist?

Edit because I didn’t read the second half of your comment. If you are too up-your-own ass and anti-intellectual to educate yourself on this matter, maybe just don’t have an opinion.

@[email protected] · 4 months ago

I know quite a few people who rely on royalties for a good chunk of their income. That includes musicians, visual artists and film workers.

Saying it doesn’t exist seems very ignorant.

@[email protected] · 4 months ago

Cool. What artists?

@[email protected] · edit-2 4 months ago

Any experienced union film director, editor, DOP, writer, sound designer comes to mind (at least where I’m from)

@[email protected] · edit-2 4 months ago

Cool. Name one. A specific one that we can directly reference, where they themselves can make that claim. Not a secondary source, but a primary one. And specifically, not the production companies either, keeping in mind that the argument that I’m making is that copyright law, was intended to protect those who control the means of production and the production system itself. Not the artists.

The artists I know, and I know several. They make their money the way almost all people make money, by contracting for their time and services, or through selling tickets and merchandise, and through patreon subscriptions: in other words, the way artists and creatives have always made their money. The “product” in the sense of their music or art being a product, is given away practically for free. In fact, actually for free in the case of the most successful artists I know personally. If they didn’t give this “product” of their creativity away for free, they would not be able to survive.

There is practically 0 revenue through copyright. Production companies like Universal make money through copyright. Copyright was also built, and historically based intended for, and is currently used for, the protection of production systems: not artists.

@[email protected] · 4 months ago

You forgot to link a legitimate source.

@[email protected] · edit-2 4 months ago

A lecture from a professional free software developer and activist whose focus is the legal history and relevance of copyright isn’t a legitimate source? His website: https://questioncopyright.org/promise/index.html

The anti-intelectualism of the modern era baffles me.

Also, he’s on the fediverse!

kfogel.org

@[email protected]

@[email protected] · 4 months ago

YouTube is not a legitimate source. The prof is fine but video only links are for the semi literate. It is frankly rude to post a minor comment and expect people to endure a video when a decent reader can absorb the main points from text in 20 seconds.

@[email protected] · 4 months ago

removed by mod

@[email protected] · 4 months ago

Interesting copyright question: if I own a copy of a book, can I feed it to a local AI installation for personal use?

Can a library train a local AI installation on everything it has and then allow use of that on their library computers? <— this one could breathe new life into libraries

Ebby · 4 months ago

First off, I’m by far no lawyer, but it was covered in a couple classes.

According to law as I know it, question 1 yes if there is no encryption, and question 2 no.

In reality, if you keep it for personal use, artists don’t care. A library however, isn’t personal use and they have to jump through more hoops than a circus especially when it comes to digital media.

But you raise a great point! I’d love to see a law library train AI for in-house use and test the system!

snooggums · edit-2 4 months ago

Good if AI fails because it can’t abuse copyright. Fuck AI.

*except the stuff used for science that isn’t trained on copyrighted scraped data, that use is fine

@[email protected] · 4 months ago

Yeah unfortunately we’ve started calling any LLM “AI”

Fushuan [he/him] · 4 months ago

In ye old notation ML was a subset of AI, and thus all LLM would be considered AI. It’s why manual decision trees that codify get NPC behaviour are also called AI, because it is.

Now people use AI to refer only to generative ML, but that’s wrong and I’m willing to complain every time.

𝘋𝘪𝘳𝘬 · 4 months ago

Okay, bye!

Zier · 4 months ago

If AI gets to use copyrighted material for free and makes a profit off of the results, that means piracy is 1000% Legal. Excuse me while I go and download a car!!

@[email protected] · 4 months ago

All you have to do is present credible evidence that these companies are distributing copyrighted works or a direct substitute for those copyrighted works. They have filters to specifically exclude matches though, so it doesn’t really happen.

@[email protected] · 4 months ago

No, stop! You wouldn’t!

Zier · 4 months ago

I would, and a house. I’m a menace!

@[email protected] · 4 months ago

DAMMIT ALL TO HELL!

…This must be DEI’s fault.

@[email protected] · 4 months ago

Thank a lot Obama

Kokesh · 4 months ago

Please, let it be over. Idiotic “ai”…

Lovable Sidekick · 4 months ago

Alright, I confess! Almost all of my training in computer programming came from copyrighted material. Put the cuffs on me!

snooggums · 4 months ago

You were trained and learned and are able to create new things.

AI poorly mimics thngs it has seen before.

Lovable Sidekick · edit-2 4 months ago

The issue being raised is copyright infringement, not the quality of the results. Writers “borrow” each other’s clever figures of speech all the time without payment or attribution. I’m sure I have often copypasted code without permission. AI does nothing on its own, it’s always a tool used by human beings. I think the copyright argument against AI is built on a false distinction between using one tool vs another.

My larger argument is that nobody has an inherent right to control what everybody else does with some idea they’ve created. For many thousands of years people saw stuff and freely imitated it. Oh look, there’s an “arch” - I think I’ll build a building like that. Oh look, that tribe uses that root to make medicine, let’s do the same thing. This process was known as “the spread of civilization” until somebody figured out that an authority structure could give people dibs on their ideas and force other people to pay to copy them. As we evolve more capabilities (like AI) I think it’s time to figure out another way to reward creators without getting in the way of improvement, instead of hanging onto a “Hey, that’s Mine!” mentality that does more to enrich copy producers than it does to enrich creators.

snooggums · 4 months ago

Yes, whether copyright should exist is a different discussion than how AI is violating it in a very different way than snippets being reused in different contexts as part of a new creative work.

Intentionally using a single line is very different than scooping up all the data and hitting a randomizer until it stumbles into some combination that happens to look usable. Kind of like how a single business jacking up prices is different than a monopoly jacking up all the prices.

Lovable Sidekick · 4 months ago

Stripping away your carefully crafted wording, the differences fade away. “Hitting a randomizer” until usable ideas come out is an equally inaccurate description of either human creativity or AI. And again, the contention is that using AI violates copyright, not how it allegedly does that.

snooggums · 4 months ago

So the other thing with AI is the companies are not just making money on the output like an artist would. They are making bank on investors and stock market speculation that exists only because they scooped up massive amounts of copyrighted materials to create their output. It really isn’t comparable to a single artist or even a collection of artists.

Lovable Sidekick · edit-2 4 months ago

Again, AI doesn’t do anything, any more than hammers and saws build houses. People use AI to do things. Anyway, profiting from investors and speculators without giving creators a piece of the action isn’t a consequence of AI, it’s how our whole system already works.

@[email protected] · 4 months ago

His personal race is over? Oooohhhh, so sorry for him.

AI is not over at all. Maybe he himself will not become the ruler of the world now. No loss.

IHeartBadCode · 4 months ago

Yeah, China sure as shit isn’t going to lose sleep over a US Copyright case.

@[email protected] · 4 months ago

I mean if they pay for it like everyone else does I don’t think it is a problem. Yes it will cost you billions and billions to do it correctly, but then you basically have the smartest creature on earth (that we know of) and you can replicate/improve on it in perpetuity. We still will have to pay you licensing fees to use it in our daily lives, so you will be making those billions back.

Now I would say let them use anything that is old and freeware, textbooks, etc. government owned stuff - we sponsored it with our learning, taxes - so we get a percentage in all AI companies. Humanity gets a 51% stake in any AI business using humanity’s knowledge, so we are then free to vote on how the tech is being used and we have a controlling share, also whatever price is set, we get half of it back in taxes at the end of the year. The more you use it the more you pay and the more you get back.

@[email protected] · 4 months ago

They’re unprofitable as it is already. They’re not going to be able to generate enough upfront capital to buy and then enclose all of humanity’s previous works to then sell it back to us. I also think it would be heinous that they could enclose and exploit our commons in this manner. It belongs to all of us. Sure train it and use it, but also release it open (or the gov can confiscate it, fine with that as well). Anything but allowing those rat-snakes to keep it all for themselves.

@[email protected] · 4 months ago

If it costs billions and billions, then only a handful of companies can afford to build an AI and they now have a monopoly on a technology that will eventually replace a chunk of the workforce. It would basically be giving our economy to Google.

@[email protected] · 4 months ago

The owners of the copyrighted works should be paid in perpetuity too though, since part of their work goes into everything the AI spits out.

@[email protected] · 4 months ago

I don’t see why I’m downvoted for this, but I don’t agree with this opinion - it’s like teaching a human being. If you buy everything once it’s still a hell of a bill - we are talking all books, all movies, all games, all software, all memes, all things - 1 of each is still trillions if you legally want to train your new thing on it.

@[email protected] · 4 months ago

But a human can’t look at a painting for a millisecond and spit out an exact replica in the next. A human can’t listen to the collected works of a musical artist and instantly improvise infinite sound-a-like songs based on complex prompts. A human can’t read every scientific article on the Internet in a few seconds and regurgitate any and every tiny trivial detail on demand in the literal blink of an eye. A human being has a soul. Most do anyway.

For the record, I didn’t downvote you, but I’m guessing others did because you don’t seem to see how AI so obviously devalues the beautiful and brilliant efforts of the human spirit to build and sustain our cultures, our societies, our civilizations, our species, our very world. In the capitalist hellscape that we currently suffer in, that kind of devaluing ought to be criminal.

@[email protected] · 4 months ago

Not targetting you specifically, but I guess AI is going to be a hard subject in the future.

Think of it as an expert in all other areas and you spend a year teaching it to be a better expert and so on. It’s just humanity’s digital baby that we are teaching based on our current knowledge, technology, art, values, morals, etc. - and it’s just much better than you or me at learning so it’s becoming an expert in everything, thus as you expect from an expert it’s able to draw, it’s able to replicate style’s of music, it’s able to think through complex math/physics/chem/biology problems as a human expert might be able to. Yet it has fatal flaws that need fixing, thus needs better training methods and more time - they are saying 2029 for AGI which is the first step. At that point it won’t be up to you or me to decide as it will be a new living form that we will have to acknowledge and let it decide for itself what it wants or doesn’t want to do.

I guess my point is it seems like it’s devaluing stuff, but is in fact elevating everything that we were, we are and will be - that’s why I’m saying it should be owned by all of us, we should all get the benefits. If a painter wants to draw something, they can use AI to draw faster, with more variations at a speed impossible before, you can make new styles, you can make it use just your own style, you save time and can create more complex works because of that. Real world paintings made by humans the old school way will always have a place, my thoughts are that they will even gain an exclusive status and be worth even more with proof of creation.

Not saying things are not bad right now, but what if AI is the path forward, like technology always has been - what if it helps cure all diseases past and future, what if it figures out how to make us immortal, what if we can travel instantenously from 1 place in the universe to another, imagine the possibilities that it will open to us. I think it’s inevitable really.

@[email protected] · 4 months ago

Fucking hellscape