Reddit says Microsoft’s Bing, Anthropic, and Perplexity have scraped its data without permission. “It has been a real pain in the ass to block these companies.”

  • @[email protected]
    link
    fedilink
    English
    321 year ago

    lol. That’s not how any of this works.

    If you have content freely and publicly accessible, it will be read freely by humans and bots.

    • @[email protected]
      link
      fedilink
      English
      231 year ago

      We can be so glad we left. That placees quality has dropped so much.

      Reddit is for porn and perhaps age old posts now.

      • @[email protected]
        link
        fedilink
        English
        71 year ago

        reddit is for right wing conservative astroturfing and misinformation, judging from the last time i looked at its front page.

      • @[email protected]
        link
        fedilink
        English
        171 year ago

        I don’t know if it’s real or just nostalgia, but reddit is massively worse than I remember. I’ve gone back to nose a few times, and it feels like a ghost town, compared to what it used to be.

        • @[email protected]
          link
          fedilink
          English
          111 months ago

          Unchecked mods, baby! Dipshit mods running everyone off.

          It is annoying that it is popular. It is the only place you can post up a picture of some piece of plastic you found and 4 people will comment 39sec later telling you that it is from the inside of a distributor cap of a 98 Ford focus, one guy saying it is from a venusian mother ship, another guy will offer to buy it for $3, and another will DM you asking for another picture of it but held between your toes for $20…so in less than a minute I am a little more knowledgeable and $23 richer.

          I hate that it remains useful and that I still use it.

      • @[email protected]
        link
        fedilink
        English
        71 year ago

        Reddit is not even good for porn anymore. Since they banned it from r/all. You gotta find it yourself now.

  • melroy
    link
    fedilink
    71 year ago

    We shouldn’t accept this behavior or other companies will follow!

    • @[email protected]
      link
      fedilink
      English
      51 year ago

      Yes, absolutely. Any time I need to buy a product I don’t know much about, I look for an enthusiast community with a FAQ. Most of the active, high-quality communities are on Reddit.

      I would like decentralized services to replace that, but that’s a slow process, if it happens at all.

    • @[email protected]
      link
      fedilink
      English
      13
      edit-2
      1 year ago

      Completely. Lemmy is far too small to have the value Reddit does.

      I left Reddit due to their API bullshit, but I so miss all of the hobby communities I was part of, that has like-minded members, and a plethora of resources. It’s not easy to impossible to start communities such as reeftanks, homesteading, literature, bookcirclejerk, etc. on a platform as small as Lemmy. And beyond starting one, the quality and quantity will never match Reddit’s because Lemmy just doesn’t have the same reach.

      Lemmy is great if you like Linux, like Star Trek, or are trans, but other than that, it’s missing so, so many demographics that make a wholistic platform.

      • @[email protected]
        link
        fedilink
        English
        3
        edit-2
        1 year ago

        trans

        I feel this so hard, the sheer number of openly LGBTQ+ people here really skews the demographics of the site. I’m not saying it’s a problem, just saying that LGBTQ+ people are dramatically over-represented here. It’s an interesting contributor to lemmy culture, and I wonder how much that impacts homogeneity here (e.g. upvotes and downvotes for certain types of content).

        But yeah, it’s missing a lot of demographics.

        That said, I’m really into Linux (been using for >15 years), so that’s cool I guess.

        • @[email protected]
          link
          fedilink
          English
          11 year ago

          As a cis straight man I’m taking this as a learning opportunity until the demographics level out. An inherently inclusive bias will be more helpful early on than more niche communities anyways.

          • @[email protected]
            link
            fedilink
            English
            21 year ago

            Sure. Again, I’m not saying it’s bad, just that the bias seems to exist.

            There are certainly worse biases that exist, such as very little representation from people on the right side of the spectrum, so hate against half the population seems to get a pass and downvotes silence constructive comments/posts just due to political bias. That’s incredibly frustrating, and I think the high focus on supporting LGBTQ+ people goes along with that (i.e. the message that conservatives “hate” LGBTQ+ people, which is only true for the more extreme end of conservatism).

            That said, I do like the support LGBTQ+ people get, I just wish the demographics were a bit more diverse without sacrificing the culture. I live and work in a conservative area, but my company has built a pretty inclusive culture (at least for the area), so I think it’s totally possible.

            • @[email protected]
              link
              fedilink
              English
              11 year ago

              Oh man I don’t miss that at all. Moderating out a pervasive delusion isn’t bias, any more than we’re biased in favor of a round Earth. On Reddit there were constant “enlightened centrists” who kept making appeals to moderation.

              There’s nothing of value to be gained from conservatives. The “good” ones who don’t say the homophobia out loud are still voting for politicians who do. If it was just the extreme end, then Trump wouldn’t be their nominee. Hate is their normal now.

              “If there’s a Nazi at the table and 10 other people sitting there talking to him, then you got a table with 11 Nazis.”

              • @[email protected]
                link
                fedilink
                English
                211 months ago

                This is exactly what I’m talking about: casually dismissing half of the population based on little more than association. That drives division and pushes people into echo chambers.

                • @[email protected]
                  link
                  fedilink
                  English
                  111 months ago

                  Understood. I am disagreeing with you. If that wasn’t obvious, then I fear you may have missed my point.

                  Half of America supporting fascism is reason to create somewhere - anywhere - where that shit is shut down. You’re free to go associate with freeze peach Nazis on X, Facebook, Nostr, wherever. I don’t want any part of that and prefer a server that moderates them out. Paradox of tolerance and all that.

                  If you all believed the Earth was flat, then I would prefer the “echo chamber” of people saying “no, we checked, it’s round”. There simply being a lot of believers doesn’t imply an idea has merit, and we don’t have infinity time for BS.

    • @[email protected]
      link
      fedilink
      English
      16
      edit-2
      1 year ago

      An absolutely prodigious back catalog of high quality images, interviews, and explainers. A treasure trove of historical content that’s been heavily indexed and participant-weighted for relevancy. And the bulk of it predates the infestation of AI, so its valuable just as sampling data of original human content for further iterative development of ChatGPT and other LLMs.

      • RBG
        link
        fedilink
        English
        11 year ago

        I don’t know about the AI part. The major companies had plenty of time scraping everything on the internet, or am I simplyifing the effort too much in my head?

    • @[email protected]
      link
      fedilink
      English
      31 year ago

      Reddit remains as valuable as ever. It’s amusing that you think it imploded a year ago just because a small number of users migrated here

      • desktop_user [they/them]
        link
        fedilink
        English
        11 year ago

        It sort of did, thousands of useful comments were turned to gibberish, the mobile web site turned to shot, and the mobile app stopped properly working for communitys with specific content warnings.

    • @[email protected]
      link
      fedilink
      English
      141 year ago

      A lot of older posts are still relevant to specific hobbies. I will look up information on paper, some guitar information, but most posts from the last two years are not worth looking at.

      There is also so much regurgitated LLM shit.

  • paraphrand
    cake
    link
    fedilink
    English
    41 year ago

    It would be interesting if any large companies got behind promoting and endorsing federated media to get around this sort of situation.

    I suspect paying money is easier. And they probally assume their need to do so will be temporary. AGI will fix everything, right? Feel it. Feel the AGI.

  • Hot PotatoOP
    link
    fedilink
    English
    891 year ago

    “It has been a real pain in the ass to block these companies.” makes me regret ever using Reddit in my life. Get your profit whatever, but this is just beyond greed.

    • @[email protected]
      link
      fedilink
      English
      181 year ago

      I deleted my account but that was before I learnt you could replace all your posts with random sentences.

      • @[email protected]
        link
        fedilink
        English
        31 year ago

        You don’t want random because that’s easy to detect. You want to fuck up the ML so you need to be more subtle like scrambling a few words or replacing certain nouns or logical connections in ways that are hard to differentiate from regular edits.

      • @[email protected]
        link
        fedilink
        English
        121 year ago

        Maybe check, b/c there’s a chance that they may have undeleted it all now by now, so there’s a possibility that you could still do it?

  • @[email protected]
    link
    fedilink
    English
    554
    edit-2
    1 year ago

    "Without these agreements, we don’t have any say or knowledge of how our data is displayed and what it’s used for, which has put us in a position now of blocking folks who haven’t been willing to come to terms with how we’d like our data to be used or not used,” Huffman said in an interview this week

    It’s not your data.

    Fuck off.

    • @[email protected]
      link
      fedilink
      English
      541 year ago

      I mean it literally is. People post it there voluntarily knowing that. It’s what keeps the lights on.

      • @[email protected]
        link
        fedilink
        English
        111 year ago

        They are not responsible for what people post, nor do they pay anyone to post, therefore I do not see how they can claim the data as “theirs”.

        They have their own self-regualted rules, but ultimately most anything is fair game for reddit to point at the user and say “we take no responsibility for what an individual may post on this public form”.

        The only thing they will have a problem with is CSAM, but even then as long the volunteer mods remain effective at removing it, reddit will not be responsible for anything users post.

      • @[email protected]
        link
        fedilink
        English
        90
        edit-2
        1 year ago

        Sort of, but not really. From the Reddit ToS (emphasis mine). Basically, you own your content but allow Reddit to use it however they want without crediting you. Only a corporate lawyer would call that arrangement “ownership”, but I digress…


        By submitting Your Content to the Services, you represent and warrant that you have all rights, power, and authority necessary to grant the rights to Your Content contained within these Terms. Because you alone are responsible for Your Content, you may expose yourself to liability if you post or share Content without all necessary rights.

        You retain any ownership rights you have in Your Content, but you grant Reddit the following license to use that Content:

        When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.

        • @[email protected]
          link
          fedilink
          English
          91 year ago

          It’s right there in the ToS: NON-EXCLUSIVE license. If they go to court, I would guess they lose.

        • @[email protected]
          link
          fedilink
          English
          291 year ago

          Beyond that, if you are serving webpages with data on them, you don’t get to decide what people do with those pages. They can’t stop search engines from scraping

          • @[email protected]
            link
            fedilink
            English
            211 year ago

            Just to nitpick, they can stop scraping, anyone can. However, doing so would require implementing barriers that tend to also negatively effect sites that are dependent on being discovered and browsed.

        • @[email protected]
          link
          fedilink
          English
          161 year ago

          It’s actually a fascinating bind Steve/Reddit has put themselves in. Because it is a non-exclusive license, you can affirmatively declare your content is free for anyone to scrape or use.

          After that, if Reddit ever asserts rights over your content by, say, suing Microsoft for improperly using your content in training data, you now have a legal claim against Reddit for interference with either your ownership rights or with a contract via whatever license you have made your content available under.

          Now, maybe Reddit has a claim release in their TOS, but it wouldn’t prevent you from getting an injunction enjoining Reddit from restricting your data from being used by Microsoft.

          It’s kind of academic, because… it’s not really a victory that Microsoft is also training its AI on your data. But, hey, they’re probably doing it anyway and at least this way we get to screw over Huffman for being an ass.

          • @[email protected]
            link
            fedilink
            English
            21 year ago

            MS couldn’t access that content without scraping the page itself, though, which of course belongs to Reddit. From a legal standpoint, it’s like a paywall.

          • @[email protected]
            link
            fedilink
            English
            3
            edit-2
            1 year ago

            The only issue I see with this is that it can be argued that this license doesn’t grant third parties access to data on Reddit’s platform.

        • @[email protected]
          link
          fedilink
          English
          31 year ago

          Lol…really? So the can reuse, modify, and remove all association with your content, but somehow you think you still own it?

          I’ve got a bridge to sell you.

          • @[email protected]
            link
            fedilink
            English
            331 year ago

            In essence, it means that you reserve the right to also use the content for your own purposes, without Reddit having any recourse to preventing you from doing that.

            • @[email protected]
              link
              fedilink
              English
              21 year ago

              Except they published your work, all variants of said work, and completely eliminated you as the author of said work.

              I don’t know how else to explain to you that you don’t own that work anymore. You have rights to it. But you don’t own it.

      • @[email protected]
        link
        fedilink
        English
        111 year ago

        Yup keeps the lights on and makes sure Spez gets his yearly 200 million bonus. It’s good that they are tightening the screw because 200 million is clearly not enough, he deserves double that at least.

      • @[email protected]
        link
        fedilink
        English
        36
        edit-2
        1 year ago

        It literally isn’t. Even their shitty EULA only claims a license to use it, not that it’s their data.

        And approximately 100% of the data on their servers was created while it was accessible to literally anyone who wanted it without restriction through a free API. Virtually none of the content was ever intended to be kept from fucking search engines so it could be sold for AI.

      • @[email protected]
        link
        fedilink
        English
        61 year ago

        I deleted my top comments and left the trash, which was 15 years worth. AI can hallucinate off that trash all it wants.

      • @[email protected]
        link
        fedilink
        English
        741 year ago

        Don’t regret too much. I wouldn’t be surprised if reddit’s “delete” function was really just "move to the “suckers-wanted-to-delete-this” file.

        • The Quuuuuill
          link
          fedilink
          English
          201 year ago

          If you delete your content do it in the form form of a GDPR takedown request

            • The Quuuuuill
              link
              fedilink
              English
              111 months ago

              Good news, Reddit’s crap ass infra doesn’t differentiate if a GDPR request is legitimate or spurious. Or at least it didn’t back when I processed mine, they may have closed that loophole

        • @[email protected]
          link
          fedilink
          English
          291 year ago

          I “deleted” all my posts, then randomly had someone reply to a 3 year old post that wasn’t showing up in my profile but still showed on the page.

          Don’t delete your comments, edit them to be useless.

            • @[email protected]
              link
              fedilink
              English
              21 year ago

              Probably, but when someone is going through old posts they are going to see the edit, not the history. The main goal here is to make Reddit less useful so people go elsewhere. Let Google’s AI be trained on Bot posts.

          • @[email protected]
            link
            fedilink
            English
            61 year ago

            And as a positive to editing rather than deleting, you may have your comment taken down by AutoMod anyway! I had AutoMod take down a ton of my comments because they were flagged a spam because I used a replacement text tool to mass fix a decade worth a comments on multiple accounts. So many messages from AutoMod…

      • FenrirIII
        link
        fedilink
        English
        281 year ago

        Same. 13+ years of insanity and grouchy comments are gonna mess up the AIs

        • Transporter Room 3
          link
          fedilink
          English
          271 year ago

          Same, plus or minus a year.

          It took me a week, but I scrambled every comment and post with lorem ipsum and bee movie scripts, deleted the comments, then after verifying I could no longer find any of my original content on any search engine outside archive sites, I deleted the account.

          It took so long because r*ddit started limiting API access when they realized people were automating their profile scrubbing.

          As I’ve said before about certain countries, if you’re doing everything you can to prevent people from leaving [THING/PLACE] then you might just be shit.

          • FenrirIII
            link
            fedilink
            English
            91 year ago

            I was straight IP banned permanently for reporting the Israeli genocide fans and racists arguing for the eradication of Palestinians. I just deleted the account because I never imagined they would turn into such shitheels.

            • @[email protected]
              link
              fedilink
              English
              51 year ago

              And here I got IP banned for saying we should murk the crown prince of Saudi Arabia. This was around the time the journalist got butchered in that Saudi Embassy. As an aside the Saudis have oil and are assholes how long till we start drone striking them?

              • @[email protected]
                link
                fedilink
                English
                51 year ago

                As an aside the Saudis have oil and are assholes how long till we start drone striking them?

                They would either need to stop playing ball, or oil is no longer a staple for energy generation and transport needs.

                Until then, the 9/11 architects will be able to hang out and do whatever they want.

                Also, thanks for protecting your friends that day GW, here’s hoping you get “touched” by a friend from a long way away…

            • LustyArgonian
              link
              fedilink
              English
              1
              edit-2
              1 year ago

              I got IP banned for asking if there was any “good news” about Mitch McConnell after his strokes. I intentionally worded it ambiguously, but the mod on r/politics looked at my political history/not a conservative and decided I was celebrating violence and so I was IP banned. I guess only Mitch McConnell is allowed to salivate at violence openly and the rest of us are supposed to be worried for his health. It’s not a problem if women are the ones he’s directing violence towards, but God forbid a woman speak back to him. Pregnancy and birth cause strokes and clots, and that’s preferable to him vs an abortion… but God forbid he get strokes at the end of his life from being a horrible person and I find that preferable to his stupid harmful policies

              • @[email protected]
                link
                fedilink
                English
                21 year ago

                The way they enforce that no violence rule is so fucking stupid. Even if you had said “I hope that stroke implodes his brain” you aren’t advocating violence. A medical issue isn’t violence.

    • yeehaw
      link
      fedilink
      English
      321 year ago

      Part of the ToS. Whatever you put on there is effectively theirs. Same with Facebook and your photos etc.

      • @[email protected]
        link
        fedilink
        English
        101 year ago

        Whatever you put on there is effectively theirs.

        I would so love if companies that had decided they own/can sell the data users published lost section 230 protection. Oh, this is your data? I guess you don’t need to be protected against the data users post if it’s your data now.

      • Dr. Moose
        link
        fedilink
        English
        111 year ago

        No you cannot transfer copyright with ToS agreements just give license for reddit to use your copyright.

      • @[email protected]
        link
        fedilink
        English
        31 year ago

        And whatever you put on a public accessible webpage is effectively anyone’s who makes a get request.

  • @[email protected]
    link
    fedilink
    English
    1251 year ago

    Without these agreements, we don’t have any say or knowledge of how our data is displayed and what it’s used for, which has put us in a position now of blocking folks who haven’t been willing to come to terms with how we’d like our data to be used or not used

    It’s the users’ data, not yours, you rent seeking fuck

  • Brownian Motion
    link
    fedilink
    English
    151 year ago

    Reddit says “blablabla” . Reddit is just trying to stop its communistic website from losing money.

    Reddit says “we will give you access” once you pay us and feather our pockets.

    Fuck spez, what a cunt. Delete all your comments etc, and let the AI rot in retarded posts.

  • Nougat
    link
    fedilink
    871 year ago

    “Let’s see … how do we get more people to visit our site? I know! We’ll prevent search engines from sending people to it!”

    • @[email protected]
      link
      fedilink
      English
      28
      edit-2
      1 year ago

      It’s phase three of the enshittification cycle. In phase one, you attract users by providing a good service. Once they’re locked in, you squeeze them for all they’re worth by switching focus to business customers. Once they’re locked in, you squeeze them by threatening to deny them access to the users on whom they now depend.

      • Nougat
        link
        fedilink
        151 year ago

        I had also described it elsewhere as the “suck all the value out before it’s dead” phase. They’re clearly no longer interested about growing the site; they’re just getting as much money as possible from their traffic and engagement history as possible now, because they know traffic and engagement is already declining.

        • Boozilla
          link
          fedilink
          English
          71 year ago

          They won’t get the Nougat! The Nougat is here with us!

          • Nougat
            link
            fedilink
            71 year ago

            Spez doesn’t get to profit from me anymore.

    • m-p{3}
      link
      fedilink
      English
      23
      edit-2
      1 year ago

      Big profit now is better than our long term image

      ~ Reddit shareholders

      • rhabarba
        link
        fedilink
        English
        91 year ago

        As if there was any hint of an “image” left to lose.

        • @[email protected]
          link
          fedilink
          English
          21 year ago

          It does make a certain amount of sense. Big profit now means you get a chunk of cash to invest in other quick profit schemes, and your wealth just keeps snowballing. It works as long as you don’t care that you never build anything that lasts.

  • @[email protected]
    link
    fedilink
    English
    121 year ago

    I stopped appending reddit to my search terms. I won’t go back to Google search; especially after the GamersNexus video featuring Wendell from Level1 Techs.