Reddit says Microsoft’s Bing, Anthropic, and Perplexity have scraped its data without permission. “It has been a real pain in the ass to block these companies.”

  • Lvxferre [he/him]
    link
    fedilink
    English
    1211 months ago

    To commemorate Steve “Greedy Pigboy” Huffman’s assertiveness, I’ve made some memes. Enjoy.

  • @mightyfoolish@lemmy.world
    link
    fedilink
    English
    1211 months ago

    I stopped appending reddit to my search terms. I won’t go back to Google search; especially after the GamersNexus video featuring Wendell from Level1 Techs.

  • @Freefall@lemmy.world
    link
    fedilink
    English
    1511 months ago

    I don’t think the content on Reddit is their to sell…unless resistors are getting a cut. That site is a dumpster and needs to die already.

  • @Vipsu@lemmy.world
    link
    fedilink
    English
    5011 months ago

    Well Reddit should just sue these companies and see if these companies are actually breaking any laws. Holding sizeable chunk of the internet hostage also sounds like something the EU and US might want to look in to as it very much sounds like anti-competitive conduct or market manipulation.

    Also if these companies want to have greater ownership over the content generated by their users they should also be much more liable for the content posted to their sites. I mean when something like the Section 230 was written they probably did not take this in to account. If these companies want to start selling user generated content then they should simply lose the immunity from liability.

    • @commie@lemmy.dbzer0.com
      link
      fedilink
      English
      611 months ago

      they should also be much more liable for the content posted to their sites.

      why do people insist on making me defend reddit.

    • Dr. Moose
      link
      fedilink
      English
      1711 months ago

      Reddit would lose badly that’s why they don’t sue. US’ 9th circuit ruled that scraping Linkedin is legal and Bing is not even scraping but indexing the data. Easiest case ever.

      It’s almost impossible to block web scraping especially someone with Microsoft or Perplexity resources.

      Its clearly an attempt to blackmail indexers into license deal as paying something to reddit could be actually cheaper than battling anti robots.

    • @mint_tamas@lemmy.world
      link
      fedilink
      English
      611 months ago

      While I don’t disagree with the general idea, Section 230 would introduce an uncontrollable risk into running any website with user-generated content and would essentially shut them down.

      • @Passerby6497@lemmy.world
        link
        fedilink
        English
        711 months ago

        If the site isn’t selling data, they wouldn’t lose 230 protection. So that would only be a risk for the companies selling their users’ data, not your regular forum or something.

        • @sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          English
          611 months ago

          That gets really murky though. For example:

          • news sites w/ comment sections - they’re profiting from ads and subscriptions, so how much of that has to do with the comments?
          • ecommerce - reviews on Amazon and eBay could be considered advertising for the product. Who’s liable, the ecommerce site, the merchant, or the poster?
          • product websites - how much are posted “reviews” considered advertising for the product? There may not be direct sales on the website, but surely someone’s review would impact sales elsewhere
          • for-profit services with a discussion forum - these would be on a separate site from the revenue-generating service, but still associated with the brand and thus likely contributing to advertisements for the product

          It’s a lot more obvious for social media sites like Facebook since user-generated content is the service, but there are a lot of for-profit entities where user-generated content is highly relevant, but not the core service. Would those sites be essentially forced to either moderate or eliminate user interaction?

          There’s a lot of complexity here.

  • @Evotech@lemmy.world
    link
    fedilink
    English
    2711 months ago

    Reddit only exists because of an open net and sharing content, noe they just suddenly determined that an open net is bad.

    A common strategy, but it fucking sucks.

    Fuck you reddit.

    • @Zak@lemmy.world
      link
      fedilink
      English
      511 months ago

      Yes, absolutely. Any time I need to buy a product I don’t know much about, I look for an enthusiast community with a FAQ. Most of the active, high-quality communities are on Reddit.

      I would like decentralized services to replace that, but that’s a slow process, if it happens at all.

    • @UnderpantsWeevil@lemmy.world
      link
      fedilink
      English
      16
      edit-2
      11 months ago

      An absolutely prodigious back catalog of high quality images, interviews, and explainers. A treasure trove of historical content that’s been heavily indexed and participant-weighted for relevancy. And the bulk of it predates the infestation of AI, so its valuable just as sampling data of original human content for further iterative development of ChatGPT and other LLMs.

      • RBG
        link
        fedilink
        English
        111 months ago

        I don’t know about the AI part. The major companies had plenty of time scraping everything on the internet, or am I simplyifing the effort too much in my head?

    • @CaliforniaKove@lemmy.ca
      link
      fedilink
      English
      311 months ago

      Reddit remains as valuable as ever. It’s amusing that you think it imploded a year ago just because a small number of users migrated here

      • desktop_user [they/them]
        link
        fedilink
        English
        111 months ago

        It sort of did, thousands of useful comments were turned to gibberish, the mobile web site turned to shot, and the mobile app stopped properly working for communitys with specific content warnings.

    • @Sweetpeaches69@lemmy.world
      link
      fedilink
      English
      13
      edit-2
      11 months ago

      Completely. Lemmy is far too small to have the value Reddit does.

      I left Reddit due to their API bullshit, but I so miss all of the hobby communities I was part of, that has like-minded members, and a plethora of resources. It’s not easy to impossible to start communities such as reeftanks, homesteading, literature, bookcirclejerk, etc. on a platform as small as Lemmy. And beyond starting one, the quality and quantity will never match Reddit’s because Lemmy just doesn’t have the same reach.

      Lemmy is great if you like Linux, like Star Trek, or are trans, but other than that, it’s missing so, so many demographics that make a wholistic platform.

      • @sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        3
        edit-2
        11 months ago

        trans

        I feel this so hard, the sheer number of openly LGBTQ+ people here really skews the demographics of the site. I’m not saying it’s a problem, just saying that LGBTQ+ people are dramatically over-represented here. It’s an interesting contributor to lemmy culture, and I wonder how much that impacts homogeneity here (e.g. upvotes and downvotes for certain types of content).

        But yeah, it’s missing a lot of demographics.

        That said, I’m really into Linux (been using for >15 years), so that’s cool I guess.

        • @explodicle@sh.itjust.works
          link
          fedilink
          English
          111 months ago

          As a cis straight man I’m taking this as a learning opportunity until the demographics level out. An inherently inclusive bias will be more helpful early on than more niche communities anyways.

          • @sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            211 months ago

            Sure. Again, I’m not saying it’s bad, just that the bias seems to exist.

            There are certainly worse biases that exist, such as very little representation from people on the right side of the spectrum, so hate against half the population seems to get a pass and downvotes silence constructive comments/posts just due to political bias. That’s incredibly frustrating, and I think the high focus on supporting LGBTQ+ people goes along with that (i.e. the message that conservatives “hate” LGBTQ+ people, which is only true for the more extreme end of conservatism).

            That said, I do like the support LGBTQ+ people get, I just wish the demographics were a bit more diverse without sacrificing the culture. I live and work in a conservative area, but my company has built a pretty inclusive culture (at least for the area), so I think it’s totally possible.

            • @explodicle@sh.itjust.works
              link
              fedilink
              English
              111 months ago

              Oh man I don’t miss that at all. Moderating out a pervasive delusion isn’t bias, any more than we’re biased in favor of a round Earth. On Reddit there were constant “enlightened centrists” who kept making appeals to moderation.

              There’s nothing of value to be gained from conservatives. The “good” ones who don’t say the homophobia out loud are still voting for politicians who do. If it was just the extreme end, then Trump wouldn’t be their nominee. Hate is their normal now.

              “If there’s a Nazi at the table and 10 other people sitting there talking to him, then you got a table with 11 Nazis.”

              • @sugar_in_your_tea@sh.itjust.works
                link
                fedilink
                English
                211 months ago

                This is exactly what I’m talking about: casually dismissing half of the population based on little more than association. That drives division and pushes people into echo chambers.

                • @explodicle@sh.itjust.works
                  link
                  fedilink
                  English
                  111 months ago

                  Understood. I am disagreeing with you. If that wasn’t obvious, then I fear you may have missed my point.

                  Half of America supporting fascism is reason to create somewhere - anywhere - where that shit is shut down. You’re free to go associate with freeze peach Nazis on X, Facebook, Nostr, wherever. I don’t want any part of that and prefer a server that moderates them out. Paradox of tolerance and all that.

                  If you all believed the Earth was flat, then I would prefer the “echo chamber” of people saying “no, we checked, it’s round”. There simply being a lot of believers doesn’t imply an idea has merit, and we don’t have infinity time for BS.

    • @Ragnarok314159@sopuli.xyz
      link
      fedilink
      English
      1411 months ago

      A lot of older posts are still relevant to specific hobbies. I will look up information on paper, some guitar information, but most posts from the last two years are not worth looking at.

      There is also so much regurgitated LLM shit.

    • LustyArgonian
      link
      fedilink
      English
      8
      edit-2
      11 months ago

      U/spez the former moderator of r/jailbait? Who might have connections to Ghilisaine Maxwell? Him?

      Oh and PS - posting OC porn on Reddit is a very weird process and not transparent

  • paraphrand
    link
    fedilink
    English
    411 months ago

    It would be interesting if any large companies got behind promoting and endorsing federated media to get around this sort of situation.

    I suspect paying money is easier. And they probally assume their need to do so will be temporary. AGI will fix everything, right? Feel it. Feel the AGI.

  • @Makeitstop@lemmy.world
    link
    fedilink
    English
    611 months ago

    Just what we need, more walled gardens and exclusivity deals. And of course, another way of monetizing your data, because we don’t have enough of that already.

    Search results are already fucked enough as it is. We don’t need to start carving up the internet and and dividing it among different search engines.

  • melroy
    cake
    link
    fedilink
    711 months ago

    We shouldn’t accept this behavior or other companies will follow!

  • mechoman444
    link
    fedilink
    English
    2211 months ago

    I’ve said once and I’ll say it again. Either the information on your site is free to all or to none. You can’t have some people/entities pay and some not!

    • @PlexSheep@infosec.pub
      link
      fedilink
      English
      811 months ago

      You can. We didn’t need to like it but they can. Besides, isn’t that how many magazines work? Pay for articles and such

      • @vinyl@lemmy.world
        link
        fedilink
        English
        711 months ago

        Not really, the people who write the articles are actually employed by those magazine companies, and everyone who wants to get one, needs to pay for one.