LLM scrapers are taking down FOSS projects’ infrastructure, and it’s getting worse.

  • @[email protected]
    link
    fedilink
    70
    edit-2
    2 months ago

    Wow that was a frustrating read. I dd not know it was quite that bad. Just to highlight one quote

    they don’t just crawl a page once and then move on. Oh, no, they come back every 6 hours because lol why not. They also don’t give a single flying fuck about robots.txt, because why should they. […] If you try to rate-limit them, they’ll just switch to other IPs all the time. If you try to block them by User Agent string, they’ll just switch to a non-bot UA string (no, really). This is literally a DDoS on the entire internet.

    • @[email protected]
      link
      fedilink
      English
      312 months ago

      the solution here is to require logins. thems the breaks unfortunately. it’ll eventually pass as the novelty wears off.

        • @[email protected]
          link
          fedilink
          English
          62 months ago

          Signups in most platforms are quite hard. Straight up give your phone and do SMS verification, or at least give email and to register that email you will have to provide phone anyway. Captchas nowadays became so hard that even humans struggle with them and it often takes multiple attempts to get it right.

        • @[email protected]
          link
          fedilink
          English
          22 months ago

          not really, just tie it with 2fa SMS style and the hurdle is large enough most companies won’t bother.

        • @[email protected]
          link
          fedilink
          English
          02 months ago

          Make them mine a BTC block in the Browser!


          ^Sorry, I’m low in blood and full of mosquito vomit. That’s probably making me think weird stuff.^

        • @[email protected]
          link
          fedilink
          English
          22 months ago

          This is exactly what we need to do. You’d think that a FOSS WAF exists out there somewhere that can do this

          • LiveLM
            link
            fedilink
            English
            32 months ago

            There is. That screenshot you see in the article is a picture of a brand new one, Anubis

            • @[email protected]
              link
              fedilink
              English
              32 months ago

              Yeah I realised that after posting. I think we need a better one to deal with the cases of letting legitimate users in easier though