Hey everyone, so as I’m sure everyone is aware Lemmy.World has been experiencing several outages throughout the last few days.

We have been investigating the root cause of these outages but believe that they are related to our current hosting provider (Hetzner) blocking access from ClouldFlare as (we think) they believe that our CDN is a DDoS’er, and is causing these disconnects to our backend server, problematic for sure.

We’ve opened support tickets with our current provider and are awaiting a response. We have no issue with being as transparent as possible with downtime. Anyone that is curious, can feel free to check out https://status.lemmy.world and https://dash.lemmy.world for up to the minute outage information. We are also looking into other fediverse friendly methods of posting status and outage updates

In the meantime, we are evaluating alternative hosting options and solutions to provide a high level of reliability to you, our users. Really, we want to say thanks to everyone for soldiering through all our technical growing pains.

Cheers

  • LW Infra Team
  • @BellaDonna@mujico.org
    link
    fedilink
    522 years ago

    Let me be real. I never noticed outages stopping. It feels like it’s daily, I’m used to it, but I think it happens so often that lemmy.world has lost its growth opportunity, and we alienated the normies. I’m still going to stay on Lemmy, and I believe you’re doing the best you can, but we lost for the time being, the migration to Lemmy from Reddit is stunted.

    • We shouldn’t be trying to grow a single instance. That defeats the whole point of Lemmy. I started on Lemmy world and switched once I got fed up with the constant connection issues. Plus, Lemmy world blocked piracy communities so fuck that. I’m happy that I am able to quickly create an account on another instance.

    • @danielton@lemmy.world
      link
      fedilink
      692 years ago

      It didn’t help that almost every other general purpose instance blocked sign-ups in June and early July either, or required an essay on the application. Lemmy.world was the only one that was even trying at all, and I will commend them for that.

      Hopefully things will get better by the next time spez screws up. Because there will be a next time.

      • HTTP_404_NotFound
        link
        fedilink
        English
        92 years ago

        Partly due to the fact… lemmy itself, basically has no moderation or administration features at all…

        So, the only way to assist with that issue, is stricter enforcement up-front.

        Besides, if someone doesn’t wanna take the time to have a verified email, and literally type 49 when registering an application… I really don’t wanna take the time to worry about having to potentially worry about them being spammers/etc.

      • gabe [he/him]
        link
        fedilink
        122 years ago

        Next time the lemmy join page needs to be improved so people can spread and don’t try to centralize into a single instance and break the purpose of lemmy in the first place.

        • @danielton@lemmy.world
          link
          fedilink
          242 years ago

          Sure, but to be fair, there weren’t really many general-purpose instances that were accepting sign-ups from anybody when the Reddit bullshit went down in June. That’s part of why lemmy.world got as big as it did.

          Most people who ended up on Squabblr and Discuit instead went there because they didn’t have to write an essay to join or try to find a server that was accepting sign-ups and wasn’t down a lot.

          • @Blaze@discuss.tchncs.de
            link
            fedilink
            52 years ago

            Squabblr and Discuit

            Discuit is a place where 4142 people get together to find cool stuff and discuss things.

            Squabblr doesn’t have a count of active users (33k registered users)

            I know we are low on the numbers, but still a bit higher than them.

            • @danielton@lemmy.world
              link
              fedilink
              12 years ago

              OK, but I didn’t say more people ended up there than here. I was just stating the main reason people chose them over Lemmy.

              • @Blaze@discuss.tchncs.de
                link
                fedilink
                12 years ago

                To be honest, with such low numbers, I guess after a while they just came to Lemmy once instances were a bit more stable or went something else altogether

                • @danielton@lemmy.world
                  link
                  fedilink
                  3
                  edit-2
                  2 years ago

                  Squabblr decided to become a “Free Speech” platform and remove rules against LGBT hate speech, so they shot themselves in the foot. Almost everybody who was active on Squabblr moved to Discuit, but even Discuit has nowhere near the activity of Lemmy and kbin.

    • HTTP_404_NotFound
      link
      fedilink
      English
      22 years ago

      I never noticed outages stopping

      I am in the same boat. If it were not for these posts, I’d have never noticed lemmy world was down.

      If I post to, or put a comment on something from lemmy world, it will just federate over when it’s back online.

      If someone posts to something on lemmy world, it will eventually federate over my way.

      But, hey, everyone is gonna fuss when they decided to put all of their eggs in one basket and now, that one basket gets targeted. (metaphor for lemmy world.)

  • The Picard Maneuver
    link
    fedilink
    5
    edit-2
    2 years ago

    Thanks for keeping things running as Lemmy grows!

    For what it’s worth, I’m on all the time and have barely noticed the outages.

      • Eames
        link
        fedilink
        English
        12 years ago

        When one mentions “Hetzner”, it doesn’t immediately evoke cloud services in the same vein as AWS, GCP, or Azure. For instance, while SAP offers “cloud” solutions, it might equate to a single server in Hannover. I hope this clarifies the context of my question :)

  • Machefi
    link
    fedilink
    812 years ago

    I don’t blame you for this, but the uptime records are incomplete at best. I’ve experienced the site being down (and confirmed with Down for Everyone or Just Me), yet status.lemmy.world showed all systems operational. As I’m writing this, status.lemmy.world is missing most data up to yesterday and dash.lemmy.world shows 16 days uptime.

    I have lots of respect to you for even having these. I also remember status.lemmy.world work mostly fine some time ago. But as of right now, both uptime monitors fail to serve their purpose.

    • @lwadmin@lemmy.worldOPM
      link
      fedilink
      English
      81
      edit-2
      2 years ago

      You need to hover over the status bar to see if there is any down time for that day. We can enable it to log incidents every time there is a burp, but we are still tuning alerts as we only have it create a incident when we ACK it in PagerDuty. You can always check the dashboard for up to the minute stats, as well as https://lemmy-status.org/endpoints/_lemmy-world We’ll add this info to make things clearer <3

      EDIT: Added more info to our status page, thanks for the feedback Machefi!

      EDIT2: Also the missing data is due to us removing and adding more specific monitors for the different infra services.

      • Obinice
        link
        fedilink
        262 years ago

        Excuse me stop being so cool, you’re raising the bar too high for everyone else thank you

  • @Tolstoshev@lemmy.world
    link
    fedilink
    English
    102 years ago

    Is there anything we can do to help? Donations? Tech volunteers? Visit hosting company with a baseball bat?

    • HEISENBERG
      link
      fedilink
      62 years ago

      This is about what’s happening the last few days I think? Lemmy World has been a lot more stable compared to even a week ago but still has some daily downtime. But not for hours every day anymore

  • @perfectra1n@lemmy.world
    link
    fedilink
    English
    42
    edit-2
    2 years ago

    On your Cloudflare account, if there was a change in the CNAME/A record being proxied vs. DNS only, that could cause an issue, as Cloudflare would then strip headers off the request that your Apache/Nginx would be looking for.

    If you enabled HTTP DDoS protection in your Security -> WAF tab (I think that’s where it is) that could do this too. Might be worth disabling.

    Also check for any headers your HTTP load balancer might be expecting, that Cloudflare could be stripping.

    Might be worth tailing the webserver logs to see what happens to requests coming in from Cloudflare.