Hey everyone, so as I’m sure everyone is aware Lemmy.World has been experiencing several outages throughout the last few days.
We have been investigating the root cause of these outages but believe that they are related to our current hosting provider (Hetzner) blocking access from ClouldFlare as (we think) they believe that our CDN is a DDoS’er, and is causing these disconnects to our backend server, problematic for sure.
We’ve opened support tickets with our current provider and are awaiting a response. We have no issue with being as transparent as possible with downtime. Anyone that is curious, can feel free to check out https://status.lemmy.world and https://dash.lemmy.world for up to the minute outage information. We are also looking into other fediverse friendly methods of posting status and outage updates
In the meantime, we are evaluating alternative hosting options and solutions to provide a high level of reliability to you, our users. Really, we want to say thanks to everyone for soldiering through all our technical growing pains.
Cheers
- LW Infra Team
Thanks for the update!
So what you’re saying is we were the ddosers all along 😹
The hack is coming from inside the instance!
Ironically true lol
Let me be real. I never noticed outages stopping. It feels like it’s daily, I’m used to it, but I think it happens so often that lemmy.world has lost its growth opportunity, and we alienated the normies. I’m still going to stay on Lemmy, and I believe you’re doing the best you can, but we lost for the time being, the migration to Lemmy from Reddit is stunted.
We shouldn’t be trying to grow a single instance. That defeats the whole point of Lemmy. I started on Lemmy world and switched once I got fed up with the constant connection issues. Plus, Lemmy world blocked piracy communities so fuck that. I’m happy that I am able to quickly create an account on another instance.
It didn’t help that almost every other general purpose instance blocked sign-ups in June and early July either, or required an essay on the application. Lemmy.world was the only one that was even trying at all, and I will commend them for that.
Hopefully things will get better by the next time spez screws up. Because there will be a next time.
Partly due to the fact… lemmy itself, basically has no moderation or administration features at all…
So, the only way to assist with that issue, is stricter enforcement up-front.
Besides, if someone doesn’t wanna take the time to have a verified email, and literally type 49 when registering an application… I really don’t wanna take the time to worry about having to potentially worry about them being spammers/etc.
Next time the lemmy join page needs to be improved so people can spread and don’t try to centralize into a single instance and break the purpose of lemmy in the first place.
Sure, but to be fair, there weren’t really many general-purpose instances that were accepting sign-ups from anybody when the Reddit bullshit went down in June. That’s part of why lemmy.world got as big as it did.
Most people who ended up on Squabblr and Discuit instead went there because they didn’t have to write an essay to join or try to find a server that was accepting sign-ups and wasn’t down a lot.
Lemmy.world itself is bigger than Squabblr and Discuit combined
I didn’t say most people ended up on Squabblr or Discuit.
Squabblr and Discuit
Discuit is a place where 4142 people get together to find cool stuff and discuss things.
Squabblr doesn’t have a count of active users (33k registered users)
I know we are low on the numbers, but still a bit higher than them.
OK, but I didn’t say more people ended up there than here. I was just stating the main reason people chose them over Lemmy.
To be honest, with such low numbers, I guess after a while they just came to Lemmy once instances were a bit more stable or went something else altogether
Squabblr decided to become a “Free Speech” platform and remove rules against LGBT hate speech, so they shot themselves in the foot. Almost everybody who was active on Squabblr moved to Discuit, but even Discuit has nowhere near the activity of Lemmy and kbin.
I never noticed outages stopping
I am in the same boat. If it were not for these posts, I’d have never noticed lemmy world was down.
If I post to, or put a comment on something from lemmy world, it will just federate over when it’s back online.
If someone posts to something on lemmy world, it will eventually federate over my way.
But, hey, everyone is gonna fuss when they decided to put all of their eggs in one basket and now, that one basket gets targeted. (metaphor for lemmy world.)
Thanks for keeping things running as Lemmy grows!
For what it’s worth, I’m on all the time and have barely noticed the outages.
I think it got A LOT better the last week or so
Thanks for the updates and transparency.
Thanks for sharing the dash link, and for your work!
Cloud is not an option?
That is the cloud.
When one mentions “Hetzner”, it doesn’t immediately evoke cloud services in the same vein as AWS, GCP, or Azure. For instance, while SAP offers “cloud” solutions, it might equate to a single server in Hannover. I hope this clarifies the context of my question :)
I don’t blame you for this, but the uptime records are incomplete at best. I’ve experienced the site being down (and confirmed with Down for Everyone or Just Me), yet status.lemmy.world showed all systems operational. As I’m writing this, status.lemmy.world is missing most data up to yesterday and dash.lemmy.world shows 16 days uptime.
I have lots of respect to you for even having these. I also remember status.lemmy.world work mostly fine some time ago. But as of right now, both uptime monitors fail to serve their purpose.
You need to hover over the status bar to see if there is any down time for that day. We can enable it to log incidents every time there is a burp, but we are still tuning alerts as we only have it create a incident when we ACK it in PagerDuty. You can always check the dashboard for up to the minute stats, as well as https://lemmy-status.org/endpoints/_lemmy-world We’ll add this info to make things clearer <3
EDIT: Added more info to our status page, thanks for the feedback Machefi!
EDIT2: Also the missing data is due to us removing and adding more specific monitors for the different infra services.
Excuse me stop being so cool, you’re raising the bar too high for everyone else thank you
deleted by creator
removed by mod
You guys rule. Thanks for all the hard work.
Is there anything we can do to help? Donations? Tech volunteers? Visit hosting company with a baseball bat?
I imagine spreading out across instances, and having accounts in instances beyond the biggest one helps, as it reduces traffic and strain on lemmy.world and its servers.
lemmy.world still deserves support, like with donations, though helping reduce the strain is something anyone can do, albeit small, though free of charge.
For both recurring and one-time donations, follow this link: https://opencollective.com/mastodonworld
For recurring donations only, visit our Patreon page from here: https://www.patreon.com/mastodonworld
The transparency is appreciated.
So the recent reports about ddos attacks were false alarms?
False positives
This is about what’s happening the last few days I think? Lemmy World has been a lot more stable compared to even a week ago but still has some daily downtime. But not for hours every day anymore
On your Cloudflare account, if there was a change in the CNAME/A record being proxied vs. DNS only, that could cause an issue, as Cloudflare would then strip headers off the request that your Apache/Nginx would be looking for.
If you enabled HTTP DDoS protection in your Security -> WAF tab (I think that’s where it is) that could do this too. Might be worth disabling.
Also check for any headers your HTTP load balancer might be expecting, that Cloudflare could be stripping.
Might be worth tailing the webserver logs to see what happens to requests coming in from Cloudflare.