PSA: You can upload images to a Lemmy instance without anyone knowing

@[email protected] · edit-2 2 years ago

PSA: You can upload images to a Lemmy instance without anyone knowing

Dandroid · 2 years ago

'm an instance administrator, what the fuck do I do?

Check your pictrs images (good luck) or nuke it. Disable pictrs, restrict sign ups, or watch your database like a hawk. You can also delete your instance.

How? I have checked, and there doesn’t seem to be any way to see the photos on my server.

I actually shut down pictrs entirely on my instance. Running pictrs in its current state is criminally negligent imo.

@[email protected] · 2 years ago

They are stored in the pctrs folder. They don’t have file extensions but are viewable with many image programs.

Dandroid · 2 years ago

Oh, I see. I only use command line on my server, so I didn’t realize they were actual photos. Thanks!

@[email protected] · 2 years ago

I’m an instance administrator, what the fuck do I do?

You forgot: Don’t allow others users but yourself ;)

@[email protected] · 2 years ago

Good point but also consider disabling pictrs until they fix the caching problem!

@[email protected] · 2 years ago

Yeah, I kinda hate it as I can’t easily post pictures anymore, now, but I have been worrying recently that some idiot would bring the police to my door. Just disabled pict-rs for my instance :/

@[email protected] · 2 years ago

Is there a description on how to disable pictrs?

@[email protected] · 2 years ago

Remove it from docker compose.

@[email protected] · 2 years ago

I’m surprised and delighted that this simply works :D

Dandroid · 2 years ago

It didn’t work that easily for me. I had to redirect all pictrs traffic to 404 in my nginx config.

@[email protected] · 2 years ago

Hm, I use caddy, and it just worked, why would you need to 404 it? That should be the default behavior if there’s nothing there.

Dandroid · 2 years ago

I set up mine manually using docker, and it gave me a site-wide error message when I simply completed our the pictrs container and dependencies. I made a post on [email protected] and someone suggested that I 404 it. I made that change, and now it works.

@[email protected] · 2 years ago

Man, Lemmy devs have zero clue about best practices… What a crap show!

@[email protected] · 2 years ago

These kinds of issues are common on any large software project.

@[email protected] · 2 years ago

Mmm, not really.

@[email protected] · 2 years ago

Are you speaking from experience?

@[email protected] · 2 years ago

wouldn’t it be just as easy to whitelist DNS?

@[email protected] · 2 years ago

What would you whitelist? And what would that help?

@[email protected] · 2 years ago

An option to prevent users to upload unless their DNS has been whitelisted. It would require explicit permission to upload, which could be handy for smaller instances.

@[email protected] · edit-2 2 years ago

You mean their IP? And restricting things to IPs is a) not that easy while having federation still work and b) IPs change.

Also, in that case you might as well close registration and only have limited accounts like most of us self-hosters have.

@[email protected] · 2 years ago

Not IP. DNS whitelist. This way if a geography or subnet is responsible for illegal material they are only allowed in if an instance granted +w.

@[email protected] · 2 years ago

But how is that related to DNS? That would be geoblocking.

@[email protected] · 2 years ago

Every person on the internet has a DNS record that loops back to them. The DNS has a topography so that various elements of a domain could be whitelisted, or not.

It would be trivial to queue a request to white list, where an administrator could decide if it is worth it, having it auto expire over time.

Instance admins could share sources of bad actors.

heuristics could help determine the risk of an approval action.

@[email protected] · 2 years ago

I really don’t understand what you are talking about. DNS is the Domain Name System, why would every person have a DNS entry?

@[email protected] · 2 years ago

Yes it would, if the problem had anything to do with the DNS.

@[email protected] · 2 years ago

The problem is people and people have DNS.

@[email protected] · 2 years ago

Try turning wifi off on your phone, getting the IP address, and then looking up the DNS entry for that and consider if you want to whitelist that? And then do this again tomorrow and check to see if it has a different value.

Once you get to the point of “whitelist everything in *.mobile.att.net” it becomes pointless to maintain that as a whitelist.

Likewise *.dhcp.dorm.college.edu is not useful to whitelist.

@[email protected] · edit-2 2 years ago

Yes. I am well aware and that would be by design.

remember - if someone on a major mobile network is uploading child photography, that device is radioactive and an instance admin is going to have options they may not have in other situations.

The idea is give instance admins control over who uploads content. Perhaps they don’t want mobile users to upload content, or perhaps they do but only major carriers, by their own definition of major.

Somewhere between “everyone” and “nobody” is an answer.

giving the instance administrator tools to help quarantine bad actors only helps, which will require layers. Reverse DNS is a cost, however; perhaps the tax is worth it when hosting images, where there is already a pause point in the end user experience, and the ramifications so severe.

Larger instances may dilligaf but a smaller instance may need to be very careful…

Just sayin…

@[email protected] · 2 years ago

Do you have a good and reasonable reverse DNS entry for the device you’re writing this from?

FWIW, my home network comes nat’ed out as {ip-addr}.res.provider.com.

Under your approach, I wouldn’t have any system that I’d be able to upload a photo from.

@[email protected] · 2 years ago

why do you say that, knowing full well DNS whitelists rely on wildcards?

@[email protected] · 2 years ago

If you’re whitelisting *.res.provider.com and *.mobile.att.com the whitelist is rather meaningless because you’ve whitelisted almost everything.

If you are not going to whitelist those, do you have any systems available to you (because I don’t) that would pass a theoretical whitelist that you set up?

@[email protected] · 2 years ago

Why does it matter? Read some of my other posts.

@[email protected] · 2 years ago

Would you be able to post an image if neither *.res.provider.com nor *.mobile.att.com were whitelisted and putting 10-11-23-45.res.provider.com (and whatever it will be tomorrow) was considered to be too onerous to put in the whitelist each time your address changed?

@[email protected] · 2 years ago

Explain.

@[email protected] · 2 years ago

He probably means whitelisting domains when posting already uploaded images, clearly not having read the post

@[email protected] · 2 years ago

That’s another issue. Also a necessary feature.

@[email protected] · 2 years ago

No I mean the user’s DNS should be whitelisted to permit uploads. If DNS not on whitelist then no upload, period.

@[email protected] · 2 years ago

What do you mean by “the user’s DNS” exactly??

@[email protected] · 2 years ago

Pedo trolls will be the death of Lemmy, you heard it here first!

@[email protected] · edit-2 2 years ago

Part of the problem with having an illegal series of bits. Of course people are going to use that as a weapon.

I don’t think those images should be made fully legal, but maybe we should calm the fuck down about two notches. We should keep in mind that the real crime is creating the pictures. Being effectively legal bombed by them is kind of ridiculous. As is having to keep the detection tools secret.

If you’re on a grand jury for csam, maybe you should actually see the evidence (with limited censorship) before you indict someone.

Maybe I’m wrong, but I don’t think seeing a small number of pictures is going to scar you for life. I’ve seen goatse. I’ve seen people decapitated. It’s not pleasant, and I avoid those things, but it’s not scarring.

The Station Nightclub Fire is scarring. I’ve recommended that video to people because it’s scarring in a way that can save lives. Seeing that stuff every day would absolutely be scarring.

I don’t want to see that kind of stuff to become common, but I am disturbed that people are afraid of unused images hiding on their Lemmy server.

@[email protected] · 2 years ago

Fuck you, pedo.

@[email protected] · 2 years ago

Regardless of the debate of whether admins should be legally liable for not deleting unknown child abuse digital files,

Maybe I’m wrong, but I don’t think seeing a small number of pictures is going to scar you for life. I’ve seen goatse. I’ve seen people decapitated. It’s not pleasant, and I avoid those things, but it’s not scarring.

You shouldn’t use your own experiences to make this generalisation, given that people working at agencies prosecuting pederasts often have to receive therapy or even leave the job after continued exposure.

I am disturbed that people are afraid of unused images hiding on their Lemmy server.

Don’t you think it’s logical for someone to be worried about being vulnerable to being accused of what likely is, in many legal systems, a crime?

@[email protected] · 2 years ago

Yeah, I think continued exposure is different than a one off thing. It’s why I used the Grand Jury example.

And I do think it’s logical. That’s the problem. My entire point is that csam shouldn’t be so easy to weaponize.

Maybe seeking, selling, or intentionally distributing should be the crime.

@[email protected] · 2 years ago

Which is why we need to act now.

@[email protected] · 2 years ago

Wasn’t facebook also found to store images that were uploaded but not posted? This is just a resource leak . I can’t believe no one has mentioned this phrase yet. I’m more concerned about DoS attacks that fill up the instance’s storage with unused images. I think the issue of illegal content is being blown out of proportion. As long as it’s removed promptly (I believe the standard is 1 hour) when the mods/admins learn about it, there should be no liabilities. Otherwise every site that allows users to post media would be dead by now.

@[email protected] · 2 years ago

I’m a pentester and security consultant. From my point of view, this vulnerability has more impact than just a resource leak or DOS. We all know how often CSAM or other illegal material is uploaded to communities here as actual posts (where hundreds of viewers run into it to report it). Now imagine them uploading it and spreading it like this, and only the admin can catch it if they goes out of their way to check it?

I wouldn’t call this a high risk issue for sure. But a significant security risk regardless.

@[email protected] · 2 years ago

Whether it’s illegal content or storage-filling DoS attacks, the issue needs to be addressed.

@[email protected] · 2 years ago

A lot of web software does this (Github and Gmail for example). I like it but always thought it could be abused.

@[email protected] · 2 years ago

They probably have the tools to deal with it. Lemmy certainly doesn’t.

@[email protected] · 2 years ago

You mean Gmail drafts? I know from at least one case where criminals used this, they shared the Gmail account password and messaged each other only via the drafts function. So technically there was never a mail send.

Nerd02 · 2 years ago

I’m an instance administrator, what the fuck do I do?

There’s one more option. The awesome @[email protected] has made this tool to detect and automatically remove CSAM content from a pict-rs object storage.

https://github.com/db0/lemmy-safety

Xylight (Photon dev) · 2 years ago

You need a GPU for that. Most $5 VPSs don’t have that.

Nerd02 · 2 years ago

Yeah I know. It’s supposed to be ran from your computer, not the VPS.

Xylight (Photon dev) · edit-2 2 years ago

Would I mount the the pictrs folders as a network folder locally?

Nerd02 · 2 years ago

No. Unfortunately it only works with storages on object storages like S3 buckets, not with filesystem storages. Meaning it access the files remotely one at a time from the bucket, downloading them over the internet (I assume, I didn’t make this).

But the more important thing is that, as it states in the readme, no files get saved to your disk, they only stay in your RAM while they are being processed and everything is deleted right after. This is relevant because even having had CSAM on your disk at some point can put you in trouble in some countries, with this tool it never happens.

Which btw is the same reason why mounting the pict-rs folder to your local computer is probably not a good idea.

db0 · 2 years ago

theoretically this tool could be adjusted to go via scp and read your filesystem pict-rs storage as well, Just someone has to code it.

Nerd02 · 2 years ago

Interesting. That would be a nice extension, I think most small admins are using the filesystem (I know I am lol).

@[email protected] · 2 years ago

This is a nice tool but orphaned images still need to be purged. Mentioned on the other thread that bad actors can upload spam to fill up object storage space.

Nerd02 · 2 years ago

That is also very true. I think better tooling for that might come with the next pict-rs version, which will move the storage to a database (right now it’s in an internal ky-value storage). Hopefully that will make it easier to identify orphaned images.

Dandroid · 2 years ago

I tried getting this to run in a container, but I was unable to access my GPU in the container. Does anyone have any tips on doing that?

Nerd02 · 2 years ago

Sorry I haven’t ran this myself yet nor have any experience with that kind of issues. But may I ask why you were concerned with running it inside of a container? Seems rather unnecessary to me.

Dandroid · 2 years ago

Running anything in a container isn’t necessary. It just makes it easier to run, as it comes with all the dependencies. And if you decide you don’t want it anymore, you can just remove the container and it and all its dependencies are gone, which is really clean. It also makes the environment extremely repeatable, so people on all distros can run it with the exact same steps. And you don’t need to worry about what version of python you have and if it’s compatible with the dependencies. For example, the dependencies for this script require python 3.10 exactly. You can’t use 3.9 or 3.11.

So really the only reason was I wanted to make it easier for everyone.

Nerd02 · 2 years ago

I see. I considered the dependency problem but only thought of using a venv to fix that, however you are right, the python version is also often the cause of compatibility issues.

Dandroid · 2 years ago

Honestly, I’m a little sick of needing to make a venv for each python script, which is why I’m trying to put all my python scripts in containers. I already got db0’s project to the point where anyone can run it with one single command line that you can just copy/paste (assuming you have docker installed already). It is just running on the CPU, which is painfully slow.

Nerd02 · 2 years ago

Same, it’s the reason why I can’t stand working with python.

Thank you for doing this, btw. Once you have something working on your hands you could consider spreading the word, maybe to db0 himself. I sure would love a convenient way to run that script, and many other admins probably would too.

db0 · edit-2 2 years ago

You can go one step further and use a conda env, which would also include the proper python version. All you need then is the micromamba binary. I might develop that as all it would need is to run a shell script to start

@[email protected] · 2 years ago

Not sure how you’re trying to run it in a container, but the answer would depend on a bunch of different factors. Nvidia has a utility you can install that assists in exposing the GPU to the container, documentation found here.

If you’re using docker compose to run it as a service, there’s a doc page for that too. Note that it uses the previous page I mentioned as prerequisite.

There’s another way to get it working from within kubernetes that comes up every now and then on stackoverflow.

If it’s Intel or AMD, no idea if this still applies.

Dandroid · 2 years ago

Yes, this is exactly what I had trouble with. The Nvidia container runtime seems to not support my distro. But even when I tried running it on my Ubuntu machine, I was getting tons of dead links using Nvidia’s instructions. And even when I fixed the links. I was getting issues like the apt repository was throwing errors. IIRC, it was some kind of signature issue, and I’m not sure I want to ignore that, especially considering I had to fudge the URL.

I’m thinking the best option is to build from source, but I don’t think that’s easier than just running this in a non-container.

Rentlar · 2 years ago

How come this hasn’t been addressed before?

Because pictrs and most other components of Lemmy was designed for a much smaller use case by a very small development team. It was designed primarily by people volunteering their time and expertise. Most of the contributors have other things to do on a full-time basis. If you really want to see a change like this implemented NOW, then code it in yourself, file a new issue directly on their page with potential solutions, or donate to the people working on it.

Your post is good for the most part, but my patience is limited for the kind of entitled attitude you show under that heading specifically. Thanks for hearing me out.

@[email protected] · 2 years ago

Entitlement? The “Subtitles” are acting as a Question the reader may have, and below the answer, OP is not demanding anything

Rentlar · 2 years ago

Fair point. That question itself is what bothers me even if it is a valid one people have on their minds. The answer to that question should highlight more clearly what has been done, and if OP doesn’t know, then IMO it would best be to not include that question/answer.

I have no problems with OP’s post and the fact to bring up this issue and dicsuss it. Including that question with an incomplete answer bothers me like a clickbait headline for an article does, or how Tucker Carlson’s show asks questions. This serves little purpose but put the people working on fixes in a bad light acting like they haven’t been working on anything.

bermuda · 2 years ago

I’m glad to live in a world where concern about safety is considered entitlement somehow

@[email protected] · 2 years ago

Entitled attitude? I’m just bringing it up again. It was brought up some time ago but wasn’t given attention so I’m bringing it up again after the recent CSAM attacks.

I didn’t demand anything in the post. I brought up the issue, explained why it’s important, and what admins could do about it.

I don’t know how to code but that doesn’t mean I’m not allowed to bring this issue to light…

Rentlar · edit-2 2 years ago

I have no issue with your post itself and discussing this issue it is important to highlight things like this. Thank you for bringing it up, and sorry if I sound mad at you for doing that.

I will point out, the specific thing that bothers me is that the heading

How come this hasn’t been addressed before?

contains an incomplete answer that ignores work that is currently in progress by devs to address. I don’t blame you for not knowing the answer but for including and answering that question when you don’t know the answer. To me it’s reminiscent of Tucker Carlson-style questioning, where some issue is brought up, questions are asked but then the answer is sparsely researched and the viewer is expected to come to some conclusion of who to blame. This specifically is what gets on my nerves.

If you can include where work to rectify the issue had been discussed and is in progress like github issues, discussion throughout Lemmy and other things, I’ll edit my first reply to note my concern is assuaged.

E: Here are some of the relevant issues and discussion:

This sort of thing has been discussed since 2020
Adding admin purging of DB items and pictures. #904 #1331 #1809
#3504, #2277, these and related issues are still open.

@[email protected] · 2 years ago

I don’t care if you don’t like my English writing. I brought up the issue and if people don’t care about it then whatever. We’ll just have to wait until it’s abused then maybe people will be actually concerned.

@[email protected] · 2 years ago

OP is flagging a legitimate issue that can actually put instance owners at risk. Raising the issue that instance owners can unwittingly host illegal content and be liable for it - how is that entitled?

Totally understand that Lemmy devs are a small team, but the growth of use of the software is exploding now, and not being able to keep up is a problem of scale - gatekeeping others from raising issues does not help it get better and in fact discourages issue reports and promotes a head-in-the-sand culture.

Rentlar · edit-2 2 years ago

I understand and raising the issue and discussion is fine. With all due respect to OP, I take it personally when the discussion is framed with the implication that the developers should not have released a project with some bugs and they should have put more effort here or there. I’ve contributed to Lemmy both in coding, translation and small donations, but I’m not here for people to push blame on devs. This is why bringing up the question “Why hasn’t anything been done?”, while I recognize it is a question on some people’s minds, it gets on my nerves. It bothers me like a clickbait/ragebait title does for many.

I would rather the discussion focus on where efforts are made or will be made to mitigate and fix the problem.

AphoticDev · 2 years ago

This can be solved very easily by a cron job to clean out the folder periodically, if you’re worried about it.

@[email protected] · 2 years ago

Very easily you say? Maybe tell us what this cron job is so we can all add it?

AphoticDev · 2 years ago

Just make a cron that runs the rm command every day or whatever to clean out the files. Then run a SQL query at the same time to truncate any draft posts in the database. There’s no logic to this method, it just clears out the files and records related to draft posts, but it’s fast and effective.

There’s a small chance it might fuck somebody up if they were writing a post at that exact moment, but you can schedule the cron for when your instance is the quietest.

@[email protected] · 2 years ago

Because anyone can upload illegal images without the admin knowing and the admin will be liable for it.

The admin/company isn’t liable until it is reported to them and they don’t do anything about it… That’s how all social media sites work, Google isn’t immediately liable if you upload illegal materials to GDrive and share it anonymously.

@[email protected] · 2 years ago

Doesn’t change the fact that this is an issue. Besides, do you think American law applies everywhere?

@[email protected] · edit-2 2 years ago

FYI to all admins: with the next release of pict-rs, it should be much easier to detect orphaned images, as the pict-rs database will be moved to postgresql. I am planning to build a hashtable of “in-use” images by iterating through all posts and comments by lemm.ee users (+ avatars and banners of course), and then I will iterate through all images in the pict-rs database, and if they are not in the “in-use” hash table, I will purge them.

Of course, Lemmy can be improved to handle this case better as well!

@[email protected] · 2 years ago

Did you opened an issue on github?

You are wording this as a clickbait news article.

You find an issue, you report it to the right channel, you notify it. Good. This is how software development work, with active community reporting issues.

But why using such tone?

@[email protected] · 2 years ago

I’m not on GitHub. Nor is a lot here. I’m wording it this way so the issue gets the attention it deserves. Anyway, everybody already knows about this but nobody understood the consequences. Same reason why there’s no option to disable image caching. These issues should have been addressed the moment image uploading was made available in Lemmy. It was just overlooked because of how tiny the platform was then.

It’s funny because last month Mastodon CSAM was a hot topic in the Fediverse and people were being defensive about it. Look where we are now. Has Mastodon addressed the CSAM issue? Did they follow the recommendations made by that paper? I don’t think so. There wouldn’t be an open GitHub issue about it. Will Lemmy be like Mastodon or will it addressed the concerns of its users?

Fedora · edit-2 2 years ago

Are you aware of the consequences of your actions? You didn’t inform the people who can fix this issue of the potential impact, no. You informed the Lemmy community that they can upload whatever they want, and some of them are pedophiles. Not cool at all. Responsible disclosure ain’t a thing outside of cybersecurity I suppose, though irresponsible disclosure is prevalent everywhere. Very irresponsible.

@[email protected] · 2 years ago

Rogues are very keen in their profession, and know already much more than we can teach them.

@[email protected] · edit-2 2 years ago

Creating a product of any size is about planning.

If you notify here, your information will be lost in 2 days. People forget, and move on to the next hot topic. Relevant stakeholders might very well completely miss this post, because they are not 24/7 on lemmy.

The way to make it more relevant is going in the place where the planning is done, i.e. Github for lemmy. Open an issue there, explain the problem and describe possible solution. Come back to lemmy, link the issue and ask people to react to it (i.e. show it is relevant for them).

This is the best way to obtain what you ask. Social media platforms are too broad and fuzzy for tracking real issues.

This is also why you see a lot of work is done on performances of sql of lemmy backend, because most issues in the past on github concerned that.

This is my suggestion. If you really care about this being implemented, open a ticket on github and follow the discussion there. If you see there is not enough traction ask help to fellow lemmings.

Suggestions for the github issue are:

be very specific
be polite
suggest solutions

If your solution is good, great, if not, people are more willing to think about a problem to show stranger on the internet they are wrong

@[email protected] · 2 years ago

Feel free to open the issue on my behalf. I am not a software developer. You seem to know more about this. I’m just reminding people something that I and many others have observed months ago.

@[email protected] · edit-2 2 years ago

I haven’t experienced myself the issue. I trust your experience, but I cannot completely reproduce/describe it, as I am not selfhosting. I couldn’t answer in case of questions from developers regarding this.

Best would be for you to report this. You can create an issue here:

https://github.com/LemmyNet/lemmy/issues

There is a simple template to fill, and you can copy and paste most text from this thread.

@[email protected] · 2 years ago

You don’t need to selfhost to reproduce this. Anyone can do this and that’s the problem.

@[email protected] · 2 years ago

Not sure why you’re getting downvoted, since you gave clear instructions that anyone can follow to verify what you said.

@[email protected] · 2 years ago

Sadly not everyone bothered to read the post and just jumped to the comments. Again its like the Mastadon CSAM issue last month. People don’t read the paper and act so defensively about it. Now Lemmy is experiencing the same problems, people suddenly act differently?? Crazy.

JackbyDev · 2 years ago

Signing up for GitHub and opening this issue would take about as long as making this post.

@[email protected] · 2 years ago

You could’ve just done it yourself if you felt so passionate about it. Badgering people into action rarely works.

JackbyDev · 2 years ago

I’m not badgering, I’m demystifying the process.

JackbyDev · 2 years ago

Here you go, two steps:

AphoticDev · 2 years ago

How would they address your concerns? The chances that one of the devs follows you is nonexistent, I would wager. Instead of using the proper channels to inform them, you did the exact opposite and posted it someplace they are almost guaranteed not to see it.

@[email protected] · 2 years ago

It’s on the GitHub issue tracker already. Did you not read the post?

WorseDoughnut 🍩 · 2 years ago

so the issue gets the attention it deserves.

The other best way to do this is to actually submit the issue in the appropriate location so the Lemmy devs can track and respond to it.

It’s been 7 hours, it can’t be that hard to make a github account and format this post into an actually helpful github issue.

@[email protected] · 2 years ago

Because there’s already an issue dated July 6: https://github.com/LemmyNet/lemmy/issues/3504

Like I said, people already know about this months ago.

30021190 · 2 years ago

Other than fulling up storage, what is the actual issue? If the image is orphaned then surely nobody can actually access the content? Sure you could be blind hosting things but if nobody can get the content back out then the abuse is surely minimal apart from say a complex cyber and physical targetted campaign or simply fulling up storage…

@[email protected] · 2 years ago

The issue is that you can share the image link to other people. People CAN get the content back out and admins or moderators WILL NOT KNOW about it.

So if someone uploads an illegal image in the comments, copies the link and does not post the comment, then they have a link of an illegal image hosted on someone’s Lemmy instance. They can share this image to other people or report it to the FBI. Admins won’t know about this UNLESS they look at their pictrs database. Nobody else can see it so nobody can report it.

2 years ago

The original uploader can access it via the hyperlink. Which was never posted publicly.

2 years ago

@bmygsbvur Pleroma is exactly the same and no one cared in six years.

@[email protected] · 2 years ago

Doesn’t change the fact that this is an issue.