The developers of the Manjaro Linux distribution, built on the basis of Arch Linux and aimed at beginners, announced the beginning of testing a new service MDD (Manjaro Data Donor), designed to collect statistics about the system and send it to the external server of the project. The author of the MDD intended to enable telemetry by default (opt-out), but the decision has not yet been approved and, judging by the objections of some developers and users, it is likely that telemetry will be offered as an option requiring prior consent of the user (a request to enable telemetry is proposed to be added to the greeting interface after the first download).

The report includes data such as host name, kernel version, desktop component versions, detailed information about hardware and drivers involved, screen size and resolution information, network device MAC addresses, disk serial numbers, disk partition data, information about the number of running processes and installed packages, versions of basic packages such as systemd, gcc, bash and PipeWire.

The sent data is stored on the project server in the ClickHouse database and visualized using the Grafana platform. The IP addresses of users are not stored, and the hash from the /etc/machine-id file is used as the system identifier.

Аccording to the code https://github.com/manjaro/mdd/blob/master/mdd.py#L40 sends everything.

  • LiveLM
    link
    fedilink
    English
    47
    edit-2
    8 months ago

    Opt-out? I see it’s time for the seasonal Manjaro fuck up.

    • Bezier
      link
      fedilink
      278 months ago

      Thought it’s probably fine after reading the title, but this shit isn’t fine. What the fuck.

    • @[email protected]
      link
      fedilink
      28 months ago

      The MAC address is anonymized with sha256, and IP adresses aren’t stored.
      So this seems to me to be perfectly anonymous.

      • @[email protected]
        link
        fedilink
        16
        edit-2
        8 months ago

        MAC addresses are 48 bit, and half of that is just the manufacturer. So 24 bits really, and those bits aren’t random, I think manufacturers just assign these based on some scheme, like a serial number. Point is you could easily reverse the SHA by brute force.

        You can’t calculate any useful statistic from a hash so literally the only use this would have is some sort of tracking.


        Edit: I just looked up some data and I found someone using hashcat on an RTX 3090, which looks like it can do almost 10000 million SHA256 hashes per second of salted passwords (which are longer than 48 bit MACs, so MACs should be faster). 2²⁴ is 16.8 million, so it’ll take about 1.7 ms per vendor. I found a database with (all?) 53011 vendor ids:

        >>> 2**24 * 53011 / 10000 / 1000 / 1000
        88.93769973759998
        

        Yup, 89 seconds. You can calculate the SHA256 of every single MAC ever potentially issued in 89 seconds on a bog-standard 3090.

        • @[email protected]
          link
          fedilink
          28 months ago

          this would have is some sort of tracking.

          It’s right at the top of the announcement, that it’s mainly for more accurate stats on unique users.
          It’s not that I think this is a good idea, because I don’t, but some people are blowing it out of proportions. Especially since this isn’t at all decided. Which I seriously doubt it will.

          • @[email protected]
            link
            fedilink
            10
            edit-2
            8 months ago

            You don’t need this to count unique users. You could just assign a random number on install or whatever. Or even more simply, just run the thing once per month, should be accurate enough. Do they expect the software to just randomly spam duplicate reports? Don’t write it that way.

            Best case they don’t care about collecting minimal data and don’t understand that hashed MACs are easily reversible. So incompetent fools with no sensitivity to privacy.

            Maybe this should be Manjaro’s tagline: Not purposely malicious, just grossly negligent and ignorant.

            • @[email protected]
              link
              fedilink
              58 months ago

              You could just assign a random number on install or whatever.

              Funny, I thought the exact same thing.

        • @[email protected]
          link
          fedilink
          58 months ago

          You can see the code of what is send.
          I’m not aware that Google claims they collect data anonymously, on everything where you are logged in.
          So that’s a false equivalence.

  • @[email protected]
    link
    fedilink
    72
    edit-2
    8 months ago

    enable telemetry by default … MAC addresses, disk serial numbers

    Another reason to not use Manjaro. Just use Endeavour instead.

    Edit: I’m not against telemetry pre se. I have the KDE feedback enabled for example but that was opt in and sends no unique data.

    • sovietknuckles [they/them]
      link
      fedilink
      English
      98 months ago

      Another reason to not use Manjaro. Just use Endeavour instead.

      Endeavour could be useful if it’s your first time running an Arch-based distro and you’re looking for software/configuration suggestions. Otherwise, Arch Linux is fine by itself and it doesn’t have telemetry

      • Handles
        link
        fedilink
        English
        108 months ago

        I don’t think anybody would say otherwise. Both Manjaro and Endeavour mean to make Arch more appealing to users who aren’t comfortable with command line configuration.

        Endeavour has arguably done better than Manjaro, but yeah. They’re just some configs on top of a system that does very well on its own.

      • @[email protected]
        link
        fedilink
        108 months ago

        Why?

        Let me put the question back to you. How do think the uniquely identifiable information will help them improve Manjaro?

        Do you think they’ve got a Russian satellite and will track down your HDD serial number from space?

        No.

        There’s lots of benefits to telemetry.

        As I basically said, if you bothered to read my comment.

      • exu
        link
        fedilink
        English
        5
        edit-2
        8 months ago

        When?

        Edit: I misread, though it said “trust” instead of “distrust”

        • @[email protected]
          link
          fedilink
          English
          198 months ago

          They’ve let TLS certs expire on multiple occasions. They’ve made the decision to enable the AUR in the default installation, which can cause conflicts with out-of-date dependencies because of the delayed release schedule compared to Arch. They’ve shipped software on their stable branch that included unmerged upstream code. One of their developers temporarily broke Asahi Linux.

          I don’t hate the project, but I can’t trust the developers and management.

          • @[email protected]
            link
            fedilink
            108 months ago

            They’ve let TLS certs expire on multiple occasions.

            And they told their community to set their clocks back. As a workaround, it will work but all your created and modified data will have the wrong timestamps.

            • @[email protected]
              link
              fedilink
              English
              5
              edit-2
              8 months ago

              He’s also a contributor to Asahi Linux. One of his MRs changed the build options that somehow caused it to (IIRC) use mainline Mesa instead of the branch that is specifically modified to work on ARM.

              (edit) Aussie linux man: https://www.youtube.com/watch?v=eDRiBbzzREw

              It’s not only his fault, but mostly.

    • @[email protected]
      link
      fedilink
      English
      11
      edit-2
      8 months ago

      Ad firm money.

      Maybe I’m just cynical, but my first instinct when I see stuff like this is they have a secret contract with an advertiser and are selling this information.

  • @[email protected]
    link
    fedilink
    38 months ago

    I tried Manjaro last year and I hated it.

    Something about the distro would lock up my PC, it would freeze from time to time.

    I disabled the standby/sleep function, but allowed my monitors to go into standby. But if I left my PC for an hour or two my screens would not wake up, different types and brands. I had so many issues with Manjaro and while speaking with a friend I told him I had moved over to Nobara but he was still on Manjaro. But then a few weeks later he mentioned he was running Nobara. Seems he also ditched it.

  • @[email protected]
    link
    fedilink
    English
    118 months ago

    Why do they need half that data for a derivative of a distro? Fuck off. I don’t care if someone collects the model number of my GPU or whatever but that sounds like personally identifiable tracking data, not basic “telemetry” data to set development priorities or whatever.

  • @[email protected]
    link
    fedilink
    308 months ago

    The report includes data such as host name, kernel version, desktop component versions, detailed information about hardware and drivers involved, screen size and resolution information, network device MAC addresses, disk serial numbers, disk partition data, information about the number of running processes and installed packages, versions of basic packages such as systemd, gcc, bash and PipeWire.

    That’s insane

  • @[email protected]
    link
    fedilink
    English
    168 months ago

    Manjaro is already less stable than arch, now it collects your data involuntarily? Fucking wild how anyone can use it.

  • @[email protected]
    link
    fedilink
    288 months ago

    I get the usefulness of technical telemetry such as kernel version, RAM, disk space, processor type, etc… but NIC MAC? HDD serial? WTF?

      • r00ty
        link
        fedilink
        58 months ago

        I said elsewhere, I hope this is just some way to track changes over time per user.

        But they need to take an anonymous hash of some non changing data or create an install id that is used for this and nothing else (e.g it identifies a unique user but not the person or hardware behind the user).

        Too much identifying info is just pushed around like we shouldn’t care, it’s become a real problem.

      • The Doctor
        link
        fedilink
        English
        38 months ago

        The first three octets of a MAC specify the manufacturer of a NIC chipset. That could come in handy for driver debugging.

        Manufacturers and firmware versions of storage devices? You can make the argument; perhaps it would have helped figure out the SSD firmware bugs years ago.

        But stuff like whether or not you have video capture card or your current system temperature stats? Nah… that’s getting into “identifiable information as toxic waste” territory.

        • @[email protected]
          link
          fedilink
          1
          edit-2
          8 months ago

          Yeah, so take the vendor and device id and be done?

          Why should they need my unique ID/MAC?

          • The Doctor
            link
            fedilink
            English
            18 months ago

            A MAC address isn’t really unique. Each has six octets, of which three refer to the manufacturer. The other three octets have at most 16,777,216 possible values. That seems like a lot but it really isn’t; a MAC is supposed to be unique on a LAN, not globally. Rollovers during manufacturing happen, and collisions are rare but happen once in a while.

            • @[email protected]
              link
              fedilink
              28 months ago

              Unique enough with the other hardware IDs

              And still, absolutely no reason to go further then the first octets, to have the vendor and device

              Or am I missing something?

              And I’m currently a happy user of Manjaro since years. But this stuff really isn’t what I want to have on my system …

              • The Doctor
                link
                fedilink
                English
                28 months ago

                Just defining the threat model of hardware addressing, as it stands.

                I don’t agree with them sending more than the first half either.

    • @[email protected]
      link
      fedilink
      English
      128 months ago

      Those are absolutely ways of covertly identifying your device while technically not counting as “personal information” under privacy laws.

        • @[email protected]
          link
          fedilink
          English
          68 months ago

          The point is that it’s a loophole in privacy laws so they don’t have to outright tell people that they collect personal or identifying information. So they can legally mislead people by claiming it’s anonymous telemetry in hopes that users don’t actually look into it or understand the implications.

  • calm.like.a.bomb
    link
    fedilink
    English
    128 months ago

    I don’t get why someone would use Manjaro after so many fuckups… If you don’t know what I’m talking about, you’re either too new to Linux or don’t care. Just look for “manjaro certificates” or “manjaro drama” and you’ll find out for yourself.