I’m writing a program that wraps around dd to try and warn you if you are doing anything stupid. I have thus been giving the man page a good read. While doing this, I noticed that dd supported all the way up to Quettabytes, a unit orders of magnitude larger than all the data on the entire internet.

This has caused me to wonder what the largest storage operation you guys have done. I’ve taken a couple images of hard drives that were a single terabyte large, but I was wondering if the sysadmins among you have had to do something with e.g a giant RAID 10 array.

  • @[email protected]
    link
    fedilink
    38 months ago

    I routinely do 1-4TB images of SSDs before making major changes to the disk. Run fstrim on all partitions and pipe dd output through zstd before writing to disk and they shrink to actually used size or a bit smaller. Largest ever backup was probably ~20T cloned from one array to another over 40/56GbE, the deltas after that were tiny by comparison.

  • @[email protected]
    link
    fedilink
    48 months ago

    ~340GB, more than a million small files (~10KB or less each one). It took like one week to move because the files were stored in a hard drive and it was struggling to read that many files.

  • @[email protected]
    link
    fedilink
    English
    48 months ago

    You should ping CERN or Fermilab about this. Or maybe the Event Horizon Telescope team but I think they used sneakernet to image the M87 black hole.

    Anyway, my answer is probably just a SQL backup like everyone else.

  • @[email protected]
    link
    fedilink
    38 months ago

    Why would dd have a limit on the amount of data it can copy, afaik dd doesn’t check not does anything fancy, if it can copy one bit it can copy infinite.

    Even if it did any sort of validation, if it can do anything larger than RAM it needs to be able to do it in chunks.

    • @[email protected]
      link
      fedilink
      28 months ago

      Not looking at the man page, but I expect you can limit it if you want and the parser for the parameter knows about these names. If it were me it’d be one parser for byte size values and it’d work for chunk size and limit and sync interval and whatever else dd does.

      Also probably limited by the size of the number tracking. I think dd reports the number of bytes copied at the end even in unlimited mode.

    • @[email protected]
      link
      fedilink
      38 months ago

      No, it can’t copy infinite bits, because it has to store the current address somewhere. If they implement unbounded integers for this, they are still limited by your RAM, as that number can’t infinitely grow without infinite memory.

    • Random Dent
      link
      fedilink
      English
      18 months ago

      Well they do nickname it disk destroyer, so if it was unlimited and someone messed it up, it could delete the entire simulation that we live in. So its for our own good really.

    • data1701d (He/Him)OP
      link
      fedilink
      English
      18 months ago

      It’s less about dd’s limits and more laughs the fact that it supports units that might take decades or more for us to read a unit that size.

  • @[email protected]
    link
    fedilink
    118 months ago

    I worked at a niche factory some 20 years ago. We had a tape robot with 8 tapes at some 200GB each. It’d do a full backup of everyone’s home directories and mailboxes every week, and incremental backups nightly.

    We’d keep the weekly backups on-site in a safe. Once a month I’d do a run to another plant one town over with a full backup.

    I guess at most we’d need five tapes. If they still use it, and with modern tapes, it should scale nicely. Today’s LTO-tapes are 18TB. Driving five tapes half an hour would give a nice bandwidth of 50GB/s. The bottleneck would be the write speed to tape at 400MB/s.

  • Presi300
    link
    fedilink
    English
    58 months ago

    I’ve imaged an entire 128GB SSD to my NAS…

  • TedvdB
    link
    fedilink
    78 months ago

    Today I’ve migrated my data from my old zfs pool to a new bigger one, the rsync of 13.5TiB took roughly 18 hours. It’s slow spinning disks storage so that’s fine.

    The second and third runs of the same rsync took like 5 seconds, blazing fast.

  • d00phy
    link
    fedilink
    English
    188 months ago

    I’ve migrated petabytes from one GPFS file system to another. More than once, in fact. I’ve also migrated about 600TB of data from D3 tape format to 9940.

  • Larvitz :fedora: :redhat:
    link
    fedilink
    129 months ago

    @data1701d downloading forza horizon 5 on Steam with around 120gb is the largest web-download, I can remember. In LAN, I’ve migrated my old FreeBSD NAS to my new one, which was a roughly 35TB transfer over NFS.

  • @[email protected]
    link
    fedilink
    28 months ago

    I recently copied ~1.6T from my old file server to my new one. I think that may be my largest non-work related transfer.