It appears to work fine (it contains my home partition for my main machine I daily drive) and I haven’t noticed signs of failure. Not noticeably slow either. I used to boot Windows off of it once upon a time which was incredibly slow to start up, but I haven’t noticed slowness since using it for my home partition for my personal files.

Articles online seem to suggest the life expectancy for an HDD is 5–7 years. Should I be worried? How do I know when to get a new drive?

  • Elise@beehaw.org
    link
    fedilink
    arrow-up
    1
    ·
    1 day ago

    Do you raid? I just have one rn and am wondering if I could get a 2nd one and put it in raid without accidentally wiping the current one. I guess that would mitigate any failures

    • ragebutt@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 day ago

      Yeah I have a 15 drive array.

      You can raid 1 and that’s basically just keeping a constant copy of the drive. A lot of people don’t do this because they want to maximize storage space but if you only have a 2 drive array it’s probably your safest option

      it’s only when you get to 3 (2 drive array + parity) that you have some potential to maximize storage space. Note that here you’re still basically sacrificing the space of an entire drive but now you basically double it and this is more resilient overall because the data is spread out over multiple drives. But it costs more because obviously you need multiple drives

      Keep in mind none of these are back up solutions though. It’s true that when a drive dies in a raid array you can rebuild the data from other drives but it is also true that this operation is extremely stressful and can lead to death of the array. Eg in raid 1 a single drive dies and when adding a new drive the second drive that held the copy of your data starts having sector corruption during rebuild of the new drive, or in raid 2 one of the 3+ drives dies and when you rebuild from parity the parity drive dies for similar reasons. These drives are normally only being accessed occasionally and the rebuild operation is basically seeking to every sector on the drive if you have a lot of data, and often puts the drive under a lot of read operation for a very long period of time (like days) especially if you get very large modern drives (18,20,24tb)

      So either be okay with your data going “poof” or back up your data as well. When I got started I was okay with certain things going “poof”, like pirated media, and would backup essential documents to cloud providers. This was really the only feasible solution because my array is huge (about 200tb with about 100tb used). But now I have tape backup so I back everything up locally although I still back up critical documents to backblaze. Depends on your needs. I am very strict about not wanting to be integrated to google, apple, dropbox, etc. and my media collection is not simply stuff I can retorrent, it’s a lot of custom media I’ve put together the “best” version of to my taste. but to set something up like this either takes a hefty investment or if you’re like me years of trawling ewaste/recycling centers and decommission auctions (and it’s still pricey then but at least my data is on my server and not googles)

      • Elise@beehaw.org
        link
        fedilink
        arrow-up
        1
        ·
        1 day ago

        Hmm. Yeah I’m thinking of keeping my operation lean and simple, with an online copy. One issue I’ve noticed is that sometimes files just get corrupted. Perhaps due to a radiation event? A parity drive could solve that, but I want something simpler. I’m thinking just a tar with hash and then store multiple copies. What do you think?

        • ragebutt@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          24 hours ago

          Bitrot sucks

          Zfs protects against this. It historically has been a pain to work with for home users but recently the implementation raidz expansion has made things a lot easier as you can now expand vdevs and increase the size of arrays without doubling the amount of disks.

          This is a potential great option for someone like you who is just starting out but still would require a minimum of 3 disks and the associated hardware. Sucks for people like me though who built arrays lonnnnng before zfs had this feature! It was literally up streamed like less than a year ago, good timing on your part (or maybe bad, maybe it doesn’t work well? I haven’t read much about it tbf but from the small amount I have read it seems to work fine. They worked on it for years)

          Btrfs is also an option for similar reasons as it has built in protections against bitrot. If you read on this there can be a lot of debate about whether it’s actually useful or dangerous. FWIW the consensus seems to be for single drives it’s fine. My array has a separate raid1 array of 2tb nvme drives, these are utilized as much higher speed cache/working storage for the services that run. Eg if a torrent downloads it goes to the nvme first as this storage is much easier to work with than the slow rotational drives that are even slower because they are in a massive array, then later the file is moved to the large array for storage in the middle of the night. Reading from the array is generally not an intensive operation but writing to it can be and a torrent that saturates my gigabit connection sometimes can’t keep up (or other operations that aren’t internet dependent like muxing or transcoding a video file). Anyway, this array has btrfs and has had 0 issues. That said I personally wouldn’t recommend it for raid5/6 and given the nature of this array I don’t care at all about the data on it

          My array has xfs. This doesn’t protect against bitrot. What you can do if you are in this scenario is what I do: once a week I run a plugin that checksums all new files and verifies checksums of old files. If checksums don’t match it warns me. I can then restore the invalid file from backup and investigate for issues (smart errors, bad sata cable, ecc problem with ram, etc). The upside of my xfs array is that I can expand it very easily and storage is maximized. I have 2 parity drives and at any point I can simply pop in another drive and extend the array to be bigger. This was not an option with zfs until about 9 months ago. This is a relatively “dangerous” setup but my array isn’t storing amazing critical data, it’s fully backed up despite that, and despite all of that it’s been going for 6+ years and has survived at least 3 drive failures

          That said my approach is inferior to btrfs and zfs because in this scenario they could revert to snapshot rather than needing to manually restore from backup. One day I will likely rebuild my array with zfs especially now that raidz expansion is complete. I was basically waiting for that

          As always double check everything I say. It is very possible someone will reply and tell me I’m stupid and wrong for several reasons. People can be very passionate about filesystems

          • Elise@beehaw.org
            link
            fedilink
            arrow-up
            1
            ·
            22 hours ago

            Where do you store the checksums? Is it for every file? I thought of just making a tar for each year and then storing it next to it, and storing a copy off-site.

            • ragebutt@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              2
              ·
              20 hours ago

              I just have them on a usb stick with a copy on the array as well so they can also be checked for bitrot. Even doing it for every file it’s not that much data and it’s scripted so it’s done pretty continuously (I do it weekly).

              Actual file backups are what I store off site. 2 copies, one here and one off. My data generally isn’t changed all that much so I don’t bother continually backing up most directories. Like it doesn’t make sense to have 30 backups of my tv folder with my shows. They’re the same shows. I have some redundancy, I don’t just do one and done, but tape media is expensive so I don’t do like monthly backups either. Tape is wildly impractical for most home users though and offsite with tape means you need a trusted place to put it that’s reasonably safe and of moderately decent climate/humidity. Though an advantage of tape is that basically no one but the biggest of tech dorks is going to be able to read that data (versus something like leaving an external hard drive or bluray at a friends house. Even if you trust them a LOT they might plug it in. Although encryption exists)

              It’s home data so it’s about balancing what makes sense with what’s cost effective and your risk tolerance

              Some data is crucial of course. My personal documents are backed up far more regularly, like once an hour or so, and that’s where I utilize services like back blaze. My business, which is healthcare oriented, is entirely different and that data is segregated and utilizes backblaze as well as specialized software since it handles PHI and hipaa concerns. That’s backed up pretty much every few minutes.