Safely storing digital files for a long time turns out to be an expensive and difficult problem.
Recent advances in filesystems, though, have made this a doable task.
The old rule: “3-2-1 backups” #
The “3-2-1” backup strategy is perhaps the most common strategy for keeping files safe.
This strategy suggests that you maintain:
- at last three different copies of every file, with
- at least two different storage formats, and
- at least one copy offsite.
Why different storage formats? #
If you store your data on different formats, hopefully the formats will have different lifespans. When one fails before the other, it gives you time to make a new copy.
There are several problems with this line of reasoning, though.
Storage formats available to consumers are typically only hard drives and optical storage media (CD-R, DVD-R, or BD-R). The latest optical storage technology (as of July, 2020), is quad-layer blue ray which can store 100GB per disk. External hard drives are regularly in the 6-12TB range, which are 60-120x larger. Backing up a 12TB hard drive on optical media would be ridiculous: it would take 120 BD-R disks (and $1,400 in media, using $12 M-Discs).
There’s a bigger problem with 3-2-1, though: how do you detect any storage problems in the first place?
Why 3-2-1 isn’t enough: the impact of “bit rot” #
All commonly available storage formats eventually suffer from “bit rot,” or data degradation.
Your files are comprised of hundreds of thousands or even millions of bits of data.
If a handful of those bits are “flipped” due to storage defects or media degradation your file can become unreadable.
Here’s an example of an image with only a couple bits flipped:
Note that the rightmost image contents are 99.999% correct, yet isn’t viewable.
Hard drives and optical disks that are five years old or older may have some amount of bit rot. Note that hard drives and SSDs have some internal error correction, but are still at risk for higher rates of corruption.
Many of our older family photos on old hard drives had succumbed to some amount of bit rot. In discovering this unpleasantness, we updated PhotoStructure to detect and skip over photos and videos that are corrupt, with the hope that you’ll have several copies of a given photo, and one of them won’t have bit rot.
But this feature just treats the symptom: this doesn’t fix the underlying problem.
Overcoming bit rot #
Several advanced filesystems, including Btrfs and ZFS, support data scrubbing, which detects and repairs bit rot automatically.
Unfortunately, neither Windows nor macOS support these filesystems (and their newer filesystems, like APFS, still don’t detect bit rot). Time Machine and Backup and Restore don’t detect bit rot either.
So: how do normal people use these fancy new filesystems?
NAS to the rescue #
Network-attached storage (NAS) devices hold several large hard drives and quietly do their work safely storing your files. You can keep using your favorite OS, but you don’t have to worry about bit rot anymore.
Most NAS also support hosting Docker images. PhotoStructure runs under Docker, and is designed to run smoothly on a NAS.
That sounds great. Which NAS should I get? #
Synology is the most “plug and play,” but they require their own proprietary hardware which is more expensive than what it costs to build your own.
FreeNAS is free, easy to install, and runs well, supports ZFS, and supports a wide variety of hardware.
If building your own computer is intimidating and you’d like to try FreeNAS, the FreeNAS Mini comes pre-assembled.
unRAID, like FreeNAS, runs on hardware you already have, and supports XFS, but requires a license to run.
(PhotoStructure has no commercial affiliation with FreeNAS, unRAID, or Synology, but we have several of each, both for testing PhotoStructure, and to store our family’s photo libraries).
Getting started with your NAS #
Consider getting a NAS that has 4 or more drives to offer redundancy and support data integrity checks.
Buy 50% more storage than you need right now so you have room to grow in the near future.
Enable weekly data scrubbing.
Enable snapshotting, if available.
Enable monthly S.M.A.R.T. self-tests.
Set up your NAS to either apply security patches automatically or notify you to do so.
Consider installing a virus scanner and malware detection package on your NAS. Synology has a “security audit” tool as well.
Make sure your router has recently-updated firmware
Use secure admin and user passwords. Using a password manager like Bitwarden makes this easy to do.
Configure your NAS to tell you if it has any errors: you don’t want any disks dying or backups failing without you knowing it.
How to get “at least 3 copies” #
Please understand: RAID is not a backup. Please read that post before continuing.
Satisfy the 2nd or 3rd copy of the “at least 3 copies” rule by copying the entire contents of your NAS to an external hard drive that normally stays powered down and offline. Do this quarterly.
Consider storing this drive near your emergency kit so you can grab it as you leave your house in case of an emergency.
This external drive can also reduce your dataloss if your NAS catastrophically fails, or if you get hit by malware like cryptolockers.
How to get “at least 1 copy offsite” #
Satisfy the “at least 1 offsite” rule by setting up your NAS to back up to a cloud service automatically. Backblaze and tarsnap are both well-regarded offsite storage solutions, and both have solutions that work with your new NAS.
If you don’t want to pay for cloud storage, you can set up another NAS (like at a friend or family member’s house). Both FreeNAS and Synology support NAS-to-NAS replication.
Make sure you configure the replication job to run in the middle of the night, and throttle network bandwidth so you don’t make your friends or family grumpy.
How do I back up files on my phone? #
Resilio Sync (for iOS and Android) and SyncThing (for Android) will automatically back up your phone to your NAS. You install the software on both your phone and your NAS, and then configure your phone to automatically back up to your NAS.
And now that your files are safe… #
You might want to try PhotoStructure, which is a self-hosted photo management solution that runs on your NAS using Docker.