RAID is not a backup
Safely storing your most important files requires both
-
using filesystems that can detect bit rot, as well as
-
keeping copies of those files on different devices.
What’s a “RAID”? đź”—
A Redundant Array of Inexpensive Disks, or “RAID,” lets you gather together several hard drives to form a single “volume” to host your files.
NAS devices make it easy to use RAID.
There are different flavors of RAID. The most common flavors offer better performance, data redundancy, or a combination of both.
RAID-1, for example, copies every bit to two different drives. If one drive fails, your data is still on the healthy drive.
Why isn’t RAID a backup? đź”—
It may be tempting to consider the data redundancy that a RAID-1 or RAID-5 array offers as a “second copy” of your file (as per 3-2-1 backups).
This data redundancy offers very different benefits from periodic, offline backups, however.
1. Data loss from human and software errors đź”—
If you, or software you run, accidentally delete files on a RAID device, those files are gone on both mirrored devices. Your disconnected backup will still have those files, though.
2. Data loss from malware đź”—
Very similarly, if you have a computer on your LAN infected by malware such as a cryptolocker or other forms of ransomware, the files on all drives in your RAID will be affected, and the RAID won’t help you recover your data.
3. Data loss from power, controller, and cache failures đź”—
Power surges, power supply failures, and internal component failures in your motherboard or disk controller attached to your RAID may fail cataclysmically, and take all (or most) of the drives in your RAID with it.
System crashes paired with write caches on drives can cause RAIDs to quietly fall out of sync, as well.
4. Data loss from correlated failures đź”—
Storing your data on 2 or more drives produced from the same factory on the same day have statistically correlated failures.
Unfortunately, if you pick drives from different manufacturers to form your RAID, it may suffer performance issues due to different cache sizes, different spindle and seek speeds, and slightly different interactions to the same disk controller commands.
5. Data loss from array rebuild failures đź”—
Correlated drive failures don’t have to happen precisely at the same time, either: when one drive fails, and you “rebuild” your RAID with a new drive, the old drive(s) will be taxed, and can fail during the rebuild. With 12 TB drives commonly used, rebuild times have grown from days to up to a week or more, depending on hardware. (It’s a long time to nervously watch the blinking lights in your NAS!)
Beware: snapshots are (also!) not backups đź”—
Filesystems that support “snapshots” can provide automatic views at historical filesystem states.
Snapshots are very convenient, in that they are “set and forget”: your NAS takes care of them automatically, after they are set up (as opposed to offline backups, which take some amount of manual effort).
Local snapshots can prevent data loss due to human and software errors and malware, but they do not protect against power and controller errors, correlated failures, or array rebuild failures.
(If you periodically copy your snapshots to another device, then they absolutely are “backups” which give you protection against hardware errors on the snapshot source).