What Is RAID? A Practical 2023 Guide to Definitions, Use and Costs
Here at Cloudwards.net we’re all about backup: after all, you never know when disaster will strike and your primary file storage is wiped out. Though generally using any of our best online backup providers will do the trick, if you really want to be safe, we recommend a hybrid backup strategy that includes a hardware element as well. This is where RAID comes into play.
In short, RAID is a way in which you can link up several hard drives so that if one of them fails, the others can take over the load, more or less seamlessly. Though it is usually found in big, commercial servers, there are plenty of people around that have a RAID array set up at home as having that extra backup measure in place gives great piece of mind.
In this article, we’ll be going through what RAID is, the most common setups found in people’s homes or small businesses and also a few practical tips and tricks. However, one important caveat — and this cannot be emphasized enough — is that RAID is not an alternative to a proper backup strategy.
If more than two drives fail at the same time you’ll find yourself up the proverbial creek unless you have a backup ready. With that warning out of the way, let’s go RAIDing.
Hard Drives, Failures and RAID
Computers in general are surprisingly resilient. Humble backend servers can run for decades without so much as a reboot. However, the same cannot be said for the ubiquitous hard disk drive — we call it a computer crash because that’s precisely what the read heads do: they crash into the actual drive.
Spinners may have gotten faster and physically smaller while at the same time offering far more storage, but they are still one of the last mechanical parts left in machines that are otherwise entirely solid-state.
Failure rates, even on modern HDDs, are sobering: it is not uncommon for a particularly unlucky model of HDD to suffer a failure rate of 27 percent during its warranty period, which is rarely more than two or three years.
Now, if a hard drive fails during the warranty, the manufacturer will probably replace the hardware but your data is under no warranty of any kind. Though you could use any of our best data recovery software solutions and get your files back, none of them come cheap and none are fully guaranteed to work.
This is where RAID can offer some peace of mind. RAID is a blanket term that covers a variety of setups in which you can pool storage drives so that they behave as a single volume. When done correctly, the drives combined display as a single massive volume that offers higher reliability and better speed than each individual drive.
In the right setup, RAID can be your first line of defense against data loss and will allow continued access to your files in situations where a hardware failure would have otherwise caused either delays due to reduced speed, or downtime until data is restored from a backup.
What is RAID?
RAID stands for Redundant Array of Independent Disks (the “I” used to stand for “Inexpensive” before succumbing to the black magic of marketing). The idea came about in the 80s as a way of overcoming problems with the size, speed and reliability of HDDs.
RAID is not a monolithic concept — far from it. The way in which disks interact within an array varies widely, as do the benefits, drawbacks and risks of each configuration. In addition to the standard RAID setups (called levels), there are a number of noteworthy additions, offshoots and proprietary implementations. Because those will often be described using a standard level as reference, we’ll start with those.
The first standard RAID level is “striping,” designated as RAID 0 (zero). It increases performance at the cost of reliability by splitting data into blocks and distributing — striping — it across all drives.
Reading and writing simultaneously to multiple drives removes the speed bottleneck imposed by a single drive‘s maximum speed. Because each disk handles only a fraction of the full task, speed is substantially higher than in single-drive operation and all disk space is usable.
RAID and Failure Rates
Statistics will tell you that having two drives doubles the chance of a drive failure. In RAID 0, that doubled chance of failure applies to data on both drives. From the point of view of data security, this is not the way to go. At least in independent operation a drive failure won’t wipe other random drives nearby, although … well, let’s just say it pays to never overestimate your IT department.
At the opposite end of the safety spectrum is a so-called mirror array, designated RAID 1. A mirror array writes full copies of the data to each of the drives in the array. Should one drive fail, there will still be one or more complete copies of the data available. It is a hardware backup at the lowest level: only the death of all disks in the array will result in data loss.
However, with great safety comes great expense: only one disk’s worth of capacity is available. Two, three or even five 1TB drives in this setup will all yield the same usable single terabyte.
As you can imagine, this is safe, but limits maximum storage capacity to the size of the largest hard disk you can fit into your RAID array (which a quick look around the digital marketplace shows is 10GB). To increase storage space and add a level of security, you want to utilize something called parity.
RAID and Parity
Parity is where RAID gets really interesting, with RAID 5 being the most popular parity setup. In a parity array, part of each drive is reserved for backup. All HDs in the array pool the capacity of available disks (as in RAID 0), but part of each drive is reserved for a checksum calculation that, in the event of one drive failing, can be used in reverse to calculate the missing data.
Note: for the highly complicated math behind parity, check out this article.
In the event of a drive failure, the data — and parity — from the surviving drives can be used to rebuild the missing data to a replacement drive. No matter the number or capacity of the drives in RAID 5, always the equivalent to one drive’s capacity is “lost” to parity, which makes it more scalable than RAID 1. A single extra disk protecting four or five disks’ worth of data is a sound investment.
Other Parity Setups
RAID levels 2 and 3 are also parity setups, though they have been made largely obsolete by RAID 5. RAID 4 differs from 5 in that it uses a designated parity drive, meaning that it stores all parity on drive and uses the others for data. RAID 5’s distributed parity better balances the wear on the drives.
RAID 6 uses double parity but is otherwise identical to RAID 5. Two disks’ worth of space is used for parity in an array that can survive — yup — two disk failures without data loss.
RAID levels can also be nested, which is basically putting a RAID array inside another RAID array. This is used mainly by companies that either want to store a lot of data or those that want to badly speed up their current RAID setup. We won’t go too much into nesting as if you want to set this up, you probably know enough to not have to read this article.
Setting up RAID: Hardware
Now that you know a little more about RAID, let’s see how we can set up an array. Though there are plenty of do-it-yourself options, we recommend that you buy a RAID enclosure, which is basically a handy little box you can slot HDDs in to and which will also come with some ready-to-go software.
These enclosure are a real time saver and are generally affordable: below you’ll find the specs of the TerraMaster and ProBox.
|Brand||Model||Number of bays||MSRP|
|Noontec||TerraMaster||2 or 4||From $200|
As for the HDDs you’ll be slotting in, it pays to look around while keeping RAID’s “inexpensive” roots in mind. If you’re implementing a thorough backup strategy, you don’t need to buy top-of-the-line hardware with infinitesimal failure rates. After all, you’re building an array that can catch failures.
Also, don’t forget that you can save costs by doing some basic arithmetic: to build a 10TB RAID 5 array, you can use either five 2TB disks, or two 5TB drives: either way, you get the same result. However, two 5TB drives will generally be a lot cheaper than five of 2TB. Do note that if not all drives in an array are of the same size, all drives will default to the smallest drive in the array.
The HDDs you’ll most likely want to use fall in the $60 to $200 range, which combined with the cost of a RAID enclosure should bring you up to a sizeable, yet manageable bill.
Another option is to use Network-attached storage or NAS in a RAID setup. We’re a big fan of Synology here at Cloudwards.net thanks to their durability, reliability and wide variety of backup options (we have an article on best cloud backup for Synology if you’d like to know more). The downside to Synology is that it isn’t the cheapest solution out there.
|Brand||Model||Number of bays||MSRP|
|Synology||DS416 Play||4||From $415|
Setting up RAID: Software
The biggest benefit to using a NAS or enclosure in a RAID setup is that you get the software included: there are plenty of other options out there, but they are generally very unfriendly to even experienced users and require some serious know-how.
At the most basic level you can use an Intel motherboard to set up your array: this should work well enough and comes with some good software and drivers that will help the tech-savvy set up fairly quickly. Another reason to stick with Intel is that it’s easy to find new parts when the connector fails, a rather unpleasant occurrence that will lock you out of your array until you find a new one.
Other software options include Windows Storage Spaces as well as FreeNAS, though both require more than even intermediate knowledge of what you’re doing. Note that Linux-based FreeNAS is almost entirely terminal-based and a great way for geeks to find out how much suffering they can endure before switching over to something else.
RAID and Backup
As was said all the way at the beginning of the article, a RAID array, even in multi-disk parity, is not an alternative to a proper backup. It’s a great way to avoid restoring backups when time is of the essence as well as simply adding more storage space to your computer. It also ensures data does not get lost to easily just because a single hard drive failed.
Keep in mind that when a drive fails, rebuilding data from parity takes time. The more data that needs to be rebuilt, the longer the array will be operating without redundancy. That’s nail-biting time during which another failure will mean a very bad day indeed; using one of our best cloud storage and backup services will, at the very least, keep your nails intact, and cloud online storage solutions start at around $5 a month.
In our earlier article comparing HDDs, NAS and cloud backup we recommend Carbonite as one great option to backup your RAID array, as this provider has custom-built software that helps you backup RAID setups without too much fuss. For more info on what we think about it, check our Carbonite review.
Is RAID for Me?
With all the above in mind, let’s see if RAID is the way to go for you. If you store a lot of data on your computer and want to make sure none of it gets lost just because of a small HDD failure, or if you just want to speed up operation of multiple hard drives, then, yes, RAID is a smart option.
Though it isn’t the cheapest solution out there, RAID is a great alternative to doing everything on the cloud, especially for people that work with audio and video, as there is little on offer in the SaaS space for these groups beyond storage.
Understand, however, that RAID is not a panacea: a hardware controller can fail and you may be out of luck finding a compatible model to replace it. RAID is not a backup solution, so losing several HDDs at once means that your data is gone unless backed up.
Though RAID isn’t for everyone, it offers great benefits for certain groups of people. Are you one of them? How did setting up a RAID array work out for you? Please let us know in the comments below. Thank you for reading.