RAID - Waikato Linux Users Group

Acronym for Redundant Array (of) Inexpensive Disks.

The idea behind RAID is having an array of disks (usually inexpensive, although that doesn't stop people buying expensive disks for their RAID array) which are all put together to form one logical disk. There are different types of RAID (as listed below) which all have various advantages (and disadvantages)

See RaidOnLinux and RaidNotes for some specific notes about RAID under linux.

RAID 0: Striping

RAID 0 technically isn't RAID: It provides no redundancy or fault tolerance.

However data is spread across the disks (they aren't just concatenated). This means that file I/O speed is better than having a single disk since each disk can be reading or writing data independant of the others.

Advantages:

No parity generation
Easy to implement in software and hardware
Cheap to implement
Utilise full disk capacity, no space is wasted storing redundant pages

http://www.raidarray.eu.com/raid0.html or http://www.acnc.com/raid.html

Disadvantages

If any disk fails, you lose all your data
Not true RAID

Applications

Anything where you need fast I/O particularly streaming I/O, for example Video Editing.

RAID 1: Mirroring

When writing data, write it to all disks in the array, when reading read from any of the disks of the array. If any disk in the array fails you can replace it easily and rebuild the array without loosing data. If your disks are hotswappable, you can do this with only minor performance losses.

Advantages

Can support all but one disk in the array failing simultaniously
Easy to implement in software and in hardware.

Disadvantages

The Cost per MB is high, since you need to buy at least twice as much disk space as you need.
Extremely wasteful of disk space (since at least 50% of your data is being used elsewhere)
Writes can be slowed down

Applications

When you just can't afford to have your data die on you.
When you need good read performance but don't care about write performance.

RAID 2: Striping + ECC

This uses striping with some disks holding ECC information. Apparently noone has ever implemented this spec, because it's so complicated and really, the other RAID levels do it better. Be sure to prove me wrong by finding someone that does RAID level 2 :)

RAID 3: Parity Disk

RAID 3 has a parity disk which stores an XOR of all the other disks. If any one disk fails, then this XOR can be used to recreate the data by XOR'ing all the other disks together and then XOR'ing the parity data. If the parity disk fails then it can be regenerated by XOR'ing all the disks together.

Advantages

Efficient use of data storage
High read speed

Disadvantages

Requires at least 4 disks
Inefficient with small data transfers

http://www.raidarray.eu.com/raid3.html

RAID 4: Block level striping with Parity Disk

RAID 4 stripes based on blocks instead of bytes and stripes the data across the disks except for one which stores the Parity. Performance is good.

RAID 5: Parity shared across disks

Ah, RAID 5! RAID 5 combines the advantages of 3 and 0 by spreading the parity infomation across all drives. This is the most common type of RAID.

Advantages

Optimum Cost/Performance/Fault Tolerance
Very efficient
Handles small writes efficiently
Handles multiple I/O requests

Disadvantages

Requires at least 3 disks

http://www.raidarray.eu.com/raid5.html

RAID 6: Dual parity disks

Striped array with two parity disks. Any two disks can fail simultaniously with the array continuing on. Good for when the data can't EVER stop!

The only card I've found that implements RAID6 are SATA cards made by Areca (also rebadged as Tekram).

RAID 7

RAID 7 isn't a standard, some company trademarked it, and came up with their own proprietary system and called it RAID 7.

RAID 1+0 (or "10")

Data is mirrored and striped across multiple disks. (Combination of RAID 1 and RAID 0.

Advantages

Good performance
Highly fault tolerant

Disadvantages

Very expensive
Drive spindles must be synchronised for good performance
Not very scalable

http://www.raidarray.eu.com/raid10.html

RAID 0+1

Two striped arrays mirrored.

Advantages

Simple
Tolerant
Fast

Disadvantages

Expensive
Lots of wasted disk space
If two disks on opposing arrays die, you lose the entire array, where 1+0 would require two disks in the same position to die before you lose the array which is far less probable.

http://www.raidarray.eu.com/raid0+1.html

Visual explaination of various RAID setups:

One suggested way of calculating the Stripe size for RAID systems that are doing a lot of random I/O (machines that are serving multiple users, eg email, compute servers etc) is to figure out the maximum throughput you can get through your disks (including controllers, PCI bus bandwidth etc). Then plug it into this formula: stripesize = throughput / (drives * RPM/60)

then round down the stripesize to the nearest multiple of your filesystem cluster size (usually 4k).

Suggestions for the improvement of the estimation of optimal stripe size is solicited.