RAID Storage Redundancy

Linux
Storage
Published

August 27, 2013

Modified

September 14, 2023

Overview

RAID (Redundant Array of Inexpensive/Independent Disks)…

  • …dedicated hardware controller or entirely in software
  • RAID system will have several storage devices as the bottom layer
    • …these will be partitioned (another block device)…
    • …and combined into a raid array (yet another block device)

Reasons to use RAID…

  • Performance
    • …enhanced transfer speed
    • …enhanced number of transactions per second
    • …increased single block device capacity
  • Redundancy
    • …greater efficiency in recovering from a device failures
    • Should never be used as a replacement for reliable backup

Types & Levels

Three possible types of RAID…

  • Firmware RAID…onboard integrated RAID controllers
  • Hardware RAID…dedicated RAID controller (typically PCIe)
  • Software RAID…completely hardware independent

Standard numbering system of raid levels

RAID without redundancy…

  • Linear - Simple grouping of drive…
    • …creates a larger virtual drive…
    • ..called append mode…IO to one device until full
  • RAID 0 - Striping
    • …data is interleaved across all drives in the array
    • Bottlenecks associated with I/O to a single device are alleviated
  • JBOD (Just a Bunch Of Disks)
    • …circumvent RAID firmware…act like a normal disk controller
    • configure device without RAID support

RAID with redundancy…

  • RAID 1 - Mirroring
    • …exact replica of all data on all devices
    • Write performance hit…increases with number of devices
  • RAID 4 - Dedicated Parity (…obsolete)
    • …dedicated drive is used to store parity
    • Single device failure can be recovered
  • RAID 5 - Distributed Parity (block-level striping)
    • At least 3 devices…support one device failure
    • …parity information is spread across all devices
    • Reduce the bottleneck inherent in writing parity
  • RAID 6 - Double Distributed Parity
    • At least 4 devices…supports two device failures
    • Write speed is slow because of double parity…restoring process is long

Nested RAIDs..

  • RAID 10 - Striping Mirror
    • …hybrid array results from the combination of RAID-0 and RAID-1
    • Performance of striping…redundant properties of mirroring
    • …most expensive solution…lot of surplus disk hardware
  • RAID 50 - Striping Parity
    • …combine two RAID-5 arrays into a striped array
    • Performance is slightly lower than a RAID 10
    • Each RAID-5 can survive a single disk failure

RAID storage capacities…

RAID level Realized capacity
Linear mode DiskSize 0+ DiskSize 1 +…DiskSizen
RAID-0 (striping) TotalDisks * DiskSize
RAID-1 (mirroring) DiskSize
RAID-4,5,6 (TotalDisks-1) * DiskSize
RAID-10 (striped mirror) NumberOfMirrors * DiskSize
RAID-50 (striped parity) (TotalDisks-ParityDisks) * DiskSize

Terminology

  • Parity algorithms are an alternative to mirroring for redundancy…
  • Striping…data is spread across multiple disks
    • …improves read and write performance
    • …stripes also support redundancy through disk parity
  • Degraded…array supporting redundancy…with a failed device
  • Rebuild (aka recovery)…
    • ..reconstructed using the parity information provided by the remaining disks
    • Usually puts an additional strain on system resource
  • Scrubbing…check for data corruption and errors…

Automation to using a spare device…

  • Hot-Spare
    • Extra storage devices to act as spares when a drive failure occurs
    • …replace a failed drive with a new drive without intervention
    • Decreases the chance that a second drive will fail and cause data loss
  • Hot-Swap
    • Removed a failed drive from a running system…without reboot
    • Need special hardware that supports it…mostly available today…

mdadm

Linux Software RAID…

  • mdraid subsystem was designed as a software RAID solution for Linux
    • Package mdadm*.{deb,rpm}…configured with the mdadm utility
    • Can consists of…
      • …physical devices
      • …partitions
      • …any Linux block device
  • dmraid (deprecated) used on a wide variety of firmware RAID implementations
  • Hardware RAID controllers have no specific RAID subsystem…
    • …come with their own drivers…
    • …allow the system to detect the RAID sets as regular disks

Terminology…

  • Superblock contains RAID metadata…
    • ~1KB stored in the beginning or in the end of each member disk
    • RAID information…same on all the disks
      • Number of devices…block size
      • RAID level and layout
    • Disk information…unique to each disk
      • Superblock number…
      • Disk role…column height
    • …allows the array to be reliably re-assembled after a shutdown
  • Assemble…rebuilds all RAID arrays
    • mdadm --assemble --scan…scan drives for superblocks
    • partially assembled array…in case of issues
    • …automatic during boot…create mdadm.conf
    • …requires support in initramfs if used for / (root-partition)
  • Scrubbing…
    • …regularly read blocks on devices to catch bad blocks early
    • …read-error handling…write back of correct data from other devices
    • …blocks read successfully…found to not be consistent…mismatch

/dev/md*

md multiple device driver…man 4 md

  • RAID solution that is completely hardware independent…
    • …implements the various RAID levels in the kernel disk (block device) code
    • …dependent on the server CPU performance and load
  • /dev/md* virtual devices…created from one or more independent underlying devices

Basic command syntax…

mdadm [mode] <raid-device> [options] <component-devices>

mdadm.conf

Configuration file…

  • …collection of words separated by white space… # for comments
  • Keywords…
    • DEVICE list devices/partitions t- ARRAY identify arrays for assembly
    • MAIL{ADDR,FORM} configure alert mail notification
    • CREATE default for array creation
    • …more cf. man 5 mdadm.conf
# add a configuration to the end
mdadm --detail --scan >> /etc/mdadm.conf

ARRAY example configuration…

ARRAY /dev/md0 metadata=1.2 name=node.fqdn:0 UUID=42650a5c:7eb06556:6db9f264:03ec67e8
  • …second word /dev/md* device to assembled…or ignore
  • otherwise…use various heuristics to determine an appropriate name
  • …subsequent words identify the array,
    • name identifier…typically the node name with device number as suffix
    • uuid 128 bit…stored in the superblock
    • devices…comma separated list of device names

Create

Create a RAID configuration…mode --create

  • …writes the per-device superblocks…initialisation…
    • …making sure disks of a mirror are identical
    • or…parity array the parities are correct
  • -–level= RAID level…
  • –raid-devices= number of active devices in the array
# mirror
mdadm --create /dev/md0 --level=raid1 --raid-devices=2 /dev/nvme[0-1]n1
# parity raid
mdadm --create /dev/md0 --level=raid5 --raid-devices=3 /dev/sd[abc]3

Monitor

# inspect the configuration in detail
mdadm --detail /dev/md0
# detailed information about each RAID device
mdadm --examine /dev/sd[bc]1

/proc/mdstat snapshot of the kernel’s RAID/md state…

watch -n .1 cat /proc/mdstat
# ...more details...
mdadm --misc --detail /dev/md[012]

Manage

Remove device from array…

  • --fail if not already in failed stated (due to a defect)
  • --remove a device from an array
mdadm /dev/md0 --fail /dev/sda1 --remove /dev/sda1

--add new device to array (replacing a failed one probably)

mdadm --add /dev/md0 /dev/sdb1

Stop and delete array…

# unmount the file-systems...
umount /dev/md0
# stop the device...
mdadm --stop /dev/md0
# remove the device
mdadm --remove /dev/md0
# ...incase if it errors with "No such file or directory"
mdadm --zero-superblock /dev/sda1 /dev/sdb1

dm-raid

Device-mapper RAID (dm-raid) target provides a bridge from DM to MD…

  • …allows the mdraid drivers to be accessed using a device-mapper interface
  • Supports…
    • …RAID device discovery
    • …RAID set activation, creation, removal, rebuild
    • …display of properties for ATARAID/DDF1 metadata

lvm RAID

LVM supports RAID…

  • …created and managed by LVM using the mdraid kernel drivers
  • …levels 0, 1, 4, 5, 6, and 10
  • Supports snapshots…

Create a RAID logical volume using lvcreate

lvcreate --type raid1 -m 1 -L 1G -n my_lv my_vg

ZFS

Implements RAID-Z

  • …variation on standard RAID-5 that offers better distribution of parity
  • Eliminates RAID-5 write hole…inconsistency in case of power loss
  • Three levels…
    • …based on the number of parity devices
    • …number of disks that can fail while the pool remains operational

Pool type in order of performance…

  • mirror – More disks, more reliability, same capacity. (RAID 1)
  • raidz1 – Single parity, minimum 3 disks. Two disk failures results in data loss.
  • raidz2 – Dual parity, minimum 4 disks. Allows for two disk failures.
  • raidz3 – Triple parity, minimum 5 disks. Allows for three disk failures.
zfs list [<name>]                      # show file-systems
zfs set mountpoint=<path> <name>       # set target mount point for file-system 
zfs mount <name>                       # mount a file-system
zfs umount <path>                      # unmount a file-system
grep -i mount= /etc/default/zfs        # boot persistance
findmnt -t zfs                         # list mounted file-systems
zfs list -o quota <name>               # show quota for file-system
zfs set quota=<size> <name>            # set quota for file-system
zpool status [<name>]                  # show storage pools
zpool scrub <name>                     
zpool create <name> <type> <device> [<device>,...]
                                       # create a new storage pool

Btrfs

…ability to combine and manage several disks as one filesystem

Supports following RAID profiles…

  • RAID 1…
    • RAID 1c3…stores 3 copies on separate disk
    • RAID 1c4…stores 4 copies on separate disks
  • RAID 10…RAID1+RAID0 modes for increased performance and redundancy
  • …not yet stable or suitable for production use
    • RAID 5…striped mode with 1 disk as redundancy
    • RAID 6…striped mode with 2 disks as redundancy

storecli

lspci | grep -i raid           # find hardware RAID controllers

MegaRAID SAS is the current high-end RAID controllers series by LSI

# list the included packages...
>>> find Unified_storcli_all_os -name '*.rpm'
Unified_storcli_all_os/ARM/Linux/storcli-007.2007.0000.0000-1.aarch64.rpm
Unified_storcli_all_os/Linux/storcli-007.2007.0000.0000-1.noarch.rpm
# ...install the command-line tools
>>> dnf install -y Unified_storcli_all_os/Linux/storcli-007.2007.0000.0000-1.noarch.rpm
# for simplicity...
>>> ln -s /opt/MegaRAID/storcli/storcli64 /sbin/storcli

storcli command interfaces with the RAID controller…following general format…

<[object identifier]> <verb> <[adverb | attributes | properties] > <[key=value]>
  • Object identifiers…
    • /cx controller x
    • /cx/vx virtual drive x on controller x
  • Verbs… add, del, set, show, etc.
  • <[adverb | attributes | properties] >
    • …specifies what the verb modifies or displays
    • <[key=value]> …if a value is required by the command
# summary of the drive and controller status
storcli show
# number of controllers detected
storcli show ctrlcount 
# show controller specifics...
storcli /c0 show | grep -e ^Product -e ^Serial -e Version
# ...controller & virtual devices configuration
storcli /c0/v0 show all

Devices

Show the list of devices on a specific controller…

  • EID:Slt enclosure number and slot numbers
# ...first controller in this example
>>> storcli /c0 show
...
---------------------------------------------------------------------------------
EID:Slt DID State DG       Size Intf Med SED PI SeSz Model               Sp Type 
---------------------------------------------------------------------------------
8:0       9 Onln   0 446.625 GB SATA SSD N   N  512B INTEL SSDSC2KG480G8 U  -    
8:1      10 Onln   0 446.625 GB SATA SSD N   N  512B INTEL SSDSC2KG480G8 U  -    
8:2      16 Onln   1   5.457 TB SAS  HDD N   N  512B HUS726060AL5210     U  -    
8:3      11 Onln   1   5.457 TB SAS  HDD N   N  512B HUS726060AL5210     U  -    
8:4      12 Onln   1   5.457 TB SAS  HDD N   N  512B HUS726060AL5210     U  -    
8:5      13 Onln   1   5.457 TB SAS  HDD N   N  512B HUS726060AL5210     U  -    
8:6      14 Onln   1   5.457 TB SAS  HDD N   N  512B HUS726060AL5210     U  -    
8:7      15 Onln   1   5.457 TB SAS  HDD N   N  512B HUS726060AL5210     U  -    
---------------------------------------------------------------------------------
...

Configuration

Create add vd RAID configuration, aka virtual drive…

  • r6 aka type=raid6
  • drives=e:s|e:s-x|e:s-x,y
    • e specifies the enclosure ID, s represents the slot
    • e:s-x range convention…slots s to x in the enclosure e
  • AWB (always write back) ignores the non-existence of a cache module
  • ra read ahead should be helpful if the read access is fairly regular
  • Strip=64 is believed to by a potentially important parameter…
    • …amount of data written to one physical disk before moving to the next one…
    • …in reality no one knows about the I/O, so take some middle value here
# created a Raid array
storcli /c0 add vd r6 drives=8:2-7 AWB ra cached Strip=64

Sanity Checks

Basic sanity checks are enable for the RAID controller:

  • Patrol read checks all disks for bad blocks…
    • …if the system load is small
    • …should not have a noticeable performance impact
  • Consistency check for parity…time interval delay=2016 (roughly 3 month)
# run both consistency checks and patrol reads
storcli /c0 set cc=conc delay=2016 starttime=2022/04/28 11