Rebuilding a Removed / Failed RAID 10 Array in CentOS / Rocky Linux
Replace Hard Drive in a RAID 10 Array and Sync the RAID 10 Array to the New Hard Drive
I had the hardest time rebuilding a RAID 10 array after replacing a hard drive. I didn't fail the old hard drive before removing it from the array, and sometimes, this may not be an option. What happened in my case is the data center replaced the hard drive that I had shipped to them directly from an eBay seller. I was hoping that the RAID array would rebuild itself onto the new drive (as I have seen happen before in some circumstances). However, that may not happen if the replacement drive still has its old RAID array or partition information present, and then, it might be difficult to actually get the RAID array to sync to the new drive.
In my case, I run LVM (Logical Volume Manager) for my partitions. This complicates the RAID setup, and I found that mdadm commands didn't work as expected. If this situation occurs, it is best to boot Rocky Linux or CentOS in recovery mode using a Rocky Linux ISO or CentOS ISO. Once the recovery system loads, drop to a shell without mounting any file systems. Next, you will need to deactivate your LVM volume group:
vgdisplay vgchange -a n my_volume_group # deactivate
Next, examine your md RAID array by running the following command:
cat /proc/mdstat
After running that command, I identied my RAID devices as md126 and md127. /dev/md127 is considered the parent even though /dev/md126 is where everything is.
I can get more information about the RAID array by running the below commands:
mdadm --detail /dev/md126 mdadm --detail /dev/md127
Let's fail and remove any removed (no longer existing) drives using this command:
mdadm /dev/md126 --remove failed mdadm /dev/md126 --remove detached mdadm /dev/md127 --remove failed mdadm /dev/md127 --remove detached
Next, we need to identify the hard drive we want to add / replace the removed drive in the array:
lsblk
From running the above command, I noticed that the new drive was /dev/sde, so I needed to wipe its old RAID configuration (if there is any) and then add it to the RAID array.
wipefs /dev/sde mdadm --add /dev/md127 /dev/sde
Check to see if the syncing process has started:
cat /proc/mdstat
You may or may not need to run the below command to get the RAID device to start syncing to the new drive:
mdadm --grow /dev/md126 --raid-devices=4
Helpful Links:
https://delightlylinux.wordpress.com/2020/12/22/how-to-remove-a-drive-from-a-raid-array/
https://serverfault.com/questions/554553/how-to-delete-removed-devices-from-a-mdadm-raid1
https://unix.stackexchange.com/questions/53129/dev-md127-refuses-to-stop-no-open-files
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/4/html/cluster_logical_volume_manager/vg_activate
https://serverfault.com/questions/676638/mdadm-drive-replacement-shows-up-as-spare-and-refuses-to-sync
https://serverfault.com/questions/554553/how-to-delete-removed-devices-from-a-mdadm-raid1
Rent dedicated game servers from Chicago, Kansas City, Dallas Texas, Wilkes-Barre Pennsylvania, Arizona, Denver Colorado, California, Florida, and Sofia Bulgaria starting as low as $7.45 a month. We Be HostiN (https://webehostin.com)
Tags: active, array, centos, drive, hard, linux, new, raid, raid10, rebuild, replace, rocky, sync