RAID Synchronization CRON Job Affecting Performance

Thursday, October 5th, 2023

RAID Synchronization CRON Job Affecting Performance

For some FakeRaid configurations, CentOS 7 and newer variants may run a RAID synchronization job configured in the /etc/cron.d directory in a file named raid-check.

This job is responsible for making sure the RAID array is in sync across all drives.  It runs by default every week on Sunday at 1 AM.

# Run system wide raid-check once a week on Sunday at 1am by default

However, this was not a convenient time for my users, as they were gaming at this time, so rather than affect server performance, I changed the cronjob to:

0 5 1 * * root /usr/bin/test $(/usr/bin/date +\%u) -ne 6 && /usr/sbin/raid-check

Thus, the sync job now runs once a month on the 1st at 5 AM.  And, it will not run if the day of the week is a Saturday.  This applies to several of my C1100 servers.

Rebuilding a Removed / Failed RAID 10 Array in CentOS / Rocky Linux

Tuesday, February 22nd, 2022

Replace Hard Drive in a RAID 10 Array and Sync the RAID 10 Array to the New Hard Drive

I had the hardest time rebuilding a RAID 10 array after replacing a hard drive.  I didn't fail the old hard drive before removing it from the array, and sometimes, this may not be an option.  What happened in my case is the data center replaced the hard drive that I had shipped to them directly from an eBay seller.  I was hoping that the RAID array would rebuild itself onto the new drive (as I have seen happen before in some circumstances).  However, that may not happen if the replacement drive still has its old RAID array or partition information present, and then, it might be difficult to actually get the RAID array to sync to the new drive. 

In my case, I run LVM (Logical Volume Manager) for my partitions.  This complicates the RAID setup, and I found that mdadm commands didn't work as expected.  If this situation occurs, it is best to boot Rocky Linux or CentOS in recovery mode using a Rocky Linux ISO or CentOS ISO.  Once the recovery system loads, drop to a shell without mounting any file systems.  Next, you will need to deactivate your LVM volume group:

vgdisplay
vgchange -a n my_volume_group # deactivate

Next, examine your md RAID array by running the following command:

cat /proc/mdstat

After running that command, I identied my RAID devices as md126 and md127.  /dev/md127 is considered the parent even though /dev/md126 is where everything is. 

I can get more information about the RAID array by running the below commands:

mdadm --detail /dev/md126
mdadm --detail /dev/md127

Let's fail and remove any removed (no longer existing) drives using this command:

mdadm /dev/md126 --remove failed
mdadm /dev/md126 --remove detached
mdadm /dev/md127 --remove failed
mdadm /dev/md127 --remove detached

Next, we need to identify the hard drive we want to add / replace the removed drive in the array:

lsblk

From running the above command, I noticed that the new drive was /dev/sde, so I needed to wipe its old RAID configuration (if there is any) and then add it to the RAID array.

wipefs /dev/sde
mdadm --add /dev/md127 /dev/sde

Check to see if the syncing process has started:

cat /proc/mdstat

You may or may not need to run the below command to get the RAID device to start syncing to the new drive:

mdadm --grow /dev/md126 --raid-devices=4

Helpful Links:

https://delightlylinux.wordpress.com/2020/12/22/how-to-remove-a-drive-from-a-raid-array/
https://serverfault.com/questions/554553/how-to-delete-removed-devices-from-a-mdadm-raid1
https://unix.stackexchange.com/questions/53129/dev-md127-refuses-to-stop-no-open-files
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/4/html/cluster_logical_volume_manager/vg_activate
https://serverfault.com/questions/676638/mdadm-drive-replacement-shows-up-as-spare-and-refuses-to-sync
https://serverfault.com/questions/554553/how-to-delete-removed-devices-from-a-mdadm-raid1

CentOS LVM and Software RAID Partitioning Instructions

Sunday, May 30th, 2021

Installing and Configuring CentOS to Host KVM Virtual Machines

GUI

When configuring a fresh install of CentOS for a KVM host machine (the main server that hosts all of the virtual machines), I like to run a GUI to make managing some of the virtual machines easier.  Thus, during install, choose the options for CentOS with Minimal GUI:

RAID 10 LVM Partitions

When configuring the hard drive partitions, set it up to use RAID 10 LVM SOFTWARE RAID:

Create volume group called "vms" without the quotes that is setup as RAID 10 (set volume group space to be as large as possible).

Set the "/" partition to 100GB XFS LVM (RAID10).

Set the "swap" partition to 32GB.

Only setup those two partitions.  The remaining space in the RAID 10 volume group "vms" will be used for KVM containers (and the remaining space does NOT need to be assigned to any mount points).

That's all.

Adding SAS RAID Drivers to CentOS 8 and Red Hat Linux During Installation

Friday, April 30th, 2021

Adding SAS RAID Drivers to CentOS 8 and Red Hat Linux During Installation

CentOS 8 and Red Hat Linux 8 removed a lot of built in RAID controller and SAS drivers.  As such, you'll need to identify your SAS RAID controller card model number, and then during the installation of CentOS 8 or Red Hat, you will need to follow these instructions (modifying them for your hardware).

https://gainanov.pro/eng-blog/linux/rhel8-install-to-dell-raid/

If for some reason the link above is no longer available, I saved and archived a copy which can be read here.

Add El Repo Permanently

As updates are released to CentOS 8 / Rocky Linux / Red Hat 8, the kernel will often be upgraded.  To make sure the SAS drives are updated as well, you'll need to configure your system to pull updates from El Repo automatically by using the following commands:

sudo rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
sudo yum install https://www.elrepo.org/elrepo-release-8.el8.elrepo.noarch.rpm
sudo yum update -y

In case the above instructions no longer work, this guide should help.

Disable NetworkManager Wait Online Service

Prevent the boot from being halted on startup by network connection checks by running the below command:

sudo systemctl mask NetworkManager-wait-online.service

Ubuntu Grub Fails to Install on RAID Array

Friday, February 6th, 2015

Ubuntu Grub RAID Issues

Grub Fails To Install on RAID Array

If grub fails to install on your RAID array in any version of Ubuntu, do NOT disable your BIOS RAID! The correct solution is at this blog entry. I'll summarize it below.

At the stage of the install where it is attempting to install GRUB it will detect as

/dev/mapper

This is incomplete! That's why the GRUB install fails.

You need the actual name of the RAID array to install to. So during that step, press ctrl+alt+F2 to drop to a busybox terminal, then enter

ls -l /dev/mapper

Pick out the name of your array from the list shown, then press ctrl+alt+F1 to switch back to the install (you can switch back and forth as much as you like with no problems) and enter it in the field as

/dev/mapper/{your array name}  

Then GRUB installs perfectly and you're ready to go, with a proper BIOS RAID array intact.

System Won't Boot After Grub Failed to Install

If your system will no longer boot because you skipped installing or updating grub, you need to download an Ubuntu version that does support RAID, boot from the LIVE CD, drop to a terminal, and then run:

ls -l /dev/mapper
sudo grub-install /dev/mapper/{ARRAY_NAME_HERE}

Setting Up RAID Array During Ubuntu Install

If you are configuring a BIOS RAID array for the first time on Ubuntu, you should create a 1MB boot partition.  Its partition type is "boot".  If you do this, grub will always try to install there and will succeed every time without failing when upgrading or reinstalling grub.