Skip to content

Linux Raid - replacing a physical device

Right now I'm dealing with a broken linux raid 1 in which both physical drives are reporting lots of bad blocks.
I have chosen the drive that exhibited the least problems and I'm having it cloned with dd_rescue on to a new one from a SysRescCD Live CD
dd_rescue /dev/old-b0rk3d-drive /dev/new-clone-drive
It's a good idea to run the above in a screen, especially if you're doing this via the internet.
Once the cloning is completed I simply put the new drive in the original server and expect it to boot - with a degraded but working raid.
In the next step I add a new empty drive, with a similar size (500 GB in my case) and clone the partition table with sfdisk:
sfdisk -d /dev/existing-drive | sfdisk /dev/new-empty-drive
Use `fdisk -l` before and after the partition cloning to be sure you're doing the right thing.
Once we have an identical partition table on both drives we can start adding partitions from the new drive to our linux raid. Assuming the cloned drive is sda and the new drive is sdb, our md setup should loook like this:
root@sysresccd /root % cat /proc/mdstat 
Personalities : [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md3 : active raid1 sda6[1]
      297780736 blocks [2/1] [U_]
      
md1 : active raid1 sda3[1]
      4192896 blocks [2/1] [U_]
      
md2 : active raid1 sda2[1]
      153597376 blocks [2/1] [U_]
      
md0 : active raid1 sda1[1]
      30716160 blocks [2/1] [U_]

And now let's add partitions to our raid layout:
mdadm /dev/md0 --add /dev/sdb1
mdadm /dev/md1 --add /dev/sdb3
mdadm /dev/md2 --add /dev/sdb2
mdadm /dev/md3 --add /dev/sdb6
And that's that, now we can see the raid resync'ing:
cat /proc/mdstat


We're not finished yet!
As this drive (and therefore its clone as well) was secondary (sdb) on the original system I expect problems with grub.
By default, when installing on to a linux raid Centos/Anaconda only installs grub on the first drive (sda in this case) and therefore my drive being sdb will lack this in its MBR.
If this is the case, we won't be able to boot at all from the cloned hdd, so we need to boot again from the Live CD, mount the linux raid from it and then chroot in to the OS and do the grub magic from there.
Assuming everything works nicely form the Live CD and the md devices are properly mounted under /mnt we can start:
export SHELL=/bin/bash
chroot /mnt/clone
#grub
grub> find /boot/grub/stage1
 (hd0,0)
 (hd1,0)
grub> root (hd0,0)
 Filesystem type is ext2fs, partition type 0xfd

grub> setup (hd0)
 Checking if "/boot/grub/stage1" exists... yes
 Checking if "/boot/grub/stage2" exists... yes
 Checking if "/boot/grub/e2fs_stage1_5" exists... yes
 Running "embed /boot/grub/e2fs_stage1_5 (hd0)"...  15 sectors are embedded.
succeeded
 Running "install /boot/grub/stage1 (hd0) (hd0)1+15 p (hd0,0)/boot/grub/stage2 /boot/grub/grub.conf"... succeeded
Done.

grub> root (hd1,0)
 Filesystem type is ext2fs, partition type 0xfd

grub> setup (hd1)
 Checking if "/boot/grub/stage1" exists... yes
 Checking if "/boot/grub/stage2" exists... yes
 Checking if "/boot/grub/e2fs_stage1_5" exists... yes
 Running "embed /boot/grub/e2fs_stage1_5 (hd1)"...  15 sectors are embedded.
succeeded
 Running "install /boot/grub/stage1 (hd1) (hd1)1+15 p (hd1,0)/boot/grub/stage2 /boot/grub/grub.conf"... succeeded
Done.

grub>quit 
And we're done now: reboot.
! - Please pay extra attention when doing this kind of operations, it's very easy to format the wrong HDD etc. :-)