[clug] Grub on software RAID1

Ben shadroth at gmail.com
Mon May 15 04:05:30 GMT 2006

Problem: Ensuring Grub will boot the system from either drive in a
software RAID1 array (in the event of either drive failing).

Hardware Fix (recabling drives):
>From what I understand, if hda and hdc have an _identical_ MBR, then
hda will boot fine on it's own (since it's MBR says to boot off hda),
and if hdc's MBR is read, then it will say to boot off hda (which
won't work if hda has failed)

Q1a: Is this correct?
Q1b: If so, would swapping the drive at hdc (secondary master) to hda
(primary master) be all that I need to do, to have the system working

(I used hexdump to read off the first 512B of hda1, hdc1 and md0 and
they appeared identical, with the letters GRUB at starting at byte
179, indicating GRUB was installed on both drives. The first 512B of
hda and hdc did not give anything in English, so I'm a bit confused,
as I read this is where the MBR should be.)

Software Fix (reconfiguring GRUB/MBR):
The issues I've encountered are:
* There are several methods (none of which I fully understand) which
are variously reported as working and not working depending on what
you read.
* There are many warnings that even a slight error will corrupt the
RAID partitions.
* The are reports that if you mess with the MBR it will just be
rewritten when certain upgrades take place.

1: The CentOS (RHEL) way:
The only method (and it's unsupported) is given as:
(This is the link from the CentOS reference for installing GRUB

2: An easier looking way:
>From what I understand this solution would cause hda to have an MBR
that led to a boot off hda (and then the mirror, as is the case now)
and hdc would a have an MBR that would lead to a boot off hdc (and
then mirror, rather than trying to boot of hda and then the mirror -
which I suspect is the case now). I read something similar on the
linuxSA mailing lists.

3: The too good to be true way:
This method remaps the GRUB references written on hdc so grub will use
hdc whenever hda is referred to.

Q2: Is option 3 above the best route?

Q3: Will any/all methods be overwritten when by certain updates?

Q4: How do I work out if a specific method will break the RAID
partitions? - other than trying it out ;-)

OS: CentOS 4.3 with SELinux active on default settings.
hda: 200GB Seagate Barracuda 7200.9 (PATA)
hdb: CD-ROM
hdc: 200GB Seagate Barracuda 7200.9 (PATA)

NB. The BIOS only recognises the drives as 136GB since they're on
ATA66 channels. Much to my relief, CentOS 4.3 had no problem
recognising the full 200GB.

1. hda was partitioned in Anaconda using Disk Druid with a ~100MB, a
~512MB and a "fill available space" RAID partition. (I did not used
LVM as I have read this will can cause problems with RAID1.)
2. hda was cloned to hdb (resulting in the same RAID partition setup).
3. RAID devices were created as
 * md0 (2*~100MiB) for /boot, ext3.
 * md1 (2*~512MiB) for swap
 * md2 (2*~189GiB) for /

I don't think I have made any other changes.

I'm still getting used to Linux, so apologies if I've overlooked
something obvious.

More information about the linux mailing list