The expanses of WolfWings' land
scratched on the wall for all to see

April 3rd, 2010
April 3rd, 2010
April 3rd, 2010
April 3rd, 2010
April 3rd, 2010

[User Picture]12:19 am - Setting up a software-RAID server under Gentoo Linux w/ initramfs the simple way.

This can technically cover a NAS, a file server, a web server, or anything else. In my case, it's being used as a borderline NAS/file server/media server for a household.

All drives are hot-swap, the RAID is heavilly growable (from 4-8TB usable w/ 5 drives up to 16+TB with more drives, all hot-swap), and this is based on the newest tech out there so things are fast, stable, and I'm focussing on the 'easy' side of the 80/20 equation: I'm putting in the 20% of the work that gets 80% of the benefit, so please don't make recommendations like "why didn't you use XYZ file system, or set ABC tunable based on I, J, K, and L aspects of your syetem?" because my answer for all of those is: Because I didn't need to eek out 100% performance.

Even with sub-7200RPM drives, and only five of them, this RAID array built in about 5 hours, and reshapes to/from RAID 6 in around 8-12 hours. A rebuild takes about 6-8 hours, using these relatively slow but modern drives. With a 4k-chunk size, aligned with the 4k sectors, I'm seeing write speeds in excess of 90MB/second, and read speeds in excess of 200MB/second, so this thing can theoretically saturate a single Gig-ethernet link pretty handilly as a file server.

First, this is based on Gentoo. I go with Gentoo because they have a rolling-release schedule, versus a backporting schedule. This server was based on the 29th March 2010 sync-point. It varies very little from the normal Gentoo Handbook. I reconfigured the hard drives in fdisk to 32 sectors per track, 128 heads per cylinder, and adjusted cylinders-per-drive accordingly. This results in 2MB cylinders, which also align with the 4K sector size on newer hard drives no matter what you do in fdisk at that point. All partitions are normal, none of them are the 'RAID automount' type.

MDADM is included in the autobuilt install-CD, so I just used that to configure the RAID initially, then made a filesystem and installed normally on top of it. I specifically tuned the file-system to reserve 1% of the resulting 5.3TB available for root only, instead of making a seperate 'operating system' partition. So there's two partitions on each drive: 1) a 40MB 'boot' partition RAID1 across all drives, and the remainder of each drive as the 'system' partition in RAID5 or RAID6 (your choice, you can change this choice later) for the 'system' partition.

All of this is fairly simple, and the normal Gentoo Installation Handbook steps are taken with one exception: Manual Kernel Configuration, to include initramfs support and add mdadm support.

First, we need to set two packages to include the 'static' USE flag, which forces their binaries to be statically linked so they can be safely included on the initramfs easilly.

1) Make a directory on your new system during the installer:

mkdir /etc/portage

2) Then run these commands:

echo sys-apps/busybox static >> /etc/portage/package.use
echo sys-fs/mdadm static >> /etc/portage/package.use
emerge busybox mdadm

That'll rebuild busybox, and install mdadm as static-linked binaries. Next, a few options to enable on the kernel:

Enable initramfs support, but disable all forms of compression of the initramfs, because the entire image (including the compiled-on initramfs) will be compressed at once instead. I pointed it at the /usr/src/initramfs directory for where to pull the initramfs from. Now, configure that directory:

mkdir -p /usr/src/initramfs/dev
mkdir -p /usr/src/initramfs/proc
mkdir -p /usr/src/initramfs/sys
mkdir -p /usr/src/initramfs/newroot
cp -a /dev/{console,null,tty,tty1} /usr/src/initramfs/dev
chown -R root:root /usr/src/initramfs

Copy your busybox and mdadm binaries to /usr/src/initramfs:

cp `which busybox` `which mdadm` /usr/src/initramfs

Now, you need to write your 'init' for the initramfs as a shell script. Yes, a shell script. It's tricky to get right, so here's a full copy of the one I use. :-)

#!/bin/busybox sh
/bin/busybox mount -t proc proc /proc
/bin/busybox mount -t sysfs sys /sys
/bin/busybox mdev -s

# We have a basic /dev tree configured now. We can manually assembly the md* devices:
/bin/mdadm -A /dev/md0 /dev/sd?2
/bin/mdadm -A /dev/md127 /dev/sd?1

# We need to 'rebuild' the /dev tree before the md* devices will show up though:
/bin/busybox mdev -s

# Now we can mount the root file system, and drop the temp file systems:
/bin/busybox mount -t ext4 -o ro /dev/md0 /newroot
/bin/busybox umount /proc
/bin/busybox umount /sys

# And finally... switch to the new root filesystem:
exec /bin/busybox switch_root /newroot /sbin/init

# Oh-shit panic code, to drop us to a minimal busybox shell to examine the wreckage:
echo Failure! Press enter to boot a shell instead...
exec /bin/busybox sh > /dev/tty1 < /dev/tty1 2>&1

Make that owned by root:root, and RWX permissions for root as a user at least, and you're set. You can now compile your kernel like normal, though it'll end up just shy of 1MB larger than normal due to the initramfs. This is assuming you compile your hard-drive drivers into the kernel, and I recommend leaving your USB drivers as modules so if you leave a USB device plugged in it won't error on you strangely. This is a first-pass init script, better can be done, and note this isn't using LVM's but just raw MD devices for the RAID.

4 commentsLeave a comment

Log in