Ubuntu 24.04 & RAID1 EFI: A Comedy of Errors

Sysadmin Horror Stories — Vol. 1

Ubuntu 24.04 & RAID1 EFI: A Comedy of Errors

Platform: AMD AM5 / B650
Disks: 2× Samsung 960GB SATA SSD
Mood: Progressively deteriorating

 

A completely true account of what happens when you try to do something perfectly reasonable — installing Ubuntu Server 24.04 on a proper software RAID1 array with EFI redundancy — and Ubuntu decides that’s not how things work around here. Grab a coffee. Or something stronger.

ACT I The Hubris of Thinking It Would Just Work

It started innocently enough. Two Samsung 960GB SATA SSDs, a shiny new AM5 server board, and a perfectly reasonable goal: install Ubuntu 24.04 Server on a software RAID1 array. Nothing exotic. Nothing cutting-edge. Just good old “if one disk dies, the server keeps running” reliability.

Boot from USB. Start installer. Get to the storage screen. Click “Create software RAID (md)”. Assign both disks. Set up RAID1. Assign /. Feel briefly competent. Then stare at the screen.

“To continue you need to: Select a boot disk.”

Oh. OH. Ubuntu wants an EFI partition. Fair enough. But here’s the twist: there are TWO disks. Ubuntu will cheerfully let you put one EFI partition on one disk — the partition table equivalent of having one engine on a two-engine plane and calling it redundant.

The installer, in its infinite wisdom, doesn’t expose a way to assign both ESP partitions as /boot/efi. Only one gets the glory. The other just sits there, formatted as vfat, labeled “unused ESP,” silently judging you. This is a known, long-standing limitation of Ubuntu’s curtin installer that the community has been complaining about since EFI became mandatory. Canonical’s response, as best as anyone can tell, has been a polite shrug.

So: into the installer console we go, like it’s 2004 and we’re manually partitioning a Gentoo install.

ACT II Partitioning Like It’s the Before Times

Drop to a shell. The disks — sdb and sdc at the time — both report partition table: unknown. Blank slates. Fine. We’ll do it ourselves.

Partition both disks (GPT)

# sdb first
parted /dev/sdb
(parted) mklabel gpt
(parted) mkpart primary fat32 1MiB 513MiB    # ESP
(parted) set 1 esp on
(parted) mkpart primary 513MiB 100%           # RAID member
(parted) set 2 raid on
(parted) quit

# Repeat identically for sdc
parted /dev/sdc
# ... same commands ...

Format both ESPs

mkfs.fat -F32 /dev/sdb1
mkfs.fat -F32 /dev/sdc1

Create the RAID1 array

mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb2 /dev/sdc2
# Type 'y' when it asks. It always asks.

mkfs.ext4 /dev/md0
⚠ Installer note
Back in the installer, assign sdb1 as /boot/efi and md0 as /. The installer will only write the bootloader to sdb1. sdc1 will be ignored until you manually sync it post-install. More on that shortly.

ACT III The BIOS Escape Plan (That Wasn’t)

At this point, a reasonable person might think: “Why bother with any of this EFI nonsense? My server doesn’t need Secure Boot. It doesn’t need a GUI boot menu. Just let GRUB live on the MBR of both disks, everything in one RAID1 partition, done.”

That person would be right. It’s a much simpler setup. MBR/BIOS boot with mdadm RAID1 is battle-tested, elegant, and requires exactly zero EFI partition gymnastics.

Unfortunately, AMD decided that AM5 (Zen 4 and onwards) ships with CSM — Compatibility Support Module, the thing that enables legacy BIOS boot — completely removed. Not disabled. Not hidden in a menu. Gone. AMD officially dropped legacy BIOS mode from the AM5 platform at launch. It’s UEFI or nothing, forever, on this hardware.

So no, you cannot just slap an MBR label on those disks and call it a day. UEFI only. Two ESPs. Manual sync. Welcome to the future.

ACT IV Why EFI Refuses to RAID

This is the part where we explain why you can’t just throw both ESP partitions into md0 and call it a RAID1 EFI partition. The answer is architectural and annoying.

UEFI firmware reads the ESP directly from the raw disk, before any OS loads. Before the kernel loads. Before mdadm loads. Before anything. It talks to the disk controller, finds a partition flagged as ESP, expects FAT32, reads \EFI\ubuntu\shimx64.efi, and jumps to it. It has no concept of software RAID, no mdadm, no metadata, no arrays.

If you put those two partitions into a RAID1 array, the firmware sees mdadm superblock metadata on what should be a clean FAT32 partition, declares it unreadable, and moves on to the next boot entry — which might be PXE, which is not what you want at 3 AM when a disk fails.

Approach Works? Why
Two ESPs in mdadm RAID1 ✗ No UEFI reads disk directly, sees mdadm metadata, refuses
One ESP, one disk ✗ Risky If that disk dies, no boot even though RAID data is intact
Two independent ESPs + efibootmgr ✓ Yes UEFI knows about both, tries secondary if primary fails
Two ESPs synced by dpkg hook ✓ Yes Both stay current; sync window is seconds on update

ACT V The Post-Install Fix (The Real Guide)

System booted. First boot. Disks have been renumbered to sda and sdb because of course they have. The installer put /boot/efi on sdb1. sda1 is sitting there, formatted as vfat, completely empty, not registered in UEFI, contributing nothing.

Here’s how to make it a proper redundant setup:

Step 1 — Find the second ESP’s identifiers

blkid /dev/sda1
# Output example:
# /dev/sda1: UUID="0A5A-67E7" TYPE="vfat" PARTUUID="0a44aeed-819c-45d0-87c4-44b4c1a00a44"

Step 2 — Mount and sync the second ESP

mkdir /boot/efi2
mount /dev/sda1 /boot/efi2
rsync -a /boot/efi/ /boot/efi2/
ls /boot/efi2/EFI/ubuntu/
# Should show: shimx64.efi grubx64.efi mmx64.efi grub.cfg BOOTX64.CSV

Step 3 — Add the mirror ESP to fstab

Use PARTUUID, not UUID — PARTUUID is GPT-level and doesn’t change if you reformat the partition:

# Add to /etc/fstab:
PARTUUID=0a44aeed-819c-45d0-87c4-44b4c1a00a44  /boot/efi2  vfat  umask=0077,nofail  0  1
⚠ nofail is critical
If sda is the dead disk and you’re booting from the survivor, you absolutely do not want the boot sequence to hang waiting for a dead disk’s ESP to mount. nofail skips it gracefully.

Step 4 — Register sda in the UEFI boot table

efibootmgr --create --disk /dev/sda --part 1 \
  --label "Ubuntu-mirror" \
  --loader "\EFI\ubuntu\shimx64.efi"

Step 5 — Fix the boot order

efibootmgr will helpfully put the new entry first in the boot order. This is backwards. Primary disk first, mirror second:

efibootmgr --bootorder 0000,0007,0002,0003,0004,0005,0006,0001
# Adjust numbers to match your efibootmgr -v output
# Ubuntu entry first, Ubuntu-mirror entry second

Step 6 — Create the automatic sync hook

This runs after every kernel and GRUB update, keeping both ESPs in sync:

cat > /etc/kernel/postinst.d/zz-sync-efi << 'EOF'
#!/bin/bash
mount /boot/efi2 2>/dev/null || true
rsync -a --delete /boot/efi/ /boot/efi2/
EOF
chmod +x /etc/kernel/postinst.d/zz-sync-efi

Step 7 — Reload systemd and verify

systemctl daemon-reload
efibootmgr -v
# Confirm both Ubuntu and Ubuntu-mirror entries exist
# Confirm BootOrder has Ubuntu first
✓ What you verify
Run ls /boot/efi2/EFI/ubuntu/ — you should see shimx64.efi, grubx64.efi, mmx64.efi, grub.cfg, and BOOTX64.CSV. If those are there, the mirror ESP is bootable right now.

FINAL LAYOUT
sda 894GB
├─ sda1 512MB fat32 /boot/efi2 (UEFI fallback, Boot0007)
└─ sda2 893GB raid ──┐
├── md0 ext4 /
sdb 894GB             
├─ sdb1 512MB fat32 /boot/efi (UEFI primary, Boot0000)
└─ sdb2 893GB raid ──┘
Component Status
md0 RAID1 (sda2 + sdb2) ✓ Active, syncing
Primary ESP sdb1 → /boot/efi ✓ Boot0000, first in order
Mirror ESP sda1 → /boot/efi2 ✓ Boot0007, UEFI fallback
fstab (both ESPs, nofail on mirror) ✓ Configured
Kernel postinst sync hook ✓ Fires on every update
efibootmgr boot order ✓ Primary first, mirror second

LESSONS LEARNED

EPILOGUE What We Learned

Ubuntu 24.04’s installer is perfectly fine for single-disk installs, cloud VMs, and people who don’t particularly care what happens when hardware fails. For everyone else — people running actual servers, people who want their machine to survive a disk failure while they’re asleep — it requires a manual detour that, frankly, should be unnecessary by now.

The underlying Linux stack — mdadm, GRUB, efibootmgr — handles all of this just fine. The gap is entirely in the installer’s unwillingness to assign two ESP partitions and register both with UEFI. Debian’s installer handles it. Arch lets you do it manually without drama. Ubuntu’s curtin installer just… doesn’t.

To be fair: once you know the workaround, it’s maybe 15 minutes of work and the result is solid. The sync hook is reliable, efibootmgr fallback genuinely works, and the RAID1 array underneath is as bulletproof as software RAID gets.

It’s just that those 15 minutes involve dropping to a console, manually partitioning, re-entering an installer, remembering to do post-install fixups before rebooting, and knowing that PARTUUID is better than UUID for ESP fstab entries.

Standard Tuesday, really.

If you’re on AM5 or any other platform that dropped CSM: you can’t escape UEFI. Legacy BIOS boot is gone. The dual-ESP sync approach described here is the correct solution, it works, and after setup it’s completely transparent. Kernel updates just work. Disk failure just works. Sleep soundly.

Just maybe don’t use the Ubuntu installer’s default storage layout if you care about any of this.

 

Filed under: sysadmin  ·  linux  ·  ubuntu  ·  raid  ·  uefi
Mood at conclusion: cautiously satisfied