Background Story

Last week, I experienced a major data loss on my daily driver computer due to encryption failure. Last time I had encountered this sort of situation was over a decade ago with TrueCrypt on an external HDD and unfortunately lost everything that weren’t backed up. I’ve recovered everything from backup this time so I decide to document it.

The entire storage on my daily driver is encrypted with LUKS and it suddenly fails on boot decryption. I don’t have the header backed up and have stopped using Timeshift for a long time due to it’s unschedulable which lead to chaotic performance impact.

I believe it’s due to physical data corruption on the LUKS header area. The hardware is a very aged low-end TLC NVMe drive (Toshiba BG3) that pulled out of someone’s dead laptop. I bought a new replacement by the chance of current SSD price drop. I got a Samsung 970 EVO Plus (1TB) for the same price as my SK hynix Gold S31 (1TB) bought on Black Friday, and it’s NVME instead of SATA.

I do not like to consume on electronics especially buying brand new but sometimes it’s literally cheaper than buying used ones. Getting “perishable hardware” such as SSDs in new condition is more forgiving to me when considering lifespan and reliability factors like TBW and MTBF.

Fortunately, I have the full system backed up within days so I’m safe netted against a critical data loss.

My backup is made with my favorite tool restic locally:

restic --exclude={/dev,/media,/mnt,/proc,/run,/sys,/tmp,/var/tmp} -r /path/to/repository/ backup /

Recovery Steps

Prepare the New Disk

Usually, manual partitioning the new disk, mounting it and then restoring the backup is the way to go. However, with encryption and EFI can make things way more complex. So, I just did a fresh install using a Live CD of the same distro and install restic while the installation.

Restore the Backup

After installation, restore right in the live system:

restic --exclude={/boot,/etc/fstab,/etc/crypttab} -r /path/to/repository restore latest -t /new/disk/root/

The exclude will avoid interference of both boot and encryption.

Cover Up

Usually, this step involves repairing/rebuilding grub2 and fstab, but that is not the case here.

After rebooting to the restored system on the new disk, the graphics crashed. This results in unable to switch into TTY by CTRL+ALT+F1-12 keys.

I think it is caused by file corruption occurs around my graphic related files, perhaps driver files.

Solution:

  1. Reboot to the Grub menu, press "e" to edit boot options
  2. Remove quiet and add 3 in the end
  3. Press Ctrl+X to boot into the system without crashing the graphic driver
  4. Switch to TTY and reinstall the corrupted driver, in my case remove mesa-dri-drivers and install mesa-dri-drivers

I believe excluding/usr/lib64/dri during the restore process may prevent it but it won’t happen again.

I still need to repair my network drive in fstab and that is it. Everything back to working but faster. Thanks to the new SSD.