systemd-homed: not enough space to start or free up space
TL;DR
If you're using systemd-homed, it may be a good idea to store a file in the home partition that can be deleted when your disk is full, so that you can mount your home to free up space.
dd bs=1M count=1024 if=/dev/random of=/home/EMERGENCY
Yesterday, I spent my evening at my first install party, debugging different setups and finding solutions for obscure bugs. To take it easy, I start doing a little routine development today, until my laptop freezes, shuts down, and refuses to restart. Debugging weird bugs is actually not over.
It directly reminds me this problem I had when my disk was full on a previous Arch install. And I know my disk was nearly full, and this last ISO I've downloaded yesterday may have filed the remaining space.
Rescue
The strategy is simple: I boot on a live system, decrypt and mount my user partition, remove some files, and reboot. I don't even need to boot on a Live USB as Particle OS gives a Live system UKI entry (it boots on a temporary system populated by the unencrypted and read only usr partition)
But, I get some errors when trying to mount my user home:
# losetup -fP --show /home/user.home
loop0
# cryptsetup open /dev/loop0p1 user
# mount /dev/mapper/user /user/home
No such file /dev/mapper/user # Reproduced from memory
The error was something like this, I'm not 100% sure. And it came with Kernel log:
BTRFS: device label user devid 1 transid 141091 /dev/mapper/home-user (253:3) scanned by systemd-homewor (69355)
BTRFS info (device dm-3): first mount of filesystem 9d30717d-d824-47bf-be84-14018eaed1e4
BTRFS info (device dm-3): using crc32c (crc32c-lib) checksum algorithm
BTRFS info (device dm-3): start tree-log replay
critical space allocation error, dev loop0, sector 12110384 op 0x1:(WRITE) flags 0x1800 phys_seg 1 prio class 2
critical space allocation error, dev loop0, sector 12110416 op 0x1:(WRITE) flags 0x1800 phys_seg 1 prio class 2
critical space allocation error, dev loop0, sector 17177536 op 0x1:(WRITE) flags 0x1800 phys_seg 1 prio class 2
critical space allocation error, dev loop0, sector 17177568 op 0x1:(WRITE) flags 0x1800 phys_seg 1 prio class 2
critical space allocation error, dev loop0, sector 12110448 op 0x1:(WRITE) flags 0x1800 phys_seg 2 prio class 2
critical space allocation error, dev loop0, sector 17177600 op 0x1:(WRITE) flags 0x1800 phys_seg 1 prio class 2
critical space allocation error, dev loop0, sector 12110512 op 0x1:(WRITE) flags 0x1800 phys_seg 3 prio class 2
critical space allocation error, dev loop0, sector 17177632 op 0x1:(WRITE) flags 0x1800 phys_seg 4 prio class 2
critical space allocation error, dev loop0, sector 12110608 op 0x1:(WRITE) flags 0x1800 phys_seg 1 prio class 2
critical space allocation error, dev loop0, sector 12110640 op 0x1:(WRITE) flags 0x1800 phys_seg 1 prio class 2
BTRFS: error (device dm-3) in btrfs_commit_transaction:2538: errno=-5 IO failure (Error while writing out transaction)
BTRFS warning (device dm-3 state E): Skipping commit of aborted transaction.
BTRFS error (device dm-3 state EA): Transaction aborted (error -5)
BTRFS: error (device dm-3 state EA) in cleanup_transaction:2023: errno=-5 IO failure
BTRFS: error (device dm-3 state EA) in btrfs_replay_log:2091: errno=-5 IO failure (Failed to recover log tree)
BTRFS error (device dm-3 state EA): open_ctree failed: -5
So, if the issue is the disk being full, it means I can't mount my partition, to free some space, because ... it is full.
As a btrfs check gives nothing, it is time to backup my current home before doing anything. Fortunately, I've my external drive here, so let's copy my 150G user.home image on this drive 🥲
A few cups of coffee later, when the copy is finally finished, I try to mount (losetup+cryptsetup+mount) this image from the external drive, and it works. That's great, all my data is back up, and it means I couldn't boot or mount my home because my laptop drive was indeed full.
So I just have to remove some files from this local copy, unmount it (umount+losetup -d), copy the new user.home to my laptop, drink a few more cups and we're done.
For the next time
I was lucky being at home this time. How could I have rescued my laptop if I hadn't my external hard drive? And how can I avoid these (excessively) long copies?
=> Simply with a file that I can delete in an emergency if I fill up my disk again.
# dd bs=1M count=1024 if=/dev/random of=/home/EMERGENCY
And with a blog post to remember why I have this EMERGENCY file in my /home.