Cross(trans)grading PopOS to Elive beta 3.8.30 [SOLVED]

I cannot boot the new Elive that replaced PopOS.
My boot config might be a little bit complicated for the transgrade.

I use to boot the laptop with rEFInd and systemd-boot on the laptop SSD. I have another USB SSD disk with PopOS that boot with grub on that disk. Both disks are encrypted with LUKS. PopOS has a 3 partition ESD, iso image, luks with LVM with lv for (root, swap, backup).

Actually if I boot using the old PopOS (now Elive) it ask for disk passphrase then I wait for a long, long time and I get in emergency mode .

I could reinstall elive on the disk but I want to keep my data-backup LV since I have borg backup and other stuff that I want to keep.

Any Idea of how to proceed ???

Regards,
BT

Its difficult to know what happened, but basically this is what the OS is meant to do:

  • have an extra /boot partition so it can load the required things for the unencryption
  • ask you for the keyphrase, based on partition set on the /etc/crypttab file (its UUID), this should be your root encrypted partition, the /etc/fstab should be then the address of the unencrypted partition

finally seems like the name should be like: if encrypted partition is sdb3, the crypttab file should contain a name like sdb3_crypt, but especially it needs to be mounted this way (if you want to chroot on it to reinstall) otherwise the generated initrd from that will not recognize it

On the other hand, I received the installer reports of your install :slight_smile: tell me if you want me to post them somewhere, In a fast look everything seems to have been performed well, your fstab and cryptab entries seems to be correct, except that there is these lines:

Device /dev/mapper/data-root is not a valid LUKS device.
umount: /mnt/target/mnt/backup: target is busy.

to me it sounds like the boot doesn't find correctly your encrypted partition (at least with his name) and cannot proceed, is very difficult to know from here what exactly happened, the only thing that I can do is to attempt an install of PopOS in a virtual machine and do betatestings to see if i can reproduce the issue (then you should explain the options you selected for install PopOS), if i need to fix anything on the installer you can later use that installer to reinstall the system to do a correct install (actually you cannot do that with the actual installer because since you didn't booted successfully the new elive, it won't suggest you to use the migration/upgrade mode)

I can chroot from my arch to elive and mount everything. There was a problem with the /etc/crypttab since the luks device is cryptdata and not sda3_crypt. It had no key-file to permit booting without adding a second time the passphrase to boot. Also at first, it was not able to mount /dev/sda1 the ESP partition. In the chroot I reformatted sda1 to vfat. Now it does mount ok (it already was a vfat and I really don't know why it would not get it). The /etc/default/grub was incorrect also since the luks uuid and the root were not specified. Now I can boot but it doesn't seem to see the root lvm. It's there and it's ok. I have to read about luks and lvm on debian it looks like the lvm is not available when it want to mount it. Like if the "vgchange -ay data" to activate the lvm was not done yet. I think it misses a module or a hook in the initrd or something like that. Still searching!

Update: after last modification, the passphrase is not asked and it complains that it doesn't find the root lvm wich is normal since the root is not decrypted yet. OK I stop and go to sleep on it now ;=)

After many try and error, I'm still stuck with error.
If I check the /boot/grub/grub.cfg there is no cryptomount of my luks disk.
I have specified the following in /etc/default/grub

GRUB_DISTRIBUTOR="Elive 64BIT 3.8.30 (beta)"
GRUB_ENABLE_CRYPTODISK=y
GRUB_PRELOAD_MODULES="part_gpt part_msdos cryptodisk luks lvm"
GRUB_CMDLINE_LINUX_DEFAULT="cryptdevice=UUID=cd33bffd-2f26-45c1-ae93-e43e9e06fcdc:cryptdata     root=/dev/data/root  splash resume=/dev/disk/by-uuid/69fd5f21-541a-4540-a2f9-b1bd7021a4cb cryptkey=rootfs:/etc/cryptdata.key rd.lvm.vg=data rd.luks.uuid=cd33bffd-2f26-45c1-ae93-e43e9e06fcdc"
GRUB_CMDLINE_LINUX=""

I update-grub, grub-install /dev/sda but no luck it doesn't want to decrypt the disk and start ...

Still searching,
BT :nauseated_face:

Blockquote

I just did a Pop_OS (same version as yours) in a virtual machine to betatest the migration mode, the partitioning looked pretty much the same as yours, in my case it perfectly worked, Elive booted correctly and was set up correctly, so I'm sorry I cannot help much more here

Some notes:

  • from what i know is not needed to add anything in your grub, the boot is meant to "search" for your root partition and this configuration is already set in your initrd
  • have you tried to disable the "swap" line in your fstab? ( /dev/mapper/data-swap ), i don't know how pop_os created this entry but it doesn't shows up like that on my installation (instead, swap is /dev/sda3), but the installation of elive correctly replaced the first entry with the second correct one (not sure if is your case, but maybe if doesn't found a specific partition in your fstab it can be the reason of why the boot can be stuck)
  • if you are unable to make the system booting I think the easiest option is to backup your important data, do a clean installation, and copy the data back

Considering there's talk of:

  1. a fixed SSD as well as a USB-disk with the same content.
  2. Using Grub as well as systemd-boot

I've lost track which is which that was causing trouble or is being described.

I do see similarities to the mess that got created when I mixed up EFI and MBR boot on my machine, where indeed my encrypted sda3 didn't get found afterwards.
The EFI partition get totally messed up so after many tried options I cloned back the original EFI and /boot partition and did a another run of the installer (using UEFI boot) and everything was properly upgraded again.

I will try that. A reinstall of the systemd-boot could be the solution.

But on the elive ssd usb disk I see that grub doesn't load the cryptdisk module and doesn't call it for decryption after ... So it may as well be a problem with grub.

My original laptop disk is meant to EFI boot only. I use systemd-boot and no grub is installed on my laptop disk. What would be the correct way to add a linux on and external disk with grub & EFI boot ???

Well, if as you say, your laptop only has UEFI boot .... then IMO that doesn't leave you with any other option than to use UEFI on the USB as well.

My laptop supports both UEFI and legacy mode but definitely will not mix the two.

Just to be clear and to avoid misunderstandings:
Grub does not decrypt your disk, it merely makes sure the correct kernel with its initrd is loaded and booted.
It's the kernel that does the real heavy lifting when it comes to encryption and recognizing filesystems.

IIRC there's no need to add much on grub, because all of this is automatically managed by the initrd, but is required that the system is correctly mounted / set when re-creating the initrd, so its dynamically created upon needs...

A recently issue on elive was that cryptsetup-initramfs package was not installed by default and only because of this, the initrd file was not featured the "unencrypt root partition on boot", the recent versions of elive (and its installer) has a much better implementation of this to make sure is installed when needed, I also just verified (again) that this package was correctly installed on @PerfMonk installation logs so its not the cause of the issue (i just wanted to mention this example), so if you have any dependency like a module to load or a initrd conf file to include it may be the cause, but the specific setup is too difficult to know from here (also my pop_os test went successfully)

2 Likes

On the elive SSD usb disk Grub is used with EFI (not legacy).
Grub must help decrypt the initramfs by loading the cryptomount module and a few others.
This the part that not done by the grub.cfg that is generate by "grub-mkconfig" or "update-grub".

It seems that this part does not work here (but why I can't tell).

here is an example extract of loading a USB disk encrypted with LUKS :
(part of /boot/grub/grub.cfg of another linux) :

### BEGIN /etc/grub.d/10_linux ###
menuentry 'Mabox Linux' --class mabox --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-bb613212-6e1d-400f-b972-124ae05d791d' {
        savedefault
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_gpt
        insmod cryptodisk
        insmod luks
        insmod gcry_rijndael
        insmod gcry_rijndael
        insmod gcry_sha256
        insmod ext2
        cryptomount -u 52f55eefa3ca425b903cfdd9b6fbc5af
        set root='cryptouuid/52f55eefa3ca425b903cfdd9b6fbc5af'
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root --hint='cryptouuid/52f55eefa3ca425b903cfdd9b6fbc5af'  bb613212-6e1d-400f-b972-124ae05d791d
        else
          search --no-floppy --fs-uuid --set=root bb613212-6e1d-400f-b972-124ae05d791d
        fi
        linux   /boot/vmlinuz-5.18-x86_64 root=UUID=bb613212-6e1d-400f-b972-124ae05d791d rw  quiet cryptdevice=UUID=52f55eef-a3ca-425b-903c-fdd9b6fbc5af:luks-52f55eef-a3ca-425b-903c-fdd9b6fbc5af root=/dev/mapper/luks-52f55eef-a3ca-425b-903c-fdd9b6fbc5af resume=/dev/mapper/luks-e47fa051-f712-42d1-a3e4-f5bff369d417 udev.log_priority=3 
        initrd  /boot/amd-ucode.img /boot/initramfs-5.18-x86_64.img
}

In the grub.cfg generate by grub on the pop transgrade to elive it doesn't load the cryptomount module nor any other associate module, also it doesn't do the "cryptomount -u cryptouid".

The /etc/default/grub does specify "GRUB_ENABLE_CRYPTODISK=y" and it gives the cryptdevice and other info necessary to boot. It looks that any parameter I try to change in /etc/default/grub doesn't make grub realize that the disk is encrypted ??????

Still searching a solution,
BT

I'm still wondering why the heck grub doesn't want to recognize my luks encrypted partition ....
I've been playing with differents setups, grub parameters, ... My LUKS is encrypted with grub2 but it is a format that it can handle :

Keyslots:
  0: luks2
        Key:        512 bits
        Priority:   normal
        Cipher:     aes-xts-plain64
        Cipher key: 512 bits
        PBKDF:      pbkdf2

And it used to boot with popOS, so that shouldn't be an issue.

Still searching ...

I never found a reason either but never looked that far knowing UEFI was the cause in my case.

That is to say I never found anything wrong with any clear text config files. Leaving the conclusion that the cause probably is in one of the generated binary files like in '/boot/efi/EFI' or in 'initrd'.

So if you're searching for specific LUKS/LVM wisdom, I'd advise searching in that direction.

A thing I also did (albeit not 100% sure it made any difference but it could be that the firmware is putting grub on the wrong foot) is clean up all old/stale entries in EFI with 'efibootmgr'. :madness:

Everything considered, it would've been wiser in my case to only have a separate 'home partition encrypted instead of the whole OS.
Security wise it wouldn't have been a difference. :thinking:

I think that having only home encrypted leave space for hacking since the swap file would not be encrypted, and many temporary files may still be usefull, and the OS itself can be hacked. But nothing is hackproof in this world :laughing:

Well if you have the swap file (N.B not the partition) on your '/home' partition that issue would be solved.
It starts to become a real problem when i.e your RAM becomes accessible by anyone with enough tech savvy.... naturally you could use 'ramcrypt' but it will have the same issues as an encrypted filesystem.

There is actually only one reasonably safe way to keep your data ultimately safe (including whether the whole OS and RAM or only /home is encrypted) and that is to shutdown the machine gracefully (meaning no plug pulling or holding down the on/off button)...or throw it out the window (except if it's very cold outside) in case of a SWAT team crashing down your door. :madness:
Also be very aware that a suspended or locked machine remains decrypted in every sense and is in no way really protected.

All in all, it's a bit contrary to the Linux habit of flaunting long up-times but for safety's sake (if safety indeed is really important) make a habit of shutting down the machine when not using it....and keep sensitive material in extra encrypted folders, files or devices. :typing:

1 Like

I have dumped the transgraded popOS/elive.
A new install fixed everything. It boots and it decrypt the root partition.

I was getting mad at grub/efi .... Now it boots, decrypt and works like a charm!

I will close this thread and mark as solved.