Pironman5 - NVMe won't boot (every 2-3 times)

I have installed Home Assistant OS on my Pi 5 and installed the Pironman5 Case with an NVMe drive following the instructions at the sunfounder docs

Every second or third time I reboot the system it won’t start up. It seems like it doesn’t recognize the NVMe drive and cycles through retries to boot the system.

Oddly, this only happens every 2 or 3 times I reboot the system. If I power it down, and start it again - it will boot from the NVMe drive and into HASS OS. Then if I reboot it typically works again. But on the next or following reboot it gets stuck and won’t start.

I have tried several troubleshooting steps including:

  1. Replaced the FPC cable with both that were shipped (same behavior with both).
  2. Unplugged and re-seated both FPC cables multiple times.
  3. Tried putting in an SD card both when powering up and during the error screen, but it doesn’t appear to boot from the SD card either.
Boot mode: RESTART (0f) order 0
Boot mode: NVME (06) order f14
VID 0x1344 HN CT500P4PSSD8
NVME on 0
Trying partition: 0
type: 16 lba: 2048 'mkfs.fat' "    V      ^ " clusters 32695 (4)
Trying partition: 0
type: 16 lba: 2048 'mkfs.fat' "    V      ^ " clusters 32695 (4)
NVME off
Timeout 00000000 3c303010 00000000 00000000
nvme: error 8
Failed to open device: 'nvme'
Boot mode: USB-MSD (04) order f1

there are a series of repeats similar to the above with different cluster numbers or 32 lba 8192 and such. But never succeeds.

The SSD is brand new and in the ‘good’ list - [Crucial P3 Plus M.2 500GB]

I have recorded a short video and a couple of screenshots (1)

I would welcome any help or guidance on what steps I can take to correct this so it boots successfully 100% of the time. Could this be a bad SSD? Something else?

We recommend that you install the Raspberry Pi OS on the SD card and boot from it. Then, enter the command line and execute the following command:

sudo rpi-eeprom-config --edit

Configure the boot order to prioritize booting from the SSD
https://docs.sunfounder.com/projects/pironman5/en/latest/install/copy_sd_to_nvme_rpi.html

by changing the BOOT_ORDER line to:

BOOT_ORDER=0xf416

Explanation:
6: First try NVMe SSD
1: Then try SD card
4: Then try USB
If you configure the system to boot from the SD card, set the BOOT_ORDER line to:

BOOT_ORDER=0xf461

Once you have completed the configuration, restart the Pironman 5 for the changes to take effect.

Try installing Home Assistant OS on your SSD and booting from it to see the results!

I followed the recommended steps above - the only change that was needed was changing the BOOT_ORDER value from 0xf461 to the specified 0xf416

The Pi boots into Home Assistant OS on the SSD as before, and every 2-3 reboots it still gets stuck in the inability to boot. This time it shows a friendlier screen showing that it is looping through NVME then SSD then USB (as specified in the BOOT_ORDER setting), but it never succeeds. I can power down the device and reboot and it will work to get to Home Assistant.

Then repeat reboots and the second or third time it will land in that boot loop where it doesn’t seem able to recognize a boot device.

I just also tried to re-image my SSD drive from scratch on my PC with Rasperry Pi Imager for Home Assistant. Even after that it is exhibiting the same behavior -every 2-3 reboots fails to recognize the SSD to boot.

Are there any other suggestions on what I might be able to do to fix this? I don’t want the machine to be in a position where a reboot might not be successful and my home automation/control will be completely offline.

I attempted another test using a different NVMe SSD - this time a Samsung EVO 970. It exhibits exactly the same behavior - it boots fine 1 or 2 times, recognizes the NVMe drive and starts Home Assistant and then cannot startup.

Is it possible there is something wrong with the NVMe PIP? I have now replaced the SSD, the cable, re-imaged with no change.

I thought I may have identified the symptom but can’t tell if it is indeed related. I noticed that many of the times when rebooting before it unsuccessfully starts up, the reboot sequence shows the following message (and stays there for 30+ seconds before then rebooting / screen going black before booting into loop sequence without ever picking up NVMe or SD card, etc.

[ OK ] Reached target System Reboot.
[ 120.754317 ] watchdog: watchdog0: watchdog did not stop!'

Is it possible this watchdog message is related to some issue or somehow the NVMe drive isn’t ready or released or something before it reboots?

Note - that when the reboot sequence rapidly finishes (eg doesn’t wait on that line) it seems like it gets rebooted successfully.

If you can, remove the SD card extender, put your SD card directly into raspberry Pi 5 and reboot it many times…

thank you @Weirdyguy. I’ll give that a try. To confirm/clarify - are you suggesting I boot with the SD card AND the NVMe installed (without the SD Card extender and do so many times (whereby I assume it will boot from NVMe)?

Or should I boot from SD card (without NVMe connected)?

You can try both. There’s a lot of issues about SD card extender. Just insulate a possible faulty SD card extender. By my side I get watchdog error when shutdown\reboot when I boot with NVMe, without SD.

Thank you for clarifying. And are you suggesting that booting several times without the adapter will help me actually correct/fix the problem when I again reinstall the adapter? Or that it might help me isolate the adapter itself as the source of the issue?

No, the extender itself have a problem. Rebooting many time is just to check if the setup is stable

Rysock, could you please try to reproduce the issue again?

We suggest recording a video of the process where the problem occurs, as it will help us analyze and resolve the issue more effectively.

If the video file is large, please upload it to OneDrive and share the link with us, ensuring you grant us access. Thank you!

I tried @Weirdyguy suggestion and disassembled the case / removed the SD card extender and rebooted several times. Unfortunately, the problem still occurs every 2-3 reboots.

I’ll take another set of videos and upload to share with @SunFounder_Moderator

Have you try others OS as Rasbian? By my side, I installed a lite desktop version into SD card, boot with it. From this OS, I install a full desktop version with Imager into the NVMe. Then I shutdown, remove SD card and boot directly with NVMe.

Similar issue to mine :frowning:

You need to configure the boot order of the system.

If you want to use the SSD, you need to set it to prioritize booting from the SSD. Enter the command line and execute the following command:

sudo rpi-eeprom-config --edit

Configure the boot order to prioritize booting from the SSD:
https://docs.sunfounder.com/projects/pironman5/en/latest/install/copy_sd_to_nvme_rpi.html 1

Change the BOOT_ORDER line to:

BOOT_ORDER=0xf416

Explanation:

6: Try NVMe SSD first
1: Then try the SD card
4: Then try USB

If you configure the system to boot from the SD card, set the BOOT_ORDER line to:

BOOT_ORDER=0xf461

After completing the configuration, reboot the Pironman 5 for the changes to take effect.

Mine was already set to f461 and then tried the other way round. It should not matter as once the SD or USB is not found it should go to the next. The issue is it tried to load a partition which it can’t find but other times it can.

Based on your description, it seems you are experiencing issues with the recognition of the SD card and SSD.

Check Connections: Please ensure that the FPC cable is correctly connected.
Observe LED Indicators: When connecting the SSD and powering on the Pironman 5, check the status LED (STA) and power LED (PWR). If the status LED is not blinking, double-check the orientation of the FPC cable.
Assembly Video: For further assistance, please refer to our assembly video: Watch the Video.
If the FPC connections are correct but the power LED does not light up and the status LED does not blink, try shorting the metal at the force en (J4) on the SSD adapter and then power on the Pironman 5 to see if it can force the recognition of the SSD.