18 July 2024

Debugging Windows 11 crashes

Once to twice a week, I have been finding my system asleep when it should have been folding. After looking at "Event Viewer" after each of these I noticed a trend of a Kernel-Power event proceeded (but not immediately) by a volmgr one.

To view the events open up "Event Viewer", expand "Windows Logs", and click on "System"

Simplified Example:

LevelDate and TimeSourceEvent IDTask Category
Critical5/7/2024 2:13:36 AMKernel-Power41(63)

The system has rebooted without cleanly shutting down first. This error could be caused if the system stopped responding, crashed, or lost power unexpectedly.
Error5/7/2024 2:13:36 AMvolmgr162None

Dump file generation succeded.
Warning5/7/2024 2:11:00 AMDisplay4101None

Display driver amduw23g stopped responding and has successfully recovered.
Warning5/7/2024 2:06:07 AMDisplay4101None

Display driver amduw23g stopped responding and has successfully recovered.


Updating Drivers

As a first step to try and remedy this, I always update drivers. I updated the following:

  • chipset: 
    • 4.07.13.2243 -> 5.11.02.217 (from Asus)
    • 6.02.07.2300 (from AMD after above didn't fix it)
  • video: a version installed in 2024 -> 24.5.1


Updating the BIOS

I found some posts online that said it could be due to the motherboard, so I decided to try updating the BIOS to see if that would help. Unfortunately, it did not.

Before updating the BIOS/UEFI make sure to write down your changes as they will be reset.

BIOS/UEFI Settings

  • Expo -> Enabled
  • Fan Curves
    • CPU
      • 20C -> 20%
      • 60C -> 40%
      • 85C -> 70%
      • 90C -> 100%
    • System
      • 20C -> 30%
      • 50C -> 50%
      • 85C -> 80%
      • 90C -> 100%
  • Advanced Mode (F7)
    • Tool -> ASUS Armoury Crate
      •  Download & Install -> Disabled
    • AI Tweaker -> Precision Boost Override
      • Curve Optimizer
        • All Cores
        • Negative
        • 30
      • Precision Boost Override -> AMD Eco Mode
      • AMD Eco Mode -> cTDP 65W
    • AI Tweaker -> SOC Offset -> negative -> 0.03 (however this caused the pc to hang on warm-boots so I use the below settings instead)
    • AI Tweaker -> CPU SOC Voltage -> Manual
      • VDD SOC Voltage Override -> 1.2

Update the BIOS

After updating the BIOS, you will need to reapply the settings.


Old video drivers

Since all the updates didn't work, I tried installing old video drivers that I knew worked from 24.5.1 to 23.10.2. Luckily this seems to have done the trick and my system is stable again with >27 days of uptime.


Slow folding

Unfortunately, it looks like my solution may be short lived as to get expected folding performance, I need to update to 24.6.1: https://foldingforum.org/viewtopic.php?t=41637&sid=d9b1c6f33f52801aaca8c31fc3fe52d1 So fingers crossed that this release is also stable.

Update 2024-07-29: I had a freeze after roughly 11 days of uptime

Update 2024-07-31: My PC crashed overnight after only 1.5 days of uptime


24.7.1

Update 2024-07-31: I updated the graphics drivers to 24.7.1. Here is the process that I used:

  • Downloaded the AMD driver cleanup utility (from here)
  • Downloaded the full updated AMD graphics driver (for 5600 XT)
  • Disabled Ethernet/Wi-Fi
    • I do this so that Windows doesn't try to "helpfully" download and install other graphic driver versions
  • Ran the AMD driver cleanup utility
    • Had it reboot me into Safe-Mode
    • It removed the drivers and prompted me to reboot, which I did
  • Installed the new drivers (had to approve the big red scary box because of no internet)
  • Rebooted
  • Enabled Ethernet/Wi-Fi


Update 2024-08-02: My computer has crashed twice in the past 2 days so I decided to try DDU

  • Downloaded Display Driver Uninstaller (aka DDU)
    • I downloaded from Guru3D
  • Disabled Ethernet/Wi-Fi
  • Rebooted into Safe Mode
  • Extracted the EXE from the ZIP file
  • Ran the EXE, it will extract even more files
  • Ran "Display Driver Uninstaller.exe"
  • Unchecked the final option (Disable Windows Update Driver Downloads)
    • Since we already disabled internet access this is unnecessary and DDU recommends reenabling afterwards anyway so this saves a step
  • Close the options
  • Device Type -> GPU
  • Device -> AMD
  • Clicked "Clean and restart"
  • Waited for it to do its thing and reboot
  • Installed the 24.7.1 drivers again
  • Rebooted
  • Enabled Ethernet/Wi-Fi

 

Appendix

Sources

No comments:

Post a Comment