Bug 219721

Summary: spd5118 11-0050: Failed to write b = 0: -6
Product: Power Management Reporter: alexknoptech
Component: OtherAssignee: Rafael J. Wysocki (rjw)
Status: NEW ---    
Severity: normal CC: danmascandrade, roberto, skala.antonin, thiago.mogui, val.zapod.vz
Priority: P3    
Hardware: Intel   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: Strix - Dmesg & Info

Description alexknoptech 2025-01-25 00:17:53 UTC
Noticing these errors in recent kernel versions.

alex@fedora:~$ sudo inxi -SM
System:
  Host: fedora Kernel: 6.12.9-200.fc41.x86_64 arch: x86_64 bits: 64
  Console: pty pts/3 Distro: Fedora Linux 41 (Workstation Edition)
Machine:
  Type: Laptop System: Razer product: Blade 15 (2022) - RZ09-0421 v: 8.04
    serial: BY2222M73501760
  Mobo: Razer model: CH580 v: 4 serial: N/A UEFI: Razer v: 2.06
    date: 11/01/2023



[ 6979.278270] spd5118 11-0050: Failed to write b = 0: -6
[ 6979.278280] spd5118 11-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[ 6979.278292] spd5118 11-0050: PM: failed to resume async: error -6
[ 6979.278559] spd5118 11-0052: Failed to write b = 0: -6
[ 6979.278569] spd5118 11-0052: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[ 6979.278581] spd5118 11-0052: PM: failed to resume async: error -6
[ 8291.455654] spd5118 11-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[ 8291.455667] spd5118 11-0050: PM: failed to resume async: error -6
[ 8291.455954] spd5118 11-0052: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[ 8291.455966] spd5118 11-0052: PM: failed to resume async: error -6
[13891.273716] spd5118 11-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[13891.273749] spd5118 11-0050: PM: failed to resume async: error -6
[13891.274165] spd5118 11-0052: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[13891.274187] spd5118 11-0052: PM: failed to resume async: error -6
[18379.491845] spd5118 11-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[18379.491862] spd5118 11-0050: PM: failed to resume async: error -6
[18379.492109] spd5118 11-0052: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[18379.492136] spd5118 11-0052: PM: failed to resume async: error -6
[25062.995021] spd5118 11-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[25062.995039] spd5118 11-0050: PM: failed to resume async: error -6
[25062.995313] spd5118 11-0052: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[25062.995327] spd5118 11-0052: PM: failed to resume async: error -6
[74408.948242] spd5118 11-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[74408.948260] spd5118 11-0050: PM: failed to resume async: error -6
[74408.948740] spd5118 11-0052: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[74408.948754] spd5118 11-0052: PM: failed to resume async: error -6
[115286.672241] spd5118 11-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[115286.672250] spd5118 11-0050: PM: failed to resume async: error -6
[115286.672624] spd5118 11-0052: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[115286.672637] spd5118 11-0052: PM: failed to resume async: error -6
Comment 1 Roberto Salomon 2025-02-16 13:30:08 UTC
Same messages observed with OpenSUSE Tumbleweed:
Linux treebeard 6.13.1-1-default #1 SMP PREEMPT_DYNAMIC Mon Feb  3 05:33:25 UTC 2025 (1918d13) x86_64 x86_64 x86_64 GNU/Linux


<From dmesg>
[   59.382718] [    T163] spd5118 11-0051: Failed to write b = 0: -6
[   59.382751] [    T163] spd5118 11-0051: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[   59.382769] [    T163] spd5118 11-0051: PM: failed to resume async: error -6

The system is functional after resume, but the messages indicate there may be something with the spd5118 sensor.

# inxi -SMG
System:
  Host: treebeard Kernel: 6.13.1-1-default arch: x86_64 bits: 64
  Console: pty pts/1 Distro: openSUSE Tumbleweed 20250211
Machine:
  Type: Laptop System: LENOVO product: 21D6S16900 v: ThinkPad P16 Gen 1 serial: PF4LTYR0
  Mobo: LENOVO model: 21D6S16900 v: SDK0T76530 WIN serial: L1HF35J02K2 UEFI: LENOVO
    v: N3FET43W (1.28 ) date: 07/02/2024
Graphics:
  Device-1: Intel Alder Lake-HX GT1 [UHD Graphics 770] driver: i915 v: kernel
  Device-2: NVIDIA GA107GLM [RTX A2000 8GB Laptop GPU] driver: nvidia v: 570.86.16
  Device-3: Bison Integrated RGB Camera driver: uvcvideo type: USB
  Display: unspecified server: X.org v: 1.21.1.15 with: Xwayland v: 24.1.5 driver: X:
    loaded: modesetting,nvidia dri: iris gpu: i915,nvidia,nvidia-nvswitch tty: 163x50 resolution:
    1: 1920x1080 2: 2560x1600
  API: EGL v: 1.5 drivers: nvidia platforms: gbm,surfaceless,device
  API: OpenGL v: 4.6.0 vendor: nvidia v: 570.86.16 note: console (EGL sourced) renderer: NVIDIA
    RTX A2000 8GB Laptop GPU/PCIe/SSE2
  API: Vulkan v: 1.4.304 drivers: N/A surfaces: N/A
  Info: Tools: api: eglinfo, glxinfo, vulkaninfo de: kscreen-console,kscreen-doctor
    gpu: nvidia-settings wl: wayland-info x11: xdpyinfo, xprop, xrandr
Comment 2 Antonín Skala 2025-02-20 11:50:29 UTC
Created attachment 307689 [details]
Strix - Dmesg & Info

My Strix usually won't jump to Gnome after wake up from sleep. Just this dmesg output stay on internal display.
Comment 3 dzero 2025-03-11 11:58:35 UTC
Same bug with Arch Linux:

My bug report:

Summary

The spd5118 driver fails to resume properly after system suspend, leading to error messages in dmesg, incomplete DDR5 temperature sensor readings, and, in some cases, complete system freezes upon resume, requiring a forced reboot.

Despite these failures, i2cdetect confirms that the driver still holds the I2C addresses (UU at 50 and 52), suggesting that the driver remains loaded but is in an invalid or non-functional state.

Steps to Reproduce

With spd5118 module is loaded, check DDR5 temperature readings:

sensors | grep -A3 spd5118

spd5118-i2c-11-50
Adapter: SMBus I801 adapter at efa0
ERROR: Can't get value of subfeature temp1_max_alarm: Can't read
DDR5 DIMM Temp 1:  +54.2°C  (low  =  +0.0°C, high = +55.0°C)
                            (crit low =  +0.0°C, crit = +85.0°C)


List I2C devices before suspend:

sudo i2cdetect -y 11

     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:                         08 -- -- -- -- -- -- -- 
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
40: -- -- -- -- 44 -- -- -- 48 -- 4a -- -- -- -- -- 
50: UU -- UU -- -- -- -- -- -- -- -- -- -- -- -- -- 
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
70: -- -- -- -- -- -- -- --  


When put the system into suspend mode using: 

systemctl suspend

and resuming from suspend, the system may completely freeze, requiring a manual restart. If the system does resume, the following error messages appear in dmesg:

spd5118 11-0050: Failed to write b = 0: -6
spd5118 11-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
spd5118 11-0050: PM: failed to resume async: error -6
spd5118 11-0052: Failed to write b = 0: -6
spd5118 11-0052: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
spd5118 11-0052: PM: failed to resume async: error -6

Run i2cdetect again:

sudo i2cdetect -y 11

     0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:                         08 -- -- -- -- -- -- -- 
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
40: -- -- -- -- 44 -- -- -- 48 -- 4a -- -- -- -- -- 
50: UU -- UU -- -- -- -- -- -- -- -- -- -- -- -- -- 
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
70: -- -- -- -- -- -- -- --  

Output remains the same (UU still appears at 50 and 52), but dmesg shows critical errors.

Expected Behavior

1 - The spd5118 driver should resume properly after suspend.
2 - Sensors should displaying complete DDR5 temperature readings without errors.
3 - The system should not freeze upon resume.

Workaround

The only way I found to avoid system freezes and sensor issues is to blacklist the spd5118 driver:

echo "blacklist spd5118" | sudo tee /etc/modprobe.d/spd5118_blacklist.conf
sudo update-initramfs -u
sudo reboot

However, this completely removes DDR5 temperature readings from sensors, making temperature monitoring impossible:

sensors | grep spd5118
(no output)

System Information

Distro: Arch Linux
Kernel: 6.13.6-arch1-1
Desktop: KDE Plasma 6 (Wayland)
Hardware: Dell G15 5530
BIOS Version: 1.22.0
CPU: Intel Core i5-13450HX
GPU: Intel Raptor Lake UHD Graphics + NVIDIA RTX 3050
NVIDIA (OPEN) Driver: 570.124.04
RAM infos:
          Size: 16 GB
          Manufacturer: SK Hynix
          Part Number: BC901
          Description: DDR5 SODIMM
          Size: 16 GiB
          Clock: 4800 MHz (JEDEC standard)
          XMP: not enabled (running at default JEDEC 4800MHz)
Other relevant messages from dmesg and journalctl include:

ahci 0000:00:0e.0: probe with driver ahci failed with error -12
pci 10000:e0:1a.0: can't derive routing for PCI INT A
nvme 10000:e1:00.0: PCI INT A: no GSI
NVRM: RmHandleDNotifierEvent: Failed to handle ACPI D-Notifier event, status=0x11
Comment 4 Daniel Mascarenhas 2025-03-11 13:47:22 UTC
I had exactly the same error here, on ubuntu 24.04 with kernel version 6.11.0-19-generic.
As soon as I try to suspend my laptop it won't suspend but briefly show a screen with these error messages: 
spd5118 1-0050: Failed to write b = 0: -6
spd5118 1-0050: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
spd5118 1-0050: PM: failed to resume async: error -6. 

And after it come back to the login screen.

System:
  Host: avell Kernel: 6.11.0-19-generic arch: x86_64 bits: 64
  Desktop: GNOME v: 46.0 Distro: Ubuntu 24.04.2 LTS (Noble Numbat)
Machine:
  Type: Laptop System: Avell High Performance product: A70 HYB v: Standard
    serial: AVNB22320492
  Mobo: Avell High Performance model: Avell A70 HYB v: Standard
    serial: 10026 UEFI: American Megatrends LLC. v: N.1.10AVE04 date: 05/31/2022
Graphics:
  Device-1: Intel Alder Lake-P GT2 [Iris Xe Graphics] driver: i915 v: kernel
  Device-2: NVIDIA GA106M [GeForce RTX 3060 Mobile / Max-Q] driver: nvidia
    v: 550.144.03
  Device-3: Chicony HD Webcam driver: uvcvideo type: USB
  Display: server: X.Org v: 21.1.11 with: Xwayland v: 23.2.6 driver: X:
    loaded: modesetting,nvidia unloaded: fbdev,nouveau,vesa dri: iris
    gpu: i915,nvidia,nvidia-nvswitch resolution: 1: 2560x1080~60Hz
    2: 1920x1080~40Hz
  API: EGL v: 1.5 drivers: iris,nvidia,swrast
    platforms: gbm,x11,surfaceless,device
  API: OpenGL v: 4.6.0 compat-v: 4.5 vendor: intel mesa
    v: 24.2.8-1ubuntu1~24.04.1 renderer: Mesa Intel Graphics (ADL GT2)
Comment 5 val.zapod.vz 2025-04-09 15:33:22 UTC
Same bug; happens when resuming from S3 only. Does not happen on first boot.

6.13.9 kernel

[20663.347163] ACPI: PM: Waking up from system sleep state S3
[20663.401253] xhci_hcd 0000:00:0d.0: xHC error in resume, USBSTS 0x401, Reinit
[20663.401256] usb usb1: root hub lost power or was reset
[20663.401257] usb usb2: root hub lost power or was reset
[20663.401391] xhci_hcd 0000:80:14.0: xHC error in resume, USBSTS 0x411, Reinit
[20663.401393] usb usb5: root hub lost power or was reset
[20663.401394] usb usb6: root hub lost power or was reset
[20663.419067] spd5118 0-0051: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[20663.419073] spd5118 0-0051: PM: failed to resume async: error -6
[20663.419688] spd5118 0-0053: PM: dpm_run_callback(): spd5118_resume [spd5118] returns -6
[20663.419701] spd5118 0-0053: PM: failed to resume async: error -6
[20663.420184] i915 0000:00:02.0: [drm] GT0: GuC firmware i915/mtl_guc_70.bin version 70.36.0
[20663.422877] mei_me 0000:80:16.0: hbm: dma setup response: failure = 3 REJECTED


I do indeed have DDR5 memory and it has temperature sensors, which is what the driver is for.

Other such issues https://github.com/QubesOS/qubes-issues/issues/9720

lm-sensors package provides a way to see temperature metadata anyway which changes:

sensors

spd5118-i2c-0-51
Adapter: SMBus I801 adapter at 4000
temp1:        +32.0°C  (low  =  +0.0°C, high = +55.0°C)
                       (crit low =  +0.0°C, crit = +85.0°C)

Also here https://lore.kernel.org/lkml/92a841f0-ab20-4243-9d95-54205790d616@roeck-us.net/T/