|Summary:||5.12 Regression -- Null pointer exception on resume from S3|
|Product:||Drivers||Reporter:||Samuel Clark (slc2015)|
|Component:||I2C||Assignee:||Drivers/I2C virtual user (drivers-i2c)|
|Severity:||normal||CC:||jarkko.nikula, pmenzel+bugzilla.kernel.org, regressions, slc2015|
|Kernel Version:||5.12 and later||Tree:||Mainline|
Fix for handling unexpected real interrupt
Description Samuel Clark 2022-04-27 20:55:03 UTC
Running Manjaro on a Gigabyte B660M DS3H DDR4 with custom 5.17 kernel. Confirmed on other distributions and kernels back to 5.12; issue is not present on most recent 5.11 kernel. dmesg traceback points to i2c DesignWare driver, specifically drivers/i2c/busses/i2c-designware-master.c:369. It seems the msgs struct passed in to i2c_dw_xfer_msg is null. Similar issue seems to be reported here: https://lore.kernel.org/lkml/YY5BRrE8bLyvd3PB@smile.fi.intel.com/t/ lspci output: https://pastebin.com/MwFM2VBJ dmesg from crashed kernel: https://pastebin.com/t6GsHjkq kernel config: https://pastebin.com/awrSve5u
Comment 1 Samuel Clark 2022-04-27 20:57:31 UTC
CPU info $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Vendor ID: GenuineIntel Model name: 12th Gen Intel(R) Core(TM) i5-12400 CPU family: 6 Model: 151 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 Stepping: 5 CPU(s) scaling MHz: 34% CPU max MHz: 5600.0000 CPU min MHz: 800.0000 BogoMIPS: 4993.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht t m pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpui d aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse 4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves s plit_lock_detect avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req umip pku ospke waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm md_clear serialize arch_lbr flush_l1d arch_capabilities Virtualization features: Virtualization: VT-x Caches (sum of all): L1d: 288 KiB (6 instances) L1i: 192 KiB (6 instances) L2: 7.5 MiB (6 instances) L3: 18 MiB (1 instance) NUMA: NUMA node(s): 1 NUMA node0 CPU(s): 0-11 Vulnerabilities: Itlb multihit: Not affected L1tf: Not affected Mds: Not affected Meltdown: Not affected Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling Srbds: Not affected Tsx async abort: Not affected
Comment 2 Samuel Clark 2022-04-27 21:21:36 UTC
Testing shows that the issue happens when /sys/power/pm_test is set to "platform" or lower.
Comment 3 Jarkko Nikula 2022-05-06 08:31:40 UTC
Hi So reason why it doesn't occur on v5.11 and earlier is that the I2C DesignWare support for Alder Lake -S came to v5.12 by the commit c7b79a752871 ("mfd: intel-lpss: Add Intel Alder Lake PCH-S PCI IDs"). Can you attach here the dump of ACPI tables? Tool below is typically available in acpica-tools, acpi-tools or similar package. Please run it as root. acpidump -o acpi.dump
Comment 4 Samuel Clark 2022-05-06 13:35:00 UTC
Created attachment 300896 [details] ACPI dump Here is the acpi dump for this machine
Comment 5 The Linux kernel's regression tracker (Thorsten Leemhuis) 2022-06-20 08:59:33 UTC
(In reply to Jarkko Nikula from comment #3) > Can you attach here the dump of ACPI tables? Did you have a chance to look into them? Samuel provided them a some time ago.
Comment 6 Samuel Clark 2022-06-21 15:17:17 UTC
Created attachment 301247 [details] attachment-12555-0.html Thanks. A recent UEFI update for the board completely resolved this issue. Prior to that, disabling IOAPIC 24-119 options in the BIOS worked as a temporary fix. On Mon, Jun 20, 2022 at 3:59 AM <email@example.com> wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=215907 > > The Linux kernel's regression tracker (Thorsten Leemhuis) ( > firstname.lastname@example.org) changed: > > What |Removed |Added > > ---------------------------------------------------------------------------- > CC| |email@example.com > > --- Comment #5 from The Linux kernel's regression tracker (Thorsten > Leemhuis) (firstname.lastname@example.org) --- > (In reply to Jarkko Nikula from comment #3) > > > Can you attach here the dump of ACPI tables? > > Did you have a chance to look into them? Samuel provided them a some time > ago. > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are on the CC list for the bug. > You reported the bug.
Comment 7 Jarkko Nikula 2022-06-22 11:20:04 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #5) > (In reply to Jarkko Nikula from comment #3) > > > Can you attach here the dump of ACPI tables? > > Did you have a chance to look into them? Samuel provided them a some time > ago. Ah, sorry, forgot to reply. I didn't find anything obvious from the dumps and was sidetracked to another tasks. Glad to hear UEFI update is fixing the issue. Unfortunately doesn't help those who are not able to or aware to upgrade so some workaround is good to have. Fortunately I got recently for a loan a machine with Gigabyte motherboard and it's showing the issue so I'll have a change to debug it next week.
Comment 8 Jarkko Nikula 2022-09-22 13:12:55 UTC
Created attachment 301846 [details] Fix for handling unexpected real interrupt
Comment 9 Jarkko Nikula 2022-09-22 13:18:01 UTC
Sorry the long delay but I finally figured out a fix for this issue. email@example.com: I believe you are not able to verify the fix after the UEFI update but is it ok since I added your "Reported-by tag to the patch? If not I will remove it before sending upstream kernel.
Comment 10 Samuel Clark 2022-09-25 01:13:45 UTC
Created attachment 301869 [details] attachment-27488-0.html I'm not able to test but glad there's a fix. You can include the tag. On Sep 22, 2022, 8:18 AM -0500, firstname.lastname@example.org, wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=215907 > > --- Comment #9 from Jarkko Nikula (email@example.com) --- > Sorry the long delay but I finally figured out a fix for this issue. > > firstname.lastname@example.org: I believe you are not able to verify the fix after the > UEFI > update but is it ok since I added your "Reported-by tag to the patch? If not > I > will remove it before sending upstream kernel. > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are on the CC list for the bug. > You reported the bug.
Comment 11 Jarkko Nikula 2022-10-12 06:31:33 UTC
This is now merged into git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git as commit 301c8f5c32c8 ("i2c: designware: Fix handling of real but unexpected device interrupts").
Comment 12 Paul Menzel 2022-10-12 13:51:27 UTC
Awesome work. Thank you all. @Samual, just for the record, can you please comment, what firmware version you used, when it was not working, and what version fixed it? PS: Also, when replying via email, please remove the quote/citation, as the Bugzilla Web interface does not hide it.