Bug 220083

Summary: [REGRESSION, BISECTED] x86 ASM changes make dispatch_hid_bpf_output_report access not-present page
Product: Platform Specific/Hardware Reporter: Rong Zhang (i)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: NEW ---    
Severity: high CC: bp
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: Full demsg
6.15-rc5_per-cpu-pf_bpftrace
6.15-rc5_per-cpu-pf_bptrace_dmesg_decoded

Description Rong Zhang 2025-05-03 18:40:41 UTC
After upgrading from 6.14.x to 6.15-rc3, not-present page PF occurs each time I unplug any of my Logitech Unifying receivers.

Upgrading to 6.15-rc4 did not fix the issue.

dmesg:
```
[   48.726588] usb 7-1.4: USB disconnect, device number 7
[   48.856531] BUG: unable to handle page fault for address: ffff8a510ee72018
[   48.856543] #PF: supervisor write access in kernel mode
[   48.856547] #PF: error_code(0x0002) - not-present page
[   48.856550] PGD 365c01067 P4D 365c01067 PUD 0
[   48.856558] Oops: Oops: 0002 [#1] SMP NOPTI
[   48.856566] CPU: 0 UID: 0 PID: 7237 Comm: kworker/0:3 Tainted: G     U              6.15.0-rc4 #1 PREEMPT(lazy)  b3a8ad1950c71c15317e5ea614db6e274ecb0913
[   48.856574] Tainted: [U]=USER
[   48.856577] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW 03/11/2025
[   48.856579] Workqueue: events hidinput_led_worker
[   48.856589] RIP: 0010:__srcu_read_unlock+0x1a/0x30
[   48.856595] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48 ff 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
[   48.856598] RSP: 0018:ffffd037cc29fd88 EFLAGS: 00010202
[   48.856602] RAX: 0000000000000000 RBX: ffff8a4c6b16fe08 RCX: 0000000000000000
[   48.856604] RDX: 0000000000000002 RSI: 0000000000000010 RDI: ffff8a4c6b16fe38
[   48.856606] RBP: ffffd037cc29fdf8 R08: 0000000000000000 R09: 00000000fffffffd
[   48.856607] R10: 0000000000000001 R11: 00000000ffffffff R12: 0000000000000000
[   48.856609] R13: ffff8a4ac182dbc0 R14: 0000000000000001 R15: 0000000000000000
[   48.856611] FS:  0000000000000000(0000) GS:ffff8a510ee72000(0000) knlGS:0000000000000000
[   48.856613] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   48.856614] CR2: ffff8a510ee72018 CR3: 0000000364c24000 CR4: 0000000000f50ef0
[   48.856617] PKRU: 55555554
[   48.856618] Call Trace:
[   48.856621]  <TASK>
[   48.856623]  dispatch_hid_bpf_output_report+0xc5/0x100
[   48.856631]  hid_hw_output_report+0x46/0x90
[   48.856635]  hidinput_led_worker+0xa9/0xe0
[   48.856640]  process_one_work+0x18f/0x350
[   48.856646]  worker_thread+0x2d3/0x400
[   48.856650]  ? rescuer_thread+0x550/0x550
[   48.856654]  kthread+0xf9/0x240
[   48.856657]  ? kthreads_online_cpu+0x120/0x120
[   48.856661]  ret_from_fork+0x31/0x50
[   48.856665]  ? kthreads_online_cpu+0x120/0x120
[   48.856668]  ret_from_fork_asm+0x11/0x20
[   48.856674]  </TASK>
[   48.856675] Modules linked in: xt_mark tcp_diag inet_diag snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq uhid rfcomm cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi snd_seq_device bridge stp llc nf_tables qrtr bnep overlay sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops videobuf2_v4l2 videobuf2_common btusb videodev btrtl btintel mc btbcm btmtk bluetooth amd_atl intel_rapl_msr intel_rapl_common snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match snd_amd_sdw_acpi soundwire_amd snd_hda_codec_realtek
[   48.856732]  soundwire_generic_allocation soundwire_bus snd_hda_codec_generic snd_soc_sdca snd_hda_scodec_component snd_hda_codec_hdmi snd_soc_core mt7925e snd_compress mt7925_common snd_hda_intel ac97_bus mt792x_lib snd_intel_dspcfg snd_pcm_dmaengine mt76_connac_lib snd_intel_sdw_acpi snd_rpl_pci_acp6x kvm_amd mt76 snd_hda_codec snd_acp_pci think_lmi snd_amd_acpi_mach kvm snd_hda_core snd_acp_legacy_common snd_pci_acp6x snd_hwdep mac80211 snd_pcm_oss snd_mixer_oss irqbypass snd_pci_acp5x snd_ctl_led snd_pcm libarc4 rapl pcspkr firmware_attributes_class snd_timer lenovo_wmi_hotkey_utilities snd_rn_pci_acp3x wmi_bmof cfg80211 snd_acp_config snd snd_soc_acpi k10temp hid_sensor_als spd5118 amdxdna amd_pmf snd_pci_acp3x rfkill soundcore hid_sensor_trigger industrialio_triggered_buffer amdtee kfifo_buf joydev hid_sensor_iio_common ccp industrialio amd_pmc platform_profile mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted
[   48.856786]  asn1_encoder tee dm_mod raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 linear md_mod igc ptp pps_core uas usb_storage hid_logitech_hidpp r8153_ecm cdc_ether usbnet hid_logitech_dj r8152 mii usbhid amdgpu i2c_algo_bit drm_ttm_helper ttm drm_panel_backlight_quirks polyval_clmulni polyval_generic drm_exec ghash_clmulni_intel drm_suballoc_helper amdxcp sha512_ssse3 sdhci_pci drm_buddy sha256_ssse3 thunderbolt hid_sensor_custom r8169 sha1_ssse3 serio_raw sp5100_tco sdhci_uhs2 gpu_sched nvme sdhci hid_multitouch realtek hid_sensor_hub aesni_intel atkbd ucsi_acpi drm_display_helper hid_generic nvme_core cqhci crypto_simd mdio_devres libps2 video typec_ucsi i2c_piix4 vivaldi_fmap cryptd nvme_keyring typec libphy mmc_core i2c_smbus i8042 cec i2c_hid_acpi amd_sfh nvme_auth roles wmi serio i2c_hid
[   48.856843] CR2: ffff8a510ee72018
[   48.856846] ---[ end trace 0000000000000000 ]---
[   50.304586] RIP: 0010:__srcu_read_unlock+0x1a/0x30
[   50.304601] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48 ff 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
[   50.304603] RSP: 0018:ffffd037cc29fd88 EFLAGS: 00010202
[   50.304606] RAX: 0000000000000000 RBX: ffff8a4c6b16fe08 RCX: 0000000000000000
[   50.304607] RDX: 0000000000000002 RSI: 0000000000000010 RDI: ffff8a4c6b16fe38
[   50.304608] RBP: ffffd037cc29fdf8 R08: 0000000000000000 R09: 00000000fffffffd
[   50.304609] R10: 0000000000000001 R11: 00000000ffffffff R12: 0000000000000000
[   50.304610] R13: ffff8a4ac182dbc0 R14: 0000000000000001 R15: 0000000000000000
[   50.304611] FS:  0000000000000000(0000) GS:ffff8a510ee72000(0000) knlGS:0000000000000000
[   50.304612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   50.304613] CR2: ffff8a510ee72018 CR3: 0000000121904000 CR4: 0000000000f50ef0
[   50.304615] PKRU: 55555554
[   50.304616] note: kworker/0:3[7237] exited with irqs disabled
```

Bisect log:

```
# good: [38fec10eb60d687e30c8c6b5420d86e8149f7557] Linux 6.14
git bisect good 38fec10eb60d687e30c8c6b5420d86e8149f7557
# bad: [9c32cda43eb78f78c73aee4aa344b777714e259b] Linux 6.15-rc3
git bisect bad 9c32cda43eb78f78c73aee4aa344b777714e259b
# bad: [4a4b30ea80d8cb5e8c4c62bb86201f4ea0d9b030] Merge tag 'bcachefs-2025-03-24' of git://evilpiepirate.org/bcachefs
git bisect bad 4a4b30ea80d8cb5e8c4c62bb86201f4ea0d9b030
# bad: [1e1ba8d23dae91e6a9cfeb1236b02749e8a49ab3] Merge tag 'timers-clocksource-2025-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 1e1ba8d23dae91e6a9cfeb1236b02749e8a49ab3
# skip: [21e0ff5b10ec1b61fda435d42db4ba80d0cdfded] Merge tag 'acpi-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
git bisect skip 21e0ff5b10ec1b61fda435d42db4ba80d0cdfded
# good: [47c4f9b1722fd883c9745d7877cb212e41dd2715] Tidy up ASoC control get and put handlers
git bisect good 47c4f9b1722fd883c9745d7877cb212e41dd2715
# bad: [2899aa3973efa3b0a7005cb7fb60475ea0c3b8a0] Merge tag 'x86_cache_for_v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad 2899aa3973efa3b0a7005cb7fb60475ea0c3b8a0
# good: [5a658afd468b0fb55bf5f45c9788ee8dc87ba463] Merge tag 'objtool-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 5a658afd468b0fb55bf5f45c9788ee8dc87ba463
# bad: [a49a879f0ac19ed0a562e220019741857b261551] Merge tag 'x86-cleanups-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad a49a879f0ac19ed0a562e220019741857b261551
# bad: [9a93e29f16bbba90a63faad0abbc6dea3b2f0c63] x86/syscall: Move sys_ni_syscall()
git bisect bad 9a93e29f16bbba90a63faad0abbc6dea3b2f0c63
# bad: [cfdaa618defc5ebe1ee6aa5bd40a7ccedffca6de] Merge branch 'x86/cpu' into x86/asm, to pick up dependent commits
git bisect bad cfdaa618defc5ebe1ee6aa5bd40a7ccedffca6de
# good: [c4a8b7116b9927f7b00bd68140e285662a03068e] perf/x86/intel: Use cache cpu-type for hybrid PMU selection
git bisect good c4a8b7116b9927f7b00bd68140e285662a03068e
# good: [4f2a0b765c9731d2fa94e209ee9ae0e96b280f17] <linux/sizes.h>: Cover all possible x86 CPU cache sizes
git bisect good 4f2a0b765c9731d2fa94e209ee9ae0e96b280f17
# bad: [95b0916118106054e1f3d5d7f8628ef3dc0b3c02] percpu: Remove PER_CPU_FIRST_SECTION
git bisect bad 95b0916118106054e1f3d5d7f8628ef3dc0b3c02
# skip: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with GOT based stack cookie load on Clang < 17
git bisect skip 78c4374ef8b842c6abf195d6f963853c7ec464d2
# bad: [b5c4f95351a097a635c1a7fc8d9efa18308491b5] x86/percpu/64: Remove fixed_percpu_data
git bisect bad b5c4f95351a097a635c1a7fc8d9efa18308491b5
# skip: [cb7927fda002ca49ae62e2782c1692acc7b80c67] x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
git bisect skip cb7927fda002ca49ae62e2782c1692acc7b80c67
# skip: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64: Convert to normal per-CPU variable
git bisect skip 80d47defddc000271502057ebd7efa4fd6481542
# skip: [f58b63857ae38b4484185b799a2759274b930c92] x86/pvh: Use fixed_percpu_data for early boot GSBASE
git bisect skip f58b63857ae38b4484185b799a2759274b930c92
# good: [0ee2689b9374d6fd5f43b703713a532278654749] x86/stackprotector: Remove stack protector test scripts
git bisect good 0ee2689b9374d6fd5f43b703713a532278654749
# bad: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42] x86/percpu/64: Use relative percpu offsets
git bisect bad 9d7de2aa8b41407bc96d89a80dc1fd637d389d42
# good: [a9a76b38aaf577887103e3ebb41d70e6aa5a4b19] x86/boot: Disable stack protector for early boot code
git bisect good a9a76b38aaf577887103e3ebb41d70e6aa5a4b19
# only skipped commits left to test
# possible first bad commit: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42] x86/percpu/64: Use relative percpu offsets
# possible first bad commit: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64: Convert to normal per-CPU variable
# possible first bad commit: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with GOT based stack cookie load on Clang < 17
# possible first bad commit: [cb7927fda002ca49ae62e2782c1692acc7b80c67] x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
# possible first bad commit: [f58b63857ae38b4484185b799a2759274b930c92] x86/pvh: Use fixed_percpu_data for early boot GSBASE
```

There is a typo in commit f58b63857ae3 ("x86/pvh: Use fixed_percpu_data for early boot GSBASE"), resulting in compilation failure.
With the patch below, I bisected again:

```
diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
index 723f181b222a..f1a8392a4835 100644
--- a/arch/x86/platform/pvh/head.S
+++ b/arch/x86/platform/pvh/head.S
@@ -180,7 +180,7 @@ SYM_CODE_START(pvh_start_xen)
         */
        movl $MSR_GS_BASE,%ecx
        leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
-       movq %edx, %eax
+       movl %edx, %eax
        shrq $32, %rdx
        wrmsr
```

New bisect log:

```
[...]
# good: [a9a76b38aaf577887103e3ebb41d70e6aa5a4b19] x86/boot: Disable stack protector for early boot code
git bisect good a9a76b38aaf577887103e3ebb41d70e6aa5a4b19
# good: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with GOT based stack cookie load on Clang < 17
git bisect good 78c4374ef8b842c6abf195d6f963853c7ec464d2
# good: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64: Convert to normal per-CPU variable
git bisect good 80d47defddc000271502057ebd7efa4fd6481542
# first bad commit: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42] x86/percpu/64: Use relative percpu offsets
```

The bad commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets") first appeared in v6.15-rc1.

Got dmesg below by building and booting the bad commit, then unplugging the receiver:

```
[  560.223095] BUG: unable to handle page fault for address: ffff9acf2b889008
[  560.223174] #PF: supervisor write access in kernel mode
[  560.223299] #PF: error_code(0x0002) - not-present page
[  560.223332] PGD 43e401067 P4D 43e401067 PUD 0
[  560.223353] Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI
[  560.223359] CPU: 0 UID: 0 PID: 8212 Comm: kworker/0:3 Tainted: G     U             6.14.0-rc3+ #1 ab962f3b7921227b62db2503d8ec7411fa694628
[  560.223364] Tainted: [U]=USER
[  560.223369] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW 03/11/2025
[  560.223378] Workqueue: events hidinput_led_worker
[  560.223382] RIP: 0010:__srcu_read_lock+0x14/0x30
[  560.223387] Code: 0f 0b eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 8b 07 48 8b 57 08 83 e0 01 89 c1 <65> 48 ff 04 ca f0 83 44 24 fc 00 c3 cc cc cc cc 66 66 2e 0f 1f 84
[  560.223392] RSP: 0018:ffffb7df8d24fd88 EFLAGS: 00010202
[  560.223396] RAX: 0000000000000001 RBX: ffff9ac82f80de08 RCX: 0000000000000001
[  560.223401] RDX: 0000000000000000 RSI: ffff9ac8fd276f40 RDI: ffff9ac82f80de38
[  560.223407] RBP: ffffb7df8d24fdf8 R08: 0000000000000000 R09: 00000000fffffffd
[  560.223412] R10: 0000000000000001 R11: 00000000ffffffff R12: 0000000000000000
[  560.223417] R13: ffff9ac8fd276f40 R14: 000000000000000e R15: 0000000000000000
[  560.223421] FS:  0000000000000000(0000) GS:ffff9acf2b889000(0000) knlGS:0000000000000000
[  560.223426] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  560.223430] CR2: ffff9acf2b889008 CR3: 00000001e1c40000 CR4: 0000000000f50ef0
[  560.223434] PKRU: 55555554
[  560.223439] Call Trace:
[  560.223444]  <TASK>
[  560.223449]  ? __die_body.cold+0x19/0x29
[  560.223453]  ? page_fault_oops+0x15a/0x2e0
[  560.223458]  ? search_bpf_extables+0x5f/0x80
[  560.223462]  ? exc_page_fault+0x1a3/0x1b0
[  560.223466]  ? asm_exc_page_fault+0x26/0x30
[  560.223471]  ? __srcu_read_lock+0x14/0x30
[  560.223475]  ? psi_task_switch+0xb7/0x200
[  560.223480]  dispatch_hid_bpf_output_report+0x73/0x100
[  560.223485]  hid_hw_output_report+0x46/0x90
[  560.223490]  hidinput_led_worker+0xa9/0xe0
[  560.223494]  process_one_work+0x17b/0x330
[  560.223498]  worker_thread+0x2ce/0x3f0
[  560.223503]  ? rescuer_thread+0x530/0x530
[  560.223507]  kthread+0xeb/0x230
[  560.223512]  ? kthreads_online_cpu+0x120/0x120
[  560.223516]  ret_from_fork+0x31/0x50
[  560.223522]  ? kthreads_online_cpu+0x120/0x120
[  560.223528]  ret_from_fork_asm+0x11/0x20
[  560.223532]  </TASK>
[  560.223538] Modules linked in: tcp_diag inet_diag xt_mark snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq uhid rfcomm cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun bridge stp llc nf_tables snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi snd_seq_device qrtr bnep overlay sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops btusb videobuf2_v4l2 btrtl videobuf2_common btintel btbcm videodev btmtk mc bluetooth snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp intel_rapl_msr amd_atl snd_sof_pci intel_rapl_common snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match snd_amd_sdw_acpi soundwire_amd soundwire_generic_allocation snd_ctl_led
[  560.223612]  soundwire_bus snd_soc_sdca snd_hda_codec_realtek snd_hda_codec_generic snd_soc_core mt7925e snd_hda_scodec_component mt7925_common snd_compress mt792x_lib snd_hda_codec_hdmi ac97_bus snd_hda_intel mt76_connac_lib snd_pcm_dmaengine snd_intel_dspcfg mt76 snd_rpl_pci_acp6x snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_acp_pci think_lmi snd_hda_core snd_acp_legacy_common mac80211 kvm snd_pci_acp6x snd_hwdep snd_pcm_oss snd_mixer_oss snd_pci_acp5x libarc4 amd_pmf rapl pcspkr firmware_attributes_class wmi_bmof hid_sensor_als amdtee snd_pcm hid_sensor_trigger snd_rn_pci_acp3x cfg80211 industrialio_triggered_buffer snd_timer joydev snd_acp_config kfifo_buf spd5118 snd snd_soc_acpi hid_sensor_iio_common ccp soundcore snd_pci_acp3x rfkill platform_profile amdxdna k10temp industrialio amd_pmc mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics nvme_keyring nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod raid10 raid456 async_raid6_recov
[  560.223631]  async_memcpy async_pq async_xor async_tx raid1 raid0 linear md_mod igc ptp pps_core uas usb_storage hid_logitech_hidpp r8153_ecm cdc_ether usbnet hid_logitech_dj r8152 mii usbhid amdgpu i2c_algo_bit drm_ttm_helper ttm drm_panel_backlight_quirks polyval_clmulni drm_exec polyval_generic ghash_clmulni_intel drm_suballoc_helper sha512_ssse3 amdxcp hid_sensor_custom serio_raw sha256_ssse3 drm_buddy sdhci_pci ucsi_acpi atkbd nvme hid_multitouch r8169 sha1_ssse3 sp5100_tco hid_sensor_hub typec_ucsi libps2 gpu_sched sdhci_uhs2 vivaldi_fmap aesni_intel nvme_core sdhci hid_generic realtek typec drm_display_helper video i8042 crypto_simd i2c_piix4 mdio_devres cqhci cryptd thunderbolt mmc_core libphy cec amd_sfh nvme_auth roles i2c_smbus serio i2c_hid_acpi wmi i2c_hid
[  560.223646] CR2: ffff9acf2b889008
[  560.223650] ---[ end trace 0000000000000000 ]---
[  560.223655] RIP: 0010:__srcu_read_lock+0x14/0x30
[  560.223660] Code: 0f 0b eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 8b 07 48 8b 57 08 83 e0 01 89 c1 <65> 48 ff 04 ca f0 83 44 24 fc 00 c3 cc cc cc cc 66 66 2e 0f 1f 84
[  560.223664] RSP: 0018:ffffb7df8d24fd88 EFLAGS: 00010202
[  560.223670] RAX: 0000000000000001 RBX: ffff9ac82f80de08 RCX: 0000000000000001
[  560.223674] RDX: 0000000000000000 RSI: ffff9ac8fd276f40 RDI: ffff9ac82f80de38
[  560.223679] RBP: ffffb7df8d24fdf8 R08: 0000000000000000 R09: 00000000fffffffd
[  560.223683] R10: 0000000000000001 R11: 00000000ffffffff R12: 0000000000000000
[  560.223687] R13: ffff9ac8fd276f40 R14: 000000000000000e R15: 0000000000000000
[  560.223692] FS:  0000000000000000(0000) GS:ffff9acf2b889000(0000) knlGS:0000000000000000
[  560.223696] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  560.223700] CR2: ffff9acf2b889008 CR3: 00000001e1c40000 CR4: 0000000000f50ef0
[  560.223704] PKRU: 55555554
[  560.223709] note: kworker/0:3[8212] exited with irqs disabled
```
Comment 1 Borislav Petkov 2025-05-03 19:29:24 UTC
Does the issue go away if you revert the commit you found as the bad one? I.e. 9d7de2aa8b41
Comment 2 Rong Zhang 2025-05-03 19:58:40 UTC
(In reply to Borislav Petkov from comment #1)
> Does the issue go away if you revert the commit you found as the bad one?
> I.e. 9d7de2aa8b41

Thanks for your timely reply! Just tried to revert it, and merge conflicts unfortunately jumped out. These conflicts seem to be non-trivial. Could you please provide a revert patch so that I can test it?
Comment 3 Borislav Petkov 2025-05-03 20:31:56 UTC
Before we go there, why does your kernel have "Tainted: [U]=USER"? Upload full dmesg from the failing and the good boot pls.
Comment 4 Rong Zhang 2025-05-04 03:28:28 UTC
(In reply to Borislav Petkov from comment #3)
> Before we go there, why does your kernel have "Tainted: [U]=USER"? Upload
> full dmesg from the failing and the good boot pls.

Oops, didn't notice the bit. Sorry for that!

The tainted bit was set by amdgpu on load because I specified amdgpu.gpu_recovery=1 in the cmdline. This behavior was just added in 6.14, and I have never noticed the change :-(

```
[   12.278425] Setting dangerous option gpu_recovery - tainting kernel
[   12.278428] [drm] amdgpu kernel modesetting enabled.
```

This is irrelevant to the regression here, as no GPU recovery was done before the PF. I booted 6.15-rc4 without the parameter and reproduced the identical PF:

```
[  161.939105] usb 7-1.4: USB disconnect, device number 7
[  162.041208] BUG: unable to handle page fault for address: ffff8ae08e072018
[  162.041219] #PF: supervisor write access in kernel mode
[  162.041222] #PF: error_code(0x0002) - not-present page
[  162.041224] PGD 42f401067 P4D 42f401067 PUD 0
[  162.041230] Oops: Oops: 0002 [#1] SMP NOPTI
[  162.041235] CPU: 0 UID: 0 PID: 9 Comm: kworker/0:0 Not tainted 6.15.0-rc4 #1 PREEMPT(lazy)  b3a8ad1950c71c15317e5ea614db6e274ecb0913
[  162.041240] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW 03/11/2025
[  162.041243] Workqueue: events hidinput_led_worker
[  162.041251] RIP: 0010:__srcu_read_unlock+0x1a/0x30
[  162.041257] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48 ff 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
[  162.041259] RSP: 0018:ffffcc50c0197d88 EFLAGS: 00010202
[  162.041261] RAX: 0000000000000000 RBX: ffff8ad94c899e08 RCX: 0000000000000000
[  162.041263] RDX: 0000000000000002 RSI: 0000000000000010 RDI: ffff8ad94c899e38
[  162.041265] RBP: ffffcc50c0197df8 R08: 0000000000000000 R09: 00000000fffffffd
[  162.041266] R10: 0000000000000001 R11: 00000000ffffffff R12: 0000000000000000
[  162.041267] R13: ffff8ad9762fa850 R14: 0000000000000001 R15: 0000000000000000
[  162.041269] FS:  0000000000000000(0000) GS:ffff8ae08e072000(0000) knlGS:0000000000000000
[  162.041270] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  162.041272] CR2: ffff8ae08e072018 CR3: 000000042e424000 CR4: 0000000000f50ef0
[  162.041274] PKRU: 55555554
[  162.041275] Call Trace:
[  162.041278]  <TASK>
[  162.041279]  dispatch_hid_bpf_output_report+0xc5/0x100
[  162.041286]  hid_hw_output_report+0x46/0x90
[  162.041290]  hidinput_led_worker+0xa9/0xe0
[  162.041294]  process_one_work+0x18f/0x350
[  162.041298]  worker_thread+0x2d3/0x400
[  162.041302]  ? rescuer_thread+0x550/0x550
[  162.041305]  kthread+0xf9/0x240
[  162.041308]  ? kthreads_online_cpu+0x120/0x120
[  162.041310]  ret_from_fork+0x31/0x50
[  162.041314]  ? kthreads_online_cpu+0x120/0x120
[  162.041317]  ret_from_fork_asm+0x11/0x20
[  162.041321]  </TASK>
[  162.041322] Modules linked in: tcp_diag inet_diag xt_mark snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq tun uhid rfcomm cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi snd_seq_device xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables bridge stp llc qrtr bnep overlay sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops videobuf2_v4l2 btusb videobuf2_common btrtl btintel videodev btbcm btmtk mc bluetooth intel_rapl_msr amd_atl intel_rapl_common snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match snd_amd_sdw_acpi soundwire_amd snd_hda_codec_realtek
[  162.041376]  soundwire_generic_allocation soundwire_bus snd_hda_codec_generic snd_hda_scodec_component snd_soc_sdca snd_hda_codec_hdmi snd_soc_core kvm_amd mt7925e snd_hda_intel snd_compress mt7925_common ac97_bus snd_intel_dspcfg snd_intel_sdw_acpi snd_pcm_dmaengine mt792x_lib kvm snd_rpl_pci_acp6x snd_hda_codec mt76_connac_lib snd_acp_pci snd_amd_acpi_mach snd_hda_core mt76 snd_acp_legacy_common snd_pci_acp6x irqbypass think_lmi snd_hwdep mac80211 hid_sensor_als snd_pcm_oss rapl snd_ctl_led libarc4 snd_pci_acp5x snd_mixer_oss amd_pmf hid_sensor_trigger pcspkr firmware_attributes_class lenovo_wmi_hotkey_utilities wmi_bmof amdtee snd_pcm snd_rn_pci_acp3x industrialio_triggered_buffer cfg80211 snd_acp_config kfifo_buf snd_timer snd_soc_acpi snd k10temp hid_sensor_iio_common amdxdna rfkill ccp snd_pci_acp3x soundcore platform_profile spd5118 amd_pmc industrialio joydev mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics nfnetlink ip_tables x_tables dm_crypt encrypted_keys trusted
[  162.041425]  asn1_encoder tee dm_mod raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 linear md_mod igc ptp pps_core uas usb_storage hid_logitech_hidpp hid_logitech_dj r8153_ecm cdc_ether usbnet r8152 mii usbhid amdgpu i2c_algo_bit drm_ttm_helper ttm drm_panel_backlight_quirks serio_raw polyval_clmulni drm_exec polyval_generic atkbd drm_suballoc_helper hid_sensor_custom libps2 amdxcp ghash_clmulni_intel thunderbolt nvme drm_buddy sha512_ssse3 vivaldi_fmap r8169 sp5100_tco sdhci_pci hid_sensor_hub hid_multitouch sha256_ssse3 gpu_sched nvme_core sha1_ssse3 sdhci_uhs2 ucsi_acpi realtek typec_ucsi aesni_intel sdhci hid_generic i8042 mdio_devres drm_display_helper i2c_piix4 crypto_simd nvme_keyring typec cqhci video cryptd libphy mmc_core i2c_smbus cec roles wmi amd_sfh serio nvme_auth i2c_hid_acpi i2c_hid
[  162.041479] CR2: ffff8ae08e072018
[  162.041483] ---[ end trace 0000000000000000 ]---
[  162.268980] pstore: backend (efi_pstore) writing error (-28)
[  162.268987] RIP: 0010:__srcu_read_unlock+0x1a/0x30
[  162.268992] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48 ff 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
[  162.268994] RSP: 0018:ffffcc50c0197d88 EFLAGS: 00010202
[  162.268996] RAX: 0000000000000000 RBX: ffff8ad94c899e08 RCX: 0000000000000000
[  162.268998] RDX: 0000000000000002 RSI: 0000000000000010 RDI: ffff8ad94c899e38
[  162.268998] RBP: ffffcc50c0197df8 R08: 0000000000000000 R09: 00000000fffffffd
[  162.268999] R10: 0000000000000001 R11: 00000000ffffffff R12: 0000000000000000
[  162.269000] R13: ffff8ad9762fa850 R14: 0000000000000001 R15: 0000000000000000
[  162.269001] FS:  0000000000000000(0000) GS:ffff8ae08e072000(0000) knlGS:0000000000000000
[  162.269001] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  162.269002] CR2: ffff8ae08e072018 CR3: 0000000265d22000 CR4: 0000000000f50ef0
[  162.269003] PKRU: 55555554
[  162.269004] note: kworker/0:0[9] exited with irqs disabled
```
Comment 5 Rong Zhang 2025-05-04 03:36:12 UTC
Created attachment 308080 [details]
Full demsg
Comment 6 Borislav Petkov 2025-05-06 14:56:16 UTC
Switching to mail.

Hi Benjamin,

take a look at the below pls.

The RIP points to:

  22:   48 c1 e6 04             shl    $0x4,%rsi
  26:   48 03 77 08             add    0x8(%rdi),%rsi
  2a:*  65 48 ff 46 08          incq   %gs:0x8(%rsi)            <-- trapping instruction
  2f:   c3                      ret

which really is a %gs-based access and the reporter has bisected this to

  9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")

which looks related.

My silly guess would be some bpf program does per-cpu accesses but it doesn't
know about this change so it tramples over registers. I mean, my fix would be
to disable BPF but you young kids love to play with that...

:-)

Thx.

On Sat, May 03, 2025 at 06:40:41PM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=220083
> 
>             Bug ID: 220083
>            Summary: [REGRESSION, BISECTED] x86 ASM changes make
>                     dispatch_hid_bpf_output_report access not-present page
>            Product: Platform Specific/Hardware
>            Version: 2.5
>           Hardware: All
>                 OS: Linux
>             Status: NEW
>           Severity: high
>           Priority: P3
>          Component: x86-64
>           Assignee: platform_x86_64@kernel-bugs.osdl.org
>           Reporter: i@rong.moe
>         Regression: No
> 
> After upgrading from 6.14.x to 6.15-rc3, not-present page PF occurs each time
> I
> unplug any of my Logitech Unifying receivers.
> 
> Upgrading to 6.15-rc4 did not fix the issue.
> 
> dmesg:
> ```
> [   48.726588] usb 7-1.4: USB disconnect, device number 7
> [   48.856531] BUG: unable to handle page fault for address: ffff8a510ee72018
> [   48.856543] #PF: supervisor write access in kernel mode
> [   48.856547] #PF: error_code(0x0002) - not-present page
> [   48.856550] PGD 365c01067 P4D 365c01067 PUD 0
> [   48.856558] Oops: Oops: 0002 [#1] SMP NOPTI
> [   48.856566] CPU: 0 UID: 0 PID: 7237 Comm: kworker/0:3 Tainted: G     U     
>        6.15.0-rc4 #1 PREEMPT(lazy)  b3a8ad1950c71c15317e5ea614db6e274ecb0913
> [   48.856574] Tainted: [U]=USER
> [   48.856577] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW
> 03/11/2025
> [   48.856579] Workqueue: events hidinput_led_worker
> [   48.856589] RIP: 0010:__srcu_read_unlock+0x1a/0x30
> [   48.856595] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e
> fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48
> ff
> 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> [   48.856598] RSP: 0018:ffffd037cc29fd88 EFLAGS: 00010202
> [   48.856602] RAX: 0000000000000000 RBX: ffff8a4c6b16fe08 RCX:
> 0000000000000000
> [   48.856604] RDX: 0000000000000002 RSI: 0000000000000010 RDI:
> ffff8a4c6b16fe38
> [   48.856606] RBP: ffffd037cc29fdf8 R08: 0000000000000000 R09:
> 00000000fffffffd
> [   48.856607] R10: 0000000000000001 R11: 00000000ffffffff R12:
> 0000000000000000
> [   48.856609] R13: ffff8a4ac182dbc0 R14: 0000000000000001 R15:
> 0000000000000000
> [   48.856611] FS:  0000000000000000(0000) GS:ffff8a510ee72000(0000)
> knlGS:0000000000000000
> [   48.856613] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   48.856614] CR2: ffff8a510ee72018 CR3: 0000000364c24000 CR4:
> 0000000000f50ef0
> [   48.856617] PKRU: 55555554
> [   48.856618] Call Trace:
> [   48.856621]  <TASK>
> [   48.856623]  dispatch_hid_bpf_output_report+0xc5/0x100
> [   48.856631]  hid_hw_output_report+0x46/0x90
> [   48.856635]  hidinput_led_worker+0xa9/0xe0
> [   48.856640]  process_one_work+0x18f/0x350
> [   48.856646]  worker_thread+0x2d3/0x400
> [   48.856650]  ? rescuer_thread+0x550/0x550
> [   48.856654]  kthread+0xf9/0x240
> [   48.856657]  ? kthreads_online_cpu+0x120/0x120
> [   48.856661]  ret_from_fork+0x31/0x50
> [   48.856665]  ? kthreads_online_cpu+0x120/0x120
> [   48.856668]  ret_from_fork_asm+0x11/0x20
> [   48.856674]  </TASK>
> [   48.856675] Modules linked in: xt_mark tcp_diag inet_diag snd_hrtimer
> snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq uhid rfcomm
> cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE xt_conntrack
> ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun snd_usb_audio snd_usbmidi_lib
> snd_ump snd_rawmidi snd_seq_device bridge stp llc nf_tables qrtr bnep overlay
> sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops
> videobuf2_v4l2
> videobuf2_common btusb videodev btrtl btintel mc btbcm btmtk bluetooth
> amd_atl
> intel_rapl_msr intel_rapl_common snd_acp_legacy_mach snd_acp_mach
> snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm snd_soc_dmic
> snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh
> snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci
> snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match
> snd_amd_sdw_acpi soundwire_amd snd_hda_codec_realtek
> [   48.856732]  soundwire_generic_allocation soundwire_bus
> snd_hda_codec_generic snd_soc_sdca snd_hda_scodec_component
> snd_hda_codec_hdmi
> snd_soc_core mt7925e snd_compress mt7925_common snd_hda_intel ac97_bus
> mt792x_lib snd_intel_dspcfg snd_pcm_dmaengine mt76_connac_lib
> snd_intel_sdw_acpi snd_rpl_pci_acp6x kvm_amd mt76 snd_hda_codec snd_acp_pci
> think_lmi snd_amd_acpi_mach kvm snd_hda_core snd_acp_legacy_common
> snd_pci_acp6x snd_hwdep mac80211 snd_pcm_oss snd_mixer_oss irqbypass
> snd_pci_acp5x snd_ctl_led snd_pcm libarc4 rapl pcspkr
> firmware_attributes_class
> snd_timer lenovo_wmi_hotkey_utilities snd_rn_pci_acp3x wmi_bmof cfg80211
> snd_acp_config snd snd_soc_acpi k10temp hid_sensor_als spd5118 amdxdna
> amd_pmf
> snd_pci_acp3x rfkill soundcore hid_sensor_trigger
> industrialio_triggered_buffer
> amdtee kfifo_buf joydev hid_sensor_iio_common ccp industrialio amd_pmc
> platform_profile mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc
> ppdev
> lp parport nvme_fabrics nfnetlink ip_tables x_tables dm_crypt encrypted_keys
> trusted
> [   48.856786]  asn1_encoder tee dm_mod raid10 raid456 async_raid6_recov
> async_memcpy async_pq async_xor async_tx raid1 raid0 linear md_mod igc ptp
> pps_core uas usb_storage hid_logitech_hidpp r8153_ecm cdc_ether usbnet
> hid_logitech_dj r8152 mii usbhid amdgpu i2c_algo_bit drm_ttm_helper ttm
> drm_panel_backlight_quirks polyval_clmulni polyval_generic drm_exec
> ghash_clmulni_intel drm_suballoc_helper amdxcp sha512_ssse3 sdhci_pci
> drm_buddy
> sha256_ssse3 thunderbolt hid_sensor_custom r8169 sha1_ssse3 serio_raw
> sp5100_tco sdhci_uhs2 gpu_sched nvme sdhci hid_multitouch realtek
> hid_sensor_hub aesni_intel atkbd ucsi_acpi drm_display_helper hid_generic
> nvme_core cqhci crypto_simd mdio_devres libps2 video typec_ucsi i2c_piix4
> vivaldi_fmap cryptd nvme_keyring typec libphy mmc_core i2c_smbus i8042 cec
> i2c_hid_acpi amd_sfh nvme_auth roles wmi serio i2c_hid
> [   48.856843] CR2: ffff8a510ee72018
> [   48.856846] ---[ end trace 0000000000000000 ]---
> [   50.304586] RIP: 0010:__srcu_read_unlock+0x1a/0x30
> [   50.304601] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e
> fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48
> ff
> 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> [   50.304603] RSP: 0018:ffffd037cc29fd88 EFLAGS: 00010202
> [   50.304606] RAX: 0000000000000000 RBX: ffff8a4c6b16fe08 RCX:
> 0000000000000000
> [   50.304607] RDX: 0000000000000002 RSI: 0000000000000010 RDI:
> ffff8a4c6b16fe38
> [   50.304608] RBP: ffffd037cc29fdf8 R08: 0000000000000000 R09:
> 00000000fffffffd
> [   50.304609] R10: 0000000000000001 R11: 00000000ffffffff R12:
> 0000000000000000
> [   50.304610] R13: ffff8a4ac182dbc0 R14: 0000000000000001 R15:
> 0000000000000000
> [   50.304611] FS:  0000000000000000(0000) GS:ffff8a510ee72000(0000)
> knlGS:0000000000000000
> [   50.304612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   50.304613] CR2: ffff8a510ee72018 CR3: 0000000121904000 CR4:
> 0000000000f50ef0
> [   50.304615] PKRU: 55555554
> [   50.304616] note: kworker/0:3[7237] exited with irqs disabled
> ```
> 
> Bisect log:
> 
> ```
> # good: [38fec10eb60d687e30c8c6b5420d86e8149f7557] Linux 6.14
> git bisect good 38fec10eb60d687e30c8c6b5420d86e8149f7557
> # bad: [9c32cda43eb78f78c73aee4aa344b777714e259b] Linux 6.15-rc3
> git bisect bad 9c32cda43eb78f78c73aee4aa344b777714e259b
> # bad: [4a4b30ea80d8cb5e8c4c62bb86201f4ea0d9b030] Merge tag
> 'bcachefs-2025-03-24' of git://evilpiepirate.org/bcachefs
> git bisect bad 4a4b30ea80d8cb5e8c4c62bb86201f4ea0d9b030
> # bad: [1e1ba8d23dae91e6a9cfeb1236b02749e8a49ab3] Merge tag
> 'timers-clocksource-2025-03-26' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect bad 1e1ba8d23dae91e6a9cfeb1236b02749e8a49ab3
> # skip: [21e0ff5b10ec1b61fda435d42db4ba80d0cdfded] Merge tag 'acpi-6.15-rc1'
> of
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
> git bisect skip 21e0ff5b10ec1b61fda435d42db4ba80d0cdfded
> # good: [47c4f9b1722fd883c9745d7877cb212e41dd2715] Tidy up ASoC control get
> and
> put handlers
> git bisect good 47c4f9b1722fd883c9745d7877cb212e41dd2715
> # bad: [2899aa3973efa3b0a7005cb7fb60475ea0c3b8a0] Merge tag
> 'x86_cache_for_v6.15' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect bad 2899aa3973efa3b0a7005cb7fb60475ea0c3b8a0
> # good: [5a658afd468b0fb55bf5f45c9788ee8dc87ba463] Merge tag
> 'objtool-core-2025-03-22' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect good 5a658afd468b0fb55bf5f45c9788ee8dc87ba463
> # bad: [a49a879f0ac19ed0a562e220019741857b261551] Merge tag
> 'x86-cleanups-2025-03-22' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> git bisect bad a49a879f0ac19ed0a562e220019741857b261551
> # bad: [9a93e29f16bbba90a63faad0abbc6dea3b2f0c63] x86/syscall: Move
> sys_ni_syscall()
> git bisect bad 9a93e29f16bbba90a63faad0abbc6dea3b2f0c63
> # bad: [cfdaa618defc5ebe1ee6aa5bd40a7ccedffca6de] Merge branch 'x86/cpu' into
> x86/asm, to pick up dependent commits
> git bisect bad cfdaa618defc5ebe1ee6aa5bd40a7ccedffca6de
> # good: [c4a8b7116b9927f7b00bd68140e285662a03068e] perf/x86/intel: Use cache
> cpu-type for hybrid PMU selection
> git bisect good c4a8b7116b9927f7b00bd68140e285662a03068e
> # good: [4f2a0b765c9731d2fa94e209ee9ae0e96b280f17] <linux/sizes.h>: Cover all
> possible x86 CPU cache sizes
> git bisect good 4f2a0b765c9731d2fa94e209ee9ae0e96b280f17
> # bad: [95b0916118106054e1f3d5d7f8628ef3dc0b3c02] percpu: Remove
> PER_CPU_FIRST_SECTION
> git bisect bad 95b0916118106054e1f3d5d7f8628ef3dc0b3c02
> # skip: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with GOT
> based stack cookie load on Clang < 17
> git bisect skip 78c4374ef8b842c6abf195d6f963853c7ec464d2
> # bad: [b5c4f95351a097a635c1a7fc8d9efa18308491b5] x86/percpu/64: Remove
> fixed_percpu_data
> git bisect bad b5c4f95351a097a635c1a7fc8d9efa18308491b5
> # skip: [cb7927fda002ca49ae62e2782c1692acc7b80c67] x86/relocs: Handle
> R_X86_64_REX_GOTPCRELX relocations
> git bisect skip cb7927fda002ca49ae62e2782c1692acc7b80c67
> # skip: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64:
> Convert to normal per-CPU variable
> git bisect skip 80d47defddc000271502057ebd7efa4fd6481542
> # skip: [f58b63857ae38b4484185b799a2759274b930c92] x86/pvh: Use
> fixed_percpu_data for early boot GSBASE
> git bisect skip f58b63857ae38b4484185b799a2759274b930c92
> # good: [0ee2689b9374d6fd5f43b703713a532278654749] x86/stackprotector: Remove
> stack protector test scripts
> git bisect good 0ee2689b9374d6fd5f43b703713a532278654749
> # bad: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42] x86/percpu/64: Use relative
> percpu offsets
> git bisect bad 9d7de2aa8b41407bc96d89a80dc1fd637d389d42
> # good: [a9a76b38aaf577887103e3ebb41d70e6aa5a4b19] x86/boot: Disable stack
> protector for early boot code
> git bisect good a9a76b38aaf577887103e3ebb41d70e6aa5a4b19
> # only skipped commits left to test
> # possible first bad commit: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42]
> x86/percpu/64: Use relative percpu offsets
> # possible first bad commit: [80d47defddc000271502057ebd7efa4fd6481542]
> x86/stackprotector/64: Convert to normal per-CPU variable
> # possible first bad commit: [78c4374ef8b842c6abf195d6f963853c7ec464d2]
> x86/module: Deal with GOT based stack cookie load on Clang < 17
> # possible first bad commit: [cb7927fda002ca49ae62e2782c1692acc7b80c67]
> x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
> # possible first bad commit: [f58b63857ae38b4484185b799a2759274b930c92]
> x86/pvh: Use fixed_percpu_data for early boot GSBASE
> ```
> 
> There is a typo in commit f58b63857ae3 ("x86/pvh: Use fixed_percpu_data for
> early boot GSBASE"), resulting in compilation failure.
> With the patch below, I bisected again:
> 
> ```
> diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
> index 723f181b222a..f1a8392a4835 100644
> --- a/arch/x86/platform/pvh/head.S
> +++ b/arch/x86/platform/pvh/head.S
> @@ -180,7 +180,7 @@ SYM_CODE_START(pvh_start_xen)
>          */
>         movl $MSR_GS_BASE,%ecx
>         leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
> -       movq %edx, %eax
> +       movl %edx, %eax
>         shrq $32, %rdx
>         wrmsr
> ```
> 
> New bisect log:
> 
> ```
> [...]
> # good: [a9a76b38aaf577887103e3ebb41d70e6aa5a4b19] x86/boot: Disable stack
> protector for early boot code
> git bisect good a9a76b38aaf577887103e3ebb41d70e6aa5a4b19
> # good: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with GOT
> based stack cookie load on Clang < 17
> git bisect good 78c4374ef8b842c6abf195d6f963853c7ec464d2
> # good: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64:
> Convert to normal per-CPU variable
> git bisect good 80d47defddc000271502057ebd7efa4fd6481542
> # first bad commit: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42] x86/percpu/64:
> Use relative percpu offsets
> ```
> 
> The bad commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
> first appeared in v6.15-rc1.
> 
> Got dmesg below by building and booting the bad commit, then unplugging the
> receiver:
> 
> ```
> [  560.223095] BUG: unable to handle page fault for address: ffff9acf2b889008
> [  560.223174] #PF: supervisor write access in kernel mode
> [  560.223299] #PF: error_code(0x0002) - not-present page
> [  560.223332] PGD 43e401067 P4D 43e401067 PUD 0
> [  560.223353] Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI
> [  560.223359] CPU: 0 UID: 0 PID: 8212 Comm: kworker/0:3 Tainted: G     U     
>       6.14.0-rc3+ #1 ab962f3b7921227b62db2503d8ec7411fa694628
> [  560.223364] Tainted: [U]=USER
> [  560.223369] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW
> 03/11/2025
> [  560.223378] Workqueue: events hidinput_led_worker
> [  560.223382] RIP: 0010:__srcu_read_lock+0x14/0x30
> [  560.223387] Code: 0f 0b eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00
> 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 8b 07 48 8b 57 08 83 e0 01 89 c1 <65> 48
> ff
> 04 ca f0 83 44 24 fc 00 c3 cc cc cc cc 66 66 2e 0f 1f 84
> [  560.223392] RSP: 0018:ffffb7df8d24fd88 EFLAGS: 00010202
> [  560.223396] RAX: 0000000000000001 RBX: ffff9ac82f80de08 RCX:
> 0000000000000001
> [  560.223401] RDX: 0000000000000000 RSI: ffff9ac8fd276f40 RDI:
> ffff9ac82f80de38
> [  560.223407] RBP: ffffb7df8d24fdf8 R08: 0000000000000000 R09:
> 00000000fffffffd
> [  560.223412] R10: 0000000000000001 R11: 00000000ffffffff R12:
> 0000000000000000
> [  560.223417] R13: ffff9ac8fd276f40 R14: 000000000000000e R15:
> 0000000000000000
> [  560.223421] FS:  0000000000000000(0000) GS:ffff9acf2b889000(0000)
> knlGS:0000000000000000
> [  560.223426] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  560.223430] CR2: ffff9acf2b889008 CR3: 00000001e1c40000 CR4:
> 0000000000f50ef0
> [  560.223434] PKRU: 55555554
> [  560.223439] Call Trace:
> [  560.223444]  <TASK>
> [  560.223449]  ? __die_body.cold+0x19/0x29
> [  560.223453]  ? page_fault_oops+0x15a/0x2e0
> [  560.223458]  ? search_bpf_extables+0x5f/0x80
> [  560.223462]  ? exc_page_fault+0x1a3/0x1b0
> [  560.223466]  ? asm_exc_page_fault+0x26/0x30
> [  560.223471]  ? __srcu_read_lock+0x14/0x30
> [  560.223475]  ? psi_task_switch+0xb7/0x200
> [  560.223480]  dispatch_hid_bpf_output_report+0x73/0x100
> [  560.223485]  hid_hw_output_report+0x46/0x90
> [  560.223490]  hidinput_led_worker+0xa9/0xe0
> [  560.223494]  process_one_work+0x17b/0x330
> [  560.223498]  worker_thread+0x2ce/0x3f0
> [  560.223503]  ? rescuer_thread+0x530/0x530
> [  560.223507]  kthread+0xeb/0x230
> [  560.223512]  ? kthreads_online_cpu+0x120/0x120
> [  560.223516]  ret_from_fork+0x31/0x50
> [  560.223522]  ? kthreads_online_cpu+0x120/0x120
> [  560.223528]  ret_from_fork_asm+0x11/0x20
> [  560.223532]  </TASK>
> [  560.223538] Modules linked in: tcp_diag inet_diag xt_mark snd_hrtimer
> snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq uhid rfcomm
> cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE xt_conntrack
> ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat
> nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun bridge stp llc nf_tables
> snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi snd_seq_device qrtr bnep
> overlay sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops btusb
> videobuf2_v4l2 btrtl videobuf2_common btintel btbcm videodev btmtk mc
> bluetooth
> snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70
> snd_acp_i2s snd_acp_pdm snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70
> snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt
> snd_sof_amd_renoir
> snd_sof_amd_acp intel_rapl_msr amd_atl snd_sof_pci intel_rapl_common
> snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match
> snd_amd_sdw_acpi soundwire_amd soundwire_generic_allocation snd_ctl_led
> [  560.223612]  soundwire_bus snd_soc_sdca snd_hda_codec_realtek
> snd_hda_codec_generic snd_soc_core mt7925e snd_hda_scodec_component
> mt7925_common snd_compress mt792x_lib snd_hda_codec_hdmi ac97_bus
> snd_hda_intel
> mt76_connac_lib snd_pcm_dmaengine snd_intel_dspcfg mt76 snd_rpl_pci_acp6x
> snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_acp_pci think_lmi snd_hda_core
> snd_acp_legacy_common mac80211 kvm snd_pci_acp6x snd_hwdep snd_pcm_oss
> snd_mixer_oss snd_pci_acp5x libarc4 amd_pmf rapl pcspkr
> firmware_attributes_class wmi_bmof hid_sensor_als amdtee snd_pcm
> hid_sensor_trigger snd_rn_pci_acp3x cfg80211 industrialio_triggered_buffer
> snd_timer joydev snd_acp_config kfifo_buf spd5118 snd snd_soc_acpi
> hid_sensor_iio_common ccp soundcore snd_pci_acp3x rfkill platform_profile
> amdxdna k10temp industrialio amd_pmc mousedev mac_hid sch_fq_codel uinput
> i2c_dev parport_pc ppdev lp parport nvme_fabrics nvme_keyring nfnetlink
> ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod
> raid10 raid456 async_raid6_recov
> [  560.223631]  async_memcpy async_pq async_xor async_tx raid1 raid0 linear
> md_mod igc ptp pps_core uas usb_storage hid_logitech_hidpp r8153_ecm
> cdc_ether
> usbnet hid_logitech_dj r8152 mii usbhid amdgpu i2c_algo_bit drm_ttm_helper
> ttm
> drm_panel_backlight_quirks polyval_clmulni drm_exec polyval_generic
> ghash_clmulni_intel drm_suballoc_helper sha512_ssse3 amdxcp hid_sensor_custom
> serio_raw sha256_ssse3 drm_buddy sdhci_pci ucsi_acpi atkbd nvme
> hid_multitouch
> r8169 sha1_ssse3 sp5100_tco hid_sensor_hub typec_ucsi libps2 gpu_sched
> sdhci_uhs2 vivaldi_fmap aesni_intel nvme_core sdhci hid_generic realtek typec
> drm_display_helper video i8042 crypto_simd i2c_piix4 mdio_devres cqhci cryptd
> thunderbolt mmc_core libphy cec amd_sfh nvme_auth roles i2c_smbus serio
> i2c_hid_acpi wmi i2c_hid
> [  560.223646] CR2: ffff9acf2b889008
> [  560.223650] ---[ end trace 0000000000000000 ]---
> [  560.223655] RIP: 0010:__srcu_read_lock+0x14/0x30
> [  560.223660] Code: 0f 0b eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84 00 00
> 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 8b 07 48 8b 57 08 83 e0 01 89 c1 <65> 48
> ff
> 04 ca f0 83 44 24 fc 00 c3 cc cc cc cc 66 66 2e 0f 1f 84
> [  560.223664] RSP: 0018:ffffb7df8d24fd88 EFLAGS: 00010202
> [  560.223670] RAX: 0000000000000001 RBX: ffff9ac82f80de08 RCX:
> 0000000000000001
> [  560.223674] RDX: 0000000000000000 RSI: ffff9ac8fd276f40 RDI:
> ffff9ac82f80de38
> [  560.223679] RBP: ffffb7df8d24fdf8 R08: 0000000000000000 R09:
> 00000000fffffffd
> [  560.223683] R10: 0000000000000001 R11: 00000000ffffffff R12:
> 0000000000000000
> [  560.223687] R13: ffff9ac8fd276f40 R14: 000000000000000e R15:
> 0000000000000000
> [  560.223692] FS:  0000000000000000(0000) GS:ffff9acf2b889000(0000)
> knlGS:0000000000000000
> [  560.223696] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  560.223700] CR2: ffff9acf2b889008 CR3: 00000001e1c40000 CR4:
> 0000000000f50ef0
> [  560.223704] PKRU: 55555554
> [  560.223709] note: kworker/0:3[8212] exited with irqs disabled
> ```
> 
> -- 
> You may reply to this email to add a comment.
> 
> You are receiving this mail because:
> You are watching the assignee of the bug.
Comment 7 bentiss 2025-05-06 15:35:43 UTC
Hi Boris,

On May 06 2025, Borislav Petkov wrote:
> Switching to mail.
> 
> Hi Benjamin,
> 
> take a look at the below pls.
> 
> The RIP points to:
> 
>   22:   48 c1 e6 04             shl    $0x4,%rsi
>   26:   48 03 77 08             add    0x8(%rdi),%rsi
>   2a:*  65 48 ff 46 08          incq   %gs:0x8(%rsi)            <-- trapping
>   instruction
>   2f:   c3                      ret
> 
> which really is a %gs-based access and the reporter has bisected this to
> 
>   9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
> 
> which looks related.
> 
> My silly guess would be some bpf program does per-cpu accesses but it doesn't
> know about this change so it tramples over registers. I mean, my fix would be
> to disable BPF but you young kids love to play with that...

Heh. Well, I would like to know if any HID-BPF program is loaded first.
These can be seen by running `sudo tree /sys/fs/bpf/hid/`.
`sudo bpftool prog` is another option in case udev-hid-bpf is not used.

If there is no hid-bpf program loaded, then it seems the code path in
drivers/hid/bpf/hid_bpf_dispatch.c:133 is:

```
	idx = srcu_read_lock(&hdev->bpf.srcu);
	list_for_each_entry_srcu(e, &hdev->bpf.prog_list, list,
				 srcu_read_lock_held(&hdev->bpf.srcu)) {
		// nothing happens here because the list is empty
	}
	ret = 0;

out:
	srcu_read_unlock(&hdev->bpf.srcu, idx);
```

So we are just in srcu_read_lock()/srcu_read_unlock() which is unlikely
to fail...

However, the fact that this happens in an unplug event makes me think
that there may be a race here at play.

Another option is that I completely missed the use of srcu, but it was
working fine previously, so I have no ideas :)

Anyway, we need to wait for the reporter to tell us if there were any
HID-BPF program first because this will likely give us a hint on where
the issue is.

Cheers,
Benjamin

> 
> :-)
> 
> Thx.
> 
> On Sat, May 03, 2025 at 06:40:41PM +0000, bugzilla-daemon@kernel.org wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=220083
> > 
> >             Bug ID: 220083
> >            Summary: [REGRESSION, BISECTED] x86 ASM changes make
> >                     dispatch_hid_bpf_output_report access not-present page
> >            Product: Platform Specific/Hardware
> >            Version: 2.5
> >           Hardware: All
> >                 OS: Linux
> >             Status: NEW
> >           Severity: high
> >           Priority: P3
> >          Component: x86-64
> >           Assignee: platform_x86_64@kernel-bugs.osdl.org
> >           Reporter: i@rong.moe
> >         Regression: No
> > 
> > After upgrading from 6.14.x to 6.15-rc3, not-present page PF occurs each
> time I
> > unplug any of my Logitech Unifying receivers.
> > 
> > Upgrading to 6.15-rc4 did not fix the issue.
> > 
> > dmesg:
> > ```
> > [   48.726588] usb 7-1.4: USB disconnect, device number 7
> > [   48.856531] BUG: unable to handle page fault for address:
> ffff8a510ee72018
> > [   48.856543] #PF: supervisor write access in kernel mode
> > [   48.856547] #PF: error_code(0x0002) - not-present page
> > [   48.856550] PGD 365c01067 P4D 365c01067 PUD 0
> > [   48.856558] Oops: Oops: 0002 [#1] SMP NOPTI
> > [   48.856566] CPU: 0 UID: 0 PID: 7237 Comm: kworker/0:3 Tainted: G     U   
> >        6.15.0-rc4 #1 PREEMPT(lazy) 
> b3a8ad1950c71c15317e5ea614db6e274ecb0913
> > [   48.856574] Tainted: [U]=USER
> > [   48.856577] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW
> 03/11/2025
> > [   48.856579] Workqueue: events hidinput_led_worker
> > [   48.856589] RIP: 0010:__srcu_read_unlock+0x1a/0x30
> > [   48.856595] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f
> 1e
> > fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65>
> 48 ff
> > 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> > [   48.856598] RSP: 0018:ffffd037cc29fd88 EFLAGS: 00010202
> > [   48.856602] RAX: 0000000000000000 RBX: ffff8a4c6b16fe08 RCX:
> > 0000000000000000
> > [   48.856604] RDX: 0000000000000002 RSI: 0000000000000010 RDI:
> > ffff8a4c6b16fe38
> > [   48.856606] RBP: ffffd037cc29fdf8 R08: 0000000000000000 R09:
> > 00000000fffffffd
> > [   48.856607] R10: 0000000000000001 R11: 00000000ffffffff R12:
> > 0000000000000000
> > [   48.856609] R13: ffff8a4ac182dbc0 R14: 0000000000000001 R15:
> > 0000000000000000
> > [   48.856611] FS:  0000000000000000(0000) GS:ffff8a510ee72000(0000)
> > knlGS:0000000000000000
> > [   48.856613] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   48.856614] CR2: ffff8a510ee72018 CR3: 0000000364c24000 CR4:
> > 0000000000f50ef0
> > [   48.856617] PKRU: 55555554
> > [   48.856618] Call Trace:
> > [   48.856621]  <TASK>
> > [   48.856623]  dispatch_hid_bpf_output_report+0xc5/0x100
> > [   48.856631]  hid_hw_output_report+0x46/0x90
> > [   48.856635]  hidinput_led_worker+0xa9/0xe0
> > [   48.856640]  process_one_work+0x18f/0x350
> > [   48.856646]  worker_thread+0x2d3/0x400
> > [   48.856650]  ? rescuer_thread+0x550/0x550
> > [   48.856654]  kthread+0xf9/0x240
> > [   48.856657]  ? kthreads_online_cpu+0x120/0x120
> > [   48.856661]  ret_from_fork+0x31/0x50
> > [   48.856665]  ? kthreads_online_cpu+0x120/0x120
> > [   48.856668]  ret_from_fork_asm+0x11/0x20
> > [   48.856674]  </TASK>
> > [   48.856675] Modules linked in: xt_mark tcp_diag inet_diag snd_hrtimer
> > snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq uhid
> rfcomm
> > cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE
> xt_conntrack
> > ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat
> > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun snd_usb_audio
> snd_usbmidi_lib
> > snd_ump snd_rawmidi snd_seq_device bridge stp llc nf_tables qrtr bnep
> overlay
> > sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops
> videobuf2_v4l2
> > videobuf2_common btusb videodev btrtl btintel mc btbcm btmtk bluetooth
> amd_atl
> > intel_rapl_msr intel_rapl_common snd_acp_legacy_mach snd_acp_mach
> > snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm snd_soc_dmic
> > snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh
> > snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci
> > snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match
> > snd_amd_sdw_acpi soundwire_amd snd_hda_codec_realtek
> > [   48.856732]  soundwire_generic_allocation soundwire_bus
> > snd_hda_codec_generic snd_soc_sdca snd_hda_scodec_component
> snd_hda_codec_hdmi
> > snd_soc_core mt7925e snd_compress mt7925_common snd_hda_intel ac97_bus
> > mt792x_lib snd_intel_dspcfg snd_pcm_dmaengine mt76_connac_lib
> > snd_intel_sdw_acpi snd_rpl_pci_acp6x kvm_amd mt76 snd_hda_codec snd_acp_pci
> > think_lmi snd_amd_acpi_mach kvm snd_hda_core snd_acp_legacy_common
> > snd_pci_acp6x snd_hwdep mac80211 snd_pcm_oss snd_mixer_oss irqbypass
> > snd_pci_acp5x snd_ctl_led snd_pcm libarc4 rapl pcspkr
> firmware_attributes_class
> > snd_timer lenovo_wmi_hotkey_utilities snd_rn_pci_acp3x wmi_bmof cfg80211
> > snd_acp_config snd snd_soc_acpi k10temp hid_sensor_als spd5118 amdxdna
> amd_pmf
> > snd_pci_acp3x rfkill soundcore hid_sensor_trigger
> industrialio_triggered_buffer
> > amdtee kfifo_buf joydev hid_sensor_iio_common ccp industrialio amd_pmc
> > platform_profile mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc
> ppdev
> > lp parport nvme_fabrics nfnetlink ip_tables x_tables dm_crypt
> encrypted_keys
> > trusted
> > [   48.856786]  asn1_encoder tee dm_mod raid10 raid456 async_raid6_recov
> > async_memcpy async_pq async_xor async_tx raid1 raid0 linear md_mod igc ptp
> > pps_core uas usb_storage hid_logitech_hidpp r8153_ecm cdc_ether usbnet
> > hid_logitech_dj r8152 mii usbhid amdgpu i2c_algo_bit drm_ttm_helper ttm
> > drm_panel_backlight_quirks polyval_clmulni polyval_generic drm_exec
> > ghash_clmulni_intel drm_suballoc_helper amdxcp sha512_ssse3 sdhci_pci
> drm_buddy
> > sha256_ssse3 thunderbolt hid_sensor_custom r8169 sha1_ssse3 serio_raw
> > sp5100_tco sdhci_uhs2 gpu_sched nvme sdhci hid_multitouch realtek
> > hid_sensor_hub aesni_intel atkbd ucsi_acpi drm_display_helper hid_generic
> > nvme_core cqhci crypto_simd mdio_devres libps2 video typec_ucsi i2c_piix4
> > vivaldi_fmap cryptd nvme_keyring typec libphy mmc_core i2c_smbus i8042 cec
> > i2c_hid_acpi amd_sfh nvme_auth roles wmi serio i2c_hid
> > [   48.856843] CR2: ffff8a510ee72018
> > [   48.856846] ---[ end trace 0000000000000000 ]---
> > [   50.304586] RIP: 0010:__srcu_read_unlock+0x1a/0x30
> > [   50.304601] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f
> 1e
> > fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65>
> 48 ff
> > 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> > [   50.304603] RSP: 0018:ffffd037cc29fd88 EFLAGS: 00010202
> > [   50.304606] RAX: 0000000000000000 RBX: ffff8a4c6b16fe08 RCX:
> > 0000000000000000
> > [   50.304607] RDX: 0000000000000002 RSI: 0000000000000010 RDI:
> > ffff8a4c6b16fe38
> > [   50.304608] RBP: ffffd037cc29fdf8 R08: 0000000000000000 R09:
> > 00000000fffffffd
> > [   50.304609] R10: 0000000000000001 R11: 00000000ffffffff R12:
> > 0000000000000000
> > [   50.304610] R13: ffff8a4ac182dbc0 R14: 0000000000000001 R15:
> > 0000000000000000
> > [   50.304611] FS:  0000000000000000(0000) GS:ffff8a510ee72000(0000)
> > knlGS:0000000000000000
> > [   50.304612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   50.304613] CR2: ffff8a510ee72018 CR3: 0000000121904000 CR4:
> > 0000000000f50ef0
> > [   50.304615] PKRU: 55555554
> > [   50.304616] note: kworker/0:3[7237] exited with irqs disabled
> > ```
> > 
> > Bisect log:
> > 
> > ```
> > # good: [38fec10eb60d687e30c8c6b5420d86e8149f7557] Linux 6.14
> > git bisect good 38fec10eb60d687e30c8c6b5420d86e8149f7557
> > # bad: [9c32cda43eb78f78c73aee4aa344b777714e259b] Linux 6.15-rc3
> > git bisect bad 9c32cda43eb78f78c73aee4aa344b777714e259b
> > # bad: [4a4b30ea80d8cb5e8c4c62bb86201f4ea0d9b030] Merge tag
> > 'bcachefs-2025-03-24' of git://evilpiepirate.org/bcachefs
> > git bisect bad 4a4b30ea80d8cb5e8c4c62bb86201f4ea0d9b030
> > # bad: [1e1ba8d23dae91e6a9cfeb1236b02749e8a49ab3] Merge tag
> > 'timers-clocksource-2025-03-26' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > git bisect bad 1e1ba8d23dae91e6a9cfeb1236b02749e8a49ab3
> > # skip: [21e0ff5b10ec1b61fda435d42db4ba80d0cdfded] Merge tag
> 'acpi-6.15-rc1' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
> > git bisect skip 21e0ff5b10ec1b61fda435d42db4ba80d0cdfded
> > # good: [47c4f9b1722fd883c9745d7877cb212e41dd2715] Tidy up ASoC control get
> and
> > put handlers
> > git bisect good 47c4f9b1722fd883c9745d7877cb212e41dd2715
> > # bad: [2899aa3973efa3b0a7005cb7fb60475ea0c3b8a0] Merge tag
> > 'x86_cache_for_v6.15' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > git bisect bad 2899aa3973efa3b0a7005cb7fb60475ea0c3b8a0
> > # good: [5a658afd468b0fb55bf5f45c9788ee8dc87ba463] Merge tag
> > 'objtool-core-2025-03-22' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > git bisect good 5a658afd468b0fb55bf5f45c9788ee8dc87ba463
> > # bad: [a49a879f0ac19ed0a562e220019741857b261551] Merge tag
> > 'x86-cleanups-2025-03-22' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > git bisect bad a49a879f0ac19ed0a562e220019741857b261551
> > # bad: [9a93e29f16bbba90a63faad0abbc6dea3b2f0c63] x86/syscall: Move
> > sys_ni_syscall()
> > git bisect bad 9a93e29f16bbba90a63faad0abbc6dea3b2f0c63
> > # bad: [cfdaa618defc5ebe1ee6aa5bd40a7ccedffca6de] Merge branch 'x86/cpu'
> into
> > x86/asm, to pick up dependent commits
> > git bisect bad cfdaa618defc5ebe1ee6aa5bd40a7ccedffca6de
> > # good: [c4a8b7116b9927f7b00bd68140e285662a03068e] perf/x86/intel: Use
> cache
> > cpu-type for hybrid PMU selection
> > git bisect good c4a8b7116b9927f7b00bd68140e285662a03068e
> > # good: [4f2a0b765c9731d2fa94e209ee9ae0e96b280f17] <linux/sizes.h>: Cover
> all
> > possible x86 CPU cache sizes
> > git bisect good 4f2a0b765c9731d2fa94e209ee9ae0e96b280f17
> > # bad: [95b0916118106054e1f3d5d7f8628ef3dc0b3c02] percpu: Remove
> > PER_CPU_FIRST_SECTION
> > git bisect bad 95b0916118106054e1f3d5d7f8628ef3dc0b3c02
> > # skip: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with
> GOT
> > based stack cookie load on Clang < 17
> > git bisect skip 78c4374ef8b842c6abf195d6f963853c7ec464d2
> > # bad: [b5c4f95351a097a635c1a7fc8d9efa18308491b5] x86/percpu/64: Remove
> > fixed_percpu_data
> > git bisect bad b5c4f95351a097a635c1a7fc8d9efa18308491b5
> > # skip: [cb7927fda002ca49ae62e2782c1692acc7b80c67] x86/relocs: Handle
> > R_X86_64_REX_GOTPCRELX relocations
> > git bisect skip cb7927fda002ca49ae62e2782c1692acc7b80c67
> > # skip: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64:
> > Convert to normal per-CPU variable
> > git bisect skip 80d47defddc000271502057ebd7efa4fd6481542
> > # skip: [f58b63857ae38b4484185b799a2759274b930c92] x86/pvh: Use
> > fixed_percpu_data for early boot GSBASE
> > git bisect skip f58b63857ae38b4484185b799a2759274b930c92
> > # good: [0ee2689b9374d6fd5f43b703713a532278654749] x86/stackprotector:
> Remove
> > stack protector test scripts
> > git bisect good 0ee2689b9374d6fd5f43b703713a532278654749
> > # bad: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42] x86/percpu/64: Use
> relative
> > percpu offsets
> > git bisect bad 9d7de2aa8b41407bc96d89a80dc1fd637d389d42
> > # good: [a9a76b38aaf577887103e3ebb41d70e6aa5a4b19] x86/boot: Disable stack
> > protector for early boot code
> > git bisect good a9a76b38aaf577887103e3ebb41d70e6aa5a4b19
> > # only skipped commits left to test
> > # possible first bad commit: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42]
> > x86/percpu/64: Use relative percpu offsets
> > # possible first bad commit: [80d47defddc000271502057ebd7efa4fd6481542]
> > x86/stackprotector/64: Convert to normal per-CPU variable
> > # possible first bad commit: [78c4374ef8b842c6abf195d6f963853c7ec464d2]
> > x86/module: Deal with GOT based stack cookie load on Clang < 17
> > # possible first bad commit: [cb7927fda002ca49ae62e2782c1692acc7b80c67]
> > x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
> > # possible first bad commit: [f58b63857ae38b4484185b799a2759274b930c92]
> > x86/pvh: Use fixed_percpu_data for early boot GSBASE
> > ```
> > 
> > There is a typo in commit f58b63857ae3 ("x86/pvh: Use fixed_percpu_data for
> > early boot GSBASE"), resulting in compilation failure.
> > With the patch below, I bisected again:
> > 
> > ```
> > diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
> > index 723f181b222a..f1a8392a4835 100644
> > --- a/arch/x86/platform/pvh/head.S
> > +++ b/arch/x86/platform/pvh/head.S
> > @@ -180,7 +180,7 @@ SYM_CODE_START(pvh_start_xen)
> >          */
> >         movl $MSR_GS_BASE,%ecx
> >         leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
> > -       movq %edx, %eax
> > +       movl %edx, %eax
> >         shrq $32, %rdx
> >         wrmsr
> > ```
> > 
> > New bisect log:
> > 
> > ```
> > [...]
> > # good: [a9a76b38aaf577887103e3ebb41d70e6aa5a4b19] x86/boot: Disable stack
> > protector for early boot code
> > git bisect good a9a76b38aaf577887103e3ebb41d70e6aa5a4b19
> > # good: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with
> GOT
> > based stack cookie load on Clang < 17
> > git bisect good 78c4374ef8b842c6abf195d6f963853c7ec464d2
> > # good: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64:
> > Convert to normal per-CPU variable
> > git bisect good 80d47defddc000271502057ebd7efa4fd6481542
> > # first bad commit: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42]
> x86/percpu/64:
> > Use relative percpu offsets
> > ```
> > 
> > The bad commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
> > first appeared in v6.15-rc1.
> > 
> > Got dmesg below by building and booting the bad commit, then unplugging the
> > receiver:
> > 
> > ```
> > [  560.223095] BUG: unable to handle page fault for address:
> ffff9acf2b889008
> > [  560.223174] #PF: supervisor write access in kernel mode
> > [  560.223299] #PF: error_code(0x0002) - not-present page
> > [  560.223332] PGD 43e401067 P4D 43e401067 PUD 0
> > [  560.223353] Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI
> > [  560.223359] CPU: 0 UID: 0 PID: 8212 Comm: kworker/0:3 Tainted: G     U   
> >       6.14.0-rc3+ #1 ab962f3b7921227b62db2503d8ec7411fa694628
> > [  560.223364] Tainted: [U]=USER
> > [  560.223369] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW
> 03/11/2025
> > [  560.223378] Workqueue: events hidinput_led_worker
> > [  560.223382] RIP: 0010:__srcu_read_lock+0x14/0x30
> > [  560.223387] Code: 0f 0b eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84 00
> 00
> > 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 8b 07 48 8b 57 08 83 e0 01 89 c1 <65>
> 48 ff
> > 04 ca f0 83 44 24 fc 00 c3 cc cc cc cc 66 66 2e 0f 1f 84
> > [  560.223392] RSP: 0018:ffffb7df8d24fd88 EFLAGS: 00010202
> > [  560.223396] RAX: 0000000000000001 RBX: ffff9ac82f80de08 RCX:
> > 0000000000000001
> > [  560.223401] RDX: 0000000000000000 RSI: ffff9ac8fd276f40 RDI:
> > ffff9ac82f80de38
> > [  560.223407] RBP: ffffb7df8d24fdf8 R08: 0000000000000000 R09:
> > 00000000fffffffd
> > [  560.223412] R10: 0000000000000001 R11: 00000000ffffffff R12:
> > 0000000000000000
> > [  560.223417] R13: ffff9ac8fd276f40 R14: 000000000000000e R15:
> > 0000000000000000
> > [  560.223421] FS:  0000000000000000(0000) GS:ffff9acf2b889000(0000)
> > knlGS:0000000000000000
> > [  560.223426] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  560.223430] CR2: ffff9acf2b889008 CR3: 00000001e1c40000 CR4:
> > 0000000000f50ef0
> > [  560.223434] PKRU: 55555554
> > [  560.223439] Call Trace:
> > [  560.223444]  <TASK>
> > [  560.223449]  ? __die_body.cold+0x19/0x29
> > [  560.223453]  ? page_fault_oops+0x15a/0x2e0
> > [  560.223458]  ? search_bpf_extables+0x5f/0x80
> > [  560.223462]  ? exc_page_fault+0x1a3/0x1b0
> > [  560.223466]  ? asm_exc_page_fault+0x26/0x30
> > [  560.223471]  ? __srcu_read_lock+0x14/0x30
> > [  560.223475]  ? psi_task_switch+0xb7/0x200
> > [  560.223480]  dispatch_hid_bpf_output_report+0x73/0x100
> > [  560.223485]  hid_hw_output_report+0x46/0x90
> > [  560.223490]  hidinput_led_worker+0xa9/0xe0
> > [  560.223494]  process_one_work+0x17b/0x330
> > [  560.223498]  worker_thread+0x2ce/0x3f0
> > [  560.223503]  ? rescuer_thread+0x530/0x530
> > [  560.223507]  kthread+0xeb/0x230
> > [  560.223512]  ? kthreads_online_cpu+0x120/0x120
> > [  560.223516]  ret_from_fork+0x31/0x50
> > [  560.223522]  ? kthreads_online_cpu+0x120/0x120
> > [  560.223528]  ret_from_fork_asm+0x11/0x20
> > [  560.223532]  </TASK>
> > [  560.223538] Modules linked in: tcp_diag inet_diag xt_mark snd_hrtimer
> > snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq uhid
> rfcomm
> > cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE
> xt_conntrack
> > ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat
> > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun bridge stp llc nf_tables
> > snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi snd_seq_device qrtr bnep
> > overlay sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops
> btusb
> > videobuf2_v4l2 btrtl videobuf2_common btintel btbcm videodev btmtk mc
> bluetooth
> > snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70
> > snd_acp_i2s snd_acp_pdm snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70
> > snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt
> snd_sof_amd_renoir
> > snd_sof_amd_acp intel_rapl_msr amd_atl snd_sof_pci intel_rapl_common
> > snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match
> > snd_amd_sdw_acpi soundwire_amd soundwire_generic_allocation snd_ctl_led
> > [  560.223612]  soundwire_bus snd_soc_sdca snd_hda_codec_realtek
> > snd_hda_codec_generic snd_soc_core mt7925e snd_hda_scodec_component
> > mt7925_common snd_compress mt792x_lib snd_hda_codec_hdmi ac97_bus
> snd_hda_intel
> > mt76_connac_lib snd_pcm_dmaengine snd_intel_dspcfg mt76 snd_rpl_pci_acp6x
> > snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_acp_pci think_lmi snd_hda_core
> > snd_acp_legacy_common mac80211 kvm snd_pci_acp6x snd_hwdep snd_pcm_oss
> > snd_mixer_oss snd_pci_acp5x libarc4 amd_pmf rapl pcspkr
> > firmware_attributes_class wmi_bmof hid_sensor_als amdtee snd_pcm
> > hid_sensor_trigger snd_rn_pci_acp3x cfg80211 industrialio_triggered_buffer
> > snd_timer joydev snd_acp_config kfifo_buf spd5118 snd snd_soc_acpi
> > hid_sensor_iio_common ccp soundcore snd_pci_acp3x rfkill platform_profile
> > amdxdna k10temp industrialio amd_pmc mousedev mac_hid sch_fq_codel uinput
> > i2c_dev parport_pc ppdev lp parport nvme_fabrics nvme_keyring nfnetlink
> > ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee dm_mod
> > raid10 raid456 async_raid6_recov
> > [  560.223631]  async_memcpy async_pq async_xor async_tx raid1 raid0 linear
> > md_mod igc ptp pps_core uas usb_storage hid_logitech_hidpp r8153_ecm
> cdc_ether
> > usbnet hid_logitech_dj r8152 mii usbhid amdgpu i2c_algo_bit drm_ttm_helper
> ttm
> > drm_panel_backlight_quirks polyval_clmulni drm_exec polyval_generic
> > ghash_clmulni_intel drm_suballoc_helper sha512_ssse3 amdxcp
> hid_sensor_custom
> > serio_raw sha256_ssse3 drm_buddy sdhci_pci ucsi_acpi atkbd nvme
> hid_multitouch
> > r8169 sha1_ssse3 sp5100_tco hid_sensor_hub typec_ucsi libps2 gpu_sched
> > sdhci_uhs2 vivaldi_fmap aesni_intel nvme_core sdhci hid_generic realtek
> typec
> > drm_display_helper video i8042 crypto_simd i2c_piix4 mdio_devres cqhci
> cryptd
> > thunderbolt mmc_core libphy cec amd_sfh nvme_auth roles i2c_smbus serio
> > i2c_hid_acpi wmi i2c_hid
> > [  560.223646] CR2: ffff9acf2b889008
> > [  560.223650] ---[ end trace 0000000000000000 ]---
> > [  560.223655] RIP: 0010:__srcu_read_lock+0x14/0x30
> > [  560.223660] Code: 0f 0b eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84 00
> 00
> > 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 8b 07 48 8b 57 08 83 e0 01 89 c1 <65>
> 48 ff
> > 04 ca f0 83 44 24 fc 00 c3 cc cc cc cc 66 66 2e 0f 1f 84
> > [  560.223664] RSP: 0018:ffffb7df8d24fd88 EFLAGS: 00010202
> > [  560.223670] RAX: 0000000000000001 RBX: ffff9ac82f80de08 RCX:
> > 0000000000000001
> > [  560.223674] RDX: 0000000000000000 RSI: ffff9ac8fd276f40 RDI:
> > ffff9ac82f80de38
> > [  560.223679] RBP: ffffb7df8d24fdf8 R08: 0000000000000000 R09:
> > 00000000fffffffd
> > [  560.223683] R10: 0000000000000001 R11: 00000000ffffffff R12:
> > 0000000000000000
> > [  560.223687] R13: ffff9ac8fd276f40 R14: 000000000000000e R15:
> > 0000000000000000
> > [  560.223692] FS:  0000000000000000(0000) GS:ffff9acf2b889000(0000)
> > knlGS:0000000000000000
> > [  560.223696] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  560.223700] CR2: ffff9acf2b889008 CR3: 00000001e1c40000 CR4:
> > 0000000000f50ef0
> > [  560.223704] PKRU: 55555554
> > [  560.223709] note: kworker/0:3[8212] exited with irqs disabled
> > ```
> > 
> > -- 
> > You may reply to this email to add a comment.
> > 
> > You are receiving this mail because:
> > You are watching the assignee of the bug.
> 
> -- 
> Regards/Gruss,
>     Boris.
> 
> https://people.kernel.org/tglx/notes-about-netiquette
Comment 8 Rong Zhang 2025-05-07 11:04:28 UTC
Created attachment 308103 [details]
6.15-rc5_per-cpu-pf_bpftrace

Hi Benjamin,

On Tue, 2025-05-06 at 17:35 +0200, Benjamin Tissoires wrote:
> Hi Boris,
> 
> On May 06 2025, Borislav Petkov wrote:
> > Switching to mail.
> > 
> > Hi Benjamin,
> > 
> > take a look at the below pls.
> > 
> > The RIP points to:
> > 
> >   22:   48 c1 e6 04             shl    $0x4,%rsi
> >   26:   48 03 77 08             add    0x8(%rdi),%rsi
> >   2a:*  65 48 ff 46 08          incq   %gs:0x8(%rsi)            <--
> trapping instruction
> >   2f:   c3                      ret
> > 
> > which really is a %gs-based access and the reporter has bisected this to
> > 
> >   9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
> > 
> > which looks related.
> > 
> > My silly guess would be some bpf program does per-cpu accesses but it
> doesn't
> > know about this change so it tramples over registers. I mean, my fix would
> be
> > to disable BPF but you young kids love to play with that...
> 
> Heh. Well, I would like to know if any HID-BPF program is loaded first.
> These can be seen by running `sudo tree /sys/fs/bpf/hid/`.

Nothing is there.

$ sudo tree /sys/fs/bpf/hid/
/sys/fs/bpf/hid/  [error opening dir]

0 directories, 0 files
$ sudo tree /sys/fs/bpf/
/sys/fs/bpf/

0 directories, 0 files

> `sudo bpftool prog` is another option in case udev-hid-bpf is not used.

$ sudo bpftool prog
21: lsm  name restrict_filesystems  tag aae89fa01fe7ee91  gpl
        loaded_at 2025-05-07T10:23:09+0800  uid 0
        xlated 560B  jited 305B  memlock 4096B  map_ids 11
        btf_id 104
22: cgroup_device  name sd_devices  tag 40ddf486530245f5  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 504B  jited 318B  memlock 4096B
23: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
24: cgroup_skb  name sd_fw_ingress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
25: cgroup_device  name sd_devices  tag be31ae23198a0378  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 464B  jited 297B  memlock 4096B
26: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
27: cgroup_skb  name sd_fw_ingress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
28: cgroup_device  name sd_devices  tag ee0e253c78993a24  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 416B  jited 267B  memlock 4096B
29: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
30: cgroup_skb  name sd_fw_ingress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:10+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
31: cgroup_device  name sd_devices  tag ee0e253c78993a24  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 416B  jited 267B  memlock 4096B
32: cgroup_device  name sd_devices  tag ee0e253c78993a24  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 416B  jited 267B  memlock 4096B
33: cgroup_device  name sd_devices  tag b37200ab714f0e17  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 184B  jited 110B  memlock 4096B
34: cgroup_device  name sd_devices  tag 738e6ebf4499a83a  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 792B  jited 489B  memlock 4096B
35: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
36: cgroup_skb  name sd_fw_ingress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
37: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
38: cgroup_skb  name sd_fw_ingress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
39: cgroup_device  name sd_devices  tag ee0e253c78993a24  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 416B  jited 267B  memlock 4096B
41: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
42: cgroup_skb  name sd_fw_ingress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:12+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
43: cgroup_device  name sd_devices  tag be31ae23198a0378  gpl
        loaded_at 2025-05-07T10:23:13+0800  uid 0
        xlated 464B  jited 297B  memlock 4096B
44: cgroup_device  name sd_devices  tag be31ae23198a0378  gpl
        loaded_at 2025-05-07T10:23:13+0800  uid 0
        xlated 464B  jited 297B  memlock 4096B
45: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:13+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
46: cgroup_skb  name sd_fw_ingress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:13+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
50: cgroup_skb  name sd_fw_egress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:14+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B
51: cgroup_skb  name sd_fw_ingress  tag 6deef7357e7b4530  gpl
        loaded_at 2025-05-07T10:23:14+0800  uid 0
        xlated 64B  jited 63B  memlock 4096B

Though I am not familiar with systemd's BPF programs, given that they
are lsm/cgroup-related, I guess they don't aim to handle raw HID
requests.

> If there is no hid-bpf program loaded, then it seems the code path in
> drivers/hid/bpf/hid_bpf_dispatch.c:133 is:
> 
> ```
>       idx = srcu_read_lock(&hdev->bpf.srcu);
>       list_for_each_entry_srcu(e, &hdev->bpf.prog_list, list,
>                                srcu_read_lock_held(&hdev->bpf.srcu)) {
>               // nothing happens here because the list is empty
>       }
>       ret = 0;
> 
> out:
>       srcu_read_unlock(&hdev->bpf.srcu, idx);
> ```
> 
> So we are just in srcu_read_lock()/srcu_read_unlock() which is unlikely
> to fail...

In case you need it, I decoded a stacktrace (I've upgraded to 6.15-rc5
BTW):

[14591.438053] usb 7-1.4.4: USB disconnect, device number 7
[14591.541666] BUG: unable to handle page fault for address: ffff8efd88e65018
[14591.541674] #PF: supervisor write access in kernel mode
[14591.541676] #PF: error_code(0x0002) - not-present page
[14591.541677] PGD 220801067 P4D 220801067 PUD 0
[14591.541681] Oops: Oops: 0002 [#1] SMP NOPTI
[14591.541684] CPU: 0 UID: 0 PID: 56816 Comm: kworker/0:2 Not tainted 6.15.0-rc5 #1 PREEMPT(lazy)  0538d36f9cfa2dbc3c98efb2730490d8b2399dc4
[14591.541687] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW 03/11/2025
[14591.541689] Workqueue: events hidinput_led_worker
[14591.541693] RIP: 0010:__srcu_read_unlock (kernel/rcu/srcutree.c:768 (discriminator 1)) 
[14591.541697] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48 ff 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
All code
========
   0:	c3                   	ret
   1:	cc                   	int3
   2:	cc                   	int3
   3:	cc                   	int3
   4:	cc                   	int3
   5:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
   c:	00 00 00 00 
  10:	f3 0f 1e fa          	endbr64
  14:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  19:	f0 83 44 24 fc 00    	lock addl $0x0,-0x4(%rsp)
  1f:	48 63 f6             	movslq %esi,%rsi
  22:	48 c1 e6 04          	shl    $0x4,%rsi
  26:	48 03 77 08          	add    0x8(%rdi),%rsi
  2a:*	65 48 ff 46 08       	incq   %gs:0x8(%rsi)		<-- trapping instruction
  2f:	c3                   	ret
  30:	cc                   	int3
  31:	cc                   	int3
  32:	cc                   	int3
  33:	cc                   	int3
  34:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
  3b:	00 00 00 00 
  3f:	90                   	nop

Code starting with the faulting instruction
===========================================
   0:	65 48 ff 46 08       	incq   %gs:0x8(%rsi)
   5:	c3                   	ret
   6:	cc                   	int3
   7:	cc                   	int3
   8:	cc                   	int3
   9:	cc                   	int3
   a:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
  11:	00 00 00 00 
  15:	90                   	nop
[14591.541698] RSP: 0018:ffffd0c6094f7d88 EFLAGS: 00010202
[14591.541700] RAX: 0000000000000000 RBX: ffff8ef67492be08 RCX: 0000000000000000
[14591.541701] RDX: 0000000000000002 RSI: 0000000000000010 RDI: ffff8ef67492be38
[14591.541702] RBP: ffffd0c6094f7df8 R08: 0000000000000000 R09: 00000000fffffffd
[14591.541703] R10: 0000000000000001 R11: 00000000ffffffff R12: 0000000000000000
[14591.541703] R13: ffff8ef70d8143d0 R14: 0000000000000001 R15: 0000000000000000
[14591.541704] FS:  0000000000000000(0000) GS:ffff8efd88e65000(0000) knlGS:0000000000000000
[14591.541705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14591.541706] CR2: ffff8efd88e65018 CR3: 00000001d0184000 CR4: 0000000000f50ef0
[14591.541707] PKRU: 55555554
[14591.541708] Call Trace:
[14591.541710]  <TASK>
[14591.541711] dispatch_hid_bpf_output_report (drivers/hid/bpf/hid_bpf_dispatch.c:148) 
[14591.541716] hid_hw_output_report (drivers/hid/hid-core.c:2500 drivers/hid/hid-core.c:2520) 
[14591.541717] hidinput_led_worker (drivers/hid/hid-input.c:1838) 
[14591.541719] process_one_work (kernel/workqueue.c:3238) 
[14591.541721] worker_thread (kernel/workqueue.c:3313 (discriminator 2) kernel/workqueue.c:3400 (discriminator 2)) 
[14591.541723] ? rescuer_thread (kernel/workqueue.c:3346) 
[14591.541724] kthread (kernel/kthread.c:464) 
[14591.541727] ? kthreads_online_cpu (kernel/kthread.c:413) 
[14591.541729] ret_from_fork (arch/x86/kernel/process.c:153) 
[14591.541731] ? kthreads_online_cpu (kernel/kthread.c:413) 
[14591.541733] ret_from_fork_asm (arch/x86/entry/entry_64.S:255) 
[14591.541737]  </TASK>
[14591.541738] Modules linked in: mmc_block rpmb_core udp_diag tcp_diag inet_diag xt_mark ccm snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun bridge stp llc nf_tables qrtr uhid rfcomm cmac algif_hash algif_skcipher af_alg overlay bnep sunrpc vfat fat btusb uvcvideo btrtl videobuf2_vmalloc btintel uvc videobuf2_memops btbcm videobuf2_v4l2 btmtk videobuf2_common bluetooth videodev mc intel_rapl_msr amd_atl intel_rapl_common snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_soc_acpi_amd_match snd_hda_codec_realtek snd_amd_sdw_acpi soundwire_amd snd_hda_codec_generic
[14591.541775]  soundwire_generic_allocation snd_hda_scodec_component soundwire_bus snd_soc_sdca snd_hda_codec_hdmi snd_soc_core mt7925e snd_hda_intel snd_compress mt7925_common ac97_bus snd_intel_dspcfg kvm_amd mt792x_lib snd_pcm_dmaengine snd_intel_sdw_acpi snd_rpl_pci_acp6x mt76_connac_lib snd_hda_codec snd_acp_pci kvm mt76 snd_amd_acpi_mach snd_hda_core snd_acp_legacy_common irqbypass think_lmi snd_pci_acp6x snd_hwdep rapl snd_ctl_led pcspkr mac80211 snd_pcm_oss firmware_attributes_class lenovo_wmi_hotkey_utilities snd_mixer_oss snd_pci_acp5x libarc4 snd_pcm wmi_bmof snd_rn_pci_acp3x spd5118 snd_timer snd_acp_config cfg80211 snd snd_soc_acpi hid_sensor_als soundcore amdxdna amd_pmf hid_sensor_trigger snd_pci_acp3x k10temp rfkill industrialio_triggered_buffer amdtee kfifo_buf joydev hid_sensor_iio_common ccp industrialio mousedev platform_profile amd_pmc mac_hid sch_fq_codel uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics nfnetlink ip_tables x_tables hid_logitech_hidpp hid_logitech_dj usbhid dm_crypt
[14591.541811]  encrypted_keys trusted asn1_encoder tee dm_mod raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 linear md_mod igc ptp pps_core r8153_ecm r8152 cdc_ether usbnet mii amdgpu i2c_algo_bit drm_ttm_helper ttm drm_panel_backlight_quirks polyval_clmulni polyval_generic drm_exec drm_suballoc_helper ghash_clmulni_intel amdxcp sha512_ssse3 drm_buddy sdhci_pci sha256_ssse3 sp5100_tco r8169 nvme sdhci_uhs2 gpu_sched sha1_ssse3 serio_raw hid_sensor_custom sdhci nvme_core realtek aesni_intel ucsi_acpi atkbd drm_display_helper cqhci libps2 crypto_simd typec_ucsi hid_multitouch i2c_piix4 nvme_keyring mdio_devres hid_sensor_hub hid_generic thunderbolt vivaldi_fmap cryptd typec cec libphy mmc_core amd_sfh video i8042 nvme_auth i2c_smbus roles i2c_hid_acpi serio wmi i2c_hid
[14591.541846] CR2: ffff8efd88e65018
[14591.541848] ---[ end trace 0000000000000000 ]---
[14591.733025] pstore: backend (efi_pstore) writing error (-28)
[14591.733031] RIP: 0010:__srcu_read_unlock (kernel/rcu/srcutree.c:768 (discriminator 1)) 
[14591.733037] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48 ff 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
All code
========
   0:	c3                   	ret
   1:	cc                   	int3
   2:	cc                   	int3
   3:	cc                   	int3
   4:	cc                   	int3
   5:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
   c:	00 00 00 00 
  10:	f3 0f 1e fa          	endbr64
  14:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  19:	f0 83 44 24 fc 00    	lock addl $0x0,-0x4(%rsp)
  1f:	48 63 f6             	movslq %esi,%rsi
  22:	48 c1 e6 04          	shl    $0x4,%rsi
  26:	48 03 77 08          	add    0x8(%rdi),%rsi
  2a:*	65 48 ff 46 08       	incq   %gs:0x8(%rsi)		<-- trapping instruction
  2f:	c3                   	ret
  30:	cc                   	int3
  31:	cc                   	int3
  32:	cc                   	int3
  33:	cc                   	int3
  34:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
  3b:	00 00 00 00 
  3f:	90                   	nop

Code starting with the faulting instruction
===========================================
   0:	65 48 ff 46 08       	incq   %gs:0x8(%rsi)
   5:	c3                   	ret
   6:	cc                   	int3
   7:	cc                   	int3
   8:	cc                   	int3
   9:	cc                   	int3
   a:	66 66 2e 0f 1f 84 00 	data16 cs nopw 0x0(%rax,%rax,1)
  11:	00 00 00 00 
  15:	90                   	nop
[14591.733039] RSP: 0018:ffffd0c6094f7d88 EFLAGS: 00010202
[14591.733041] RAX: 0000000000000000 RBX: ffff8ef67492be08 RCX: 0000000000000000
[14591.733043] RDX: 0000000000000002 RSI: 0000000000000010 RDI: ffff8ef67492be38
[14591.733043] RBP: ffffd0c6094f7df8 R08: 0000000000000000 R09: 00000000fffffffd
[14591.733044] R10: 0000000000000001 R11: 00000000ffffffff R12: 0000000000000000
[14591.733045] R13: ffff8ef70d8143d0 R14: 0000000000000001 R15: 0000000000000000
[14591.733046] FS:  0000000000000000(0000) GS:ffff8efd88e65000(0000) knlGS:0000000000000000
[14591.733047] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14591.733047] CR2: ffff8efd88e65018 CR3: 00000001d0184000 CR4: 0000000000f50ef0
[14591.733048] PKRU: 55555554
[14591.733049] note: kworker/0:2[56816] exited with irqs disabled

> However, the fact that this happens in an unplug event makes me think
> that there may be a race here at play.
> 
> Another option is that I completely missed the use of srcu, but it was
> working fine previously, so I have no ideas :)

Yes, this is weird.

I also tried uinput and some other HID devices (randomly borrowed from
my friends). They all worked fine.

I have a Logitech Bolt receiver, too. Will find and try it out.

> Anyway, we need to wait for the reporter to tell us if there were any
> HID-BPF program first because this will likely give us a hint on where
> the issue is.

In another clean boot, I triggered the bug and dumped the hdev struct
at fentry (fexit will never hit because of the PF) via bpftrace.

#!/usr/bin/env bpftrace

f:dispatch_hid_bpf_output_report
{
	$US = (uint64) 1000000; $ts = nsecs / 1000;
	printf(
		"[%lld.%06lld - %s(%d)@CPU#%u]: %s:\n",
		$ts / $US, $ts % $US, comm, pid, cpu, probe
	);
	print(*args);
	print(kstack);
	printf("*hdev:\n"); print(*args.hdev);
}

See attachments for its output (warning: contains an extremely long
line) and the decoded dmesg while tracing.

In another clean boot (again), I played around retsnoop to capture the
Last Branch Records (type: any_return, ind_call) from
dispatch_hid_bpf_output_report. This time I didn't trigger the issue,
or else nothing would be captured due to the PF as mentioned above.
Instead, I pressed Caps Lock on a keyboard under the same receiver
several times to trigger hidinput_led_worker. I always got:

[#15] kprobe_multi_link_handler+0x5d      (kernel/trace/bpf_trace.c:2843)           ->  fprobe_entry+0xe6                   (kernel/trace/fprobe.c:321)
                                                                                        . __fprobe_handler                  (kernel/trace/fprobe.c:224)
[#14] fprobe_entry+0x21c                  (kernel/trace/fprobe.c:336)               ->  function_graph_enter_regs+0x15d     (kernel/trace/fgraph.c:676)
[#13] function_graph_enter_regs+0x1cd     (kernel/trace/fgraph.c:718)               ->  ftrace_graph_func+0x3c              (arch/x86/kernel/ftrace.c:659)
[#12] ftrace_graph_func+0x4c              (arch/x86/kernel/ftrace.c:661)            ->  ftrace_trampoline+0x83
[#11] ftrace_trampoline+0xc2                                                        ->  dispatch_hid_bpf_output_report+0x9  (drivers/hid/bpf/hid_bpf_dispatch.c:120)
[#10] __srcu_read_lock+0x20               (kernel/rcu/srcutree.c:757)               ->  dispatch_hid_bpf_output_report+0x73 (drivers/hid/bpf/hid_bpf_dispatch.c:133)
                                                                                        . srcu_read_lock                    (include/linux/srcu.h:252)
[#09] __srcu_read_unlock+0x1f             (kernel/rcu/srcutree.c:769)               ->  dispatch_hid_bpf_output_report+0xc5 (drivers/hid/bpf/hid_bpf_dispatch.c:148)
[#08] dispatch_hid_bpf_output_report+0xe6 (drivers/hid/bpf/hid_bpf_dispatch.c:148)  ->  return_to_handler+0x0               (arch/x86/kernel/ftrace_64.S:358)

!    6us [0]  dispatch_hid_bpf_output_report

Thus, there is indeed no BPF program being called.

Feel free to ask for more experiments :)

> Cheers,
> Benjamin

Thanks,
Rong

> > 
> > :-)
> > 
> > Thx.
> > 
> > On Sat, May 03, 2025 at 06:40:41PM +0000, bugzilla-daemon@kernel.org wrote:
> > > https://bugzilla.kernel.org/show_bug.cgi?id=220083
> > > 
> > >             Bug ID: 220083
> > >            Summary: [REGRESSION, BISECTED] x86 ASM changes make
> > >                     dispatch_hid_bpf_output_report access not-present
> page
> > >            Product: Platform Specific/Hardware
> > >            Version: 2.5
> > >           Hardware: All
> > >                 OS: Linux
> > >             Status: NEW
> > >           Severity: high
> > >           Priority: P3
> > >          Component: x86-64
> > >           Assignee: platform_x86_64@kernel-bugs.osdl.org
> > >           Reporter: i@rong.moe
> > >         Regression: No
> > > 
> > > After upgrading from 6.14.x to 6.15-rc3, not-present page PF occurs each
> time I
> > > unplug any of my Logitech Unifying receivers.
> > > 
> > > Upgrading to 6.15-rc4 did not fix the issue.
> > > 
> > > dmesg:
> > > ```
> > > [   48.726588] usb 7-1.4: USB disconnect, device number 7
> > > [   48.856531] BUG: unable to handle page fault for address:
> ffff8a510ee72018
> > > [   48.856543] #PF: supervisor write access in kernel mode
> > > [   48.856547] #PF: error_code(0x0002) - not-present page
> > > [   48.856550] PGD 365c01067 P4D 365c01067 PUD 0
> > > [   48.856558] Oops: Oops: 0002 [#1] SMP NOPTI
> > > [   48.856566] CPU: 0 UID: 0 PID: 7237 Comm: kworker/0:3 Tainted: G     U 
> > >        6.15.0-rc4 #1 PREEMPT(lazy) 
> b3a8ad1950c71c15317e5ea614db6e274ecb0913
> > > [   48.856574] Tainted: [U]=USER
> > > [   48.856577] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW
> 03/11/2025
> > > [   48.856579] Workqueue: events hidinput_led_worker
> > > [   48.856589] RIP: 0010:__srcu_read_unlock+0x1a/0x30
> > > [   48.856595] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3
> 0f 1e
> > > fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65>
> 48 ff
> > > 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> > > [   48.856598] RSP: 0018:ffffd037cc29fd88 EFLAGS: 00010202
> > > [   48.856602] RAX: 0000000000000000 RBX: ffff8a4c6b16fe08 RCX:
> > > 0000000000000000
> > > [   48.856604] RDX: 0000000000000002 RSI: 0000000000000010 RDI:
> > > ffff8a4c6b16fe38
> > > [   48.856606] RBP: ffffd037cc29fdf8 R08: 0000000000000000 R09:
> > > 00000000fffffffd
> > > [   48.856607] R10: 0000000000000001 R11: 00000000ffffffff R12:
> > > 0000000000000000
> > > [   48.856609] R13: ffff8a4ac182dbc0 R14: 0000000000000001 R15:
> > > 0000000000000000
> > > [   48.856611] FS:  0000000000000000(0000) GS:ffff8a510ee72000(0000)
> > > knlGS:0000000000000000
> > > [   48.856613] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [   48.856614] CR2: ffff8a510ee72018 CR3: 0000000364c24000 CR4:
> > > 0000000000f50ef0
> > > [   48.856617] PKRU: 55555554
> > > [   48.856618] Call Trace:
> > > [   48.856621]  <TASK>
> > > [   48.856623]  dispatch_hid_bpf_output_report+0xc5/0x100
> > > [   48.856631]  hid_hw_output_report+0x46/0x90
> > > [   48.856635]  hidinput_led_worker+0xa9/0xe0
> > > [   48.856640]  process_one_work+0x18f/0x350
> > > [   48.856646]  worker_thread+0x2d3/0x400
> > > [   48.856650]  ? rescuer_thread+0x550/0x550
> > > [   48.856654]  kthread+0xf9/0x240
> > > [   48.856657]  ? kthreads_online_cpu+0x120/0x120
> > > [   48.856661]  ret_from_fork+0x31/0x50
> > > [   48.856665]  ? kthreads_online_cpu+0x120/0x120
> > > [   48.856668]  ret_from_fork_asm+0x11/0x20
> > > [   48.856674]  </TASK>
> > > [   48.856675] Modules linked in: xt_mark tcp_diag inet_diag snd_hrtimer
> > > snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq uhid
> rfcomm
> > > cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE
> xt_conntrack
> > > ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat
> > > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun snd_usb_audio
> snd_usbmidi_lib
> > > snd_ump snd_rawmidi snd_seq_device bridge stp llc nf_tables qrtr bnep
> overlay
> > > sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops
> videobuf2_v4l2
> > > videobuf2_common btusb videodev btrtl btintel mc btbcm btmtk bluetooth
> amd_atl
> > > intel_rapl_msr intel_rapl_common snd_acp_legacy_mach snd_acp_mach
> > > snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm
> snd_soc_dmic
> > > snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63 snd_sof_amd_vangogh
> > > snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci
> > > snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps
> snd_soc_acpi_amd_match
> > > snd_amd_sdw_acpi soundwire_amd snd_hda_codec_realtek
> > > [   48.856732]  soundwire_generic_allocation soundwire_bus
> > > snd_hda_codec_generic snd_soc_sdca snd_hda_scodec_component
> snd_hda_codec_hdmi
> > > snd_soc_core mt7925e snd_compress mt7925_common snd_hda_intel ac97_bus
> > > mt792x_lib snd_intel_dspcfg snd_pcm_dmaengine mt76_connac_lib
> > > snd_intel_sdw_acpi snd_rpl_pci_acp6x kvm_amd mt76 snd_hda_codec
> snd_acp_pci
> > > think_lmi snd_amd_acpi_mach kvm snd_hda_core snd_acp_legacy_common
> > > snd_pci_acp6x snd_hwdep mac80211 snd_pcm_oss snd_mixer_oss irqbypass
> > > snd_pci_acp5x snd_ctl_led snd_pcm libarc4 rapl pcspkr
> firmware_attributes_class
> > > snd_timer lenovo_wmi_hotkey_utilities snd_rn_pci_acp3x wmi_bmof cfg80211
> > > snd_acp_config snd snd_soc_acpi k10temp hid_sensor_als spd5118 amdxdna
> amd_pmf
> > > snd_pci_acp3x rfkill soundcore hid_sensor_trigger
> industrialio_triggered_buffer
> > > amdtee kfifo_buf joydev hid_sensor_iio_common ccp industrialio amd_pmc
> > > platform_profile mousedev mac_hid sch_fq_codel uinput i2c_dev parport_pc
> ppdev
> > > lp parport nvme_fabrics nfnetlink ip_tables x_tables dm_crypt
> encrypted_keys
> > > trusted
> > > [   48.856786]  asn1_encoder tee dm_mod raid10 raid456 async_raid6_recov
> > > async_memcpy async_pq async_xor async_tx raid1 raid0 linear md_mod igc
> ptp
> > > pps_core uas usb_storage hid_logitech_hidpp r8153_ecm cdc_ether usbnet
> > > hid_logitech_dj r8152 mii usbhid amdgpu i2c_algo_bit drm_ttm_helper ttm
> > > drm_panel_backlight_quirks polyval_clmulni polyval_generic drm_exec
> > > ghash_clmulni_intel drm_suballoc_helper amdxcp sha512_ssse3 sdhci_pci
> drm_buddy
> > > sha256_ssse3 thunderbolt hid_sensor_custom r8169 sha1_ssse3 serio_raw
> > > sp5100_tco sdhci_uhs2 gpu_sched nvme sdhci hid_multitouch realtek
> > > hid_sensor_hub aesni_intel atkbd ucsi_acpi drm_display_helper hid_generic
> > > nvme_core cqhci crypto_simd mdio_devres libps2 video typec_ucsi i2c_piix4
> > > vivaldi_fmap cryptd nvme_keyring typec libphy mmc_core i2c_smbus i8042
> cec
> > > i2c_hid_acpi amd_sfh nvme_auth roles wmi serio i2c_hid
> > > [   48.856843] CR2: ffff8a510ee72018
> > > [   48.856846] ---[ end trace 0000000000000000 ]---
> > > [   50.304586] RIP: 0010:__srcu_read_unlock+0x1a/0x30
> > > [   50.304601] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3
> 0f 1e
> > > fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65>
> 48 ff
> > > 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> > > [   50.304603] RSP: 0018:ffffd037cc29fd88 EFLAGS: 00010202
> > > [   50.304606] RAX: 0000000000000000 RBX: ffff8a4c6b16fe08 RCX:
> > > 0000000000000000
> > > [   50.304607] RDX: 0000000000000002 RSI: 0000000000000010 RDI:
> > > ffff8a4c6b16fe38
> > > [   50.304608] RBP: ffffd037cc29fdf8 R08: 0000000000000000 R09:
> > > 00000000fffffffd
> > > [   50.304609] R10: 0000000000000001 R11: 00000000ffffffff R12:
> > > 0000000000000000
> > > [   50.304610] R13: ffff8a4ac182dbc0 R14: 0000000000000001 R15:
> > > 0000000000000000
> > > [   50.304611] FS:  0000000000000000(0000) GS:ffff8a510ee72000(0000)
> > > knlGS:0000000000000000
> > > [   50.304612] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [   50.304613] CR2: ffff8a510ee72018 CR3: 0000000121904000 CR4:
> > > 0000000000f50ef0
> > > [   50.304615] PKRU: 55555554
> > > [   50.304616] note: kworker/0:3[7237] exited with irqs disabled
> > > ```
> > > 
> > > Bisect log:
> > > 
> > > ```
> > > # good: [38fec10eb60d687e30c8c6b5420d86e8149f7557] Linux 6.14
> > > git bisect good 38fec10eb60d687e30c8c6b5420d86e8149f7557
> > > # bad: [9c32cda43eb78f78c73aee4aa344b777714e259b] Linux 6.15-rc3
> > > git bisect bad 9c32cda43eb78f78c73aee4aa344b777714e259b
> > > # bad: [4a4b30ea80d8cb5e8c4c62bb86201f4ea0d9b030] Merge tag
> > > 'bcachefs-2025-03-24' of git://evilpiepirate.org/bcachefs
> > > git bisect bad 4a4b30ea80d8cb5e8c4c62bb86201f4ea0d9b030
> > > # bad: [1e1ba8d23dae91e6a9cfeb1236b02749e8a49ab3] Merge tag
> > > 'timers-clocksource-2025-03-26' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > > git bisect bad 1e1ba8d23dae91e6a9cfeb1236b02749e8a49ab3
> > > # skip: [21e0ff5b10ec1b61fda435d42db4ba80d0cdfded] Merge tag
> 'acpi-6.15-rc1' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
> > > git bisect skip 21e0ff5b10ec1b61fda435d42db4ba80d0cdfded
> > > # good: [47c4f9b1722fd883c9745d7877cb212e41dd2715] Tidy up ASoC control
> get and
> > > put handlers
> > > git bisect good 47c4f9b1722fd883c9745d7877cb212e41dd2715
> > > # bad: [2899aa3973efa3b0a7005cb7fb60475ea0c3b8a0] Merge tag
> > > 'x86_cache_for_v6.15' of
> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > > git bisect bad 2899aa3973efa3b0a7005cb7fb60475ea0c3b8a0
> > > # good: [5a658afd468b0fb55bf5f45c9788ee8dc87ba463] Merge tag
> > > 'objtool-core-2025-03-22' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > > git bisect good 5a658afd468b0fb55bf5f45c9788ee8dc87ba463
> > > # bad: [a49a879f0ac19ed0a562e220019741857b261551] Merge tag
> > > 'x86-cleanups-2025-03-22' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > > git bisect bad a49a879f0ac19ed0a562e220019741857b261551
> > > # bad: [9a93e29f16bbba90a63faad0abbc6dea3b2f0c63] x86/syscall: Move
> > > sys_ni_syscall()
> > > git bisect bad 9a93e29f16bbba90a63faad0abbc6dea3b2f0c63
> > > # bad: [cfdaa618defc5ebe1ee6aa5bd40a7ccedffca6de] Merge branch 'x86/cpu'
> into
> > > x86/asm, to pick up dependent commits
> > > git bisect bad cfdaa618defc5ebe1ee6aa5bd40a7ccedffca6de
> > > # good: [c4a8b7116b9927f7b00bd68140e285662a03068e] perf/x86/intel: Use
> cache
> > > cpu-type for hybrid PMU selection
> > > git bisect good c4a8b7116b9927f7b00bd68140e285662a03068e
> > > # good: [4f2a0b765c9731d2fa94e209ee9ae0e96b280f17] <linux/sizes.h>: Cover
> all
> > > possible x86 CPU cache sizes
> > > git bisect good 4f2a0b765c9731d2fa94e209ee9ae0e96b280f17
> > > # bad: [95b0916118106054e1f3d5d7f8628ef3dc0b3c02] percpu: Remove
> > > PER_CPU_FIRST_SECTION
> > > git bisect bad 95b0916118106054e1f3d5d7f8628ef3dc0b3c02
> > > # skip: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with
> GOT
> > > based stack cookie load on Clang < 17
> > > git bisect skip 78c4374ef8b842c6abf195d6f963853c7ec464d2
> > > # bad: [b5c4f95351a097a635c1a7fc8d9efa18308491b5] x86/percpu/64: Remove
> > > fixed_percpu_data
> > > git bisect bad b5c4f95351a097a635c1a7fc8d9efa18308491b5
> > > # skip: [cb7927fda002ca49ae62e2782c1692acc7b80c67] x86/relocs: Handle
> > > R_X86_64_REX_GOTPCRELX relocations
> > > git bisect skip cb7927fda002ca49ae62e2782c1692acc7b80c67
> > > # skip: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64:
> > > Convert to normal per-CPU variable
> > > git bisect skip 80d47defddc000271502057ebd7efa4fd6481542
> > > # skip: [f58b63857ae38b4484185b799a2759274b930c92] x86/pvh: Use
> > > fixed_percpu_data for early boot GSBASE
> > > git bisect skip f58b63857ae38b4484185b799a2759274b930c92
> > > # good: [0ee2689b9374d6fd5f43b703713a532278654749] x86/stackprotector:
> Remove
> > > stack protector test scripts
> > > git bisect good 0ee2689b9374d6fd5f43b703713a532278654749
> > > # bad: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42] x86/percpu/64: Use
> relative
> > > percpu offsets
> > > git bisect bad 9d7de2aa8b41407bc96d89a80dc1fd637d389d42
> > > # good: [a9a76b38aaf577887103e3ebb41d70e6aa5a4b19] x86/boot: Disable
> stack
> > > protector for early boot code
> > > git bisect good a9a76b38aaf577887103e3ebb41d70e6aa5a4b19
> > > # only skipped commits left to test
> > > # possible first bad commit: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42]
> > > x86/percpu/64: Use relative percpu offsets
> > > # possible first bad commit: [80d47defddc000271502057ebd7efa4fd6481542]
> > > x86/stackprotector/64: Convert to normal per-CPU variable
> > > # possible first bad commit: [78c4374ef8b842c6abf195d6f963853c7ec464d2]
> > > x86/module: Deal with GOT based stack cookie load on Clang < 17
> > > # possible first bad commit: [cb7927fda002ca49ae62e2782c1692acc7b80c67]
> > > x86/relocs: Handle R_X86_64_REX_GOTPCRELX relocations
> > > # possible first bad commit: [f58b63857ae38b4484185b799a2759274b930c92]
> > > x86/pvh: Use fixed_percpu_data for early boot GSBASE
> > > ```
> > > 
> > > There is a typo in commit f58b63857ae3 ("x86/pvh: Use fixed_percpu_data
> for
> > > early boot GSBASE"), resulting in compilation failure.
> > > With the patch below, I bisected again:
> > > 
> > > ```
> > > diff --git a/arch/x86/platform/pvh/head.S b/arch/x86/platform/pvh/head.S
> > > index 723f181b222a..f1a8392a4835 100644
> > > --- a/arch/x86/platform/pvh/head.S
> > > +++ b/arch/x86/platform/pvh/head.S
> > > @@ -180,7 +180,7 @@ SYM_CODE_START(pvh_start_xen)
> > >          */
> > >         movl $MSR_GS_BASE,%ecx
> > >         leaq INIT_PER_CPU_VAR(fixed_percpu_data)(%rip), %rdx
> > > -       movq %edx, %eax
> > > +       movl %edx, %eax
> > >         shrq $32, %rdx
> > >         wrmsr
> > > ```
> > > 
> > > New bisect log:
> > > 
> > > ```
> > > [...]
> > > # good: [a9a76b38aaf577887103e3ebb41d70e6aa5a4b19] x86/boot: Disable
> stack
> > > protector for early boot code
> > > git bisect good a9a76b38aaf577887103e3ebb41d70e6aa5a4b19
> > > # good: [78c4374ef8b842c6abf195d6f963853c7ec464d2] x86/module: Deal with
> GOT
> > > based stack cookie load on Clang < 17
> > > git bisect good 78c4374ef8b842c6abf195d6f963853c7ec464d2
> > > # good: [80d47defddc000271502057ebd7efa4fd6481542] x86/stackprotector/64:
> > > Convert to normal per-CPU variable
> > > git bisect good 80d47defddc000271502057ebd7efa4fd6481542
> > > # first bad commit: [9d7de2aa8b41407bc96d89a80dc1fd637d389d42]
> x86/percpu/64:
> > > Use relative percpu offsets
> > > ```
> > > 
> > > The bad commit 9d7de2aa8b41 ("x86/percpu/64: Use relative percpu
> offsets")
> > > first appeared in v6.15-rc1.
> > > 
> > > Got dmesg below by building and booting the bad commit, then unplugging
> the
> > > receiver:
> > > 
> > > ```
> > > [  560.223095] BUG: unable to handle page fault for address:
> ffff9acf2b889008
> > > [  560.223174] #PF: supervisor write access in kernel mode
> > > [  560.223299] #PF: error_code(0x0002) - not-present page
> > > [  560.223332] PGD 43e401067 P4D 43e401067 PUD 0
> > > [  560.223353] Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI
> > > [  560.223359] CPU: 0 UID: 0 PID: 8212 Comm: kworker/0:3 Tainted: G     U 
> > >       6.14.0-rc3+ #1 ab962f3b7921227b62db2503d8ec7411fa694628
> > > [  560.223364] Tainted: [U]=USER
> > > [  560.223369] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW
> 03/11/2025
> > > [  560.223378] Workqueue: events hidinput_led_worker
> > > [  560.223382] RIP: 0010:__srcu_read_lock+0x14/0x30
> > > [  560.223387] Code: 0f 0b eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84
> 00 00
> > > 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 8b 07 48 8b 57 08 83 e0 01 89 c1 <65>
> 48 ff
> > > 04 ca f0 83 44 24 fc 00 c3 cc cc cc cc 66 66 2e 0f 1f 84
> > > [  560.223392] RSP: 0018:ffffb7df8d24fd88 EFLAGS: 00010202
> > > [  560.223396] RAX: 0000000000000001 RBX: ffff9ac82f80de08 RCX:
> > > 0000000000000001
> > > [  560.223401] RDX: 0000000000000000 RSI: ffff9ac8fd276f40 RDI:
> > > ffff9ac82f80de38
> > > [  560.223407] RBP: ffffb7df8d24fdf8 R08: 0000000000000000 R09:
> > > 00000000fffffffd
> > > [  560.223412] R10: 0000000000000001 R11: 00000000ffffffff R12:
> > > 0000000000000000
> > > [  560.223417] R13: ffff9ac8fd276f40 R14: 000000000000000e R15:
> > > 0000000000000000
> > > [  560.223421] FS:  0000000000000000(0000) GS:ffff9acf2b889000(0000)
> > > knlGS:0000000000000000
> > > [  560.223426] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [  560.223430] CR2: ffff9acf2b889008 CR3: 00000001e1c40000 CR4:
> > > 0000000000f50ef0
> > > [  560.223434] PKRU: 55555554
> > > [  560.223439] Call Trace:
> > > [  560.223444]  <TASK>
> > > [  560.223449]  ? __die_body.cold+0x19/0x29
> > > [  560.223453]  ? page_fault_oops+0x15a/0x2e0
> > > [  560.223458]  ? search_bpf_extables+0x5f/0x80
> > > [  560.223462]  ? exc_page_fault+0x1a3/0x1b0
> > > [  560.223466]  ? asm_exc_page_fault+0x26/0x30
> > > [  560.223471]  ? __srcu_read_lock+0x14/0x30
> > > [  560.223475]  ? psi_task_switch+0xb7/0x200
> > > [  560.223480]  dispatch_hid_bpf_output_report+0x73/0x100
> > > [  560.223485]  hid_hw_output_report+0x46/0x90
> > > [  560.223490]  hidinput_led_worker+0xa9/0xe0
> > > [  560.223494]  process_one_work+0x17b/0x330
> > > [  560.223498]  worker_thread+0x2ce/0x3f0
> > > [  560.223503]  ? rescuer_thread+0x530/0x530
> > > [  560.223507]  kthread+0xeb/0x230
> > > [  560.223512]  ? kthreads_online_cpu+0x120/0x120
> > > [  560.223516]  ret_from_fork+0x31/0x50
> > > [  560.223522]  ? kthreads_online_cpu+0x120/0x120
> > > [  560.223528]  ret_from_fork_asm+0x11/0x20
> > > [  560.223532]  </TASK>
> > > [  560.223538] Modules linked in: tcp_diag inet_diag xt_mark snd_hrtimer
> > > snd_seq_dummy snd_seq_midi snd_seq_oss snd_seq_midi_event snd_seq uhid
> rfcomm
> > > cmac algif_hash algif_skcipher af_alg xt_CHECKSUM xt_MASQUERADE
> xt_conntrack
> > > ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat nft_chain_nat nf_nat
> > > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun bridge stp llc nf_tables
> > > snd_usb_audio snd_usbmidi_lib snd_ump snd_rawmidi snd_seq_device qrtr
> bnep
> > > overlay sunrpc vfat fat uvcvideo videobuf2_vmalloc uvc videobuf2_memops
> btusb
> > > videobuf2_v4l2 btrtl videobuf2_common btintel btbcm videodev btmtk mc
> bluetooth
> > > snd_acp_legacy_mach snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70
> > > snd_acp_i2s snd_acp_pdm snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70
> > > snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt
> snd_sof_amd_renoir
> > > snd_sof_amd_acp intel_rapl_msr amd_atl snd_sof_pci intel_rapl_common
> > > snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps
> snd_soc_acpi_amd_match
> > > snd_amd_sdw_acpi soundwire_amd soundwire_generic_allocation snd_ctl_led
> > > [  560.223612]  soundwire_bus snd_soc_sdca snd_hda_codec_realtek
> > > snd_hda_codec_generic snd_soc_core mt7925e snd_hda_scodec_component
> > > mt7925_common snd_compress mt792x_lib snd_hda_codec_hdmi ac97_bus
> snd_hda_intel
> > > mt76_connac_lib snd_pcm_dmaengine snd_intel_dspcfg mt76 snd_rpl_pci_acp6x
> > > snd_intel_sdw_acpi snd_hda_codec kvm_amd snd_acp_pci think_lmi
> snd_hda_core
> > > snd_acp_legacy_common mac80211 kvm snd_pci_acp6x snd_hwdep snd_pcm_oss
> > > snd_mixer_oss snd_pci_acp5x libarc4 amd_pmf rapl pcspkr
> > > firmware_attributes_class wmi_bmof hid_sensor_als amdtee snd_pcm
> > > hid_sensor_trigger snd_rn_pci_acp3x cfg80211
> industrialio_triggered_buffer
> > > snd_timer joydev snd_acp_config kfifo_buf spd5118 snd snd_soc_acpi
> > > hid_sensor_iio_common ccp soundcore snd_pci_acp3x rfkill platform_profile
> > > amdxdna k10temp industrialio amd_pmc mousedev mac_hid sch_fq_codel uinput
> > > i2c_dev parport_pc ppdev lp parport nvme_fabrics nvme_keyring nfnetlink
> > > ip_tables x_tables dm_crypt encrypted_keys trusted asn1_encoder tee
> dm_mod
> > > raid10 raid456 async_raid6_recov
> > > [  560.223631]  async_memcpy async_pq async_xor async_tx raid1 raid0
> linear
> > > md_mod igc ptp pps_core uas usb_storage hid_logitech_hidpp r8153_ecm
> cdc_ether
> > > usbnet hid_logitech_dj r8152 mii usbhid amdgpu i2c_algo_bit
> drm_ttm_helper ttm
> > > drm_panel_backlight_quirks polyval_clmulni drm_exec polyval_generic
> > > ghash_clmulni_intel drm_suballoc_helper sha512_ssse3 amdxcp
> hid_sensor_custom
> > > serio_raw sha256_ssse3 drm_buddy sdhci_pci ucsi_acpi atkbd nvme
> hid_multitouch
> > > r8169 sha1_ssse3 sp5100_tco hid_sensor_hub typec_ucsi libps2 gpu_sched
> > > sdhci_uhs2 vivaldi_fmap aesni_intel nvme_core sdhci hid_generic realtek
> typec
> > > drm_display_helper video i8042 crypto_simd i2c_piix4 mdio_devres cqhci
> cryptd
> > > thunderbolt mmc_core libphy cec amd_sfh nvme_auth roles i2c_smbus serio
> > > i2c_hid_acpi wmi i2c_hid
> > > [  560.223646] CR2: ffff9acf2b889008
> > > [  560.223650] ---[ end trace 0000000000000000 ]---
> > > [  560.223655] RIP: 0010:__srcu_read_lock+0x14/0x30
> > > [  560.223660] Code: 0f 0b eb bc 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 84
> 00 00
> > > 00 00 00 f3 0f 1e fa 0f 1f 44 00 00 8b 07 48 8b 57 08 83 e0 01 89 c1 <65>
> 48 ff
> > > 04 ca f0 83 44 24 fc 00 c3 cc cc cc cc 66 66 2e 0f 1f 84
> > > [  560.223664] RSP: 0018:ffffb7df8d24fd88 EFLAGS: 00010202
> > > [  560.223670] RAX: 0000000000000001 RBX: ffff9ac82f80de08 RCX:
> > > 0000000000000001
> > > [  560.223674] RDX: 0000000000000000 RSI: ffff9ac8fd276f40 RDI:
> > > ffff9ac82f80de38
> > > [  560.223679] RBP: ffffb7df8d24fdf8 R08: 0000000000000000 R09:
> > > 00000000fffffffd
> > > [  560.223683] R10: 0000000000000001 R11: 00000000ffffffff R12:
> > > 0000000000000000
> > > [  560.223687] R13: ffff9ac8fd276f40 R14: 000000000000000e R15:
> > > 0000000000000000
> > > [  560.223692] FS:  0000000000000000(0000) GS:ffff9acf2b889000(0000)
> > > knlGS:0000000000000000
> > > [  560.223696] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [  560.223700] CR2: ffff9acf2b889008 CR3: 00000001e1c40000 CR4:
> > > 0000000000f50ef0
> > > [  560.223704] PKRU: 55555554
> > > [  560.223709] note: kworker/0:3[8212] exited with irqs disabled
> > > ```
> > > 
> > > -- 
> > > You may reply to this email to add a comment.
> > > 
> > > You are receiving this mail because:
> > > You are watching the assignee of the bug.
> > 
> > -- 
> > Regards/Gruss,
> >     Boris.
> > 
> > https://people.kernel.org/tglx/notes-about-netiquette
Comment 9 Rong Zhang 2025-05-07 11:04:28 UTC
Created attachment 308104 [details]
6.15-rc5_per-cpu-pf_bptrace_dmesg_decoded
Comment 10 Rong Zhang 2025-05-07 17:34:19 UTC
Hi Benjamin,

On Wed, 2025-05-07 at 19:03 +0800, Rong Zhang wrote:
> Hi Benjamin,
> 
> On Tue, 2025-05-06 at 17:35 +0200, Benjamin Tissoires wrote:
> > Hi Boris,
> > 
> > On May 06 2025, Borislav Petkov wrote:
> > > Switching to mail.
> > > 
> > > Hi Benjamin,
> > > 
> > > take a look at the below pls.
> > > 
> > > The RIP points to:
> > > 
> > >   22:   48 c1 e6 04             shl    $0x4,%rsi
> > >   26:   48 03 77 08             add    0x8(%rdi),%rsi
> > >   2a:*  65 48 ff 46 08          incq   %gs:0x8(%rsi)            <--
> trapping instruction
> > >   2f:   c3                      ret
> > > 
> > > which really is a %gs-based access and the reporter has bisected this to
> > > 
> > >   9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
> > > 
> > > which looks related.
> > > 
> > > My silly guess would be some bpf program does per-cpu accesses but it
> doesn't
> > > know about this change so it tramples over registers. I mean, my fix
> would be
> > > to disable BPF but you young kids love to play with that...

[...]

> > However, the fact that this happens in an unplug event makes me think
> > that there may be a race here at play.
> > 
> > Another option is that I completely missed the use of srcu, but it was
> > working fine previously, so I have no ideas :)
> 
> Yes, this is weird.
> 
> I also tried uinput and some other HID devices (randomly borrowed from
> my friends). They all worked fine.
> 
> I have a Logitech Bolt receiver, too. Will find and try it out.

The good news is it always works fine.

> > Anyway, we need to wait for the reporter to tell us if there were any
> > HID-BPF program first because this will likely give us a hint on where
> > the issue is.
> 
> In another clean boot, I triggered the bug and dumped the hdev struct
> at fentry (fexit will never hit because of the PF) via bpftrace.

[...]

> See attachments for its output (warning: contains an extremely long
> line) and the decoded dmesg while tracing.
> 
> In another clean boot (again), I played around retsnoop to capture the
> Last Branch Records (type: any_return, ind_call) from
> dispatch_hid_bpf_output_report. This time I didn't trigger the issue,
> or else nothing would be captured due to the PF as mentioned above.
> Instead, I pressed Caps Lock on a keyboard under the same receiver
> several times to trigger hidinput_led_worker. I always got:
> 
> [#15] kprobe_multi_link_handler+0x5d      (kernel/trace/bpf_trace.c:2843)    
>       ->  fprobe_entry+0xe6                   (kernel/trace/fprobe.c:321)
>                                                                                         .
>                                                                                         __fprobe_handler
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                         
>                                                                                         (kernel/trace/fprobe.c:224)
> [#14] fprobe_entry+0x21c                  (kernel/trace/fprobe.c:336)               -> 
> function_graph_enter_regs+0x15d     (kernel/trace/fgraph.c:676)
> [#13] function_graph_enter_regs+0x1cd     (kernel/trace/fgraph.c:718)               -> 
> ftrace_graph_func+0x3c              (arch/x86/kernel/ftrace.c:659)
> [#12] ftrace_graph_func+0x4c              (arch/x86/kernel/ftrace.c:661)            -> 
> ftrace_trampoline+0x83
> [#11] ftrace_trampoline+0xc2                                                        -> 
> dispatch_hid_bpf_output_report+0x9  (drivers/hid/bpf/hid_bpf_dispatch.c:120)
> [#10] __srcu_read_lock+0x20               (kernel/rcu/srcutree.c:757)               -> 
> dispatch_hid_bpf_output_report+0x73 (drivers/hid/bpf/hid_bpf_dispatch.c:133)
>                                                                                         .
>                                                                                         srcu_read_lock
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                          
>                                                                                         
>                                                                                         (include/linux/srcu.h:252)
> [#09] __srcu_read_unlock+0x1f             (kernel/rcu/srcutree.c:769)               -> 
> dispatch_hid_bpf_output_report+0xc5 (drivers/hid/bpf/hid_bpf_dispatch.c:148)
> [#08] dispatch_hid_bpf_output_report+0xe6 (drivers/hid/bpf/hid_bpf_dispatch.c:148)  -> 
> return_to_handler+0x0               (arch/x86/kernel/ftrace_64.S:358)
> 
> !    6us [0]  dispatch_hid_bpf_output_report
> 
> Thus, there is indeed no BPF program being called.

Since the Bolt receiver is mine, I also played around bpftrace and
retsnoop. Surprisingly, dispatch_hid_bpf_output_report was never called
when I unplugged the Bolt receiver or pressed Caps Lock on a keyboard
under it, but hidinput_led_worker was indeed called in both cases.

I am completely unfamiliar with HID stuff, so I traced
hidinput_get_led_field and revealed that hid->ll_driver->request was
pointed to usbhid_request, resulting in hidinput_led_worker returning
via a happy path. So the divergence is merely due to different driver
implementation.

This information may explain why the impact of the bug was limited to
certain types of devices. Besides, most Logitech Unifying receivers
reside on users' PCs and never been unplugged, so the bug can hardly be
triggered in their setup. In my case, I have a Unifying receiver
residing on a dock, so the first time I encountered the bug was when I
used the dock for a while and unplugged it (ouch!).

My rough guess is: the bug previously existed somewhere but had no
destructive effect. The per-cpu change isn't the bug itself, but turned
the bug to be destructive.

> Feel free to ask for more experiments :)
> 
> > Cheers,
> > Benjamin
> 
> Thanks,
> Rong
> 
> > > 
> > > :-)
> > > 
> > > Thx.
> > > 
> > > On Sat, May 03, 2025 at 06:40:41PM +0000, bugzilla-daemon@kernel.org wrote:
> > > 
> > > 

[...]

Thanks,
Rong
Comment 11 brgerst 2025-05-07 22:14:35 UTC
On Wed, May 7, 2025 at 7:06 AM Rong Zhang <i@rong.moe> wrote:
>
> Hi Benjamin,
>
> On Tue, 2025-05-06 at 17:35 +0200, Benjamin Tissoires wrote:
> > Hi Boris,
> >
> > On May 06 2025, Borislav Petkov wrote:
> > > Switching to mail.
> > >
> > > Hi Benjamin,
> > >
> > > take a look at the below pls.
> > >
> > > The RIP points to:
> > >
> > >   22:   48 c1 e6 04             shl    $0x4,%rsi
> > >   26:   48 03 77 08             add    0x8(%rdi),%rsi
> > >   2a:*  65 48 ff 46 08          incq   %gs:0x8(%rsi)            <--
> trapping instruction
> > >   2f:   c3                      ret
> > >
> > > which really is a %gs-based access and the reporter has bisected this to
> > >
> > >   9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
> > >
> > > which looks related.
> > >
> > > My silly guess would be some bpf program does per-cpu accesses but it
> doesn't
> > > know about this change so it tramples over registers. I mean, my fix
> would be
> > > to disable BPF but you young kids love to play with that...
> >
> > Heh. Well, I would like to know if any HID-BPF program is loaded first.
> > These can be seen by running `sudo tree /sys/fs/bpf/hid/`.
>
> Nothing is there.

[snip]

>
> In case you need it, I decoded a stacktrace (I've upgraded to 6.15-rc5
> BTW):
>
> [14591.438053] usb 7-1.4.4: USB disconnect, device number 7
> [14591.541666] BUG: unable to handle page fault for address: ffff8efd88e65018
> [14591.541674] #PF: supervisor write access in kernel mode
> [14591.541676] #PF: error_code(0x0002) - not-present page
> [14591.541677] PGD 220801067 P4D 220801067 PUD 0
> [14591.541681] Oops: Oops: 0002 [#1] SMP NOPTI
> [14591.541684] CPU: 0 UID: 0 PID: 56816 Comm: kworker/0:2 Not tainted
> 6.15.0-rc5 #1 PREEMPT(lazy)  0538d36f9cfa2dbc3c98efb2730490d8b2399dc4
> [14591.541687] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW
> 03/11/2025
> [14591.541689] Workqueue: events hidinput_led_worker
> [14591.541693] RIP: 0010:__srcu_read_unlock (kernel/rcu/srcutree.c:768
> (discriminator 1))
> [14591.541697] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e
> fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65> 48
> ff 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> All code
> ========
>    0:   c3                      ret
>    1:   cc                      int3
>    2:   cc                      int3
>    3:   cc                      int3
>    4:   cc                      int3
>    5:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
>    c:   00 00 00 00
>   10:   f3 0f 1e fa             endbr64
>   14:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>   19:   f0 83 44 24 fc 00       lock addl $0x0,-0x4(%rsp)
>   1f:   48 63 f6                movslq %esi,%rsi
>   22:   48 c1 e6 04             shl    $0x4,%rsi
>   26:   48 03 77 08             add    0x8(%rdi),%rsi
>   2a:*  65 48 ff 46 08          incq   %gs:0x8(%rsi)            <-- trapping
>   instruction
>   2f:   c3                      ret
>   30:   cc                      int3
>   31:   cc                      int3
>   32:   cc                      int3
>   33:   cc                      int3
>   34:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
>   3b:   00 00 00 00
>   3f:   90                      nop
>
> Code starting with the faulting instruction
> ===========================================
>    0:   65 48 ff 46 08          incq   %gs:0x8(%rsi)
>    5:   c3                      ret
>    6:   cc                      int3
>    7:   cc                      int3
>    8:   cc                      int3
>    9:   cc                      int3
>    a:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
>   11:   00 00 00 00
>   15:   90                      nop
> [14591.541698] RSP: 0018:ffffd0c6094f7d88 EFLAGS: 00010202
> [14591.541700] RAX: 0000000000000000 RBX: ffff8ef67492be08 RCX:
> 0000000000000000
> [14591.541701] RDX: 0000000000000002 RSI: 0000000000000010 RDI:
> ffff8ef67492be38
> [14591.541702] RBP: ffffd0c6094f7df8 R08: 0000000000000000 R09:
> 00000000fffffffd
> [14591.541703] R10: 0000000000000001 R11: 00000000ffffffff R12:
> 0000000000000000
> [14591.541703] R13: ffff8ef70d8143d0 R14: 0000000000000001 R15:
> 0000000000000000
> [14591.541704] FS:  0000000000000000(0000) GS:ffff8efd88e65000(0000)
> knlGS:0000000000000000
> [14591.541705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [14591.541706] CR2: ffff8efd88e65018 CR3: 00000001d0184000 CR4:
> 0000000000f50ef0
> [14591.541707] PKRU: 55555554
> [14591.541708] Call Trace:
> [14591.541710]  <TASK>
> [14591.541711] dispatch_hid_bpf_output_report
> (drivers/hid/bpf/hid_bpf_dispatch.c:148)
> [14591.541716] hid_hw_output_report (drivers/hid/hid-core.c:2500
> drivers/hid/hid-core.c:2520)
> [14591.541717] hidinput_led_worker (drivers/hid/hid-input.c:1838)
> [14591.541719] process_one_work (kernel/workqueue.c:3238)
> [14591.541721] worker_thread (kernel/workqueue.c:3313 (discriminator 2)
> kernel/workqueue.c:3400 (discriminator 2))
> [14591.541723] ? rescuer_thread (kernel/workqueue.c:3346)
> [14591.541724] kthread (kernel/kthread.c:464)
> [14591.541727] ? kthreads_online_cpu (kernel/kthread.c:413)
> [14591.541729] ret_from_fork (arch/x86/kernel/process.c:153)
> [14591.541731] ? kthreads_online_cpu (kernel/kthread.c:413)
> [14591.541733] ret_from_fork_asm (arch/x86/entry/entry_64.S:255)
> [14591.541737]  </TASK>
> [14591.541738] Modules linked in: mmc_block rpmb_core udp_diag tcp_diag
> inet_diag xt_mark ccm snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss
> snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device xt_CHECKSUM
> xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat
> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun bridge
> stp llc nf_tables qrtr uhid rfcomm cmac algif_hash algif_skcipher af_alg
> overlay bnep sunrpc vfat fat btusb uvcvideo btrtl videobuf2_vmalloc btintel
> uvc videobuf2_memops btbcm videobuf2_v4l2 btmtk videobuf2_common bluetooth
> videodev mc intel_rapl_msr amd_atl intel_rapl_common snd_acp_legacy_mach
> snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm
> snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63
> snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp
> snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps
> snd_soc_acpi_amd_match snd_hda_codec_realtek snd_amd_sdw_acpi soundwire_amd
> snd_hda_codec_generic
> [14591.541775]  soundwire_generic_allocation snd_hda_scodec_component
> soundwire_bus snd_soc_sdca snd_hda_codec_hdmi snd_soc_core mt7925e
> snd_hda_intel snd_compress mt7925_common ac97_bus snd_intel_dspcfg kvm_amd
> mt792x_lib snd_pcm_dmaengine snd_intel_sdw_acpi snd_rpl_pci_acp6x
> mt76_connac_lib snd_hda_codec snd_acp_pci kvm mt76 snd_amd_acpi_mach
> snd_hda_core snd_acp_legacy_common irqbypass think_lmi snd_pci_acp6x
> snd_hwdep rapl snd_ctl_led pcspkr mac80211 snd_pcm_oss
> firmware_attributes_class lenovo_wmi_hotkey_utilities snd_mixer_oss
> snd_pci_acp5x libarc4 snd_pcm wmi_bmof snd_rn_pci_acp3x spd5118 snd_timer
> snd_acp_config cfg80211 snd snd_soc_acpi hid_sensor_als soundcore amdxdna
> amd_pmf hid_sensor_trigger snd_pci_acp3x k10temp rfkill
> industrialio_triggered_buffer amdtee kfifo_buf joydev hid_sensor_iio_common
> ccp industrialio mousedev platform_profile amd_pmc mac_hid sch_fq_codel
> uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics nfnetlink ip_tables
> x_tables hid_logitech_hidpp hid_logitech_dj usbhid dm_crypt
> [14591.541811]  encrypted_keys trusted asn1_encoder tee dm_mod raid10 raid456
> async_raid6_recov async_memcpy async_pq async_xor async_tx raid1 raid0 linear
> md_mod igc ptp pps_core r8153_ecm r8152 cdc_ether usbnet mii amdgpu
> i2c_algo_bit drm_ttm_helper ttm drm_panel_backlight_quirks polyval_clmulni
> polyval_generic drm_exec drm_suballoc_helper ghash_clmulni_intel amdxcp
> sha512_ssse3 drm_buddy sdhci_pci sha256_ssse3 sp5100_tco r8169 nvme
> sdhci_uhs2 gpu_sched sha1_ssse3 serio_raw hid_sensor_custom sdhci nvme_core
> realtek aesni_intel ucsi_acpi atkbd drm_display_helper cqhci libps2
> crypto_simd typec_ucsi hid_multitouch i2c_piix4 nvme_keyring mdio_devres
> hid_sensor_hub hid_generic thunderbolt vivaldi_fmap cryptd typec cec libphy
> mmc_core amd_sfh video i8042 nvme_auth i2c_smbus roles i2c_hid_acpi serio wmi
> i2c_hid
> [14591.541846] CR2: ffff8efd88e65018
> [14591.541848] ---[ end trace 0000000000000000 ]---

So what we have here is a function that takes two parameters: what
looks like an index in RSI, and a pointer to a structure in RDI.
Looking at the register dump, RSI ends up with the value
0000000000000010, which could very likely be an index of 1 shifted and
added to a NULL pointer from the structure.  On the old zero-based
percpu, that would be a valid address, but would corrupt whatever was
there.  So I don't think this is a problem with the bisected commit,
it just exposed an existing bug.


Brian Gerst
Comment 12 Rong Zhang 2025-05-08 12:19:06 UTC
Hi all,

On Wed, 2025-05-07 at 18:14 -0400, Brian Gerst wrote:
> On Wed, May 7, 2025 at 7:06 AM Rong Zhang <i@rong.moe> wrote:
> > 
> > Hi Benjamin,
> > 
> > On Tue, 2025-05-06 at 17:35 +0200, Benjamin Tissoires wrote:
> > > Hi Boris,
> > > 
> > > On May 06 2025, Borislav Petkov wrote:
> > > > Switching to mail.
> > > > 
> > > > Hi Benjamin,
> > > > 
> > > > take a look at the below pls.
> > > > 
> > > > The RIP points to:
> > > > 
> > > >   22:   48 c1 e6 04             shl    $0x4,%rsi
> > > >   26:   48 03 77 08             add    0x8(%rdi),%rsi
> > > >   2a:*  65 48 ff 46 08          incq   %gs:0x8(%rsi)            <--
> trapping instruction
> > > >   2f:   c3                      ret
> > > > 
> > > > which really is a %gs-based access and the reporter has bisected this
> to
> > > > 
> > > >   9d7de2aa8b41 ("x86/percpu/64: Use relative percpu offsets")
> > > > 
> > > > which looks related.
> > > > 
> > > > My silly guess would be some bpf program does per-cpu accesses but it
> doesn't
> > > > know about this change so it tramples over registers. I mean, my fix
> would be
> > > > to disable BPF but you young kids love to play with that...
> > > 
> > > Heh. Well, I would like to know if any HID-BPF program is loaded first.
> > > These can be seen by running `sudo tree /sys/fs/bpf/hid/`.
> > 
> > Nothing is there.
> 
> [snip]
> 
> > 
> > In case you need it, I decoded a stacktrace (I've upgraded to 6.15-rc5
> > BTW):
> > 
> > [14591.438053] usb 7-1.4.4: USB disconnect, device number 7
> > [14591.541666] BUG: unable to handle page fault for address:
> ffff8efd88e65018
> > [14591.541674] #PF: supervisor write access in kernel mode
> > [14591.541676] #PF: error_code(0x0002) - not-present page
> > [14591.541677] PGD 220801067 P4D 220801067 PUD 0
> > [14591.541681] Oops: Oops: 0002 [#1] SMP NOPTI
> > [14591.541684] CPU: 0 UID: 0 PID: 56816 Comm: kworker/0:2 Not tainted
> 6.15.0-rc5 #1 PREEMPT(lazy)  0538d36f9cfa2dbc3c98efb2730490d8b2399dc4
> > [14591.541687] Hardware name: LENOVO 21Q4/LNVNB161216, BIOS PXCN24WW
> 03/11/2025
> > [14591.541689] Workqueue: events hidinput_led_worker
> > [14591.541693] RIP: 0010:__srcu_read_unlock (kernel/rcu/srcutree.c:768
> (discriminator 1))
> > [14591.541697] Code: c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 f3 0f
> 1e fa 0f 1f 44 00 00 f0 83 44 24 fc 00 48 63 f6 48 c1 e6 04 48 03 77 08 <65>
> 48 ff 46 08 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90
> > All code
> > ========
> >    0:   c3                      ret
> >    1:   cc                      int3
> >    2:   cc                      int3
> >    3:   cc                      int3
> >    4:   cc                      int3
> >    5:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
> >    c:   00 00 00 00
> >   10:   f3 0f 1e fa             endbr64
> >   14:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> >   19:   f0 83 44 24 fc 00       lock addl $0x0,-0x4(%rsp)
> >   1f:   48 63 f6                movslq %esi,%rsi
> >   22:   48 c1 e6 04             shl    $0x4,%rsi
> >   26:   48 03 77 08             add    0x8(%rdi),%rsi
> >   2a:*  65 48 ff 46 08          incq   %gs:0x8(%rsi)            <--
> trapping instruction
> >   2f:   c3                      ret
> >   30:   cc                      int3
> >   31:   cc                      int3
> >   32:   cc                      int3
> >   33:   cc                      int3
> >   34:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
> >   3b:   00 00 00 00
> >   3f:   90                      nop
> > 
> > Code starting with the faulting instruction
> > ===========================================
> >    0:   65 48 ff 46 08          incq   %gs:0x8(%rsi)
> >    5:   c3                      ret
> >    6:   cc                      int3
> >    7:   cc                      int3
> >    8:   cc                      int3
> >    9:   cc                      int3
> >    a:   66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
> >   11:   00 00 00 00
> >   15:   90                      nop
> > [14591.541698] RSP: 0018:ffffd0c6094f7d88 EFLAGS: 00010202
> > [14591.541700] RAX: 0000000000000000 RBX: ffff8ef67492be08 RCX:
> 0000000000000000
> > [14591.541701] RDX: 0000000000000002 RSI: 0000000000000010 RDI:
> ffff8ef67492be38
> > [14591.541702] RBP: ffffd0c6094f7df8 R08: 0000000000000000 R09:
> 00000000fffffffd
> > [14591.541703] R10: 0000000000000001 R11: 00000000ffffffff R12:
> 0000000000000000
> > [14591.541703] R13: ffff8ef70d8143d0 R14: 0000000000000001 R15:
> 0000000000000000
> > [14591.541704] FS:  0000000000000000(0000) GS:ffff8efd88e65000(0000)
> knlGS:0000000000000000
> > [14591.541705] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [14591.541706] CR2: ffff8efd88e65018 CR3: 00000001d0184000 CR4:
> 0000000000f50ef0
> > [14591.541707] PKRU: 55555554
> > [14591.541708] Call Trace:
> > [14591.541710]  <TASK>
> > [14591.541711] dispatch_hid_bpf_output_report
> (drivers/hid/bpf/hid_bpf_dispatch.c:148)
> > [14591.541716] hid_hw_output_report (drivers/hid/hid-core.c:2500
> drivers/hid/hid-core.c:2520)
> > [14591.541717] hidinput_led_worker (drivers/hid/hid-input.c:1838)
> > [14591.541719] process_one_work (kernel/workqueue.c:3238)
> > [14591.541721] worker_thread (kernel/workqueue.c:3313 (discriminator 2)
> kernel/workqueue.c:3400 (discriminator 2))
> > [14591.541723] ? rescuer_thread (kernel/workqueue.c:3346)
> > [14591.541724] kthread (kernel/kthread.c:464)
> > [14591.541727] ? kthreads_online_cpu (kernel/kthread.c:413)
> > [14591.541729] ret_from_fork (arch/x86/kernel/process.c:153)
> > [14591.541731] ? kthreads_online_cpu (kernel/kthread.c:413)
> > [14591.541733] ret_from_fork_asm (arch/x86/entry/entry_64.S:255)
> > [14591.541737]  </TASK>
> > [14591.541738] Modules linked in: mmc_block rpmb_core udp_diag tcp_diag
> inet_diag xt_mark ccm snd_hrtimer snd_seq_dummy snd_seq_midi snd_seq_oss
> snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device xt_CHECKSUM
> xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat
> nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun bridge
> stp llc nf_tables qrtr uhid rfcomm cmac algif_hash algif_skcipher af_alg
> overlay bnep sunrpc vfat fat btusb uvcvideo btrtl videobuf2_vmalloc btintel
> uvc videobuf2_memops btbcm videobuf2_v4l2 btmtk videobuf2_common bluetooth
> videodev mc intel_rapl_msr amd_atl intel_rapl_common snd_acp_legacy_mach
> snd_acp_mach snd_soc_nau8821 snd_acp3x_rn snd_acp70 snd_acp_i2s snd_acp_pdm
> snd_soc_dmic snd_acp_pcm snd_sof_amd_acp70 snd_sof_amd_acp63
> snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp
> snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps
> snd_soc_acpi_amd_match snd_hda_codec_realtek snd_amd_sdw_acpi soundwire_amd
> snd_hda_codec_generic
> > [14591.541775]  soundwire_generic_allocation snd_hda_scodec_component
> soundwire_bus snd_soc_sdca snd_hda_codec_hdmi snd_soc_core mt7925e
> snd_hda_intel snd_compress mt7925_common ac97_bus snd_intel_dspcfg kvm_amd
> mt792x_lib snd_pcm_dmaengine snd_intel_sdw_acpi snd_rpl_pci_acp6x
> mt76_connac_lib snd_hda_codec snd_acp_pci kvm mt76 snd_amd_acpi_mach
> snd_hda_core snd_acp_legacy_common irqbypass think_lmi snd_pci_acp6x
> snd_hwdep rapl snd_ctl_led pcspkr mac80211 snd_pcm_oss
> firmware_attributes_class lenovo_wmi_hotkey_utilities snd_mixer_oss
> snd_pci_acp5x libarc4 snd_pcm wmi_bmof snd_rn_pci_acp3x spd5118 snd_timer
> snd_acp_config cfg80211 snd snd_soc_acpi hid_sensor_als soundcore amdxdna
> amd_pmf hid_sensor_trigger snd_pci_acp3x k10temp rfkill
> industrialio_triggered_buffer amdtee kfifo_buf joydev hid_sensor_iio_common
> ccp industrialio mousedev platform_profile amd_pmc mac_hid sch_fq_codel
> uinput i2c_dev parport_pc ppdev lp parport nvme_fabrics nfnetlink ip_tables
> x_tables hid_logitech_hidpp hid_logitech_dj usbhid dm_crypt
> > [14591.541811]  encrypted_keys trusted asn1_encoder tee dm_mod raid10
> raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx raid1
> raid0 linear md_mod igc ptp pps_core r8153_ecm r8152 cdc_ether usbnet mii
> amdgpu i2c_algo_bit drm_ttm_helper ttm drm_panel_backlight_quirks
> polyval_clmulni polyval_generic drm_exec drm_suballoc_helper
> ghash_clmulni_intel amdxcp sha512_ssse3 drm_buddy sdhci_pci sha256_ssse3
> sp5100_tco r8169 nvme sdhci_uhs2 gpu_sched sha1_ssse3 serio_raw
> hid_sensor_custom sdhci nvme_core realtek aesni_intel ucsi_acpi atkbd
> drm_display_helper cqhci libps2 crypto_simd typec_ucsi hid_multitouch
> i2c_piix4 nvme_keyring mdio_devres hid_sensor_hub hid_generic thunderbolt
> vivaldi_fmap cryptd typec cec libphy mmc_core amd_sfh video i8042 nvme_auth
> i2c_smbus roles i2c_hid_acpi serio wmi i2c_hid
> > [14591.541846] CR2: ffff8efd88e65018
> > [14591.541848] ---[ end trace 0000000000000000 ]---
> 
> So what we have here is a function that takes two parameters: what
> looks like an index in RSI, and a pointer to a structure in RDI.
> Looking at the register dump, RSI ends up with the value
> 0000000000000010, which could very likely be an index of 1 shifted and
> added to a NULL pointer from the structure.  On the old zero-based
> percpu, that would be a valid address, but would corrupt whatever was
> there.  So I don't think this is a problem with the bisected commit,
> it just exposed an existing bug.

Brian,

Thanks for your explanation!

With your information, I think there is a race when unplugging as
Benjamin said. Thus, I double-checked the bpftrace output (in my
previous reply) and found:

   .bpf = {
      .device_data = 0x0, .allocated_data = 0, .destroyed = 1, 
      [...]
      .srcu = {
         .srcu_ctrp = 0x61896b5b0290, .sda = 0x0, .dep_map = {  },
         .srcu_sup = 0x0
      }
   }

So yes, the device was destroyed with a dangling led work.

Benjemin,

I applied the patch below, and now my setup is usable without the need
to reboot again and again to make USB recover. The patch doesn't fix
the bug itself, but exposes race bugs in a gentle manner once they
exist, preventing the case we are currently dealing with - another low
level change exposed an existing bug in a destructive way.

Is the patch worth submitting? If so, do you have any opinion on it?

Will keep debugging to reveal more details and try to fix the root
cause...

> Brian Gerst

Thanks,
Rong

---
 drivers/hid/bpf/hid_bpf_dispatch.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/hid/bpf/hid_bpf_dispatch.c b/drivers/hid/bpf/hid_bpf_dispatch.c
index 2e96ec6a3073..b9d19416c243 100644
--- a/drivers/hid/bpf/hid_bpf_dispatch.c
+++ b/drivers/hid/bpf/hid_bpf_dispatch.c
@@ -10,6 +10,7 @@
 #include <linux/bitops.h>
 #include <linux/btf.h>
 #include <linux/btf_ids.h>
+#include <linux/bug.h>
 #include <linux/filter.h>
 #include <linux/hid.h>
 #include <linux/hid_bpf.h>
@@ -38,6 +39,9 @@ dispatch_hid_bpf_device_event(struct hid_device *hdev, enum hid_report_type type
 	struct hid_bpf_ops *e;
 	int ret;
 
+	if (WARN_ON(hdev->bpf.destroyed))
+		return ERR_PTR(-ENODEV);
+
 	if (type >= HID_REPORT_TYPES)
 		return ERR_PTR(-EINVAL);
 
@@ -93,6 +97,9 @@ int dispatch_hid_bpf_raw_requests(struct hid_device *hdev,
 	struct hid_bpf_ops *e;
 	int ret, idx;
 
+	if (WARN_ON(hdev->bpf.destroyed))
+		return -ENODEV;
+
 	if (rtype >= HID_REPORT_TYPES)
 		return -EINVAL;
 
@@ -130,6 +137,9 @@ int dispatch_hid_bpf_output_report(struct hid_device *hdev,
 	struct hid_bpf_ops *e;
 	int ret, idx;
 
+	if (WARN_ON(hdev->bpf.destroyed))
+		return -ENODEV;
+
 	idx = srcu_read_lock(&hdev->bpf.srcu);
 	list_for_each_entry_srcu(e, &hdev->bpf.prog_list, list,
 				 srcu_read_lock_held(&hdev->bpf.srcu)) {