Bug 208097

Summary: [Bisected][Regression] kernels greater than 4.14.x fail to boot with Hyperthreading enabled (Shuttle XPC SS51G w. Pentium 4 HT)
Product: Platform Specific/Hardware Reporter: Erhard F. (erhard_f)
Component: i386Assignee: platform_i386
Status: NEW ---    
Severity: normal CC: tglx
Priority: P1    
Hardware: i386   
OS: Linux   
Kernel Version: 5.7.0 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: kernel .config (kernel 4.15-rc8+, Shuttle XPC FS51, Pentium 4 HT)
kernel dmesg (kernel 4.15-rc8+, Shuttle XPC FS51, Pentium 4 HT)
kernel dmesg (kernel 4.14.178, Shuttle XPC FS51, Pentium 4 HT)
bisect.log
dmesg (kernel 5.11-rc7, Shuttle XPC FS51, Pentium 4)
kernel .config (kernel 5.11-rc7, Shuttle XPC FS51, Pentium 4)

Description Erhard F. 2020-06-07 16:34:33 UTC
Created attachment 289551 [details]
kernel .config (kernel 4.15-rc8+, Shuttle XPC FS51, Pentium 4 HT)

Kernels >4.14.x show weird sorts of problems (swap partition fails to start, USB errors, WLAN errors) when Hyperthreading is enabled. Most of the time the machine won't finish booting and crashes with stacktraces like this:

[...]
? usb_probe_interface+0x192/0x192 [usbcore]
usb_runtime_suspend +0x21/0x52 [usbcore]
__rpm_callback+0xb4/0x141
? usb_probe_interface+0x192/0x192 [usbcore]
? usb_probe_interface+0x192/0x192 [usbcore]
rpm_callbackk+0x46/0x63
? usb_probe_interface+0x192/0x192 [usbcore]
rpm_suspend+0x25f/0x39b
? _raw_spin_lock_irqsave+0x2f/0x38
__pm_runtime_suspend+0x4b/0x5f
usb_runtime_idle+0x1e/0x26 [usbcore]
__rpm_callback+0xb4/0x141
? usb_runtime_resume+0x12/0x12 [usbcore]
rpm_idle+0x12e/0x187
pm_runtime_work+0x43/0x77
process_one_work+0x215/0x36c
worker_thread+0x164/0x1fb
kthread+0xce/0xd0
? rescuer_thread+0x1e5/0x1e5
? kthread_destroy_worker+0x4e/0x4e
ret_from_fork+0x19/0x24
Kernel Offset: 0x1e000000 from 0xc1000000 (reloca ?? 000000-0xf7b7dfff)
Rebooting in 120 seconds..

With Hyperthreading disabled in BIOS, the very same kernel just works fine. I was curious and did a git-bisect which lead to the following offending commit:

 # git bisect bad | tee -a ~/bisect02.log
a0c9259dc4e1923a98356967ce8b732da1979df8 is the first bad commit
commit a0c9259dc4e1923a98356967ce8b732da1979df8
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Jan 17 16:01:47 2018 +0100

    irq/matrix: Spread interrupts on allocation
    
    Keith reported an issue with vector space exhaustion on a server machine
    which is caused by the i40e driver allocating 168 MSI interrupts when the
    driver is initialized, even when most of these interrupts are not used at
    all.
    
    The x86 vector allocation code tries to avoid the immediate allocation with
    the reservation mode, but the card uses MSI and does not support MSI entry
    masking, which prevents reservation mode and requires immediate vector
    allocation.
    
    The matrix allocator is a bit naive and prefers the first CPU in the
    cpumask which describes the possible target CPUs for an allocation. That
    results in allocating all 168 vectors on CPU0 which later causes vector
    space exhaustion when the NVMe driver tries to allocate managed interrupts
    on each CPU for the per CPU queues.
    
    Avoid this by finding the CPU which has the lowest vector allocation count
    to spread out the non managed interrupt accross the possible target CPUs.
    
    Fixes: 2f75d9e1c905 ("genirq: Implement bitmap matrix allocator")
    Reported-by: Keith Busch <keith.busch@intel.com>
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Tested-by: Keith Busch <keith.busch@intel.com>
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801171557330.1777@nanos

 kernel/irq/matrix.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)


Some data about the machine:

 # inxi -b
System:    Kernel: 4.14.178-gentoo-Pentium4 i686 bits: 32 Console: tty 0 
           Distro: Gentoo Base System release 2.6 
Machine:   Type: Desktop Mobo: Shuttle model: FS51 serial: N/A BIOS: Phoenix v: 6.00 PG 
           date: 12/02/2003 
CPU:       Single Core: Intel Pentium 4 type: MT speed: 3063 MHz 
Graphics:  Device-1: AMD RV350 [Radeon 9550/9600/X1050 Series] driver: N/A 
           Display: server: X.org 1.20.7 driver: ati,radeon unloaded: fbdev,modesetting tty: 104x53 
Network:   Device-1: Ralink RT2500 Wireless 802.11bg driver: rt2500pci 
           Device-2: Realtek RTL-8100/8101L/8139 PCI Fast Ethernet Adapter driver: 8139too 
Drives:    Local Storage: total: 76.69 GiB used: 4.43 GiB (5.8%) 
Info:      Processes: 79 Uptime: 3h 29m Memory: 1.97 GiB used: 329.1 MiB (16.3%) Init: systemd 
           Shell: bash inxi: 3.0.38 

 # lscpu 
Architecture:        i686
CPU op-mode(s):      32-bit
Byte Order:          Little Endian
Address sizes:       36 bits physical, 32 bits virtual
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  2
Core(s) per socket:  1
Socket(s):           1
Vendor ID:           GenuineIntel
CPU family:          15
Model:               2
Model name:          Intel(R) Pentium(R) 4 CPU 3.06GHz
Stepping:            7
CPU MHz:             3062.686
BogoMIPS:            6127.59
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pebs bts cpuid cid xtpr
Comment 1 Erhard F. 2020-06-07 16:36:51 UTC
Created attachment 289553 [details]
kernel dmesg (kernel 4.15-rc8+, Shuttle XPC FS51, Pentium 4 HT)
Comment 2 Erhard F. 2020-06-07 16:38:54 UTC
Created attachment 289555 [details]
kernel dmesg (kernel 4.14.178, Shuttle XPC FS51, Pentium 4 HT)
Comment 3 Erhard F. 2020-06-07 16:43:10 UTC
Created attachment 289557 [details]
bisect.log
Comment 4 Erhard F. 2021-02-09 22:53:00 UTC
Created attachment 295173 [details]
dmesg (kernel 5.11-rc7, Shuttle XPC FS51, Pentium 4)

Now with a proper stacktrace made with netconsole.

The "inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage." which shows up at the beginning is bug #211575 and shows up seperate from HyperThreadine enabled.
Comment 5 Erhard F. 2021-02-09 22:54:28 UTC
Created attachment 295175 [details]
kernel .config (kernel 5.11-rc7, Shuttle XPC FS51, Pentium 4)
Comment 6 Thomas Gleixner 2021-02-10 10:35:50 UTC
On Tue, Feb 09 2021 at 22:53, bugzilla-daemon wrote:
> dmesg (kernel 5.11-rc7, Shuttle XPC FS51, Pentium 4)
>
> Now with a proper stacktrace made with netconsole.
>
> The "inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage." which shows up at
> the beginning is bug #211575 and shows up seperate from HyperThreadine
> enabled.

Yes, that's unrelated and an issue with the 8139 netconsole.

That's an odd failure. It's stuck waiting for an interrupt to
synchronize. Can you try to boot with:

 irqaffinity=0

on the kernel command line?

If that makes it boot then you might try to move e.g. the sound
interrupt to CPU1 via /proc/irq/$IRQ/smp_affinity[_list].

Thanks,

        tglx
Comment 7 Erhard F. 2021-02-11 15:43:09 UTC
(In reply to Thomas Gleixner from comment #6)
> That's an odd failure. It's stuck waiting for an interrupt to
> synchronize. Can you try to boot with:
> 
>  irqaffinity=0
Yes, that did the trick! The machine runs fine so far.

> If that makes it boot then you might try to move e.g. the sound
> interrupt to CPU1 via /proc/irq/$IRQ/smp_affinity[_list].
How to do that at boot time? Most of the time the machine won't finish booting, being able ro run user scripts afterwards.
Comment 8 Thomas Gleixner 2021-02-11 23:42:22 UTC
On Thu, Feb 11 2021 at 15:43, bugzilla-daemon wrote:
>> That's an odd failure. It's stuck waiting for an interrupt to
>> synchronize. Can you try to boot with:
>> 
>>  irqaffinity=0
> Yes, that did the trick! The machine runs fine so far.
>
>> If that makes it boot then you might try to move e.g. the sound
>> interrupt to CPU1 via /proc/irq/$IRQ/smp_affinity[_list].
> How to do that at boot time? Most of the time the machine won't finish
> booting,
> being able ro run user scripts afterwards.

Not at boot time. With that command line option the machine seems to
run. So you have a working system, right?

If so , can you please provide the output of

 # cat /proc/interrupts

Thanks,

        tglx
Comment 9 Erhard F. 2021-02-23 12:35:55 UTC
(In reply to Thomas Gleixner from comment #8)
> If so , can you please provide the output of
> 
>  # cat /proc/interrupts
> 
> Thanks,
> 
>         tglx
 # cat /proc/interrupts
           CPU0       CPU1       
  0:         40          0   IO-APIC   2-edge      timer
  8:          0          0   IO-APIC   8-edge      rtc0
  9:          0          0   IO-APIC   9-fasteoi   acpi
 14:       6672          0   IO-APIC  14-edge      pata_sis
 15:          0          0   IO-APIC  15-edge      pata_sis
 16:         67          0   IO-APIC  16-fasteoi   radeon
 17:       2775          0   IO-APIC  17-fasteoi   rt2500pci
 18:       1789          0   IO-APIC  18-fasteoi   eth0, snd_intel8x0
 19:          1          0   IO-APIC  19-fasteoi   firewire_ohci
 20:          0          0   IO-APIC  20-fasteoi   ohci_hcd:usb2
 21:          0          0   IO-APIC  21-fasteoi   ohci_hcd:usb3
 22:          0          0   IO-APIC  22-fasteoi   ohci_hcd:usb4
 23:          0          0   IO-APIC  23-fasteoi   ehci_hcd:usb1
NMI:          3          3   Non-maskable interrupts
LOC:      16611      15587   Local timer interrupts
SPU:          0          0   Spurious interrupts
PMI:          2          2   Performance monitoring interrupts
IWI:          0          0   IRQ work interrupts
RTR:          0          0   APIC ICR read retries
RES:       5782      12575   Rescheduling interrupts
CAL:        794        871   Function call interrupts
TLB:        805        549   TLB shootdowns
TRM:          0          0   Thermal event interrupts
THR:          0          0   Threshold APIC interrupts
MCE:          0          0   Machine check exceptions
MCP:          1          1   Machine check polls
ERR:          0
MIS:          0
PIN:          0          0   Posted-interrupt notification event
NPI:          0          0   Nested posted-interrupt event
PIW:          0          0   Posted-interrupt wakeup event
Comment 10 Thomas Gleixner 2021-02-23 18:25:39 UTC
On Tue, Feb 23 2021 at 12:35, bugzilla-daemon wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=208097
>
> --- Comment #9 from Erhard F. (erhard_f@mailbox.org) ---
> (In reply to Thomas Gleixner from comment #8)
>> If so , can you please provide the output of
>> 
>>  # cat /proc/interrupts
>> 
>> Thanks,
>> 
>>         tglx
>  # cat /proc/interrupts
>            CPU0       CPU1       
>   0:         40          0   IO-APIC   2-edge      timer
>   8:          0          0   IO-APIC   8-edge      rtc0
>   9:          0          0   IO-APIC   9-fasteoi   acpi
>  14:       6672          0   IO-APIC  14-edge      pata_sis
>  15:          0          0   IO-APIC  15-edge      pata_sis
>  16:         67          0   IO-APIC  16-fasteoi   radeon
>  17:       2775          0   IO-APIC  17-fasteoi   rt2500pci
>  18:       1789          0   IO-APIC  18-fasteoi   eth0, snd_intel8x0
>  19:          1          0   IO-APIC  19-fasteoi   firewire_ohci
>  20:          0          0   IO-APIC  20-fasteoi   ohci_hcd:usb2
>  21:          0          0   IO-APIC  21-fasteoi   ohci_hcd:usb3
>  22:          0          0   IO-APIC  22-fasteoi   ohci_hcd:usb4
>  23:          0          0   IO-APIC  23-fasteoi   ehci_hcd:usb1

Ok. So all device interrupts end up on CPU 0 which is what we told the
kernel to do with that commandline option.

What happens if you do the following:

 1) Boot with irqaffinity=0 on the command line

 2) run: echo 1 >/proc/irq/18/smp_affinity_list

 3) Shut down the network interface and/or use sound on and off

In theory this should lead to a similar situation as you saw without
that command line option.

Thanks,

        tglx
Comment 11 Erhard F. 2021-02-25 02:44:24 UTC
(In reply to Thomas Gleixner from comment #10)
> What happens if you do the following:
> 
>  1) Boot with irqaffinity=0 on the command line
> 
>  2) run: echo 1 >/proc/irq/18/smp_affinity_list
> 
>  3) Shut down the network interface and/or use sound on and off
A few monents I run "echo 1 >/proc/irq/18/smp_affinity_list" I get this:

------------[ cut here ]------------
NETDEV WATCHDOG: eth0 (8139too): transmit queue 0 timed out
WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:442 dev_watchdog+0x15f/0x1ad
Modules linked in: auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc nfs_ssc rt2500pci eeprom_93cx6 rt2x00pci rt2x00mmio radeon rt2x00lib led_class mac80211 hwmon i2c_algo_bit drm_ttm_helper cfg80211 ttm drm_kms_helper cfbfillrect snd_intel8x0 syscopyarea snd_ac97_codec ohci_pci cfbimgblt ehci_pci ohci_hcd sysfillrect ehci_hcd ac97_bus sysimgblt fb_sys_fops snd_pcm cfbcopyarea usbcore evdev fb firewire_ohci firewire_core fan snd_timer thermal snd 8250 font sr_mod fbdev cdrom 8250_base serial_core rfkill soundcore button usb_common libarc4 crc_itu_t i2c_sis96x drm configfs fuse drm_panel_orientation_quirks backlight
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.11.1-gentoo-Pentium4 #2
Hardware name:  /FS51, BIOS 6.00 PG 12/02/2003
EIP: dev_watchdog+0x15f/0x1ad
Code: 3d 79 ad a9 c4 00 75 34 c6 05 79 ad a9 c4 01 8b 45 f0 e8 cd 32 fd ff 56 50 8d 83 fc fc ff ff 50 68 fc 2f 9a c4 e8 e5 b6 0b 00 <0f> 0b 83 c4 10 eb 0b 46 05 00 02 00 00 e9 02 ff ff ff 8b 83 18 fe
EAX: 0000003b EBX: c108c304 ECX: f63c41fc EDX: fffff000
ESI: 00000000 EDI: ffff1d40 EBP: c1083f44 ESP: c1083f24
DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210282
CR0: 80050033 CR2: 0164e02c CR3: 05fd5000 CR4: 000006d0
Call Trace:
 <SOFTIRQ>
 ? pfifo_fast_change_tx_queue_len+0x193/0x193
 call_timer_fn+0xfe/0x201
 __run_timers+0x134/0x159
 ? pfifo_fast_change_tx_queue_len+0x193/0x193
 ? rt2500pci_rxdone_tasklet+0x4e/0x52 [rt2500pci]
 ? tasklet_action_common.constprop.0+0x67/0xa3
 run_timer_softirq+0x14/0x27
 __do_softirq+0x15f/0x307
 ? __entry_text_end+0x5/0x5
 call_on_stack+0x40/0x46
 </SOFTIRQ>
 ? __irq_exit_rcu+0x4f/0x85
 ? irq_exit_rcu+0x8/0x11
 ? sysvec_apic_timer_interrupt+0x3e/0x4b
 ? handle_exception+0x10e/0x10e
 ? do_idle+0xb7/0x1c3
 ? ldsem_down_write+0x1f/0x1f
 ? rebalance_domains+0x125/0x292
 ? default_idle+0xa/0xc
 ? open_ctree+0xe03/0x134f
 ? sysvec_call_function_single+0x49/0x49
 ? default_idle+0xa/0xc
 ? arch_cpu_idle+0xd/0xf
 ? default_idle_call+0x48/0x74
 ? do_idle+0xb7/0x1c3
 ? cpu_startup_entry+0x19/0x1b
 ? rest_init+0x11d/0x120
 ? arch_call_rest_init+0x8/0xb
 ? start_kernel+0x40d/0x41b
 ? i386_start_kernel+0x43/0x45
 ? startup_32_smp+0x164/0x168
irq event stamp: 14810
hardirqs last  enabled at (14810): [<c4680e4a>] net_rx_action+0x75/0x250
hardirqs last disabled at (14809): [<c4680e27>] net_rx_action+0x52/0x250
softirqs last  enabled at (14794): [<c477b29f>] __do_softirq+0x2d7/0x307
softirqs last disabled at (14807): [<c420fc81>] call_on_stack+0x40/0x46
---[ end trace 0035e9e10037fdfe ]---

This is also the last netconsole message shown, the machine is not usable via ssh after that.
Comment 12 Erhard F. 2021-03-04 00:23:27 UTC
Only guessing around, but could bug #207353 be relevant too? It's about lockdep warnings regarding netdev on the same machine.
Comment 13 Thomas Gleixner 2021-03-04 07:39:14 UTC
On Thu, Mar 04 2021 at 00:23, bugzilla-daemon wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=208097
>
> --- Comment #12 from Erhard F. (erhard_f@mailbox.org) ---
> Only guessing around, but could bug #207353 be relevant too? It's about
> lockdep
> warnings regarding netdev on the same machine.

No, that's an independent issue which has nothing to do with those P4
oddities.