Bug 206723 - Regression: T420 not accepting inputs - usb device not accepting address, pci=noacpi
Summary: Regression: T420 not accepting inputs - usb device not accepting address, pci...
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: ACPI
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: acpi_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-02-29 21:01 UTC by Carli* Freudenberg
Modified: 2020-11-19 03:42 UTC (History)
3 users (show)

See Also:
Kernel Version: >4.19.76
Subsystem:
Regression: No
Bisected commit-id:


Attachments
reportbug (debian) Information + lsmod output (18.79 KB, text/plain)
2020-02-29 21:01 UTC, Carli* Freudenberg
Details
Kernel Config (138.23 KB, text/plain)
2020-03-29 13:57 UTC, Carli* Freudenberg
Details

Description Carli* Freudenberg 2020-02-29 21:01:45 UTC
Created attachment 287721 [details]
reportbug (debian) Information + lsmod output

I found a regression bug.
With kernel starting from version 4.19.76 my Thinkpad T420 "freezes" on boot (no inputs taken, can't enter the decryption password, so can't boot my device).

Even so I can enter anything zt continues to show (with changing n):
usb 1-1: device not accepting address n, error -110
usb 2-1: new high-speed USB device number n using ehci-pci
usb usb2-port1: attempt power cycle

It shows this errors even so no [external] USB devices are connected.

With Kernel <4.19.76 I can boot normally with kernel settings:
pci=noacpi.
Without this setting the screen stays black, so I have to use it.

I made a git bisect (stable branch) and found:
b40c15c20e42491303202ae1368841704be0c3b9 is the first bad commit
commit b40c15c20e42491303202ae1368841704be0c3b9
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Mon Jul 22 20:47:08 2019 +0200

    x86/apic: Soft disable APIC before initializing it
    
    [ Upstream commit 2640da4cccf5cc613bf26f0998b9e340f4b5f69c ]
    
    If the APIC was already enabled on entry of setup_local_APIC() then
    disabling it soft via the SPIV register makes a lot of sense.
    
    That masks all LVT entries and brings it into a well defined state.
    
    Otherwise previously enabled LVTs which are not touched in the setup
    function stay unmasked and might surprise the just booting kernel.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
    Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Link: https://lkml.kernel.org/r/20190722105219.068290579@linutronix.de
    Signed-off-by: Sasha Levin <sashal@kernel.org>

:040000 040000 db502796eb6fdf24be8a4b6442531cd027b4d459 8a9beff72957993cab7366eaa36c2d25700e1676 M      arch

This is why I sorted this bug into ACPI bugs.

I added some system information. If you need more, let me know.

I can't use any newer kernel version if this bug remains as either my system "freezes" (don't accepts inputs) with pci=noacpi set or doesn't display anything.

In older kernel versions (3.xx) everything worked fine out of the box.
How can I contribute to get to this state again?
Comment 1 Thomas Gleixner 2020-03-01 15:17:44 UTC
bugzilla-daemon@bugzilla.kernel.org writes:

> With Kernel <4.19.76 I can boot normally with kernel settings:
> pci=noacpi.
> Without this setting the screen stays black, so I have to use it.

Have your tried w/o that command line option? I have no idea why you
have to use it at all. My T420 just works w/o any magic tweaks.

What kind of kernel config are you using? Home brewn or some standard
distro config?

> I made a git bisect (stable branch) and found:
> b40c15c20e42491303202ae1368841704be0c3b9 is the first bad commit

This does not make any sense as this operation is undone 20 lines later
and there is absolutely no reason why this might affect your EHCI.

Did you verify by reverting that commit on top of 4.19.76?

Also please verify that the problem persists with the latest 4.19.107
kernel and the upstream 5.6-rc3/4 kernel.

Thanks,

        tglx
Comment 2 Carli* Freudenberg 2020-03-29 13:57:07 UTC
Created attachment 288117 [details]
Kernel Config

Kernel config used for git bisect, taken from original debian kernel config, created with make oldconfig
Comment 3 Carli* Freudenberg 2020-05-30 14:45:36 UTC
I found some time and tested with 4.19.113 and 5.5.9 both patched (reverted the commit) and unpatched.

The only version that actually worked were the patched ones with the parameter pci=noacpi set.
I also recognized that after ca. 5min the black screen disappears and I can actually boot the system even without pci=noacpi set.
Unfortunaetly it freezes very soon after this.

I checked memtest and the logs again and found that the memory is corrupt and also that the CPU was overheated a number of times.

Therefore I can not rule out Hardware failures.
Even so it would be nice that the Kernel shows more meaningful messages in case it would be really a Hardware issue, not at lot of work should be spend on solving the issue.
I bough a new Laptop, so I would be only interested in spending time in debugging if that would really help other or to rule out a bigger bug behind it, that could effect others as well.
Comment 4 Zhang Rui 2020-06-29 13:36:53 UTC
(In reply to Carli* Freudenberg from comment #3)
> I found some time and tested with 4.19.113 and 5.5.9 both patched (reverted
> the commit) and unpatched.
> 
> The only version that actually worked were the patched ones with the
> parameter pci=noacpi set.

> I also recognized that after ca. 5min the black screen disappears and I can
> actually boot the system even without pci=noacpi set.
> Unfortunaetly it freezes very soon after this.
> 
> I checked memtest and the logs again and found that the memory is corrupt
> and also that the CPU was overheated a number of times.
> 
> Therefore I can not rule out Hardware failures.

But the previous kernel versions, earlier than 4.19.113, still works smoothly with pci=noacpi, on this laptop, right?

If the previous kernels that used to work still works perfect, then this is still probably a kernel issue.
Comment 5 Zhang Rui 2020-11-19 03:42:45 UTC
Bug closed as there is no response from the bug reporter.
Please feel free to re-open it if this is a kernel issue.

Note You need to log in before you can comment on or make changes to this bug.