Description (from Arch Linux bugtracker link : https://bugs.archlinux.org/task/69810)
"I have a laptop with an AMD 3550H CPU and since Kernel 5.11.x it doesn't boot at all with 'amd_iommu=off' kernel parameter.
To give you more info, when I select the boot entry in systemd-boot nothing happens.
There are no error messages or anything just a blank screen (and external screen is not detected/it doesn't detect any signal from the laptop) and I need to shut down with power button.
Linux 5.10 kernels and older work correctly."
We did a bisect and got the following results:
git bisect log
git bisect start
# bad: [f40ddce88593482919761f74910f42f4b84c004b] Linux 5.11
git bisect bad f40ddce88593482919761f74910f42f4b84c004b
# good: [2c85ebc57b3e1817b6ce1a6b703928e113a90442] Linux 5.10
git bisect good 2c85ebc57b3e1817b6ce1a6b703928e113a90442
# bad: [538fcf57aaee6ad78a05f52b69a99baa22b33418] Merge branches 'acpi-scan', 'acpi-pnp' and 'acpi-sleep'
git bisect bad 538fcf57aaee6ad78a05f52b69a99baa22b33418
# bad: [d635a69dd4981cc51f90293f5f64268620ed1565] Merge tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
git bisect bad d635a69dd4981cc51f90293f5f64268620ed1565
# good: [a1dd1d86973182458da7798a95f26cfcbea599b4] Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
git bisect good a1dd1d86973182458da7798a95f26cfcbea599b4
# good: [e5795aacd71b697c739f2d193b0e275993d93187] Merge tag 'wireless-drivers-next-2020-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
git bisect good e5795aacd71b697c739f2d193b0e275993d93187
# good: [dfefd226b0bf7c435a58d75a0ce2f9273b9825f6] mm: cleanup kstrto*() usage
git bisect good dfefd226b0bf7c435a58d75a0ce2f9273b9825f6
# good: [eb0ea74120e0f14a6d6454109153d1b4ccf210fc] Merge tag 'x86-fpu-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good eb0ea74120e0f14a6d6454109153d1b4ccf210fc
# good: [22f07b86d4e580424cbeb0ce232ed30d4b5ecb95] Merge branch 'bnxt_en-improve-firmware-flashing'
git bisect good 22f07b86d4e580424cbeb0ce232ed30d4b5ecb95
# bad: [26ab12bb9d96133b7880141d68b5e01a8783de9d] iommu/hyper-v: Remove I/O-APIC ID check from hyperv_irq_remapping_select()
git bisect bad 26ab12bb9d96133b7880141d68b5e01a8783de9d
# good: [341b4a7211b6ba3a7089e1dc09ac4bd576dfb05f] x86/ioapic: Cleanup IO/APIC route entry structs
git bisect good 341b4a7211b6ba3a7089e1dc09ac4bd576dfb05f
# good: [79eb3581bcaae9b5677629d945e14da212aa76e2] iommu/vt-d: Simplify intel_irq_remapping_select()
git bisect good 79eb3581bcaae9b5677629d945e14da212aa76e2
# good: [d981059e13ffa9ed03a73472e932d070323bd057] x86/hyperv: Enable 15-bit APIC ID if the hypervisor supports it
git bisect good d981059e13ffa9ed03a73472e932d070323bd057
# bad: [2fb6acf3edfeb904505f9ba3fd01166866062591] iommu/amd: Fix union of bitfields in intcapxt support
git bisect bad 2fb6acf3edfeb904505f9ba3fd01166866062591
# bad: [aec8da04e4d71afdd4ab3025ea34a6517435f363] x86/ioapic: Correct the PCI/ISA trigger type selection
git bisect bad aec8da04e4d71afdd4ab3025ea34a6517435f363
# bad: [f36a74b9345aebaf5d325380df87a54720229d18] x86/ioapic: Use I/O-APIC ID for finding irqdomain, not index
git bisect bad f36a74b9345aebaf5d325380df87a54720229d18
# first bad commit: [f36a74b9345aebaf5d325380df87a54720229d18] x86/ioapic: Use I/O-APIC ID for finding irqdomain, not index
This points to the following commit:
git bisect bad
f36a74b9345aebaf5d325380df87a54720229d18 is the first bad commit
Author: David Woodhouse <firstname.lastname@example.org>
Date: Tue Nov 3 16:36:22 2020 +0000
x86/ioapic: Use I/O-APIC ID for finding irqdomain, not index
In commit b643128b917 ("x86/ioapic: Use irq_find_matching_fwspec() to
find remapping irqdomain") the I/O-APIC code was changed to find its
parent irqdomain using irq_find_matching_fwspec(), but the key used
for the lookup was wrong. It shouldn't use 'ioapic' which is the index
into its own ioapics array. It should use the actual arbitration
ID of the I/O-APIC in question, which is mpc_ioapic_id(ioapic).
Fixes: b643128b917 ("x86/ioapic: Use irq_find_matching_fwspec() to find remapping irqdomain")
Reported-by: lkp <email@example.com>
Signed-off-by: David Woodhouse <firstname.lastname@example.org>
Signed-off-by: Thomas Gleixner <email@example.com>
arch/x86/kernel/apic/io_apic.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
So, it would seem there is something wrong with the mentioned commit. Could someone look at it? If this report belongs somewhere else (Product/Component), I apologize, it's the first time I am reporting a bug here.
Reverting this commit fixes the issue and there haven't been any side-effects yet.
I tried reverting f36a74b9 in Zen Kernel and it made my own system not bootable.
System configuration affected:
Machine: Type: Desktop System: ASUS product: All Series v: N/A serial: <superuser required>
Mobo: ASUSTeK model: X99-DELUXE II v: Rev 1.xx serial: <superuser required>
UEFI [Legacy]: American Megatrends v: 2101 date: 07/10/2019
CPU: Info: 8-Core Intel Core i7-5960X [MT MCP] speed: 3787 MHz min/max: 1200/3001 MHz
Maybe this commit's change is mutually exclusive between AMD and Intel systems?
(In reply to Steven Barrett from comment #2)
> Maybe this commit's change is mutually exclusive between AMD and Intel
Might be, I don't have any other hardware to test it with though.
(In reply to matejm98mthw from comment #3)
> Might be, I don't have any other hardware to test it with though.
I can confirm a Thinkpad T495s (AMD Zen+) will not boot with "amd_iommu=off" in 5.11.1 abd 5.11.2 as delivered with manjaro linux. (and will sometimes fail to recover from suspend if I do not pass this parameter).
(In reply to Mav from comment #4)
> (In reply to matejm98mthw from comment #3)
> > Might be, I don't have any other hardware to test it with though.
> I can confirm a Thinkpad T495s (AMD Zen+) will not boot with "amd_iommu=off"
> in 5.11.1 abd 5.11.2 as delivered with manjaro linux. (and will sometimes
> fail to recover from suspend if I do not pass this parameter).
Have you tried 5.11.3 or .4? The issue with waking up from suspend seems to be fixed now.
I have tried the last kernel from arch 5.11.6 and it still doesn't boot.
I'm one of the reporters from the arch thread https://bugs.archlinux.org/task/69810 and the kernel compiled with this commit rollback worked.
Are the changes pushed to the main branch?
I have ryzen 1600x (zen1) and it's confirmed that removing "amd_iommu=off" boots.
(In reply to jordicoma from comment #6)
> I have tried the last kernel from arch 5.11.6 and it still doesn't boot.
> I'm one of the reporters from the arch thread
> https://bugs.archlinux.org/task/69810 and the kernel compiled with this
> commit rollback worked.
> Are the changes pushed to the main branch?
> I have ryzen 1600x (zen1) and it's confirmed that removing "amd_iommu=off"
Yes, it's still broken, someone has to make the change, which would be either reverting the commit or figure out a better way to fix the original issue.
Please test https://firstname.lastname@example.org
I just built it on 5.12-rc3 and it fixes the issue.
(In reply to David Woodhouse from comment #8)
> Please test
Is this patch still needed after 36013e9ffc0a17eee8d3e4d92aea0dc37687760d (9f81ca8d1fd68f5697c201f26632ed622e9e462f upstream, bug 212133)?
I am currently on 5.11.11 and I haven't had this issue for a while now.