Bug 199045 - Kernel doesn't boot unless acpi=off
Summary: Kernel doesn't boot unless acpi=off
Status: RESOLVED WILL_NOT_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: ACPICA-Core (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-03-07 13:18 UTC by Fedor Koshel
Modified: 2018-07-08 13:32 UTC (History)
3 users (show)

See Also:
Kernel Version: elder 4.8.7
Subsystem:
Regression: No
Bisected commit-id:


Attachments
boot log (98.11 KB, text/plain)
2018-05-10 17:55 UTC, Fedor Koshel
Details
boot log verbose (619.33 KB, text/plain)
2018-05-10 17:56 UTC, Fedor Koshel
Details
Boot screen photo (148.38 KB, image/jpeg)
2018-06-06 09:21 UTC, Fedor Koshel
Details
idle=poll boot log (117.21 KB, text/plain)
2018-06-25 22:09 UTC, Fedor Koshel
Details
idle=poll verbose log (737.30 KB, text/plain)
2018-06-25 22:09 UTC, Fedor Koshel
Details
nox2apic boot log (99.93 KB, text/plain)
2018-06-25 22:10 UTC, Fedor Koshel
Details
nox2apic verbose log (628.92 KB, text/plain)
2018-06-25 22:10 UTC, Fedor Koshel
Details
maxcpus=1 boot log (100.14 KB, text/plain)
2018-06-25 22:10 UTC, Fedor Koshel
Details
maxcpus=1 verbose log (627.29 KB, text/plain)
2018-06-25 22:11 UTC, Fedor Koshel
Details

Description Fedor Koshel 2018-03-07 13:18:09 UTC
This bug started in redhat bugzilla because I thought it's special for Fedora.
https://bugzilla.redhat.com/show_bug.cgi?id=1492349
It reproduced on my Asus GL553VE, and tested on kernels 4.8.9 - 4.15.6

System loads correct with grub option acpi=off, but without it fails with error:
tmp_crb MSFT010:00: [Firmware Bug] ACPI region does not cover the entire command/response buffer. [mem 0xfed40000-0xfed4087f flags 0x200] vs fed40080 f80
NMI: watchdog: Watchdog detected hard lookup on cpu1
Watchdog: BUG: Soft lookup - CPU#2 stuck for 23s!

Logs for fedora 26, 4.15.6:
https://paste.fedoraproject.org/paste/tz0-1GBgWNw6dIKIAsqn6Q

As I found it isn't only fedoras bug. And reproduced on other ASUS notebooks. This problem looks like a BIOS bug. But ASUS doesn't support Linux and they wouldn't fix it.
Comment 1 Zhang Rui 2018-05-07 07:08:32 UTC
I can not open any of the log you attached.
can you please attach the logs here?
Comment 2 Fedor Koshel 2018-05-10 17:55:10 UTC
Created attachment 275903 [details]
boot log
Comment 3 Fedor Koshel 2018-05-10 17:56:04 UTC
Created attachment 275905 [details]
boot log verbose
Comment 4 Fedor Koshel 2018-05-10 18:02:40 UTC
(In reply to Zhang Rui from comment #1)
> I can not open any of the log you attached.
> can you please attach the logs here?

I have reproduced it on kernel 4.16.6, and collected logs with "journalctl -b -1 --no-pager --utc --no-hostname" (and with -o verbose). But there aren't strings from system booting log. It stacks on:
> tmp_crb MSFT010:00: [Firmware Bug] ACPI region does not cover the entire
> command/response buffer. [mem 0xfed40000-0xfed4087f flags 0x200] vs fed40080
> f80
> NMI: watchdog: Watchdog detected hard lookup on cpu1
> Watchdog: BUG: Soft lookup - CPU#2 stuck for 23s!

If you need other information just say how to get it. I'm not very good with this new journalctl.
Comment 5 Zhang Rui 2018-05-30 06:04:30 UTC
(In reply to Fedor Koshel from comment #4)
> (In reply to Zhang Rui from comment #1)
> > I can not open any of the log you attached.
> > can you please attach the logs here?
> 
> I have reproduced it on kernel 4.16.6, and collected logs with "journalctl
> -b -1 --no-pager --utc --no-hostname" (and with -o verbose). But there
> aren't strings from system booting log. It stacks on:

what do you mean? why we can not get the strings you mentioned?
Comment 6 Chen Yu 2018-05-30 06:11:27 UTC
@Fedor, do you have the full log after
 NMI: watchdog: Watchdog detected hard lookup on cpu1
Comment 7 Fedor Koshel 2018-05-30 06:32:39 UTC
(In reply to Zhang Rui from comment #5)
> (In reply to Fedor Koshel from comment #4)
> > (In reply to Zhang Rui from comment #1)
> > > I can not open any of the log you attached.
> > > can you please attach the logs here?
> > 
> > I have reproduced it on kernel 4.16.6, and collected logs with "journalctl
> > -b -1 --no-pager --utc --no-hostname" (and with -o verbose). But there
> > aren't strings from system booting log. It stacks on:
> 
> what do you mean? why we can not get the strings you mentioned?

I see this strings on the screen when I'm trying to boot the system. But in this moment system is totally stuck and I can't change tty and collect logs. And I can't find them, in journalctl log, when booting with the previous kernel. There isn't boot.log anymore, maybe I should use some special journalctl keys, but I didn't find correct combination.

I can only rewrite this from the screen, or make a photo. Or I can try any commands if you recommend me.
Comment 8 Fedor Koshel 2018-06-06 09:21:40 UTC
Created attachment 276343 [details]
Boot screen photo

In text form:
NMI watchdog: Watchdog detected hard LOOKUP on cpu 2
INFO: rcu_shed detected stalls on CPUs/tasks:
2-...0: (8 GPs behind) idle=44e/1/4611686018427387904 softirq=3072/3072 fqs=15000
(detected by 6, t=60002 jiffies, g=1882, c=18811=209)
Comment 9 Chen Yu 2018-06-13 02:57:41 UTC
Looks like there's an tpm_crb issue before this hang, how about disabling the  Trusted Platform Module in BIOS and try again?
Comment 10 Fedor Koshel 2018-06-15 09:43:21 UTC
(In reply to Chen Yu from comment #9)
> Looks like there's an tpm_crb issue before this hang, how about disabling
> the  Trusted Platform Module in BIOS and try again?

There isn't TPM entry in BIOS. I have already tried to disable secure boot, fast boot, and other boot params but it didn't help.
Comment 11 Zhang Rui 2018-06-25 07:32:37 UTC
First of all, for all the kernel versions that you have tries, acpi=off is mandatory for the kernel to boot, right?

please check the following command lines separately and see if kernel boots
1. idle=poll
2. nox2apic
3. maxcpus=1
Comment 12 Fedor Koshel 2018-06-25 22:09:13 UTC
Created attachment 276841 [details]
idle=poll boot log
Comment 13 Fedor Koshel 2018-06-25 22:09:37 UTC
Created attachment 276843 [details]
idle=poll verbose log
Comment 14 Fedor Koshel 2018-06-25 22:10:02 UTC
Created attachment 276845 [details]
nox2apic boot log
Comment 15 Fedor Koshel 2018-06-25 22:10:35 UTC
Created attachment 276847 [details]
nox2apic verbose log
Comment 16 Fedor Koshel 2018-06-25 22:10:59 UTC
Created attachment 276849 [details]
maxcpus=1 boot log
Comment 17 Fedor Koshel 2018-06-25 22:11:31 UTC
Created attachment 276851 [details]
maxcpus=1 verbose log
Comment 18 Fedor Koshel 2018-06-25 22:11:59 UTC
(In reply to Zhang Rui from comment #11)
> First of all, for all the kernel versions that you have tries, acpi=off is
> mandatory for the kernel to boot, right?
> 
> please check the following command lines separately and see if kernel boots
> 1. idle=poll
> 2. nox2apic
> 3. maxcpus=1

Yes all of them worked without acpi.
I've tried all of these params and no one works. The logs are in attachments.
Comment 19 Chen Yu 2018-06-25 23:48:19 UTC
It seems there's a issue in nouveau which hang at ioread32 in nv04_timer_read()
according to the idle=poll log. Please blacklist the nouveau driver in grub and trt again.
Comment 20 Fedor Koshel 2018-07-08 13:32:15 UTC
(In reply to Chen Yu from comment #19)
> It seems there's a issue in nouveau which hang at ioread32 in
> nv04_timer_read()
> according to the idle=poll log. Please blacklist the nouveau driver in grub
> and trt again.

Yes, it helps, thank you. There are still a lot of problems with the proprietary NVIDIA driver, but I can boot system.

Note You need to log in before you can comment on or make changes to this bug.