Bug 203431 - Dell Inspiron 5485/5585, poor ACPI flags; requiring ACPI=off.
Summary: Dell Inspiron 5485/5585, poor ACPI flags; requiring ACPI=off.
Status: NEW
Alias: None
Product: ACPI
Classification: Unclassified
Component: Config-Processors (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: acpi_config-processors
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-04-26 10:32 UTC by Adam Grigolato
Modified: 2019-07-26 12:08 UTC (History)
6 users (show)

See Also:
Kernel Version: 5.0.9
Tree: Mainline
Regression: No


Attachments
Kernel Config (135.22 KB, text/plain)
2019-04-26 10:34 UTC, Adam Grigolato
Details
Backtrace (1.56 MB, image/jpeg)
2019-04-26 11:02 UTC, Adam Grigolato
Details
ACPI TABLE: APIC (312 bytes, application/octet-stream)
2019-04-26 11:07 UTC, Adam Grigolato
Details
Modified FACP with reduced hardware unset (268 bytes, application/octet-stream)
2019-06-12 00:56 UTC, justin
Details
Modified FACP based on bios 1.1.2 with reduced hardware unset (268 bytes, application/octet-stream)
2019-07-16 16:58 UTC, justin
Details

Description Adam Grigolato 2019-04-26 10:32:06 UTC
A very new laptop,
Same issue shows up with every kernel tried,
kernel versions tested 4.19, 4.20, 5.0.7, 5.0.8, 5.0.9, 5.1-rc4
All results in a null pointer dereference.
traceback pointing at setup_boot_APIC_clock.

If booted with nolapic, it does get a little further, but dies as well, but differently (and still apparently related to the CPU's by the seems); I'll get some further information of the actual error here and post it).

A little bit of digging through ACPI has shown some oddities.
the APIC table describes 16 lapics and 16 nmi's for only 8 cpus.
(The LAPIC entries after id7 have enabled = 0).
As well as this, the CPU has an entirely new model (Family 23 / Model 24).



As well as this; 
Even with ACPI disabled, some instability occurs, especially in the ath10k_usb module, which seems to become a little flakey without "nospec_store_bypass_disable" in kernel params.
Also, nomodeset is required, but this may be because the ACPI tables aren't loaded (assumed based on acpi tables in the dumps that look related to GPU).
Comment 1 Adam Grigolato 2019-04-26 10:34:34 UTC
Created attachment 282543 [details]
Kernel Config

Most Recent kernel configuration used.
Comment 2 Adam Grigolato 2019-04-26 11:02:27 UTC
Created attachment 282545 [details]
Backtrace

Backtrace via efi console.
Comment 3 Adam Grigolato 2019-04-26 11:07:29 UTC
Created attachment 282547 [details]
ACPI TABLE: APIC

Odd looking APIC table.
Comment 4 nanericwang 2019-04-29 09:20:42 UTC
Should be the same issue as https://bugzilla.kernel.org/show_bug.cgi?id=200087
Comment 5 Adam Grigolato 2019-04-29 11:35:14 UTC
I've been looking at that bug, it does seem very similar,

The IVRS params didn't work, having a look this may be as my IVRS table doesn't seem to be broken (at least not in the same way anyway, IOAPIC's and the HPET entries both look good after disassembling it.)

I'll investigate the same leads as that ticket though; at least to find any commonality.
(It may be that issue + more; who knows... I'm onlythinking that as it does get past this point and dies horribly later with nolapic, and i can't remember if noapic was the same or not, will test and get back)
Comment 6 Adam Grigolato 2019-04-29 13:23:03 UTC
So, to update, noapic doesn't allow boot.
noacpi or nolapic are the only things that allow me to pass "Calibrating APIC timer..."

Both result in only a single CPU detected, noacpi boots, nolapic locks up.

Also after nolapic it spits a ton of
"[Firmware Bug]: cpu 0, try to use APIC510 (LVT offset 1) for vector 0xf9, but the register is already in use for vector 0x0 on this cpu"
and
"[Firmware Bug]: cpu 0, failed to setup threshold interrupt for bank X, block 0 (MSRC0002003=0xd0100000000000000"
(With X incrementing)
, then finally
"[Firmware Bug]: cpu 0, try to use APIC520 (LVT offset 2) for vector 0xf4, but the register is already in use for vector 0x0 on this cpu"

before finally locking up after 'crytpd: max_cpu_qlen set to 1000'.

As to the ACPI tables,

Cleaning up the extra NMI/LAPIC's in the APIC table (via override); although cleaning up the boot messages, didn't help booting in the slightest.


Sadly even with a lot of debugging enabled, I can't seem to get any kernel messages that indicate anywhere to look even.
(Other than those via nolapic; I haven't seen any [Firmware Bug] messages to help :/)
Comment 7 Adam Grigolato 2019-05-02 04:54:04 UTC
Another Update..
After much much debugging.

There appears to be something up with ACPI reduced hardware support.
This laptops FACP had its reduced hardware bit set.
For whatever reason, if this is set, none of the clocksources setup properly.
So when LAPIC calibration occurs.
Specifically the line in apic.c [calibrate_APIC_clock] 
"real_handler = global_clock_event->event_handler;"
Is where the null pointer dereference occurs.

Interestingly when set, it also results in the hpet init to be broken by the looks (possibly also tsc_sync); only by the fact that the printk messages are oddly super short when reduced hardware is enabled; not containing any calibration information, but when not enabled.. they appear to be fine.

With a modified FACP, I have now booted successfully with all cores.
There are still issues, but at least there is that now :).
Comment 8 justin 2019-05-13 22:35:27 UTC
Testing on an Inspiron 5585 (same CPU and shared bios update package) shows exactly the same panic on load.

Disabling apic and nomodeset generates the same errors Adam Grigolato observed. 

Decompiling the acpi tables shows the "reduced hardware" bit is set and overriding the tables to unset the bit has similarly resolved the issues.
Comment 9 Tim N 2019-06-11 23:40:21 UTC
Same problem.  Dell Inspiron 5485, Ryzen 7 3700U.  I haven't had a chance to test the suggested fix, but we have discovered that Mint Linux with the 4.15.0 kernel (but not any later ones) seems to boot with ACPI off (but can't use some of the hardware features).  

I have questions though:
- Does anyone know the basic outline of the long-term fix for this?  
- Will everyone with one of these laptops have to compile their own kernel?  (Is that what you're doing to override the ACPI tables?)
- Would the long-term solution somehow involve detecting these chips, and then overriding the ACPI tables programmatically?  

Thanks in advance for any help that can be offered :).
Comment 10 justin 2019-06-12 00:56:32 UTC
Created attachment 283209 [details]
Modified FACP with reduced hardware unset
Comment 11 justin 2019-06-12 01:11:06 UTC
My understanding of the long term is that Linux is improperly handling the reduced hardware bit and Adam is working on a kernel patch to address this generally.

Dell however should not be setting the reduced hardware bit so it is possible this could also be corrected in a future BIOS update. 

For the short term, Linux has the capability to override the Dell provided tables without kernel changes (or compiling your own), so long as a particular kernel option is specified (it seems to be commonly set). Instead a modified table file (with the reduced hardware bit unset and a higher version number than the BIOS provided table) can be added to intitramfs.

I've attached an appropriate table to this bug. 

You can use the modified table by placing it in an appropriately structured archive and concatenating with your existing intitramfs file. There are plenty of guides for this such as (https://blog.vortigaunt.net/decompile-recompile-load-custom-acpi-table-linux/). 

However, initramfs is regenerated when you update your kernel, undoing your changes. The regeneration process can usually be extended to include the modified table, but that is distro-specific. For example Fedora uses dracut and can automatically include the table by adding a config file with the options acpi_override="yes" and acpi_table_dir="path to folder containing modified file". You will have to find instructions for your particular distro.
Comment 12 Adam Grigolato 2019-06-12 01:17:33 UTC
Its not required to build a new kernel,
As long as the one you are using allows ACPI tables to be overrode from initramfs, in which case you can place the new ACPI table into the initramfs and it should work.

The long term is complicated; this is a bios bug, and dell should fix it, but I'd say without much pressure, they probably will take a long time to do this.

Secondarily, this could be fixed in kernel; as frankly.. the behavior of the kernel when the ACPI hw reduced bit is set doesn't make much sense... It probably shouldn't be failing to init clocksources just because of it.. 

I've been working on a patch in my spare time; but it slow going as I have very little of such spare time at the moment.
Comment 13 Tim N 2019-06-12 01:37:08 UTC
Great!  Thanks for the update, guys!  I'll see if I can't make something out of the info I've been given, and hopefully it will also help anyone else who happens across this bug.
Comment 14 brian 2019-07-16 15:08:44 UTC
Thanks for the great work figuring this out. I have an Inspiron 5485 with possibly a newer bios version (2.2.3), and am unable to make even basic headway on booting. I first tried to load the patched FACP provided by Justin, but that didn't work. I've also tried noacpi, that doesn't work either. I'm using the Ubuntu 19.04 installer.

My first thought is the newer bios (if that is the case) is preventing this from working, maybe the revision number in the bios is greater than the patched version?

I would like to help further, however I've been unable to produce any debug output. My machine hangs with no output beyond "EFI stub: UEFI Secure Boot is enabled.", regardless of the flags I pass. I've searched around for pointers or a howto on debugging kernel boot issues of ACPI issues, but haven't found the right trail head. From reading the above I get the sense there is a way to launch the kernel from the EFI console and see messages. I know this is slightly of topic, but can someone point out how to do that?
Comment 15 brian 2019-07-16 15:41:34 UTC
Quick update, reverting my bios to 1.0.0 enabled, using the FACP provided by Justin, my installation proceeded normally.

I'd still like to learn how to debug this issue when it is occurring, so I may update my bios back to 2.2.3 and create a new patched table. Pointer appreciated.
Comment 16 justin 2019-07-16 16:58:43 UTC
Created attachment 283747 [details]
Modified FACP based on bios 1.1.2 with reduced hardware unset

The revision number has been set much higher to avoid the BIOS provided table overriding
Comment 17 justin 2019-07-16 17:25:17 UTC
I've updated the previous table with a much higher version number. If your issues are caused by the new version having a higher revision and thus the modified table not being loaded this should fix that. The BIOS 1.0 table was set at revision 3 and the previous modified table was at 1000 so this is unlikely to be the cause.

the modified table is based on BIOS 1.0. Dell may have made changes to the table in the new BIOS versions that should be accounted for.

In order to make a new modified table you will first have to manage to boot under the new BIOS to acquire the tables. The good news is the tools required work under both windows and linux. Windows should be bootable. 

You can get the windows versions at https://acpica.org/downloads/binary-tools and under linux your distro probably has a package for them. 

1. run "acpidump -o tables" to create a file with the currently running tables
2. "acpixtract -a tables" to extract the individual tables (we only care about the FACP table)
3. "iasl -d facp.dat" to disassemble the FACP table
4. edit facp.dsl with your prefered text editor. 
set the "Oem Revision" field at the top to a larger number
set the "Hardware Reduced (V5)" field to 0
5. "iasl -sa facp.dsl" to assemble your modified table
6. install the produced facp.aml file like you did before
Comment 18 brian 2019-07-17 15:23:22 UTC
Thanks for the acpi tools pointers, super helpful. I'm going through the various versions of the bios from the Dell site and looking at the flags they include now.

I'm currently running Bios version 1.0, and interestingly it has the "Hardware Reduced (V5)" flag set to 0. My experience with this bios revision is everything just works out of the box. I'm going to take a look at the other versions posted online and see when they change, but a quick workaround for all of this, for me at least, is to just downgrade the bios to 1.0. Posting in case others running into this want to give it a try.
Comment 19 brian 2019-07-17 15:34:57 UTC
To clarify, I'm using Bios 1.0 from https://www.dell.com/support/home/us/en/04/Drivers/DriversDetails?driverId=K02DY&osCode=wt64a&productCode=inspiron-14-5485-laptop

The version of the Bios 1.0 I downloaded and installed had an OEM Revision of 2, and looking at the modified_facp.aml and the one I'm extracting from the version of the Bios 1.0 I'm using, I see that mine also has "Low Power S0 Idle (V5)" set to 0, "PCIEXP_WAK Bits Supported (V4)" set to 1, and "Control Method Power Button (V1)" set to 0. I wonder if you have a different version of 1.0, since you're OEM Revision was 3?
Comment 20 justin 2019-07-17 15:43:28 UTC
Seems you are correct. I thought I was running 1.0 but checking now I am actually on 1.1.2
Comment 21 brian 2019-07-17 15:49:09 UTC
After noticing the other differences in the decompiled tables, I looked at my Bios settings, and saw I had set Bios -> Advanced -> Sleep Mode = "Force S3". I tried setting this back to "Let OS configure" and I was again unable to boot. Resetting to "Force S3" and things worked "normally". So, I'm pretty confident now that Bios 1.0.0 with Bios -> Advanced -> Sleep Mode = "Force S3" provides a working configuration (doesn't requiring ACPI table patching).
Comment 22 brian 2019-07-17 16:46:18 UTC
After updating my bios back to 2.2.3, I booted into Windows and extracted the tables using the tools you linked to above (thanks!). I found that the "Sleep Mode" bios setting had the same effect on the same set of FACP flags. So, instead of patching the ACPI table, I simply set "Sleep Mode" to "Force S3", and was able to cleanly boot into Ubuntu 19.04 with the 5.0.0-20-generic kernel. 

Looking at the set of flags that get flipped by setting "Sleep Mode", they all look related to setting up the machine to support connected standby. As this isn't a mode I'm interested in for my laptop, I have no issues with setting it to "Force S3" for now.

So,in summary, setting "Force S3" worked for me with all bios revisions for installing Ubuntu 19.04 on my Dell 5485. (Yeah, I wrote that in a painfully terse way for SEO purposes).
Comment 23 boldos 2019-07-17 22:36:58 UTC
Sorry to interrupt the discussion, but I confirm I have exactly the same problem on the very new HP ENVY x360 15-ds0005 (Ryzen 7 3700U machine).

I was able to boot with the only one kernel 4.15.0 (stock kernel from Ubuntu 18.04), using "nolapic acpi=off". 

Other kernels [from Ubuntu 16.04, 17.10, 19.04, 19.10nightly] with those same parameters keep failing to boot further in the boot process.

Please let me know should I provide an further info/details.

Note: I will be most probably returning the machine for refund within approx. 1,5 weeks :-/
Comment 24 brian 2019-07-17 22:56:28 UTC
On that HP, do you have an option in the Advanced section of the BIOS to change the Sleep Mode to something like "Force S3" like on the Dell? If so, can you try changing that?
Comment 25 boldos 2019-07-18 11:38:40 UTC
(In reply to brian from comment #24)
> On that HP, do you have an option in the Advanced section of the BIOS to
> change the Sleep Mode to something like "Force S3" like on the Dell? If so,
> can you try changing that?

Hi Brian, unfortunately there is not such thing as "Advanced" section in these BIOSes. (I've red "on internets" that there might be some possibilities of unclocking the advanced mode - probably by flashing a modded BIOS etc. - but I'm not sure I'm really willing to walk this path...).

Will check again, but I believe it is not possible for me this to be tested :(
Comment 26 boldos 2019-07-18 13:04:43 UTC
(In reply to boldos from comment #25)

> Hi Brian, unfortunately there is not such thing as "Advanced" section in
> these BIOSes. (I've red "on internets" that there might be some
> possibilities of unclocking the advanced mode - probably by flashing a
> modded BIOS etc. - but I'm not sure I'm really willing to walk this path...).
> 
> Will check again, but I believe it is not possible for me this to be tested
> :(
Yep, no Advanced options on basically all HP "consumer" notebooks; HP support widely refuses all requests for Advanced settings unlocking.

So this also means, that it is/will not be possible for users to fix this bug by changing BIOS settings on newer HP devices, since these HP Insyde BIOSes are locked to a bare minimum settings.
Comment 27 boldos 2019-07-21 10:12:40 UTC
Ok, tried the updated ACPI tables magic trick (modified Reduced hardware support to 0 in FACP tables) and can confirm that kernel now boots fine with LAPIC and ACPI. Also managed to boot kernel 5.1.16 without "nolapic acpi=off".

Now there are other shenanigans to be fixed on this HP ENVY x360 15-ds0005nc:
- non-functioning keyboard (works 100% under PIC mode)
- non-functioning trackpad (worked "sometimes" under PIC mode)
Comment 28 boldos 2019-07-23 20:32:36 UTC
(In reply to boldos from comment #27)
> Ok, tried the updated ACPI tables magic trick (modified Reduced hardware
> support to 0 in FACP tables) and can confirm that kernel now boots fine with
> LAPIC and ACPI. Also managed to boot kernel 5.1.16 without "nolapic
> acpi=off".
> 
> Now there are other shenanigans to be fixed on this HP ENVY x360 15-ds0005nc:
> - non-functioning keyboard (works 100% under PIC mode)
> - non-functioning trackpad (worked "sometimes" under PIC mode)

Ok, to follow up: the kernel requires i8042.nopnp in order to make keyboard and touchpad working.

Majority of the notebook devices are working very fine, nevertheless, ACPI is not recognizing any of these:
- AC Adapter [AC]
- Sleep Button [SLPB]
- Lid Switch [LID]
- Power Button [PWRF]
This means no battery and no suspend&resume.

I guess this will require a new bug report, correct?
Comment 29 goffyara 2019-07-25 21:12:19 UTC
Same issue with HP ENVY x360 15-ds0003ur (Ryzen 7 3700U).


I tried to modify FACP tables and load it as CPIO archive by the bootloader (maybe I have done something wrong). But it's useless. With 5.0 kernel with Ubuntu live iso and with 5.1.15 with Arch iso.
Comment 30 boldos 2019-07-26 08:45:48 UTC
You have directly modified the isos? And tried to boot those moisos?

First I installed Ubuntu 18.04 with kernel params "nolapic acpi=off". Then I added the modified FACP tables (as cpio archive) to the installed initrd and then tried to boot the installed system without the kernel parameters above.
Comment 31 goffyara 2019-07-26 10:18:46 UTC
> You have directly modified the isos?

I writed iso on usb. Then i added cpio file as this - https://wiki.archlinux.org/index.php/DSDT#Using_a_CPIO_archive. And added string in grub config. I can boot only with params 'nospec_store_bypass_disable acpi=off'. 

Boldos, Must I install ubuntu at first?
Comment 32 boldos 2019-07-26 12:08:52 UTC
(In reply to goffyara from comment #31)
> > You have directly modified the isos?
> 
> I writed iso on usb. Then i added cpio file as this -
> https://wiki.archlinux.org/index.php/DSDT#Using_a_CPIO_archive. And added
> string in grub config. I can boot only with params
> 'nospec_store_bypass_disable acpi=off'. 
> 
> Boldos, Must I install ubuntu at first?

Just sent you and email; lets take the detailed discussion elsewhere outside of this phorum...

Note You need to log in before you can comment on or make changes to this bug.