Bug 206343
Summary: | Hanging on boot: error parsing RSDP address - Intel(R) Core(TM) i9-9820X | ||
---|---|---|---|
Product: | ACPI | Reporter: | Steven Clarkson (sc) |
Component: | BIOS | Assignee: | acpi_bios |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | bp, rui.zhang |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 5.3 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | dmesg output from workaround |
Turns out this causes the kernel to hang in the while loop parsing the SRAT table in count_immovable_mem_regions. After dumping the SRAT table, it looks like there's 320 bytes of zeros in the middle of it. Sure enough, dmesg complains [ 0.007413] ACPI: [SRAT:0x00] Invalid zero length [ 0.007415] ACPI: [SRAT:0x01] Invalid zero length Proposed patch below. diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c index 25019d42ae93..7369de333eda 100644 --- a/arch/x86/boot/compressed/acpi.c +++ b/arch/x86/boot/compressed/acpi.c @@ -394,6 +394,12 @@ int count_immovable_mem_regions(void) while (table + sizeof(struct acpi_subtable_header) < table_end) { sub_table = (struct acpi_subtable_header *)table; + + if (!sub_table->length) { + debug_putstr("Invalid zero length SRAT subtable.\n"); + break; + } + if (sub_table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) { struct acpi_srat_mem_affinity *ma; (In reply to Steven Clarkson from comment #1) > Turns out this causes the kernel to hang in the while loop parsing the SRAT > table in count_immovable_mem_regions. After dumping the SRAT table, it looks > like there's 320 bytes of zeros in the middle of it. Of course. Qwalitee BIOS. ;-\ > Sure enough, dmesg complains > > [ 0.007413] ACPI: [SRAT:0x00] Invalid zero length > [ 0.007415] ACPI: [SRAT:0x01] Invalid zero length > > Proposed patch below. > > > diff --git a/arch/x86/boot/compressed/acpi.c > b/arch/x86/boot/compressed/acpi.c > index 25019d42ae93..7369de333eda 100644 > --- a/arch/x86/boot/compressed/acpi.c > +++ b/arch/x86/boot/compressed/acpi.c > @@ -394,6 +394,12 @@ int count_immovable_mem_regions(void) > > while (table + sizeof(struct acpi_subtable_header) < table_end) { > sub_table = (struct acpi_subtable_header *)table; > + > + if (!sub_table->length) { > + debug_putstr("Invalid zero length SRAT subtable.\n"); > + break; > + } > + > if (sub_table->type == ACPI_SRAT_TYPE_MEMORY_AFFINITY) { > struct acpi_srat_mem_affinity *ma; Yah, makes a lot of sense to me. Especially if this has been already encountered with other BIOSes. Sounds like the qwalitee work has been spread around. Please submit a proper patch to LKML documenting which BIOS version it is and CC me. If you need help with creating the patch, just ask. Thx. Fixed in 2b73ea379624 ("x86/boot: Handle malformed SRAT tables during early ACPI parsing") |
Created attachment 287015 [details] dmesg output from workaround After upgrading my kernel to 5.3, my machine hangs at boot. GRUB outputs the that it is booting the kernel, then the machine hangs indefinitely. The motherboard is an ASUS WS X299 SAGE, with firmware version 1201. Anecdotally, I believe this affects most recent ASUS motherboards, without the most recent firmware version. Last known working kernel was 5.2. I was able to bisect the issue to commit 8e44c7840 Revert "x86/boot: Disable RSDP parsing temporarily" I was able to boot my system by applying the patch below to the most recent kernel, 5.5. The output of dmesg is attached. diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c index 9652d5c2afda..5df966201abd 100644 --- a/arch/x86/boot/compressed/misc.c +++ b/arch/x86/boot/compressed/misc.c @@ -373,7 +373,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, memptr heap, * so that early debugging output from the RSDP parsing code can be * collected. */ - boot_params->acpi_rsdp_addr = get_rsdp_addr(); + // boot_params->acpi_rsdp_addr = get_rsdp_addr(); debug_putstr("early console in extract_kernel\n");