*Most recent kernel where this bug did not occur: - *Distribution: Slackware *Hardware Environment: Sony Vaio VGN-SZ340 - Core 2 Duo T7200 (2 GHz), 1 GB RAM, 100 GB HD, wireless ipw3945, 2 video cards (intel and nvidia go 7400) *Software Environment: Slack defaults + KDE *Problem Description: With ACPI enabled in the kernel, `cat /proc/cpuinfo` shows only 1 cpu core. Giving "acpi=off" to the kernel shows the 2 cores. *Steps to reproduce: ACPI enabled and SMP enabled also. Just boot and `cat /proc/cpuinfo`
Please attach the dmesg and acpidump output. :)
Created attachment 9636 [details] Dmesg
Created attachment 9637 [details] Dmesg
From the dmesg, you can see: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 6:15 APIC version 20 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] disabled) ACPI processor driver failed to start the second CPU, as the data read from MADT show that the second cpu should be disabled.
Created attachment 9638 [details] DSDT I got this file doing cat /proc/acpi/dsdt and decompiling it with iasl -d
The whole acpidump is needed,and acpidump tools is available here: http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils IMO, there should be some BIOS options that have something to do with LAPIC or mutil-core and you disabled them. That will be nice if you can check them. :)
>ACPI: 2 duplicate APIC table ignored. The BIOS has two MADT tables. Rui, please check which MADT table is valid.
Created attachment 9650 [details] ACPI Dump ACPI dump using the tool you gave the link to me. I checked the BIOS and there is nothing regarding multi cores or LAPIC. It's a Phoenix BIOS and it does not have too many options, just a few of them (and none regarding multi cores or LAPIC =( ) Note that if I put acpi=off it detects both cores. Thanks
Did you check if there is a BIOS update? Or did you try WinXP? This looks like a BIOS bug to me.
to Artur Souza : It's really strange that your BIOS has two MADTs. The first one disable CPU1 LAPIC, and this cause the problem you describe. Kernel only parse the first MADT and disable CPU1.
Well, under WinXP it works perfectly and I couldnt find a BIOS upgrade at Sony's website (this model is pretty new...VGN-SZ340). Do you think that there is any way to make the kernel parse the second MADT ? thanks!
Created attachment 9702 [details] workaround to parse the second MADT This is a workaround to make your laptop parse the second MADT. Please make a test and attatch the _DMESG_ output with this patch. :)
Created attachment 9703 [details] Dmesg with patch dmesg with patch
Comment on attachment 9703 [details] Dmesg with patch The patch didnt work. And here it is the dmesg. It still shows only one core. Thanks for your help! =)
Created attachment 9706 [details] workaround to ignore the first MADT Sorry, the last one is wrong. How about this one? I think it should work. :)
Created attachment 9707 [details] Dmesg using the second patch Now it worked! I'm using ACPI and both cores. I noticed that the fan is still "working hard" as it was when I ran without ACPI.... but now I can use the acpi to control brightness of the monitor and so on... and It shows both cores. What is the next step ? =) thanks!
The fan was working "hard" because I didnt set up the correct scaling governor. But I was able just to setup the scaling governor for cpu0 and not for cpu1. Was that expected ? thanks
>But I was able just to setup the scaling governor for cpu0 and not for cpu1. What do you mean by "can't setup the scaling governor for cpu1"? Only "cpu0" is shown under /sys/devices/system/cpu/? Or "cpufreq" is lost under "cpu1"? I'm not quite clear. :)
> What do you mean by "can't setup the scaling governor for cpu1"? > Only "cpu0" is shown under /sys/devices/system/cpu/? > Or "cpufreq" is lost under "cpu1"? I'm not quite clear. :) Ow, sorry. I mean that there is no "cpufreq" under cpu1.... =)
OK. Please add a boot option cpufreq.debug=7 and attatch the _DMESG_ output. :)
I got a message "Unknown boot option `cpufreq.debug=7': ignoring" with that option you said...
Check whether CONFIG_CPU_FREQ_DEBUG is set in your kernconfig. If not, set it, recompile the kernel and try again. And please attatch the kernel config file. :) IMO, it will be more meaningful if you can do this test on the latest kernel release, i.e. 2.6.19. :)
Created attachment 9747 [details] Dmesg with cpufreq.debug=7 Using the last patch you gave me (so both cores are detected). Using kernel 2.6.19 and cpufreq.debug=7
Created attachment 9748 [details] Config used with my kernel 2.6.19
This is some information in your BIOS Name (SSDT, Package (0x0C) { "CPU0IST ", 0x3F68BCF8, 0x000001EA, "CPU1IST ", 0x00000000, 0xF000FF53, "CPU0CST ", 0x3F68BAE1, 0x00000217, "CPU1CST ", 0x00000000, 0xF000FF53 }) "CPU1IST" "CPU1CST" are dynamic tables loaded by _PDC method. And these tables contain the critical methods for P-state support. But the table base_addr and length is totally wrong. It's really strange it works well in Windows.... :(
is there any way to fix these errors ?
Is it possible that the BIOS installed on this board does not support the processor that is installed -- or did the board+processor+BIOS all come together from Sony? In any case, the MADT issue clearly moves this into the BIOS bug category. A long while back, we ran into a Dell system with multiple MADTs that used to fail. So we choose the 1st MADT and ignored the other(s). It is possible that we could have just as well ignored the initial and used the last, which would also work for this Sony. We'll have to dig up the Dell to find out. Re: P-states on 2 cpus. Setting the governors are handled from user-space. It is possible that due to the MADT issue above, the distro installed a script for 1 processor and not the other. If the kernel is exporting /sys/devices/system/cpu/cpu0 and cpu1 and cpufreq files appear beneath both, and you can set the governor in each, and it works, then I don't think there is a kernel issue related to P-states on this machine.
> or did the board+processor+BIOS all come together from Sony? Yes, all come together from Sony. > If the kernel is exporting /sys/devices/system/cpu/cpu0 and cpu1 > and cpufreq files appear beneath both, and you can set the governor > in each, and it works, then I don't think there is a kernel issue > related to P-states on this machine. That's the problem: cpu1 file appear inside /sys/devices/system/cpu but cpufreq appears just inside cpu0 and not inside cpu1.
>That's the problem: cpu1 file appear inside /sys/devices/system/cpu but >cpufreq appears just inside cpu0 and not inside cpu1. This is a BIOS problem. CPU0 can be successfully initialized by cpu-freq driver. While the methods that support P-state for CPU1 are not found in the BIOS. I think they are contained in some dynamic loaded tables, "CPU1IST" or "CPU1CST". But as shown in comment #25, the base_addr and length from BIOS for these two tables are meaningless so that they are not loaded. That's why CPU1 can not be initialized by cpu-freq driver. I think we have root caused the bug. The only way to fix the P-state problem is to update the BIOS. :(
Created attachment 10014 [details] 2.6.20-rc3 patch adding "acpi_parse_multi_table=1" parameter Re: duplicate MADT Please test this patch. Boot with no parameters, it should print a message alerting you to the duplicate tables, asking you to... boot with "acpi_parse_multi_table=1" Please boot that way too. Please confirm that the original boot failed to bring up the 2nd core and that the 2nd boot succeeded in bringing up the 2nd core. Please attach the output from dmesg -s 64000 from both.
Created attachment 10027 [details] Dmesg of 2.6.20-rc4 with patch and acpi=off I couldnt even boot with acpi turned on. The screen remains black and the computer just stops after LILO loads the kernel. With acpi=off I could boot. I used the same .config as with kernel 2.6.19 (above). Thanks
I have a similar problem. Boot failed with "acpi_parse_multi_table=1". And with a error message: ERROR: Unable to locate IOAPIC for GSI 9 ERROR: Unable to locate IOAPIC for GSI 0 I'll debug further. :)
Created attachment 10044 [details] patch-ACPI-parse-multi-table to Artur Souza: Can you do the same test with this patch please? thanks. :)
Len: Only one change is made in the new patch. This makes my laptop boots successful with "acpi_parse_multi_table=1", even if there is only one MADT here, :). in acpi_table_parse_entries: - table_end = (unsigned long)table_start + sdt_entry[i].size; + table_end = (unsigned long)table_start + sdt_entry[index].size; If acpi_parse_multi_table > the number of MADTs in the BIOS, "i" equals to sdt_count and sdt_entry[i].size doesn't make sense. "index" is the actual table_index in sdt_entry[] that should be used here.
Hi, just tested with the last patch you sent me but had the same error related above: the screen remains blank and nothing happens when it start booting. Thanks again
Rui, thanks for fixing the patch. Artur, are you sure that you're running the fixed patch from comment #33? It works for me on several machines with and without duplicate APIC tables.
Yeah, I'm sure about that. Today I'll try again. What happens is that the screen turns black (off) and nothing else happens (the LED of the HD doesnt even blink). If I turn ACPI off, then the machine can boot. But I'll try again today! Thanks
Created attachment 10061 [details] updated patch w/o BUG_ON in acpi_table_parse() here's an updated patch that prints out if a NULL handler is seen in acpi_table_parse() instead of a BUG_ON. Perhaps BUG_ON was firing before your screen was on. Please try it out, per above.
Created attachment 10084 [details] Dmesg of 2.6.20-rc4 and last patch Now I could boot, both cores were detected (used "acpi_parse_multi_table=1") but frequency scaling is not enabled for the second core yet. Thanks
> ACPI: acpi_table_parse(17, 00000000) HPET NULL handler! Ah, that explains it. boot.c defines acpi_hpet_timer as NULL when CONFIG_HPET_TIMER=n Who'd a thunk it...
Created attachment 10685 [details] 2.6.21-rc3 patch This patch vs. 2.6.21-rc3 adds cmdline "acpi_apic_instance=" and changes the default for that parameter to 2 in order to parse the 2nd APIC/MADT by default. Please test.
It worked well just like the other patches. But I still doesnt have cpufreq for the second core. Any ideas ? THanks
The patch fixing the primary issue, the duplicate MADT issue and missing cpu1 in ACPI mode, shipped in Linux-2.6.21-rc4-git7. So that issue and this bug report should be closed. Please open a new bug report against the cpufreq on cpu1 issue.