Kernel Bug Tracker – Bug 14129
2.6.31 regression - pci_get_slot oops, udev boot hang - toshiba X200
Last modified: 2009-10-26 19:22:38 UTC
My laptop is an toshiba X200, with intel core2 duo cpu 7500 centrino and nvidia card 8700M GT.
I want test the fedora rawhide (future F12), and with kernel 2.6.31 fedora don't boot (that's work with kernel 2.6.30)
When I install a 2.6.31 kernel (the latest is 2.6.31-0.199.rc8.git2.fc12.x86_64) and I when I want booting on this kernel I have a black screen, with freeze of my laptop (keyboard don't work,fan of my laptop is very speed, and I have obliged to do an hard reset.)
For test I compile an 2.6.31 kernel and i install it for same result.
Also I think it is not a fedora problem, but a kernel bug for my laptop.
I have an problem with kernel 2.6.29 and my laptop (see this bug: http://bugzilla.kernel.org/show_bug.cgi?id=12735 ) and you have resolve it with a patch.
please attach lspci -vvv and dmesg of working boot and describe more details about the failing boot. what messages do you see? if you see nothing, can you remove params like "quiet" or "splash" and add "vga=normal" to the bootparams?
I remove quiet and add vga=normal, and now I see all messages in boot, but they are too fast and I can't note this.
I see the progress bar, but when this progress bar is finish the boot seems stopped.
I wait for 15 minutes and I reboot with CTRL+ALT+BACK-SPACE.
I look in the log, but there is no dmesg for this boot.
I attach lspci -vvv and dmesg for a boot with 2.6.30 kernel
Created attachment 23020 [details]
Created attachment 23021 [details]
>nvidia: module license 'NVIDIA' taints kernel.
>Disabling lock debugging due to kernel taint
can you please retry without the nvidia module?
(either temporarly rename it or unsinstall the appropriate package)
most kernel developers won`t look at your problem if closed source modules are in place, as they often are the source of problems which can`t be solved here.
it`s just to make sure that the nvidia module is NOT the source of the problem.
what progress bar do you see?
please boot into text-mode only and check if that works without problems.
if that is the case, try starting X11 manually.
Sorry, for F12 I have no kernel module installed...
But I make a mistake and attach the dmesg for my F11.
I create a new attachment with the good dmesg (for F12)
Created attachment 23024 [details]
I also boot in text-mode for same result.
boot hang after progress-bar, and I am not able to do an startx.
The progress-bar is bar of fedora (line blue-white) which progress during boot, and after, normally, I have logging screen, but not with 2.6.31 kernel
Command line: ro root=UUID=215e72f1-9f2d-4940-96d8-f0f28c3854a3 rhgb vga=0x365 quiet LANG=fr_FR.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=fr-latin9 rd_plytheme=charge
>and describe more details
>about the failing boot. what messages do you see? if you see nothing, can you
>remove params like "quiet" or "splash" and add "vga=normal" to the bootparams?
apparently, you did not remove quite and did not add "vga=normal" to the bootparams.
please do that.
furthermore remove "rhgb", as rhgb param enables progressbar, afaik. you can try hitting ESC during but, that may also disable the progess bar. instead of vga=normal you can also use vga=ask and choose a console with enough lines to print whole kernel trace/oops (if there is one)
I have remove quiet and add vga=normal...
now I remove "rhgb" and use vga=ask.
I see an error message, and I copy it by hand... (see attachment 2 [details].6.31.boot.txt)
Created attachment 23026 [details]
can you change /etc/udev/udev.conf -> udev_log="debug" and post log results of good boot with working kernel and bad boot with 2.6.31 ?
I change /etc/udev/udev.conf-> udev_log="debug" and with that my 2.6.30 kernel won't boot, with "esc" I see a multiple line beginning by udev, I wait 10 minutes and I reboot my laptop.
But I have a boot.log for this boot (see attachment boot.log.2.6.30.txt).
With the 2.6.31 kernel, the same thing appear but I have no boot.log for that boot
Created attachment 23061 [details]
I take 2 photos of 2.6.31 boot:
Photo1:photo of boot:
Photo2:photo after CTRL+ALT+DEL:
If that help you...
I can boot on a kernel 2.6.31 with my laptop Toshiba, with add this option on boot
But it is not an available solution, because with that I have no verification of temp for cpu and gpu...
this could be related: http://lkml.org/lkml/2009/7/20/426
For test I compile and install a 2.6.31 kernel with the patch of http://lkml.org/lkml/2009/7/20/426 , I remove acpi=off option in boot, but that don't work...
Same result as previously...
For test I do another think.
I use Fedora 11, and to see if the patch of fedora is not the origin of problem, I take a 2.6.31 kernel (here: http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.31.tar.bz2) and I compile and install it to my Fedora 11.
This kernel don't boot, except if I add acpi=off on boot.(same result as previously...).
I am not a programmer, and I don't made a patch.
But if I understand the log, the problem is with LNXVIDEO and LNXSYSTM...
I want just add a comment:
For me, I think that it is a regression, because acpi work on my laptop (Toshiba X200) with an 2.6.30 kernel, and not working with 2.6.31 kernel.
If you want, I can do a regression test, if you explain me how I can do it.
Please, can you move target "Other" to "ACPI" ?
sorry, i don`t have the proper permission for this. don´t know who`s looking at this, too.
>If you want, I can do a regression test, if you explain me how I can do it.
yes, that would be great
you can systematically and quite efficiently search for the "offending" patch (i.e. git commit) which introduced the problem.
the magic word to google for is "git bisect".
if you have difficulties finding a good tutorial on how to git bisect, please let me know.
After searching on web, I am not able to make a bisect, my english is too bad and I don't understood how make this bisect...
But I found http://bugzilla.kernel.org/show_bug.cgi?id=14211 and the problem is same that mine.
Unfortunately, I try the patch ant it's don't work with my laptop (i try the two way, the patch and the patch -R.
If that can help you...
After many search and test, I can do a git-bisect.
The result is:
80ffdedf6020a77adcd06c01cfe6c488312b28f8 is the first bad commit
Author: Alexander Chiang <firstname.lastname@example.org>
Date: Wed Jun 10 19:55:55 2009 +0000
ACPI: kill acpi_get_pci_id
acpi_get_pci_dev() is better, and all callers have been converted, so
Signed-off-by: Alex Chiang <email@example.com>
Acked-by: Bjorn Helgaas <firstname.lastname@example.org>
Signed-off-by: Len Brown <email@example.com>
:040000 040000 d4df802ef1782e3ec795be4fb015f1b797613c4e 0499fac9c0a9b479379f42d120ed72d75b9c2174 M drivers
:040000 040000 a86418d0e1e49735be64671e6802010cb960d6da 66a3d0b724af6f89d064f9d026e6f68a02a2517d M include
If that can help you...
Created attachment 23269 [details]
pci_root: fix NULL pointer deref after resume from suspend
Can you please try this patch that Rafael wrote?
I try your patch with the latest kernel of fedora 11 (220.127.116.11-58.fc12.x86_64) and with this patch I boot without acpi=off option...
Thanks a lot, and I hope that this patch will be included in future kernels.
Thanks for your re-activity and your knowledge.
Patch : http://bugzilla.kernel.org/attachment.cgi?id=23269
Handled-By : Alex Chiang <firstname.lastname@example.org>
Handled-By : Rafael J. Wysocki <email@example.com>
*** Bug 14317 has been marked as a duplicate of this bug. ***
Ignore-Patch : http://bugzilla.kernel.org/attachment.cgi?id=23269
Patch : http://patchwork.kernel.org/patch/51834/
Fixed by commit 497fb54f578efd2b479727bc88d5ef942c0a1e2d .