Latest working kernel version: 2.6.18 Earliest failing kernel version: 2.6.19 Distribution: 2.6.26.5 from kernel.org Hardware Environment: AMD Duron 800. VIA KT133A motherboard Software Environment: Debian testing Problem Description: CD-RW unit at /dev/hdc is not properly detected, which makes udev to wait for 180 seconds at boot Steps to reproduce: Boot using 2.6.26.5. Kernel boots, shows a few messages, and then it seems to stop. If I wait for 180 seconds a message like this is shown and the boot process ends: After the udevadm settle timeout, the events queue contains: 1180: /block/hdc At this moment, the eject button of CD-RW unit at /dev/hdc does not work. Surprisingly, trying to access the unit via "less -f /dev/hdc" seems to fix it. The problem was introduced somewhere between 2.6.18 and 2.6.19. This is the result of git-bisect: 4aff5e2333c9a1609662f2091f55c3f6fffdad36 is first bad commit commit 4aff5e2333c9a1609662f2091f55c3f6fffdad36 Author: Jens Axboe <axboe@suse.de> Date: Thu Aug 10 08:44:47 2006 +0200 [PATCH] Split struct request ->flags into two parts Right now ->flags is a bit of a mess: some are request types, and others are just modifiers. Clean this up by splitting it into ->cmd_type and ->cmd_flags. This allows introduction of generic Linux block message types, useful for sending generic Linux commands to block devices. Signed-off-by: Jens Axboe <axboe@suse.de> :040000 040000 ff931af25471578be78885d8e27e9e0df829b49d f5edcbd2a9424828cfb4f1579d672c95bba7a4a0 M block :040000 040000 5e4d7235fa7d0a48cb3b23399905dc3d472d738e 05a44d13e66ce6e6bdc6e9d697675c32799c70e3 M drivers :040000 040000 887303b2f4077cc43bd23e42d0b104cab05655b1 50c82dbe8394b6b8e5bd169c182e0b4cc3d71963 M include Will include dmesg and lspci output as soon as I find the attach button.
Created attachment 17834 [details] output of lspci -v
Created attachment 17835 [details] output of dmesg
Marked as a regression, reassigned to IDE, cc'ed Jens. Jens, please note that this was bisected down to a block layer change.
Hmm interesting. So the drive is actually detected, but later issued commands by udev are timing out. Could you double check if 2.6.27-rc6 is broken or not? CC'ing Bart and Borislav.
2.6.27-rc6 is also broken. Note: This time I've had to use the .config from the Debian package for 2.6.26, as the one provided by "make defconfig" didn't detect the hard disk (!). The behaviour is the same: Waiting time while udev is trying to detect hdc, timeout of "udevadm settle", eject button does not work, and a simple "less -f /dev/hdc" makes eject to actually happen.
Hi, can you please try the attached patch and send me the dmesg output? Thanks.
Created attachment 17848 [details] spit failing command patch
(In reply to comment #7) > Created an attachment (id=17848) [details] > spit failing command patch Could you tick the "patch" checkbox on this attachment?
Created attachment 17854 [details] dmesg after patching ide-io.c
More info, which I don't know if it's relevant. If I leave the system alone and don't try to wake up the cdrom by doing "less -f /dev/hdc", then the following messages are appended to dmesg (see next attach).
Created attachment 17855 [details] More dmesg output
Ok, those are follow-up traces from the soft lockup detector code showing that we're stuck trying to revalidate the disk after reading the toc. There are also some ioctls which come from somewhere else so we'll have to enable full debugging output in order to see exactly what happens. Here's a debugging patch, it is pretty big, please recompile with it and send me the whole boot log - the dmesg might not be complete since the debug output is going to be a lot more verbose and overflow the ring buffer so try to copy it from /var/log/syslog or similar, thanks.
Created attachment 17856 [details] enable full debugging info
Here it is. Notes: * This is 2.6.27-rc6, as the patch didn't apply cleanly to 2.6.26.5. * I've modified the file /usr/share/initramfs-tools/scripts/init-premount/udev which is used in my system to build the initramfs so that "udevadm settle" takes only 30 seconds instead of the default 180.
Created attachment 17858 [details] dmesg with full debug patch
Using 2.6.27-rc6, "cat /dev/hdc" produces a kernel panic and kdb is started. [ Would love to cut and paste but there is no bash anymore ].
you can catch the output with a serial console or a netconsole.
Ok. This is what netconsole was able to catch. It's an oops.
Created attachment 17879 [details] oops after cat /dev/hdc
Can you now do objdump -d drivers/ide/ide-cd.o > drivers/ide/ide-cd.dsm and make drivers/ide/ide-cd.s and send me the .s and .dsm files? Thanks.
Created attachment 17885 [details] ide-cd.s
Created attachment 17886 [details] ide-cd.dsm
This is a NULL ptr access in the debugging printk, here's a fix.
Created attachment 17887 [details] unrelated NULL ptr fix
Hi, does the above patch fix the oops you get? I'm still working on the main problem but it looks pretty hairy... Thanks.
[ Sorry for the delay, the computer is the one I use at work ]. The patch seems to fix the previous oops, but now there is a new one. Follows netconsole output for this one.
Created attachment 17937 [details] new oops after cat /dev/hdc
i opened a similar (possibly same) bug on http://bugzilla.kernel.org/show_bug.cgi?id=10216
drivers/ide is now obsolete