Latest working kernel version: 2.6.18
Earliest failing kernel version: 2.6.19
Distribution: 126.96.36.199 from kernel.org
Hardware Environment: AMD Duron 800. VIA KT133A motherboard
Software Environment: Debian testing
Problem Description: CD-RW unit at /dev/hdc is not properly detected, which makes udev to wait for 180 seconds at boot
Steps to reproduce:
Boot using 188.8.131.52. Kernel boots, shows a few messages, and then it seems to stop. If I wait for 180 seconds a message like this is shown and the boot process ends:
After the udevadm settle timeout, the events queue contains:
At this moment, the eject button of CD-RW unit at /dev/hdc does not work.
Surprisingly, trying to access the unit via "less -f /dev/hdc" seems to fix it.
The problem was introduced somewhere between 2.6.18 and 2.6.19.
This is the result of git-bisect:
4aff5e2333c9a1609662f2091f55c3f6fffdad36 is first bad commit
Author: Jens Axboe <email@example.com>
Date: Thu Aug 10 08:44:47 2006 +0200
[PATCH] Split struct request ->flags into two parts
Right now ->flags is a bit of a mess: some are request types, and
others are just modifiers. Clean this up by splitting it into
->cmd_type and ->cmd_flags. This allows introduction of generic
Linux block message types, useful for sending generic Linux commands
to block devices.
Signed-off-by: Jens Axboe <firstname.lastname@example.org>
:040000 040000 ff931af25471578be78885d8e27e9e0df829b49d f5edcbd2a9424828cfb4f1579d672c95bba7a4a0 M block
:040000 040000 5e4d7235fa7d0a48cb3b23399905dc3d472d738e 05a44d13e66ce6e6bdc6e9d697675c32799c70e3 M drivers
:040000 040000 887303b2f4077cc43bd23e42d0b104cab05655b1 50c82dbe8394b6b8e5bd169c182e0b4cc3d71963 M include
Will include dmesg and lspci output as soon as I find the attach button.
Created attachment 17834 [details]
output of lspci -v
Created attachment 17835 [details]
output of dmesg
Marked as a regression, reassigned to IDE, cc'ed Jens.
Jens, please note that this was bisected down to a block layer change.
Hmm interesting. So the drive is actually detected, but later issued commands by udev are timing out. Could you double check if 2.6.27-rc6 is broken or not?
CC'ing Bart and Borislav.
2.6.27-rc6 is also broken.
Note: This time I've had to use the .config from the Debian package for 2.6.26, as the one provided by "make defconfig" didn't detect the hard disk (!).
The behaviour is the same: Waiting time while udev is trying to detect hdc,
timeout of "udevadm settle", eject button does not work, and a simple
"less -f /dev/hdc" makes eject to actually happen.
can you please try the attached patch and send me the dmesg output?
Created attachment 17848 [details]
spit failing command patch
(In reply to comment #7)
> Created an attachment (id=17848) [details]
> spit failing command patch
Could you tick the "patch" checkbox on this attachment?
Created attachment 17854 [details]
dmesg after patching ide-io.c
More info, which I don't know if it's relevant. If I leave the system alone and don't try to wake up the cdrom by doing "less -f /dev/hdc", then the following messages are appended to dmesg (see next attach).
Created attachment 17855 [details]
More dmesg output
Ok, those are follow-up traces from the soft lockup detector code showing that we're stuck trying to revalidate the disk after reading the toc. There are also some ioctls which come from somewhere else so we'll have to enable full debugging output in order to see exactly what happens. Here's a debugging patch, it is pretty big, please recompile with it and send me the whole boot log - the dmesg might not be complete since the debug output is going to be a lot more verbose and overflow the ring buffer so try to copy it from /var/log/syslog or similar, thanks.
Created attachment 17856 [details]
enable full debugging info
Here it is. Notes:
* This is 2.6.27-rc6, as the patch didn't apply cleanly to 184.108.40.206.
* I've modified the file
which is used in my system to build the initramfs so that "udevadm settle" takes
only 30 seconds instead of the default 180.
Created attachment 17858 [details]
dmesg with full debug patch
Using 2.6.27-rc6, "cat /dev/hdc" produces a kernel panic and kdb is started.
[ Would love to cut and paste but there is no bash anymore ].
you can catch the output with a serial console or a netconsole.
Ok. This is what netconsole was able to catch. It's an oops.
Created attachment 17879 [details]
oops after cat /dev/hdc
Can you now do
objdump -d drivers/ide/ide-cd.o > drivers/ide/ide-cd.dsm
and send me the .s and .dsm files?
Created attachment 17885 [details]
Created attachment 17886 [details]
This is a NULL ptr access in the debugging printk, here's a fix.
Created attachment 17887 [details]
unrelated NULL ptr fix
does the above patch fix the oops you get? I'm still working on the main problem but it looks pretty hairy...
[ Sorry for the delay, the computer is the one I use at work ].
The patch seems to fix the previous oops, but now there is a new one.
Follows netconsole output for this one.
Created attachment 17937 [details]
new oops after cat /dev/hdc
i opened a similar (possibly same) bug on http://bugzilla.kernel.org/show_bug.cgi?id=10216
drivers/ide is now obsolete