|Summary:||CDRW not detected on boot. 2.6.18 worked ok.|
|Component:||IDE||Assignee:||Borislav Petkov (bp)|
|Severity:||normal||CC:||akpm, alan, axboe, bp, lars.winterfeld|
output of lspci -v
output of dmesg
spit failing command patch
dmesg after patching ide-io.c
More dmesg output
enable full debugging info
dmesg with full debug patch
oops after cat /dev/hdc
unrelated NULL ptr fix
new oops after cat /dev/hdc
Description sanvila 2008-09-17 04:46:44 UTC
Latest working kernel version: 2.6.18 Earliest failing kernel version: 2.6.19 Distribution: 184.108.40.206 from kernel.org Hardware Environment: AMD Duron 800. VIA KT133A motherboard Software Environment: Debian testing Problem Description: CD-RW unit at /dev/hdc is not properly detected, which makes udev to wait for 180 seconds at boot Steps to reproduce: Boot using 220.127.116.11. Kernel boots, shows a few messages, and then it seems to stop. If I wait for 180 seconds a message like this is shown and the boot process ends: After the udevadm settle timeout, the events queue contains: 1180: /block/hdc At this moment, the eject button of CD-RW unit at /dev/hdc does not work. Surprisingly, trying to access the unit via "less -f /dev/hdc" seems to fix it. The problem was introduced somewhere between 2.6.18 and 2.6.19. This is the result of git-bisect: 4aff5e2333c9a1609662f2091f55c3f6fffdad36 is first bad commit commit 4aff5e2333c9a1609662f2091f55c3f6fffdad36 Author: Jens Axboe <firstname.lastname@example.org> Date: Thu Aug 10 08:44:47 2006 +0200 [PATCH] Split struct request ->flags into two parts Right now ->flags is a bit of a mess: some are request types, and others are just modifiers. Clean this up by splitting it into ->cmd_type and ->cmd_flags. This allows introduction of generic Linux block message types, useful for sending generic Linux commands to block devices. Signed-off-by: Jens Axboe <email@example.com> :040000 040000 ff931af25471578be78885d8e27e9e0df829b49d f5edcbd2a9424828cfb4f1579d672c95bba7a4a0 M block :040000 040000 5e4d7235fa7d0a48cb3b23399905dc3d472d738e 05a44d13e66ce6e6bdc6e9d697675c32799c70e3 M drivers :040000 040000 887303b2f4077cc43bd23e42d0b104cab05655b1 50c82dbe8394b6b8e5bd169c182e0b4cc3d71963 M include Will include dmesg and lspci output as soon as I find the attach button.
Comment 3 Andrew Morton 2008-09-17 09:37:38 UTC
Marked as a regression, reassigned to IDE, cc'ed Jens. Jens, please note that this was bisected down to a block layer change.
Comment 4 Jens Axboe 2008-09-17 11:28:00 UTC
Hmm interesting. So the drive is actually detected, but later issued commands by udev are timing out. Could you double check if 2.6.27-rc6 is broken or not? CC'ing Bart and Borislav.
Comment 5 sanvila 2008-09-18 00:51:53 UTC
2.6.27-rc6 is also broken. Note: This time I've had to use the .config from the Debian package for 2.6.26, as the one provided by "make defconfig" didn't detect the hard disk (!). The behaviour is the same: Waiting time while udev is trying to detect hdc, timeout of "udevadm settle", eject button does not work, and a simple "less -f /dev/hdc" makes eject to actually happen.
Comment 6 Borislav Petkov 2008-09-18 01:58:15 UTC
Hi, can you please try the attached patch and send me the dmesg output? Thanks.
Comment 7 Borislav Petkov 2008-09-18 02:00:24 UTC
Created attachment 17848 [details] spit failing command patch
Comment 8 Sergei Shtylyov 2008-09-18 02:17:44 UTC
(In reply to comment #7) > Created an attachment (id=17848) [details] > spit failing command patch Could you tick the "patch" checkbox on this attachment?
Comment 9 sanvila 2008-09-18 03:25:10 UTC
Created attachment 17854 [details] dmesg after patching ide-io.c
Comment 10 sanvila 2008-09-18 03:59:22 UTC
More info, which I don't know if it's relevant. If I leave the system alone and don't try to wake up the cdrom by doing "less -f /dev/hdc", then the following messages are appended to dmesg (see next attach).
Comment 12 Borislav Petkov 2008-09-18 04:38:25 UTC
Ok, those are follow-up traces from the soft lockup detector code showing that we're stuck trying to revalidate the disk after reading the toc. There are also some ioctls which come from somewhere else so we'll have to enable full debugging output in order to see exactly what happens. Here's a debugging patch, it is pretty big, please recompile with it and send me the whole boot log - the dmesg might not be complete since the debug output is going to be a lot more verbose and overflow the ring buffer so try to copy it from /var/log/syslog or similar, thanks.
Comment 13 Borislav Petkov 2008-09-18 04:39:06 UTC
Created attachment 17856 [details] enable full debugging info
Comment 14 sanvila 2008-09-18 06:38:27 UTC
Here it is. Notes: * This is 2.6.27-rc6, as the patch didn't apply cleanly to 18.104.22.168. * I've modified the file /usr/share/initramfs-tools/scripts/init-premount/udev which is used in my system to build the initramfs so that "udevadm settle" takes only 30 seconds instead of the default 180.
Comment 15 sanvila 2008-09-18 06:40:09 UTC
Created attachment 17858 [details] dmesg with full debug patch
Comment 16 sanvila 2008-09-18 10:51:40 UTC
Using 2.6.27-rc6, "cat /dev/hdc" produces a kernel panic and kdb is started. [ Would love to cut and paste but there is no bash anymore ].
Comment 17 Borislav Petkov 2008-09-18 16:53:30 UTC
you can catch the output with a serial console or a netconsole.
Comment 18 sanvila 2008-09-19 05:18:17 UTC
Ok. This is what netconsole was able to catch. It's an oops.
Comment 19 sanvila 2008-09-19 05:19:41 UTC
Created attachment 17879 [details] oops after cat /dev/hdc
Comment 20 Borislav Petkov 2008-09-19 08:23:16 UTC
Can you now do objdump -d drivers/ide/ide-cd.o > drivers/ide/ide-cd.dsm and make drivers/ide/ide-cd.s and send me the .s and .dsm files? Thanks.
Comment 23 Borislav Petkov 2008-09-19 10:33:47 UTC
This is a NULL ptr access in the debugging printk, here's a fix.
Comment 24 Borislav Petkov 2008-09-19 10:34:20 UTC
Created attachment 17887 [details] unrelated NULL ptr fix
Comment 25 Borislav Petkov 2008-09-21 22:22:18 UTC
Hi, does the above patch fix the oops you get? I'm still working on the main problem but it looks pretty hairy... Thanks.
Comment 26 sanvila 2008-09-22 03:10:13 UTC
[ Sorry for the delay, the computer is the one I use at work ]. The patch seems to fix the previous oops, but now there is a new one. Follows netconsole output for this one.
Comment 27 sanvila 2008-09-22 03:11:13 UTC
Created attachment 17937 [details] new oops after cat /dev/hdc
Comment 28 lars.winterfeld 2008-11-23 15:32:36 UTC
i opened a similar (possibly same) bug on http://bugzilla.kernel.org/show_bug.cgi?id=10216
Comment 29 Alan 2012-05-22 14:08:07 UTC
drivers/ide is now obsolete