Most recent kernel where this bug did *NOT* occur: NA Distribution:SuSE 10 Hardware Environment:P4 2.4GHz,1GiB RAM, hda HDD, hdb DVD-ROM, hdc CD-R/RW,hdd plextor CD-R/RW 40/12/40A with latest firmware Software Environment: Problem Description: Every 5 minutes approximately hdd: drive_cmd: status=0x51 { DriveReady SeekComplete Error } hdd: drive_cmd: error=0x04 { AbortedCommand } ide: failed opcode was: 0xec appears in the system log. The plextor CD-R/RW drive works fine otherwise; Windows works fine with it, plextools as well. It's just a minor nuisance, but I would like to know what causes it and it fills my system log. Steps to reproduce:
It seems that some brain-damaged user-space program sends ATA IDENTIFY command every 5 minutes (no wonder that the command fails on ATAPI device because). Could you try incrementally disabling various system services (start with haldaemon) and see if the issue goes away. Jens, any ideas?
Created attachment 9929 [details] draft patch for printing name and pid of process if HDIO_DRIVE_CMD ioctl fails update: please try this (untested) patch instead - it should give us the name of the "guilty" process
I found the culprit, it is munin network monitor. I cannot find a reason though for it to interfere with CD-R drives and the weird thing is that the other CD-R drive (TEAC) doesn't cause any such error messages. Anyway, thanks for your prompt response even for such a minor issue.
munin uses smartctl and hddtemp without checking whether device is ATA or ATAPI. However I cannot reproduce the error with using smartctl/hddtemp on my ATAPI device. It could be that it was the bug present in some older versions of the smartctl/hddtemp. on FC6 with all updates: [bzolnier@trik ~]$ rpm -q munin smartmontools hddtemp munin-1.2.5-1.fc6 smartmontools-5.36-3 hddtemp-0.3-0.9.beta15.fc6 Does running "smartctl -A /dev/hdb" or "hddtemp /dev/hdb" give you the same error in the system log?
smartctl -A /dev/hdd causes the error to display, smartctl -A /dev/hdb doesn't. Also smartctl is version 5.36.
OK, thanks for testing. From smartmontools source (smartmontools-5.36/os_linux.c): #if 1 // Note to people doing ports to other OSes -- don't worry about // this block -- you can safely ignore it. I have put it here // because under linux when you do IDENTIFY DEVICE to a packet // device, it generates an ugly kernel syslog error message. This // is harmless but frightens users. So this block detects packet // devices and make IDENTIFY DEVICE fail "nicely" without a syslog // error message. // // If you read only the ATA specs, it appears as if a packet device // *might* respond to the IDENTIFY DEVICE command. This is // misleading - it's because around the time that SFF-8020 was // incorporated into the ATA-3/4 standard, the ATA authors were // sloppy. See SFF-8020 and you will see that ATAPI devices have // *always* had IDENTIFY PACKET DEVICE as a mandatory part of their // command set, and return 'Command Aborted' to IDENTIFY DEVICE. if (command==IDENTIFY || command==PIDENTIFY){ unsigned short deviceid[256]; // check the device identity, as seen when the system was booted // or the device was FIRST registered. This will not be current // if the user has subsequently changed some of the parameters. If // device is a packet device, swap the command interpretations. if (!ioctl(device, HDIO_GET_IDENTITY, deviceid) && (deviceid[0] & 0x8000)) buff[0]=(command==IDENTIFY)?ATA_IDENTIFY_PACKET_DEVICE:ATA_IDENTIFY_DEVICE; } #endif (deviceid[0] & 0x8000) [ device is ATA ] is not be enough to distinguish ATA and ATAPI devices. Could you send content of /proc/ide/hdd/identify file to verify this? I wonder why user-space apps are not using readily available info from sysfs (/sys/bus/ide/devices/*/media files) as kernel already does the hard job of finding out whether device is ATA or ATAPI...
Created attachment 9932 [details] contents of /proc/ide/hdd/identify
OK, my theory proven wrong. This is getting interesting... Please send output of "strace smartctl -A /dev/hdd". Thanks!
Created attachment 9933 [details] strace output
One thing that may help is that the CD-R that doesn't give the error (TEAC) is on the same IDE channel as the HDD (hda is the HDD, hdb the TEAC CD-R/RW, hdc the DVD-ROM, hdd the plextor CD-R/RW; the initial report has an error concerning the drive assignments to the device files).
Created attachment 9935 [details] small program to get HDIO_GET_IDENTITY data for "/dev/hdd" Thanks for info but what do you mean by "the initial report has an error concerning the drive assignments to the device files"? from strace done locally (probably same for your system and /dev/hdb): ioctl(3, 0x30d, 0xbfcd132c) = 0 ioctl(3, 0x31f, 0xbfcd152c) = 0 from strace for /dev/hdd: ioctl(3, 0x30d, 0xaff6d670) = 0 ioctl(3, 0x31f, 0xaff6daa0) = -1 EIO (Input/output error) So call to HDIO_DRIVE_CMD ioctl (0x31f) fails while it shouldn't. We know from /proc/ide/hdd/identify that (deviceid[0] & 0x8000) should be true (== not an ATA device - I made a mistake in some earlier comment) and that ATA IDENTIFY PACKET DEVICE command should be used (0xa1) and not ATA IDENTIFY DEVICE (0xec). However ATA IDENTIFY DEVICE is used and therefore it fails: hdd: drive_cmd: status=0x51 { DriveReady SeekComplete Error } hdd: drive_cmd: error=0x04 { AbortedCommand } ide: failed opcode was: 0xec Maybe HDIO_GET_IDENTITY ioctl (0x30d) returns wrong data (this would be kernel bug and should be fixed). It could also be that smartmontools does some magic and fails. Please compile attached small tool with "gcc hddid.c -ohddid", run it and send the result which is the output of the ioctl for "/dev/hdd".
Created attachment 9936 [details] hddid output This is what you requested; by "assignment" I meant association, i.e. the device each device file points to, I'm not a native speaker, sorry.
Thanks, "assignment" is fine too, I didn't get the context. :) Anyway it seems that HDIO_GET_IDENTITY works just fine. Could you get smartmontools-5.37 source, compile and test it? [ http://sourceforge.net/project/showfiles.php?group_id=64297 ] Please also send dmesg command output.
Running the updated smartctl -A /dev/hdd gives the same error. Should I refer to the smartmontools developers?
updated to 5.37?
Yes: smartctl version 5.37. I just ran it without installation (the smartctl binary is autonomous i.e. doesn't depend on any special smartmontools library that needs to be updated as well, right?).
Right. I'm out of ideas for the moment so please contact smartmontools developers. Also please send dmesg command output.
Created attachment 9938 [details] dmesg output dmesg output per your request. Contacting smartmontools developers.
Thanks. Is this SuSE kernel or vanilla one? Could you also try some newer kernel (2.6.19 or 2.6.20-rc1) to eliminate the possibility of some strange kernel bug which got somehow fixed already (I should have asked this before)...
All the testing was done on 2.6.16 vanilla. I also tried 2.6.19 and the error message remained (by the way, 2.6.19 introduced two bugs for me, one with ohci1394 displaying "irq 21 no one cared" and one with the system timer in that compiling the kernel displayed something about skewed clock). As Christmas approaches, I think it's not worth dealing with this; maybe we should postpone further investigation until next year?
Yep, this IDE issue is not dangerous anyway (just very strange) and it can wait. so Merry Christmas and Happy New Year :)
Merry Christmas and a Happy New Year to you too.
Any update on this problem? Thanks.
Just tested 2.6.22, smartctl -a /dev/hdb yields: hdb: drive_cmd: status=0x51 { DriveReady SeekComplete Error } hdb: drive_cmd: error=0x04 { AbortedCommand } ide: failed opcode was: 0xec in the system log, so no progress here.
There were multiple updates for the subsystem, but I'm not sure if this issue has been fixed by any of them. Have you tested with latest kernel? Bartolomiej, do you think this problem has been identified/fixed?
Tested 2.6.25, smartctl -a /dev/hdb output in the syslog: kernel: hdb: task_in_intr: status=0x51 { DriveReady SeekComplete Error } kernel: hdb: task_in_intr: error=0x04 { AbortedCommand } kernel: ide: failed opcode was: 0xec
Is this still an issue with a recent kernel?
I traced it to the smartctl misbehavior (which tries IDENTIFY on ATAPI when it shouldn't). Can be closed as 'not a kernel bug'..