Bug 14157
Summary: | end_request: I/O error, dev cciss/cXdX, sector 0 | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | jiri.harcarik |
Component: | Block Layer | Assignee: | Jens Axboe (axboe) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | akpm, birrachiara, florian, markus, rjw, spike, tj, tmhikaru, wschlich |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.31 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 13615 | ||
Attachments: | dmesg output |
Description
jiri.harcarik
2009-09-11 07:42:17 UTC
I think this is not scsi bug. I get same errors flood with IDE disc. SiS5513 EIDE Controller with drive Model Family: Seagate Barracuda ATA IV family Device Model: ST320011A Sep 12 17:08:02 ololo XFS mounting filesystem dm-0 Sep 12 17:08:02 ololo Ending clean XFS mount for filesystem: dm-0 Sep 12 17:08:02 ololo end_request: I/O error, dev hda, sector 0 Sep 12 17:08:02 ololo end_request: I/O error, dev hda, sector 0 ... and more 50 messages per second logged. This is a regression. 2.6.30.6 last not affected version. 2.6.31 first affected. Marked as a regression. Randomly reassigned to block layer. Weird. Reply-To: James.Bottomley@suse.de > --- Comment #2 from Andrew Morton <akpm@linux-foundation.org> 2009-09-13 > 22:40:04 --- > Marked as a regression. > > Randomly reassigned to block layer. > > Weird. It's theoretically possible, I suppose, but no-one else is seeing this. It sounds like some local config setup issue, which a full dmesg might shed some light on ... say some protected area on sector 0 or something. James Created attachment 23091 [details]
dmesg output
Using XFS? Yes, XFS. Problem only with IDE disk. XFS on SATA (without Host Protected Area) working. Also "end_request: I/O error, dev hda, sector 0" recived only after file system changed. 2.6.30.6------------------------------ ide-gd driver 1.18 hda: max request size: 128KiB hda: Host Protected Area detected. current capacity is 39100223 sectors (20019 MB) native capacity is 39102336 sectors (20020 MB) hda: Host Protected Area disabled. hda: 39102336 sectors (20020 MB) w/2048KiB Cache, CHS=38792/16/63 hda: cache flushes not supported hda: hda1 2.6.31------------------------------ ide-gd driver 1.18 hda: max request size: 128KiB hda: Host Protected Area detected. current capacity is 39100223 sectors (20019 MB) native capacity is 39102336 sectors (20020 MB) hda: 39100223 sectors (20019 MB) w/2048KiB Cache, CHS=38789/16/63 hda: cache flushes not supported hda: hda1 hda: p1 size 39102147 exceeds device capacity, enabling native capacity hda: detected capacity change from 20019314176 to 20020396032 Why 2.6.31 didn't disable Host Protected Area? Сapacity change?! I thought so, then this is just the empty barrier being failed and warned. It can be safely ignored, I'll make sure it goes away. This is still present in 2.6.31.5. Adding "nobarrier" to the xfs mount options works around this issue. What's the current status of fixing this? I'm having the same problem with a usb ide hard disk on 2.6.31.5 - previous versions of 2.6.31.x used to make the filesystem remount readonly, for some reason this does not happen with this version of the kernel. Trying the workaround for xfs does not work for me, I'm using ext4 - disabling barriers (using barrier=0) in the mount options for ext4 did *not* work around the problem for me. This is the sort of output I'm getting: (From the test with barriers disabled) Nov 9 11:59:31 roll kernel: usb 1-4: new high speed USB device using ehci_hcd and address 6 Nov 9 11:59:31 roll kernel: usb 1-4: New USB device found, idVendor=067b, idProduct=2506 Nov 9 11:59:31 roll kernel: usb 1-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3 Nov 9 11:59:31 roll kernel: usb 1-4: Product: Mass Storage Device Nov 9 11:59:31 roll kernel: usb 1-4: Manufacturer: Prolific Technology Inc. Nov 9 11:59:31 roll kernel: usb 1-4: SerialNumber: 0 Nov 9 11:59:31 roll kernel: usb 1-4: configuration #1 chosen from 1 choice Nov 9 11:59:31 roll kernel: scsi3 : SCSI emulation for USB Mass Storage devices Nov 9 11:59:36 roll kernel: scsi 3:0:0:0: Direct-Access ST325082 3A 3.06 PQ: 0 ANSI: 0 Nov 9 11:59:36 roll kernel: sd 3:0:0:0: Attached scsi generic sg0 type 0 Nov 9 11:59:36 roll kernel: sd 3:0:0:0: [sda] 488397168 512-byte logical blocks: (250 GB/232 GiB) Nov 9 11:59:36 roll kernel: sd 3:0:0:0: [sda] Write Protect is off Nov 9 11:59:36 roll kernel: sda: sda1 Nov 9 11:59:37 roll kernel: sd 3:0:0:0: [sda] Attached SCSI disk ==> /var/log/syslog <== Nov 9 11:59:36 roll kernel: sd 3:0:0:0: [sda] Assuming drive cache: write through ==> /var/log/messages <== Nov 9 12:03:48 roll kernel: EXT4-fs (sda1): barriers disabled Nov 9 12:03:48 roll kernel: kjournald2 starting: pid 7298, dev sda1:8, commit interval 5 seconds Nov 9 12:03:48 roll kernel: EXT4-fs (sda1): internal journal on sda1:8 Nov 9 12:03:48 roll kernel: EXT4-fs (sda1): delayed allocation enabled Nov 9 12:03:48 roll kernel: EXT4-fs: file extents enabled Nov 9 12:03:48 roll kernel: EXT4-fs: mballoc enabled Nov 9 12:03:48 roll kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode Nov 9 12:06:35 roll kernel: sd 3:0:0:0: [sda] Unhandled error code Nov 9 12:06:35 roll kernel: sd 3:0:0:0: [sda] Result: hostbyte=0x07 driverbyte=0x00 ==> /var/log/syslog <== Nov 9 11:59:37 roll last message repeated 2 times Nov 9 12:06:35 roll kernel: end_request: I/O error, dev sda, sector 4202703 Worlfram, tmhikaru, can you guys please attach the output of dmesg after such failure? Thanks. Would you please ignore or otherwise delete my comment, I apologize but it turned out to be hardware failing (the usb enclosure was damaged somehow and was giving random I/O errors) so it's entirely unrelated to this bug. Sorry! I can confirm this problem on 2.6.31.5 (openSUSE 11.2 distro kernel). I use ext3 only and get lots of end_request: I/O error, dev cciss/c0d1, sector 0 (every 15-20 seconds) Markus, can you reproduce the problem on 2.6.32? This is a production system, but I might be able to try openSUSE factory kernel (2.6.32) during the holidays. Yeap, that will be great. In case you don't know, 2.6.32 kernel packages are available in the following url. http://ftp.suse.com/pub/projects/kernel/kotd/HEAD/x86_64/ I can confirm the last kernel 2.6.32 is without that error for me on my HP server. I can confirm the bug with opensuse kernel 2.6.31.8-0.1.1 (x86_64) Opensuse kernel 2.6.32.5-0.0.3.f89b2ba-default works correctly. BTW, it doesn't seem just a cosmetic problem. On the test system (HP ML350 g6, 4-core Xeon, Smart Array 641, ext3) the faulty kernel sometimes seems to stall disk operations while outputting those messages. Umm... as upstream is already working fine. At this point, I think it'll probably be best to report these problems to distros so that they can backport it to their kernels. Jens, can you please point out which commit fixed this one? Thanks. In mainline it should be fixed by commit 6cafb12d. It is just a cosmetic issue, if you see stalls otherwise it must be some other problem. |