Bug 13252 - "bio too big device md0" and possible data corruption
Summary: "bio too big device md0" and possible data corruption
Status: CLOSED DUPLICATE of bug 9401
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: LVM2/DM (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Alasdair G Kergon
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-05-06 00:13 UTC by Tim Connors
Modified: 2011-02-24 15:23 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.29
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Tim Connors 2009-05-06 00:13:21 UTC
I had a raid1 device consisting of 2 1TB esata drives with
/sys/block/sdc/queue/max_sectors_kb=512
and
/sys/block/sdc/queue/max_hw_sectors_kb=32767
I then have lvm ontop of the md0 device.

I remove one device, and because of a bug (http://marc.info/?l=linux-ide&m=124153367903104&w=2) had to add it back as a USB device instead.
There, both max_hw_sectors_kb and max_sectors_kb become 120.

Thereafter, I get a whole lot of:
[55125.228693] bio too big device md0 (248 > 240)
although the resync and subsequent usage goes ahead without a glitch.

This is a long standing bug that according to people like https://lists.ubuntu.com/archives/kernel-bugs/2009-January/thread.html#46929 eats data.  Since I had lost some data when I first was dealing with raid on these disks a month ago, but didn't know at the time what caused it, I can easily imagine I came across the same circumstances then too.

The bitmap usage had been hovering at about 60/233 572KB pages for quite some time, even on a disk with not much writing happening at the time.  I took that to initially mean that it wasn't writing out some data to the usb disk, but then after some time, it dropped to 0.  As such, since I no longer trust the integrity of the raid device, I have removed the usb device from the array, and will probably zero its superblock so that it has to do a full resync rather than using the bitmap which may not be understanding that bits of the usb drive weren't written properly?  Is that a possibility?
Comment 1 Alasdair G Kergon 2009-10-18 18:42:33 UTC

*** This bug has been marked as a duplicate of bug 9401 ***

Note You need to log in before you can comment on or make changes to this bug.