Bug 42800
Summary: | (mdadm) md raid 1 volume with ram disk and loop/physical device fails during fsyncs | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Petros Koutoupis (petros) |
Component: | MD | Assignee: | io_md |
Status: | NEW --- | ||
Severity: | normal | CC: | neilb |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.0 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Petros Koutoupis
2012-02-20 19:26:04 UTC
Oops. A typo was just brought to my attention. The second command reads: # dd in=/dev/ram0 od=/mnt/ram0.dat And should read: # dd if=/dev/ram0 of=/mnt/ram0.dat The core problem to this can be found in when a RAID 1 configured MD array sends I/O to the ram disk of 0 bytes. Found in drivers/block/brd.c in function brd_make_request(): --------------------- int err = -EIO; [...] bio_for_each_segment(bvec, bio, i) { unsigned int len = bvec->bv_len; err = brd_do_bvec(brd, bvec->bv_page, len, bvec->bv_offset, rw, sector); if (err) break; sector += len >> SECTOR_SHIFT; } [...] bio_endio(bio, err); --------------------- err is initialized to EIO and when an I/O transfer is sent with 0 bytes, it falls through bio_for_each_segment() with err still set to EIO. Here is a GDB dump right after the array has failed on the last bio: --------------------- (bio)->bi_idx < (bio)->bi_vcnt) == (0xcd5d < 0x5000) == Empty loop!!!! (gdb) p /x *bio $3 = { bi_sector = 0x2, bi_next = 0x0, bi_bdev = 0x0, bi_flags = 0x0, bi_rw = 0x10, bi_vcnt = 0x5000, bi_idx = 0xcd5d, bi_phys_segments = 0xccf8d240, bi_size = 0x0, bi_seg_front_size = 0x0, bi_seg_back_size = 0x0, bi_max_vecs = 0x0, bi_comp_cpu = 0xccf8d7c0, bi_cnt = { counter = 0x0 }, bi_io_vec = 0x0, bi_end_io = 0x0, bi_private = 0xccdc3c40, bi_fs_private = 0x0, bi_destructor = 0x0, bi_inline_vecs = 0xcb965d4c } --------------------- The question is, is it by design that MD sends 0 byte commands to the underlying block device? If so, then this may need to be addressed in brd.c and not MD. I just tried your recipe on current mainline kernel and it works smoothly - no failure. The bio that you have displayed above looks to be corrupted: - bi_end_io is NULL, so the bio_endio() call will crash - bi_idx == 0xcd5d. It should nearly always be '0'. Occasionally 1 or 2. - bi_vcnt == 0x5000 - it should never exceed 256 (BIO_MAX_PAGES). - bi_bdev == NULL - this is not possible. It *must* point to the brd device to else brd_make_request could not be called. so I don't really trust it. The only times that md should send zero-length request down is when REQ_FLUSH is set. In this case generic_make_request should notice that q->flush_flags is zero and so will completed the request early without passing it down to brd. So something is clearly wrong but I cannot see what. I would suggest modifying the code in brd_make_request to print out bi_flags whenever bi_size is zero. Maybe add a WARN() too so we can see the stack trace. |