Bug 42614
Summary: | NULL pointer dereference in handle_stripe after disk failure | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Kevin Shanahan (kmshanah) |
Component: | MD | Assignee: | io_md |
Status: | NEW --- | ||
Severity: | normal | CC: | bastienphilbert, c.r1, xerofoify |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.1.9-2-ARCH | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Kevin Shanahan
2012-01-20 02:12:07 UTC
Please test against a newer kernel to see if this is fixed. Cheers Nick This was not readily reproducable and still isn't. If the backtrace is not useful in pointing out a bug, then there is nothing else to do and the bug can be closed. The back trace is not good enough for me to trace it. I would recommend closing it them. Thanks Nick This definitly still happens - current backtrace: Aug 06 23:33:45 james kernel: md/raid:md0: Disk failure on sde1, disabling device. md/raid:md0: Operation continuing on 4 devices. Aug 06 23:33:46 james kernel: md: md0: recovery interrupted. Aug 06 23:33:47 james kernel: RAID conf printout: Aug 06 23:33:47 james kernel: --- level:6 rd:6 wd:4 Aug 06 23:33:47 james kernel: disk 0, o:1, dev:sdd1 Aug 06 23:33:47 james kernel: disk 1, o:0, dev:sde1 Aug 06 23:33:47 james kernel: disk 2, o:1, dev:sdc1 Aug 06 23:33:47 james kernel: disk 3, o:1, dev:sdj1 Aug 06 23:33:47 james kernel: disk 4, o:1, dev:sdb1 Aug 06 23:33:47 james kernel: disk 5, o:1, dev:sdk1 Aug 06 23:33:47 james kernel: RAID conf printout: Aug 06 23:33:47 james kernel: --- level:6 rd:6 wd:4 Aug 06 23:33:47 james kernel: disk 0, o:1, dev:sdd1 Aug 06 23:33:47 james kernel: disk 2, o:1, dev:sdc1 Aug 06 23:33:47 james kernel: disk 3, o:1, dev:sdj1 Aug 06 23:33:47 james kernel: disk 4, o:1, dev:sdb1 Aug 06 23:33:47 james kernel: disk 5, o:1, dev:sdk1 Aug 06 23:33:47 james kernel: RAID conf printout: Aug 06 23:33:47 james kernel: --- level:6 rd:6 wd:4 Aug 06 23:33:47 james kernel: disk 0, o:1, dev:sdd1 Aug 06 23:33:47 james kernel: disk 2, o:1, dev:sdc1 Aug 06 23:33:47 james kernel: disk 3, o:1, dev:sdj1 Aug 06 23:33:47 james kernel: disk 4, o:1, dev:sdb1 Aug 06 23:33:47 james kernel: disk 5, o:1, dev:sdk1 Aug 06 23:33:47 james kernel: RAID conf printout: Aug 06 23:33:47 james kernel: --- level:6 rd:6 wd:4 Aug 06 23:33:47 james kernel: disk 0, o:1, dev:sdd1 Aug 06 23:33:47 james kernel: disk 2, o:1, dev:sdc1 Aug 06 23:33:47 james kernel: disk 3, o:1, dev:sdj1 Aug 06 23:33:47 james kernel: disk 4, o:1, dev:sdb1 Aug 06 23:33:47 james kernel: disk 5, o:1, dev:sdk1 Aug 06 23:33:47 james kernel: md: recovery of RAID array md0 Aug 06 23:33:47 james kernel: md: minimum _guaranteed_ speed: 1000 KB/sec/disk. Aug 06 23:33:47 james kernel: md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. Aug 06 23:33:47 james kernel: md: using 128k window, over a total of 1953278720k. Aug 06 23:33:47 james kernel: md: resuming recovery of md0 from checkpoint. Aug 06 23:34:05 james kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Aug 06 23:34:05 james kernel: ata8.00: irq_stat 0x00060002, device error via D2H FIS Aug 06 23:34:05 james kernel: ata8.00: failed command: READ DMA EXT Aug 06 23:34:05 james kernel: ata8.00: cmd 25/00:e8:c0:60:7d/00:07:d0:00:00/e0 tag 0 dma 1036288 in res 51/40:97:10:63:7d/00:05:d0:00:00/e0 Emask 0x9 (media error) Aug 06 23:34:05 james kernel: ata8.00: status: { DRDY ERR } Aug 06 23:34:05 james kernel: ata8.00: error: { UNC } Aug 06 23:34:05 james kernel: ata8.00: configured for UDMA/33 Aug 06 23:34:05 james kernel: scsi_io_completion: 40 callbacks suppressed Aug 06 23:34:05 james kernel: sd 9:0:0:0: [sdj] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Aug 06 23:34:05 james kernel: sd 9:0:0:0: [sdj] tag#0 Sense Key : Medium Error [current] [descriptor] Aug 06 23:34:05 james kernel: sd 9:0:0:0: [sdj] tag#0 Add. Sense: Unrecovered read error - auto reallocate failed Aug 06 23:34:05 james kernel: sd 9:0:0:0: [sdj] tag#0 CDB: Read(10) 28 00 d0 7d 60 c0 00 07 e8 00 Aug 06 23:34:05 james kernel: blk_update_request: 42 callbacks suppressed Aug 06 23:34:05 james kernel: blk_update_request: I/O error, dev sdj, sector 3497878288 Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876240 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876248 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876256 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876264 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876272 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876280 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876288 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876296 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876304 on sdj1). Aug 06 23:34:05 james kernel: md/raid:md0: read error not correctable (sector 3497876312 on sdj1). Aug 06 23:34:05 james kernel: ata8: EH complete Aug 06 23:34:10 james kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Aug 06 23:34:10 james kernel: ata8.00: irq_stat 0x00060002, device error via D2H FIS Aug 06 23:34:10 james kernel: ata8.00: failed command: READ DMA EXT Aug 06 23:34:10 james kernel: ata8.00: cmd 25/00:e8:78:78:7d/00:07:d0:00:00/e0 tag 1 dma 1036288 in res 51/40:d7:88:79:7d/00:06:d0:00:00/e0 Emask 0x9 (media error) Aug 06 23:34:10 james kernel: ata8.00: status: { DRDY ERR } Aug 06 23:34:10 james kernel: ata8.00: error: { UNC } Aug 06 23:34:10 james kernel: ata8.00: configured for UDMA/33 Aug 06 23:34:10 james kernel: sd 9:0:0:0: [sdj] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Aug 06 23:34:10 james kernel: sd 9:0:0:0: [sdj] tag#1 Sense Key : Medium Error [current] [descriptor] Aug 06 23:34:10 james kernel: sd 9:0:0:0: [sdj] tag#1 Add. Sense: Unrecovered read error - auto reallocate failed Aug 06 23:34:10 james kernel: sd 9:0:0:0: [sdj] tag#1 CDB: Read(10) 28 00 d0 7d 78 78 00 07 e8 00 Aug 06 23:34:10 james kernel: blk_update_request: I/O error, dev sdj, sector 3497884040 Aug 06 23:34:10 james kernel: ata8: EH complete Aug 06 23:34:28 james kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 Aug 06 23:34:28 james kernel: ata8.00: irq_stat 0x00060002, device error via D2H FIS Aug 06 23:34:28 james kernel: ata8.00: failed command: READ DMA EXT Aug 06 23:34:28 james kernel: ata8.00: cmd 25/00:d8:88:79:7d/00:06:d0:00:00/e0 tag 0 dma 897024 in res 51/40:a7:b0:7e:7d/00:01:d0:00:00/e0 Emask 0x9 (media error) Aug 06 23:34:28 james kernel: ata8.00: status: { DRDY ERR } Aug 06 23:34:28 james kernel: ata8.00: error: { UNC } Aug 06 23:34:28 james kernel: ata8.00: configured for UDMA/33 Aug 06 23:34:28 james kernel: sd 9:0:0:0: [sdj] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE Aug 06 23:34:28 james kernel: sd 9:0:0:0: [sdj] tag#0 Sense Key : Medium Error [current] [descriptor] Aug 06 23:34:28 james kernel: sd 9:0:0:0: [sdj] tag#0 Add. Sense: Unrecovered read error - auto reallocate failed Aug 06 23:34:28 james kernel: sd 9:0:0:0: [sdj] tag#0 CDB: Read(10) 28 00 d0 7d 79 88 00 06 d8 00 Aug 06 23:34:28 james kernel: blk_update_request: I/O error, dev sdj, sector 3497885360 Aug 06 23:34:28 james kernel: raid5_end_read_request: 388 callbacks suppressed Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883312 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: Disk failure on sdj1, disabling device. md/raid:md0: Operation continuing on 3 devices. Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883320 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883328 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883336 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883344 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883352 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883360 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883368 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883376 on sdj1). Aug 06 23:34:28 james kernel: md/raid:md0: read error not correctable (sector 3497883384 on sdj1). Aug 06 23:34:28 james kernel: ata8: EH complete Aug 06 23:34:30 james kernel: md: md0: recovery interrupted. Aug 06 23:34:30 james kernel: RAID conf printout: Aug 06 23:34:30 james kernel: --- level:6 rd:6 wd:3 Aug 06 23:34:30 james kernel: disk 0, o:1, dev:sdd1 Aug 06 23:34:30 james kernel: disk 2, o:1, dev:sdc1 Aug 06 23:34:30 james kernel: disk 3, o:0, dev:sdj1 Aug 06 23:34:30 james kernel: disk 4, o:1, dev:sdb1 Aug 06 23:34:30 james kernel: disk 5, o:1, dev:sdk1 Aug 06 23:34:30 james kernel: RAID conf printout: Aug 06 23:34:30 james kernel: --- level:6 rd:6 wd:3 Aug 06 23:34:30 james kernel: disk 0, o:1, dev:sdd1 Aug 06 23:34:30 james kernel: disk 2, o:1, dev:sdc1 Aug 06 23:34:30 james kernel: disk 4, o:1, dev:sdb1 Aug 06 23:34:30 james kernel: disk 5, o:1, dev:sdk1 Aug 06 23:34:30 james kernel: RAID conf printout: Aug 06 23:34:30 james kernel: --- level:6 rd:6 wd:3 Aug 06 23:34:30 james kernel: disk 0, o:1, dev:sdd1 Aug 06 23:34:30 james kernel: disk 2, o:1, dev:sdc1 Aug 06 23:34:30 james kernel: disk 4, o:1, dev:sdb1 Aug 06 23:34:30 james kernel: disk 5, o:1, dev:sdk1 Aug 06 23:34:30 james kernel: RAID conf printout: Aug 06 23:34:30 james kernel: --- level:6 rd:6 wd:3 Aug 06 23:34:30 james kernel: disk 0, o:1, dev:sdd1 Aug 06 23:34:30 james kernel: disk 2, o:1, dev:sdc1 Aug 06 23:34:30 james kernel: disk 4, o:1, dev:sdb1 Aug 06 23:34:31 james kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000150 Aug 06 23:34:31 james kernel: IP: [<ffffffffa0590d36>] handle_stripe+0x1596/0x2660 [raid456] Aug 06 23:34:31 james kernel: PGD 0 Aug 06 23:34:31 james kernel: Oops: 0000 [#1] SMP Aug 06 23:34:31 james kernel: Modules linked in: vhost_net vhost macvtap macvlan ebtable_filter ebtables ip6table_filter ip6_tables xt_conntrack ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype br_netfilter nf_nat nf_conntrack dm_thin_pool dm_persistent_data dm_bio_prison libcrc32c loop tun bridge stp llc cfg80211 rfkill vfat fat intel_rapl iosf_mbi x86_pkg_temp_thermal raid456 coretemp kvm_intel async_raid6_recov async_memcpy async_pq async_xor kvm xor async_tx iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic raid6_pq snd_hda_intel snd_hda_controller snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm i2c_i801 lpc_ich snd_timer mfd_core nuvoton_cir snd rc_core mei_me mei soundcore tpm_tis shpchp tpm Aug 06 23:34:31 james kernel: nfsd auth_rpcgss nfs_acl lockd grace sunrpc dm_crypt i915 crct10dif_pclmul crc32_pclmul crc32c_intel i2c_algo_bit drm_kms_helper uas e1000e drm ghash_clmulni_intel usb_storage serio_raw r8169 mii ptp mvsas pps_core sata_sil24 libsas scsi_transport_sas video Aug 06 23:34:31 james kernel: CPU: 1 PID: 759 Comm: md0_raid6 Not tainted 4.1.3-200.fc22.x86_64 #1 Aug 06 23:34:31 james kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z68 Pro3, BIOS P2.10 05/07/2012 Aug 06 23:34:31 james kernel: task: ffff8807fa274520 ti: ffff8800d16a0000 task.ti: ffff8800d16a0000 Aug 06 23:34:31 james kernel: RIP: 0010:[<ffffffffa0590d36>] [<ffffffffa0590d36>] handle_stripe+0x1596/0x2660 [raid456] Aug 06 23:34:31 james kernel: RSP: 0018:ffff8800d16a3b48 EFLAGS: 00010202 Aug 06 23:34:31 james kernel: RAX: 0000000000000003 RBX: 0000000000000005 RCX: 0000000000000005 Aug 06 23:34:31 james kernel: RDX: 0000000000000002 RSI: 0000000000000003 RDI: 0000000000000000 Aug 06 23:34:31 james kernel: RBP: ffff8800d16a3c78 R08: 0000000000000000 R09: 0000000000000000 Aug 06 23:34:31 james kernel: R10: 0000000000000005 R11: 0000000000000004 R12: 0000000000000005 Aug 06 23:34:31 james kernel: R13: ffff880653790948 R14: 0000000000000005 R15: ffff8807f9be9000 Aug 06 23:34:31 james kernel: FS: 0000000000000000(0000) GS:ffff88081f280000(0000) knlGS:0000000000000000 Aug 06 23:34:31 james kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 06 23:34:31 james kernel: CR2: 0000000000000150 CR3: 0000000002c0b000 CR4: 00000000000427e0 Aug 06 23:34:31 james kernel: Stack: Aug 06 23:34:31 james kernel: 0000000000000001 0000000000000000 0000000000000000 00000000c8882968 Aug 06 23:34:31 james kernel: ffff8807f9be9230 000000010199d4f8 0000000000000001 ffff8806537909b0 Aug 06 23:34:31 james kernel: 0000000000000000 ffff880653790b48 ffff880000000001 00000001ffffffff Aug 06 23:34:31 james kernel: Call Trace: Aug 06 23:34:31 james kernel: [<ffffffffa0591f9e>] handle_active_stripes.isra.45+0x19e/0x4e0 [raid456] Aug 06 23:34:31 james kernel: [<ffffffffa0592798>] raid5d+0x4b8/0x680 [raid456] Aug 06 23:34:31 james kernel: [<ffffffff8179d581>] ? __schedule+0x241/0x720 Aug 06 23:34:31 james kernel: [<ffffffff815fd2c4>] md_thread+0x144/0x150 Aug 06 23:34:31 james kernel: [<ffffffff810e4d10>] ? wake_atomic_t_function+0x70/0x70 Aug 06 23:34:31 james kernel: [<ffffffff815fd180>] ? find_pers+0x80/0x80 Aug 06 23:34:31 james kernel: [<ffffffff810c0b88>] kthread+0xd8/0xf0 Aug 06 23:34:31 james kernel: [<ffffffff810c0ab0>] ? kthread_worker_fn+0x180/0x180 Aug 06 23:34:31 james kernel: [<ffffffff817a1e62>] ret_from_fork+0x42/0x70 Aug 06 23:34:31 james kernel: [<ffffffff810c0ab0>] ? kthread_worker_fn+0x180/0x180 Aug 06 23:34:31 james kernel: Code: 69 d2 70 01 00 00 4c 01 ea 48 8b 92 10 02 00 00 80 e2 10 0f 85 c4 10 00 00 31 d2 85 c0 0f 8e f2 f7 ff ff 48 8b bc d5 48 ff ff ff <48> 83 bf 50 01 00 00 00 74 1d 4c 8b 87 68 01 00 00 41 83 e0 01 Aug 06 23:34:31 james kernel: RIP [<ffffffffa0590d36>] handle_stripe+0x1596/0x2660 [raid456] Aug 06 23:34:31 james kernel: RSP <ffff8800d16a3b48> Aug 06 23:34:31 james kernel: CR2: 0000000000000150 Aug 06 23:34:31 james kernel: [drm:intel_crtc_set_config [i915]] *ERROR* failed to restore config after modeset failure Aug 06 23:34:31 james kernel: ---[ end trace 5971c15150d886db ]--- Try a newer kernel as it seems to possibly be fixed now. |