Hi guys, Is there anything I can do to debug this further which would help? Sep 8 18:27:21 london-0-1 kernel: [ 330.954789] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 Sep 8 18:27:21 london-0-1 kernel: [ 330.954867] IP: blk_queue_split+0x190/0x5f0 Sep 8 18:27:21 london-0-1 kernel: [ 330.954901] PGD 0 P4D 0 Sep 8 18:27:21 london-0-1 kernel: [ 330.954926] Oops: 0000 [#1] SMP PTI Sep 8 18:27:21 london-0-1 kernel: [ 330.954955] Modules linked in: ebt_log ebt_ip6 ebt_ip ebt_arp vhost_net vhost tap tun iptable_nat nf_nat_ipv4 ipt_REJECT nf_reject_ipv4 iptable_mangle iptable_raw nf_conntrack_ipv4 nf_defrag_ipv4 xt_recent ip6table_nat nf_nat_ipv6 xt_hashlimit xt_comment ip6t_REJECT nf_reject_ipv6 xt_addrtype xt_mark ip6table_mangle xt_tcpudp xt_CT ip6table_raw xt_multiport nf_log_ipv4 nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip xt_conntrack nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 xt_NFLOG nfnetlink_log xt_LOG nf_log_ipv6 nf_nat_ftp nf_nat_amanda nf_nat nf_log_common nf_conntrack_tftp nf_conntrack_sip nf_conntrack_sane nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 Sep 8 18:27:21 london-0-1 kernel: [ 330.955506] ts_kmp nf_conntrack_amanda nf_conntrack_ftp nf_conntrack ipmi_watchdog ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc fuse dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm bcache irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ast intel_cstate joydev intel_uncore ttm drm_kms_helper drm efi_pstore intel_rapl_perf evdev efivars sg mei_me mei pcspkr lpc_ich mfd_core ipmi_si wmi ipmi_devintf ipmi_msghandler shpchp pcc_cpufreq ioatdma acpi_power_meter acpi_pad button efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto ecb raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor dm_mod hid_generic usbhid hid sd_mod raid6_pq Sep 8 18:27:21 london-0-1 kernel: [ 330.956067] libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod crc32c_intel aesni_intel ahci xhci_pci aes_x86_64 crypto_simd libahci ehci_pci xhci_hcd ehci_hcd cryptd glue_helper libata usbcore igb i2c_i801 i2c_algo_bit scsi_mod usb_common dca nvme ptp pps_core nvme_core Sep 8 18:27:21 london-0-1 kernel: [ 330.956273] CPU: 4 PID: 2672 Comm: worker Not tainted 4.14.0-68.idms.0-amd64 #1 IDMS Linux 4.14.68-1+idms1 Sep 8 18:27:21 london-0-1 kernel: [ 330.956342] Hardware name: Supermicro Super Server/X10SRi-F, BIOS 2.0 12/17/2015 Sep 8 18:27:21 london-0-1 kernel: [ 330.956397] task: ffff95ad2462a100 task.stack: ffffb4a987bb0000 Sep 8 18:27:21 london-0-1 kernel: [ 330.958952] RIP: 0010:blk_queue_split+0x190/0x5f0 Sep 8 18:27:21 london-0-1 kernel: [ 330.961448] RSP: 0018:ffffb4a987bb3b98 EFLAGS: 00010246 Sep 8 18:27:21 london-0-1 kernel: [ 330.963953] RAX: 00000000ffffffff RBX: 000000000007a000 RCX: 0000000000000000 Sep 8 18:27:21 london-0-1 kernel: [ 330.966451] RDX: 00000000000000d0 RSI: 0000000000785830 RDI: ffff95ad305ef200 Sep 8 18:27:21 london-0-1 kernel: [ 330.968909] RBP: ffffb4a987bb3c28 R08: ffff95ad31532d20 R09: 0000000000000000 Sep 8 18:27:21 london-0-1 kernel: [ 330.971422] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Sep 8 18:27:21 london-0-1 kernel: [ 330.973911] R13: 0000000000000000 R14: ffff95ad305ef200 R15: 0000000000000000 Sep 8 18:27:21 london-0-1 kernel: [ 330.976357] FS: 00007fa1f4991700(0000) GS:ffff95ad3f300000(0000) knlGS:0000000000000000 Sep 8 18:27:21 london-0-1 kernel: [ 330.978821] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 8 18:27:21 london-0-1 kernel: [ 330.981291] CR2: 0000000000000008 CR3: 0000001ff307a001 CR4: 00000000003626e0 Sep 8 18:27:21 london-0-1 kernel: [ 330.983809] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 8 18:27:21 london-0-1 kernel: [ 330.991322] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Sep 8 18:27:21 london-0-1 kernel: [ 330.993839] Call Trace: Sep 8 18:27:21 london-0-1 kernel: [ 330.996339] md_make_request+0x23/0x160 [md_mod] Sep 8 18:27:21 london-0-1 kernel: [ 330.998847] generic_make_request+0x123/0x2f0 Sep 8 18:27:21 london-0-1 kernel: [ 331.001309] ? submit_bio+0x6c/0x140 Sep 8 18:27:21 london-0-1 kernel: [ 331.003774] submit_bio+0x6c/0x140 Sep 8 18:27:21 london-0-1 kernel: [ 331.006211] ? next_bio+0x18/0x40 Sep 8 18:27:21 london-0-1 kernel: [ 331.008588] ? __blkdev_issue_discard+0x185/0x1e0 Sep 8 18:27:21 london-0-1 kernel: [ 331.010993] submit_bio_wait+0x57/0x80 Sep 8 18:27:21 london-0-1 kernel: [ 331.013328] blkdev_issue_discard+0x80/0xd0 Sep 8 18:27:21 london-0-1 kernel: [ 331.015705] ? blk_ioctl_discard+0x8b/0xc0 Sep 8 18:27:21 london-0-1 kernel: [ 331.017981] blk_ioctl_discard+0x8b/0xc0 Sep 8 18:27:21 london-0-1 kernel: [ 331.020252] blkdev_ioctl+0x893/0x970 Sep 8 18:27:21 london-0-1 kernel: [ 331.022499] block_ioctl+0x39/0x40 Sep 8 18:27:21 london-0-1 kernel: [ 331.029313] do_vfs_ioctl+0xa2/0x620 Sep 8 18:27:21 london-0-1 kernel: [ 331.031566] ? SyS_futex+0x7a/0x170 Sep 8 18:27:21 london-0-1 kernel: [ 331.033739] SyS_ioctl+0x74/0x80 Sep 8 18:27:21 london-0-1 kernel: [ 331.035862] do_syscall_64+0x6e/0x100 Sep 8 18:27:21 london-0-1 kernel: [ 331.037932] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Sep 8 18:27:21 london-0-1 kernel: [ 331.039980] RIP: 0033:0x7fbaa4e8bdd7 Sep 8 18:27:21 london-0-1 kernel: [ 331.041948] RSP: 002b:00007fa1f4990978 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 Sep 8 18:27:21 london-0-1 kernel: [ 331.043961] RAX: ffffffffffffffda RBX: 0000000000002000 RCX: 00007fbaa4e8bdd7 Sep 8 18:27:21 london-0-1 kernel: [ 331.045890] RDX: 00007fa1f4990980 RSI: 0000000000001277 RDI: 000000000000002a Sep 8 18:27:21 london-0-1 kernel: [ 331.047769] RBP: 0000560cc04e7280 R08: 0000000000000000 R09: 00000000ffffffff Sep 8 18:27:21 london-0-1 kernel: [ 331.049617] R10: 00007fa1f49909a0 R11: 0000000000000246 R12: 0000560cbbd66b70 Sep 8 18:27:21 london-0-1 kernel: [ 331.051394] R13: 00007fa1f4990980 R14: 00007fa1f4991700 R15: 0000560cbe690bc0 Sep 8 18:27:21 london-0-1 kernel: [ 331.053123] Code: 60 3c 89 55 98 45 31 db 45 31 ff c7 45 a4 00 00 00 00 45 31 d2 85 db 0f 84 92 03 00 00 45 89 cd 4c 89 e9 48 c1 e1 04 49 03 48 78 <8b> 41 08 8b 71 0c 48 8b 11 44 29 e0 39 d8 48 89 55 a8 0f 47 c3 Sep 8 18:27:21 london-0-1 kernel: [ 331.056631] RIP: blk_queue_split+0x190/0x5f0 RSP: ffffb4a987bb3b98 Sep 8 18:27:21 london-0-1 kernel: [ 331.058308] CR2: 0000000000000008 Sep 8 18:27:21 london-0-1 kernel: [ 331.059958] ---[ end trace 4f9c8b67a1e8cf2c ]---
Hi, I have a patch which I believe fixes your issue: https://www.spinics.net/lists/linux-bcache/msg06997.html It looks like it will go in to the 5.1 kernel. Regards, Daniel
Hi there Daniel, I should of noted more details in my bug report. My setup was two NVMe's, in RAID 1 with LVM ontop. I was doing a mkfs.ext4 when the issue occurred iirc. Is it safe for me to backport the patch to 4.14 and re-test? -N
Hi Nigel, I'm not in any position to guarantee or make promises about the safety of anything, but: - I do not believe it is likely to cause you any issues, and - it is queued up for 5.1, so the maintainer seems to agree So YMMV, and make backups etc, but I think it's worth trying. Regards, Daniel
Does this problem still show up in Linux v5.5 or v5.6-rc ? Thanks. Coly Li