Bug 215824 - bfq null pointer dereference in bfq_idle_extract -> __list_del_entry_valid
Summary: bfq null pointer dereference in bfq_idle_extract -> __list_del_entry_valid
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Block Layer (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jens Axboe
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-04-09 13:13 UTC by ValdikSS
Modified: 2022-04-28 05:50 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.15.32
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description ValdikSS 2022-04-09 13:13:32 UTC
Since circa kernel 5.10.52 or 5.10.56, system with configured bfq I/O scheduler hangs up the entire platform after some time, usually in high I/O workload moments. All the CPU cores are locked up, alt+sysrq+b does not reboot the system. It either hangs indefinitely or got rebooted if the watchdog is configured.

I managed to capture the following kernel oops on 5.15.32 with netconsole, after which the platform hung up. Loop1 is an LXD file system stored on a system disk.

[42522.578543] BUG: kernel NULL pointer dereference, address: 0000000000000000
[42522.578555] #PF: supervisor read access in kernel mode
[42522.578559] #PF: error_code(0x0000) - not-present page
[42522.578562] PGD 0 P4D 0 
[42522.578567] Oops: 0000 [#1] SMP PTI
[42522.578571] CPU: 13 PID: 213350 Comm: kworker/u32:7 Tainted: G S                5.15.32-1-lts #1 bb8765a1c0d822a5d87cc236b26af488e39e88db
[42522.578577] Hardware name: HUANANZHI X99  /X99-8M-F , BIOS 5.11 04/12/2021
[42522.578580] Workqueue: loop1 loop_workfn [loop]
[42522.578590] RIP: 0010:__list_del_entry_valid+0x25/0x90
[42522.578597] Code: c3 0f 1f 40 00 48 8b 17 4c 8b 47 08 48 b8 00 01 00 00 00 00 ad de 48 39 c2 74 26 48 b8 22 01 00 00 00 00 ad de 49 39 c0 74 2b <49> 8b 30 48 39 fe 75 3a 48 8b 52 08 48 39 f2 75 48 b8 01 00 00 00
[42522.578603] RSP: 0018:ffffaac5469a7890 EFLAGS: 00010017
[42522.578607] RAX: dead000000000122 RBX: ffff9aabfbb0b158 RCX: 0000000000000000
[42522.578610] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9aac70380140
[42522.578613] RBP: ffff9aac70380098 R08: 0000000000000000 R09: 0000000000000000
[42522.578617] R10: 0000000000000001 R11: ffff9aac4a139c00 R12: ffff9aac70380010
[42522.578620] R13: ffff9aac4a139c70 R14: 0000000000000000 R15: ffff9aabc6443c00
[42522.578623] FS:  0000000000000000(0000) GS:ffff9aaf2ff40000(0000) knlGS:0000000000000000
[42522.578627] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42522.578631] CR2: 0000000000000000 CR3: 000000006ae10005 CR4: 00000000001706e0
[42522.578634] Call Trace:
[42522.578638]  <TASK>
[42522.578641]  bfq_idle_extract+0x52/0xb0
[42522.578648]  bfq_put_idle_entity+0x12/0x60
[42522.578652]  bfq_bfqq_served+0xc1/0x1a0
[42522.578657]  bfq_dispatch_request+0x2d3/0x12a0
[42522.578661]  ? __sbitmap_get_word+0x30/0x80
[42522.578668]  __blk_mq_do_dispatch_sched+0x219/0x320
[42522.578674]  ? recalibrate_cpu_khz+0x10/0x10
[42522.578681]  ? ktime_get+0x38/0x90
[42522.578686]  ? bfq_insert_requests+0x778/0x16e0
[42522.578690]  __blk_mq_sched_dispatch_requests+0x109/0x160
[42522.578696]  blk_mq_sched_dispatch_requests+0x30/0x60
[42522.578701]  __blk_mq_run_hw_queue+0x2b/0x90
[42522.578707]  __blk_mq_delay_run_hw_queue+0x144/0x150
[42522.578711]  blk_mq_sched_insert_requests+0x63/0xe0
[42522.578717]  blk_mq_flush_plug_list+0x10f/0x1a0
[42522.578722]  blk_finish_plug+0x21/0x30
[42522.578728]  __iomap_dio_rw+0x59e/0x7c0
[42522.578737]  iomap_dio_rw+0xa/0x30
[42522.578741]  ext4_file_read_iter+0x101/0x160 [ext4 dd6da0888b8148498814602f34d1bf7d7eae8148]
[42522.578793]  lo_rw_aio.isra.0+0x2c3/0x2e0 [loop bc975935d69a92a419a6272704b0dab0b0464574]
[42522.578801]  loop_process_work+0x6e4/0xcb0 [loop bc975935d69a92a419a6272704b0dab0b0464574]
[42522.578808]  ? raw_spin_rq_lock_nested+0xa/0x10
[42522.578814]  ? newidle_balance+0x2ef/0x400
[42522.578822]  ? __switch_to_asm+0x42/0x70
[42522.578829]  ? __switch_to+0x11b/0x420
[42522.578835]  process_one_work+0x1f1/0x390
[42522.578841]  worker_thread+0x53/0x3e0
[42522.578846]  ? process_one_work+0x390/0x390
[42522.578850]  kthread+0x127/0x150
[42522.578857]  ? set_kthread_struct+0x40/0x40
[42522.578862]  ret_from_fork+0x22/0x30
[42522.578868]  </TASK>
[42522.578871] Modules linked in: nf_conntrack_netlink netconsole xt_conntrack nft_chain_nat xt_addrtype nft_counter xt_owner nft_compat nf_tables overlay ip6table_raw ip6t_rpfilter iptable_raw ipt_rpfilter veth xt_CHECKSUM xt_tcpudp xt_comment xt_MASQUERADE ip6table_nat ip6table_mangle ip6table_filter ip6_tables bridge stp llc btrfs blake2b_generic xor raid6_pq loop vhost_vsock vmw_vsock_virtio_transport_common vhost vhost_iotlb vsock dm_crypt cbc encrypted_keys trusted asn1_encoder tee tpm rng_core rfkill lzo_rle zram nfnetlink_queue nfnetlink iptable_mangle iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nct6775 hwmon_vid intel_rapl_msr intel_rapl_common vfat fat x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_realtek snd_hda_codec_generic kvm_intel ledtrig_audio kvm snd_hda_intel irqbypass crct10dif_pclmul snd_intel_dspcfg snd_intel_sdw_acpi crc32_pclmul snd_hda_codec intel_spi_platform ghash_clmulni_intel aesni_intel iTCO_wdt psmouse
[42522.578926]  serio_raw intel_spi snd_hda_core crypto_simd intel_pmc_bxt spi_nor atkbd mtd cryptd iTCO_vendor_support snd_hwdep mxm_wmi gpio_ich rapl r8169 libps2 intel_cstate snd_pcm snd_timer snd i2c_i801 realtek mdio_devres intel_uncore i2c_smbus soundcore wmi libphy lpc_ich mac_hid i8042 serio sch_fq tcp_bbr dm_multipath dm_mod sg crypto_user fuse bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 xhci_pci crc32c_intel xhci_pci_renesas
[42522.578987] CR2: 0000000000000000
[42522.578991] ---[ end trace 9588af7b4567ad40 ]---
[42522.578995] RIP: 0010:__list_del_entry_valid+0x25/0x90
[42522.579000] Code: c3 0f 1f 40 00 48 8b 17 4c 8b 47 08 48 b8 00 01 00 00 00 00 ad de 48 39 c2 74 26 48 b8 22 01 00 00 00 00 ad de 49 39 c0 74 2b <49> 8b 30 48 39 fe 75 3a 48 8b 52 08 48 39 f2 75 48 b8 01 00 00 00
[42522.579005] RSP: 0018:ffffaac5469a7890 EFLAGS: 00010017
[42522.579009] RAX: dead000000000122 RBX: ffff9aabfbb0b158 RCX: 0000000000000000
[42522.579013] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9aac70380140
[42522.579016] RBP: ffff9aac70380098 R08: 0000000000000000 R09: 0000000000000000
[42522.579019] R10: 0000000000000001 R11: ffff9aac4a139c00 R12: ffff9aac70380010
[42522.579023] R13: ffff9aac4a139c70 R14: 0000000000000000 R15: ffff9aabc6443c00
[42522.579027] FS:  0000000000000000(0000) GS:ffff9aaf2ff40000(0000) knlGS:0000000000000000
[42522.579031] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42522.579034] CR2: 0000000000000000 CR3: 000000006ae10005 CR4: 00000000001706e0

Note You need to log in before you can comment on or make changes to this bug.