Bug 206181
Summary: | [PATCH] x86_32: Panic caused by systemd-udevd on Hyper-V (triggered by memory hot-add) | ||
---|---|---|---|
Product: | Memory Management | Reporter: | Taketo Kabe (kkabe) |
Component: | Other | Assignee: | Andrew Morton (akpm) |
Status: | NEW --- | ||
Severity: | normal | ||
Priority: | P1 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 4.19.95 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
add printk's to see hv_balloon memory hot-add function
do not __online_page_free() the hot-added memory in hv_balloon |
Description
Taketo Kabe
2020-01-13 08:38:02 UTC
I've tracked this down to "udevadm trigger --type=devices --action=add" invoked from /usr/lib/systemd/system/systemd-udev-trigger.service . To control when the panic occurs: - Boot the kernel with "init=/bin/sh" kernel option. - In the init shell, type sh# chmod -x /usr/bin/udevadm to make systemd-udev service fail. - Continue booting by typing sh# exec /usr/lib/systemd/systemd The system doesn't panic when udevadm is disabled. To cause the panic: - sh# chmod +x /usr/bin/udevadm - sh# udevadm trigger --type=devices --action=add After 46 seconds, the kernel will panic. Invoking "udevadm monitor &" beforehand does not log anything suspicious; udevd seems quiscent. Since systemd-udevd is related, this bug may not be a memory management but something related to systemd-udevd<=>kernel interface (inotify?) FYI: kernel-4.19.94 also panics, but at different location: [ 202.109493] BUG: unable to handle kernel paging request at eb800000 [ 202.114991] *pde = 00000000 [ 202.115635] Oops: 0002 [#1] SMP [ 202.116538] CPU: 0 PID: 184 Comm: kworker/0:4 Not tainted 4.19.94-1.el8.i586 #1 [ 202.116882] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012 [ 202.116882] Workqueue: events hot_add_req [hv_balloon] [ 202.116882] EIP: sparse_add_one_section+0xcb/0x12e [ 202.116882] Code: 45 ec e8 ce 5b 00 00 8b 55 e4 89 45 e8 89 d0 c1 e0 04 8d b0 c0 f7 dc c4 f6 80 c0 f7 dc c4 01 75 44 b0 ff b9 00 00 0a 00 89 df <f3> aa 89 f0 2d c0 f7 dc c4 c1 f8 04 3b 05 80 f7 dc c4 7e 05 a3 80 [ 202.116882] EAX: 000000ff EBX: eb800000 ECX: 000a0000 EDX: 0000000d [ 202.116882] ESI: c4dcf890 EDI: eb800000 EBP: df8bbe48 ESP: df8bbe2c [ 202.116882] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046 [ 202.116882] CR0: 80050033 CR2: eb800000 CR3: 014fb000 CR4: 003406d0 [ 202.116882] Call Trace: [ 202.116882] __add_pages+0x89/0x100 [ 202.116882] arch_add_memory+0x3c/0x50 [ 202.116882] add_memory_resource+0x125/0x180 [ 202.116882] __add_memory+0xad/0x130 [ 202.116882] add_memory+0x2c/0x3a [ 202.116882] hot_add_req+0x3de/0x60b [hv_balloon] [ 202.116882] process_one_work+0x17b/0x340 [ 202.116882] worker_thread+0x39/0x3d0 [ 202.116882] kthread+0xf0/0x110 [ 202.116882] ? pwq_unbound_release_workfn+0xc0/0xc0 [ 202.116882] ? kthread_bind+0x30/0x30 [ 202.116882] ret_from_fork+0x2e/0x40 [ 202.116882] Modules linked in: intel_rapl_perf pcspkr i2c_piix4 sg hv_balloon hv_utils joydev rfkill zram ext4 mbcache jbd2 loop nls_utf8 isofs sr_mod cdrom sd_mod ata_generic 8021q garp mrp stp llc hv_netvsc hv_storvsc scsi_transport_fc hyperv_keyboard hid_hyperv ata_piix libata hyperv_fb crc32_pclmul hv_vmbus serio_raw sunrpc xts lrw dm_crypt dm_round_robin dm_multipath dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_zero dm_mod linear raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_intel raid1 raid0 iscsi_ibft squashfs cramfs be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb iscsi_tcp libiscsi_tcp libiscsi edd scsi_transport_iscsi [ 202.116882] CR2: 00000000eb800000 [ 202.116882] ---[ end trace 8cf405307c8a67b5 ]--- [ 202.116882] EIP: sparse_add_one_section+0xcb/0x12e [ 202.116882] Code: 45 ec e8 ce 5b 00 00 8b 55 e4 89 45 e8 89 d0 c1 e0 04 8d b0 c0 f7 dc c4 f6 80 c0 f7 dc c4 01 75 44 b0 ff b9 00 00 0a 00 89 df <f3> aa 89 f0 2d c0 f7 dc c4 c1 f8 04 3b 05 80 f7 dc c4 7e 05 a3 80 [ 202.116882] EAX: 000000ff EBX: eb800000 ECX: 000a0000 EDX: 0000000d [ 202.116882] ESI: c4dcf890 EDI: eb800000 EBP: df8bbe48 ESP: c4c633bc [ 202.116882] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046 [ 202.116882] CR0: 80050033 CR2: eb800000 CR3: 014fb000 CR4: 003406d0 [ 202.116882] Kernel panic - not syncing: Fatal exception [ 202.116882] Kernel Offset: 0x3000000 from 0xc1000000 (relocation range: 0xc0000000-0xe07effff) [ 202.116882] ---[ end Kernel panic - not syncing: Fatal exception ]--- After experimenting, disabling hv_balloon driver by "module_blacklist=hv_balloon" kernel option will not cause the panic. Seems something wrong for 32-bit drivers/hv/hv_balloon.c driver. By passing module parameter via kernel commandline by "hv_balloon.pressure_report_delay=60", the time between "udevadm trigger --type=devices --action=add" and panic changes accordingly, so something after pressure_report_delay expires (default static uint pressure_report_delay=45) seems to be wrong, in drivers/hv/hv_balloon.c:post_status() . It doesn't seem to be wrong anywhere though from my eyes... adding "hv_balloon.hot_add=0" kernel commandine seems to make it not panic. Something seems wrong with 32-bit memory hot-add codepath. I've also seen several times a panic going through hot_add_req() [hv_balloon] . Have succeeded in capturing a panic involving drivers/hv/hv_balloon.c:hot_add_req() . Reproducability unknown. [ 73.268544] BUG: unable to handle kernel paging request at eb800000 [ 73.270157] *pde = 00000000 [ 73.271192] Oops: 0002 [#1] SMP [ 73.271897] CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 4.19.95-1.el8.i586 #1 [ 73.273535] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012 [ 73.275487] Workqueue: events hot_add_req [hv_balloon] [ 73.276610] EIP: sparse_add_one_section+0xcb/0x12e [ 73.277715] Code: 45 ec e8 ce 5b 00 00 8b 55 e4 89 45 e8 89 d0 c1 e0 04 8d b0 c0 f7 1c c6 f6 80 c0 f7 1c c6 01 75 44 b0 ff b9 00 00 0a 00 89 df <f3> aa 89 f0 2d c0 f7 1c c6 c1 f8 04 3b 05 80 f7 1c c6 7e 05 a3 80 [ 73.277753] EAX: 000000ff EBX: eb800000 ECX: 000a0000 EDX: 0000000d [ 73.277753] ESI: c61cf890 EDI: eb800000 EBP: dbd2be48 ESP: dbd2be2c [ 73.277753] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046 [ 73.277753] CR0: 80050033 CR2: eb800000 CR3: 06228000 CR4: 003406d0 [ 73.277753] Call Trace: [ 73.277753] __add_pages+0x89/0x100 [ 73.277753] arch_add_memory+0x3c/0x50 [ 73.277753] add_memory_resource+0x125/0x180 [ 73.277753] __add_memory+0xad/0x130 [ 73.277753] add_memory+0x2c/0x3a [ 73.277753] hot_add_req+0x3de/0x60b [hv_balloon] [ 73.277753] process_one_work+0x17b/0x340 [ 73.277753] worker_thread+0x39/0x3d0 [ 73.277753] kthread+0xf0/0x110 [ 73.277753] ? pwq_unbound_release_workfn+0xc0/0xc0 [ 73.277753] ? kthread_bind+0x30/0x30 [ 73.277753] ret_from_fork+0x2e/0x40 [ 73.277753] Modules linked in: xfs libfc rfkill zram intel_rapl_perf i2c_piix4 pcspkr sg hv_utils hv_balloon joydev ext4 mbcache jbd2 loop nls_utf8 isofs sr_mod cdrom sd_mod 8021q garp ata_generic mrp stp llc hv_netvsc hv_storvsc scsi_transport_fc hyperv_keyboard hid_hyperv ata_piix libata hyperv_fb crc32_pclmul hv_vmbus serio_raw sunrpc xts lrw dm_crypt dm_round_robin dm_multipath dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_zero dm_mod linear raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_intel raid1 raid0 iscsi_ibft squashfs cramfs be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb iscsi_tcp libiscsi_tcp libiscsi edd scsi_transport_iscsi [ 73.277753] CR2: 00000000eb800000 [ 73.277753] ---[ end trace 6d1ee839ab38607d ]--- [ 73.277753] EIP: sparse_add_one_section+0xcb/0x12e [ 73.277753] Code: 45 ec e8 ce 5b 00 00 8b 55 e4 89 45 e8 89 d0 c1 e0 04 8d b0 c0 f7 1c c6 f6 80 c0 f7 1c c6 01 75 44 b0 ff b9 00 00 0a 00 89 df <f3> aa 89 f0 2d c0 f7 1c c6 c1 f8 04 3b 05 80 f7 1c c6 7e 05 a3 80 [ 73.277753] EAX: 000000ff EBX: eb800000 ECX: 000a0000 EDX: 0000000d [ 73.277753] ESI: c61cf890 EDI: eb800000 EBP: dbd2be48 ESP: c60633bc [ 73.277753] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010046 [ 73.277753] CR0: 80050033 CR2: eb800000 CR3: 06228000 CR4: 003406d0 [ 73.277753] Kernel panic - not syncing: Fatal exception [ 73.277753] Kernel Offset: 0x4400000 from 0xc1000000 (relocation range: 0xc0000000-0xe07effff) [ 73.277753] ---[ end Kernel panic - not syncing: Fatal exception ]--- Created attachment 286859 [details]
add printk's to see hv_balloon memory hot-add function
Not for production
Patched hv_balloon.c with lots of printk's (pr_info()). I first thought hv_balloon was adding memory outside MAXMEM range, but actually it wasn't the case; it was failing after first add_memory() of 128MB. Is hot-add add_memory() in X86_32 functional? [ 56.719004] hv_balloon: Max. dynamic memory size: 1048576 MB [ 57.275031] hv_balloon: Received DM_MEM_HOT_ADD_REQUEST size 24 [ 57.277162] hv_balloon: DM_MEM_HOT_ADD_REQUEST received size 24, should be 16 [ 57.280537] hv_balloon: Received partial hot-add request [ 57.282442] hv_balloon: ha_wrk.ha_page_range{.finfo.start_page=65536, .finfo.page_cnt=16896} [ 57.284524] hv_balloon: ha_wrk.ha_region_range{.finfo.start_page=65536, .finfo.page_cnt=229888} [ 57.287729] hv_balloon: Invoking process_hot_add(pg_start=65536, pfn_cnt=16896, tg_start=65536, rg_sz=229888 [ 57.290013] hv_balloon: process_hot_add: rg_size=229888 ha_region={ .start_pfn=65536, .ha_end_pfn=65536, .covered_start_pfn=65536, .covered_end_pfn=65536, .end_pfn=295424 } [ 57.293727] hv_balloon: add_memory(nid=0 PFN_PHYS(start_pfn=65536)=0x10000000, HA_CHUNK<<PAGE_SHIFT=134217728) [ 57.557961] BUG: unable to handle kernel paging request at d3fff000 [ 57.564008] *pde = 00000000 [ 57.566879] Oops: 0002 [#1] SMP [ 57.567597] CPU: 0 PID: 405 Comm: systemd-udevd Tainted: G E 4.19.95-1.el8.i586 #1 [ 57.567597] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090006 05/23/2012 [ 57.567597] EIP: wp_page_copy+0x9c/0x780 [ 57.567597] Code: 75 e8 85 f6 0f 84 9c 05 00 00 8b 45 e8 e8 1c 5d e7 ff 89 45 d4 8b 45 e4 e8 11 5d e7 ff 8b 55 d4 8d 78 04 8b 0a 83 e7 fc 89 d6 <89> 08 8b 8a fc 0f 00 00 89 88 fc 0f 00 00 89 c1 29 f9 89 55 d4 29 [ 57.567597] EAX: d3fff000 EBX: c566df04 ECX: 00000000 EDX: c3868000 [ 57.567597] ESI: c3868000 EDI: d3fff004 EBP: c566dec8 ESP: c566de98 [ 57.567597] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210282 [ 57.567597] CR0: 80050033 CR2: d3fff000 CR3: 051bc000 CR4: 003406d0 [ 57.567597] Call Trace: [ 57.567597] do_wp_page+0x8a/0x600 [ 57.567597] handle_mm_fault+0x8b0/0xfb0 [ 57.567597] __do_page_fault+0x1c3/0x480 [ 57.567597] do_page_fault+0x25/0xf0 [ 57.567597] ? __do_page_fault+0x480/0x480 [ 57.567597] common_exception+0x11d/0x12e [ 57.567597] EIP: 0xb7b3f104 [ 57.567597] Code: 29 f9 89 4c 24 10 83 f9 0f 0f 86 92 00 00 00 8b 45 40 8d 14 3e 8b 4c 24 0c 39 48 0c 75 74 8b 4c 24 0c 81 7c 24 10 ef 03 00 00 <89> 42 08 89 4a 0c 89 55 40 89 50 0c 76 0e c7 42 10 00 00 00 00 c7 [ 57.567597] EAX: b7c657d8 EBX: 00001190 ECX: b7c657d8 EDX: 0169a108 [ 57.567597] ESI: 016990f8 EDI: 00001010 EBP: b7c657a0 ESP: bfefcc80 [ 57.567597] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00210293 [ 57.567597] Modules linked in: rfkill snd_pcm snd_timer crc32_pclmul snd soundcore intel_rapl_perf pcspkr hv_balloon(E) hv_netvsc i2c_piix4 hyperv_fb hv_utils sg joydev ip_tables ext4 mbcache jbd2 sr_mod cdrom sd_mod ata_generic ata_piix hyperv_keyboard hid_hyperv hv_storvsc scsi_transport_fc libata crc32c_intel serio_raw hv_vmbus [ 57.567597] CR2: 00000000d3fff000 [ 57.567597] ---[ end trace 08a505f0f046453d ]--- [ 57.567597] EIP: wp_page_copy+0x9c/0x780 [ 57.567597] Code: 75 e8 85 f6 0f 84 9c 05 00 00 8b 45 e8 e8 1c 5d e7 ff 89 45 d4 8b 45 e4 e8 11 5d e7 ff 8b 55 d4 8d 78 04 8b 0a 83 e7 fc 89 d6 <89> 08 8b 8a fc 0f 00 00 89 88 fc 0f 00 00 89 c1 29 f9 89 55 d4 29 [ 57.567597] EAX: d3fff000 EBX: c566df04 ECX: 00000000 EDX: c3868000 [ 57.567597] ESI: c3868000 EDI: d3fff004 EBP: c566dec8 ESP: c32633bc [ 57.567597] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210282 [ 57.567597] CR0: 80050033 CR2: d3fff000 CR3: 051bc000 CR4: 003406d0 [ 57.567597] Kernel panic - not syncing: Fatal exception [ 57.567597] Kernel Offset: 0x1600000 from 0xc1000000 (relocation range: 0xc0000000-0xc87effff) [ 57.567597] ---[ end Kernel panic - not syncing: Fatal exception ]--- When you squelch calling register_memory() from drivers/base/memory.c:init_memory_block(), it won't panic. drivers/base/memory.c:register_memory(struct memory_block mem) initializes struct device mem->dev and passes it down to device_register(&memory->dev), but my wild guess is memory->dev initialization is not enough. I don't have idea what is missing. Current register_memory(): /* * register_memory - Setup a sysfs device for a memory block */ static int register_memory(struct memory_block *memory) { int ret; memory->dev.bus = &memory_subsys; memory->dev.id = memory->start_section_nr / sections_per_block; memory->dev.release = memory_block_release; memory->dev.groups = memory_memblk_attr_groups; memory->dev.offline = memory->state == MEM_OFFLINE; ret = device_register(&memory->dev); if (ret) put_device(&memory->dev); return ret; } drivers/base/memory.c:hotplug_memory_register() looks like this: /* * need an interface for the VM to add new memory regions, * but without onlining it. */ int hotplug_memory_register(int nid, struct mem_section *section) { int ret = 0; struct memory_block *mem; mutex_lock(&mem_sysfs_mutex); mem = find_memory_block(section); if (mem) { mem->section_count++; put_device(&mem->dev); } else { ret = init_memory_block(&mem, section, MEM_OFFLINE); if (ret) goto out; mem->section_count++; } out: mutex_unlock(&mem_sysfs_mutex); return ret; } There's nothing suspicious here, but difference between nonproblematic memory add during boot is init_memory_block(,,MEM_ONLINE); if I changed it to init_memory_block(,,MEM_ONLINE); in hotplug_memory_register(), it doesn't panic. Seems like hot-adding offline memory (and callback for onlining it?) seems to have some problem. Created attachment 286941 [details]
do not __online_page_free() the hot-added memory in hv_balloon
Patch which makes it not panic, and the hot-added memory properly recognized by the Hyper-V guest
The patch posted as Comment 11 https://bugzilla.kernel.org/show_bug.cgi?id=206181#c11 seems to make kernel not panic and recognize the hot-added memory triggered by memory pressure, and added by hv_balloon module. I guess you shouldn't __online_page_free(pg) the page you just hot-added and brought online. __online_page_free() should be called when memory hot-remove was requested, which current hv_balloon driver doesn't implement now. Any thoughts? I don't know why memory hot-add was working for others before the patch. |