Bug 219154

Summary: Crash on receiving large data over virtio_net under memory and IO load
Product: Drivers Reporter: Takero Funaki (flintglass)
Component: NetworkAssignee: drivers_network (drivers_network)
Status: RESOLVED CODE_FIX    
Severity: normal CC: regressions
Priority: P3    
Hardware: All   
OS: Linux   
Kernel Version: 6.10 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg from a crash
custom build config
patch for v6.11-rc3
patch for v6.11-rc3
patch for v6.11-rc3
patch1 for v6.11-rc3

Description Takero Funaki 2024-08-13 10:02:53 UTC
Hello,

I've encountered repeated crashes or freezes when a KVM VM receives large amounts of data over the network while the system is under memory load and performing I/O operations. The crashes sometimes occur in the filesystem code (ext4 and btrfs, at least), but they also happen in other locations.

This issue occurs on my custom builds using kernel versions v6.10 to v6.11-rc2, with virtio network and disk drivers, and either Ubuntu 22.04 or Debian 12 user space.

The same kernel build did not crash on an Azure VM, which does not use the virtio network driver. Since this issue only appears when receiving data, I suspect there could be an issue related to the virtio interface or receive buffer handling.

This issue did not occur on the Debian backport kernel 6.9.7-1~bpo12+1 amd64.

Steps to Reproduce:
1. Setup a small VM on a KVM host.
   I tested this on an x86_64 KVM VM with 1 CPU, 512 MB RAM, 2 GB SWAP (the smallest configuration from Vultr), using a Debian 12 user space, virtio disk, and virtio net.
2. Induce high memory and I/O load. Run the following command:
   stress --vm 2 --hdd 1
   (Adjust --vm to to occupy all the RAM)
   This slows down the system but does not cause a crash.
3. Send large data to the VM.
   I used `iperf3 -s` on the VM and sent data using `iperf3 -c` from another host. The system crashes within a few seconds to a few minutes. (The reverse direction `iperf3 -c -R` did not cause a crash.)


The OOPS messages are mostly general protection faults, but sometimes I see "Bad pagetable" or other errors, such as:
Oops: general protection fault, probably for non-canonical address 0x2f9b7fa5e2bde696: 0000 [#1] PREEMPT SMP PTI
Oops: Oops: 0000 [#1] PREEMPT SMP PTI
Oops: Bad pagetable: 000d [#1] PREEMPT SMP PTI

In some cases, dmesg contains something like:
UBSAN: shift-out-of-bounds in lib/xarray.c:158:34

When the system freezes without crash, I sometimes found BUGON messages in some cases, such as:
get_swap_device: Bad swap file entry 3403b0f5b2584992
BUG: Bad page map in process stress  pte:c42f93fac0299e1d pmd:0d9b2047
BUG: Bad rss-counter-state mm:000000004df3dd9a type:MM_ANONPAGES val:2
BUG: Bad rss-counter-state mm:000000004df3dd9a type:MM_SWAPENTS val:-1

Thanks.
Comment 1 Takero Funaki 2024-08-13 10:04:50 UTC
Created attachment 306714 [details]
dmesg from a crash
Comment 2 Takero Funaki 2024-08-13 10:05:33 UTC
Created attachment 306715 [details]
custom build config
Comment 4 Takero Funaki 2024-08-15 13:18:28 UTC
I bisected the issue to commit f9dac92ba908 (virtio_ring: enable premapped mode regardless of use_dma_api).

It appears to be the same issue as discussed in the first thread:
https://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com/

f9dac92ba9081062a6477ee015bd3b8c5914efc4: BAD
6e62702feb6d474e969b52f0379de93e9729e457: OK

However, reverting f9dac92ba908 on v6.11-rc2 did not boot up.
Comment 5 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-08-15 13:30:22 UTC
(In reply to Takero Funaki from comment #4)
> I bisected the issue 

Great!

> However, reverting f9dac92ba908 on v6.11-rc2 did not boot up.

Then you likely need to apply two more reverts, e.g. everything from this thread:
https://lore.kernel.org/all/7774ac707743ad8ce3afeacbd4bee63ac96dd927.1723617902.git.mst@redhat.com/

Can I CC you in a reply to https://lore.kernel.org/all/m2r0aqrsq6.fsf@oracle.com/ once you tried that and posted the results here? This would expose your name and email address to the public.
Comment 6 Takero Funaki 2024-08-16 03:19:57 UTC
Thanks for the suggestion. I couldn’t initially determine which commits needed to be reverted.

I tested applying all three reverts from the thread on v6.11-rc3 (with some modifications to resolve conflicts) and confirmed that the issue no longer reproduces. So far, I’ve only tested this on one VM, but I plan to check it in other environments as well.
I will attach the modified patches I applied on tag:v6.11-rc3.

In my case, receiving data with Debian default setting  net.core.high_order_alloc_disable=0, combined with memory and I/O load, triggered the crash.
I hope this information is helpful for further investigation.

For the mailing list, please feel free to CC me:
Cc: Takero Funaki <flintglass@gmail.com>

Thanks.
Comment 7 Takero Funaki 2024-08-16 03:22:15 UTC
Created attachment 306744 [details]
patch for v6.11-rc3
Comment 8 Takero Funaki 2024-08-16 03:22:29 UTC
Created attachment 306745 [details]
patch for v6.11-rc3
Comment 9 Takero Funaki 2024-08-16 03:22:45 UTC
Created attachment 306746 [details]
patch for v6.11-rc3
Comment 10 Takero Funaki 2024-08-16 03:23:49 UTC
Created attachment 306747 [details]
patch1 for v6.11-rc3
Comment 11 Takero Funaki 2024-09-16 09:51:08 UTC
Thanks to everyone involved, Xuan's reverting commits have been merged and released in the v6.11 kernel. Although Xuan's proposed fix to reimplement the disabled feature is still in progress, I am closing this issue as the visible problem has been resolved.