Bug 219154
Summary: | Crash on receiving large data over virtio_net under memory and IO load | ||
---|---|---|---|
Product: | Drivers | Reporter: | Takero Funaki (flintglass) |
Component: | Network | Assignee: | drivers_network (drivers_network) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | regressions |
Priority: | P3 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 6.10 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg from a crash
custom build config patch for v6.11-rc3 patch for v6.11-rc3 patch for v6.11-rc3 patch1 for v6.11-rc3 |
Description
Takero Funaki
2024-08-13 10:02:53 UTC
Created attachment 306714 [details]
dmesg from a crash
Created attachment 306715 [details]
custom build config
Wonder if it's the same problem discussed here: https://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com/ https://lore.kernel.org/all/20240814065914.bpnFIoTXhqGpEiCvOuj0e9Kmx0tngb1NFUPxs378JDU@z/raw https://lore.kernel.org/all/7774ac707743ad8ce3afeacbd4bee63ac96dd927.1723617902.git.mst@redhat.com/ I brought that up there as well. I bisected the issue to commit f9dac92ba908 (virtio_ring: enable premapped mode regardless of use_dma_api). It appears to be the same issue as discussed in the first thread: https://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com/ f9dac92ba9081062a6477ee015bd3b8c5914efc4: BAD 6e62702feb6d474e969b52f0379de93e9729e457: OK However, reverting f9dac92ba908 on v6.11-rc2 did not boot up. (In reply to Takero Funaki from comment #4) > I bisected the issue Great! > However, reverting f9dac92ba908 on v6.11-rc2 did not boot up. Then you likely need to apply two more reverts, e.g. everything from this thread: https://lore.kernel.org/all/7774ac707743ad8ce3afeacbd4bee63ac96dd927.1723617902.git.mst@redhat.com/ Can I CC you in a reply to https://lore.kernel.org/all/m2r0aqrsq6.fsf@oracle.com/ once you tried that and posted the results here? This would expose your name and email address to the public. Thanks for the suggestion. I couldn’t initially determine which commits needed to be reverted. I tested applying all three reverts from the thread on v6.11-rc3 (with some modifications to resolve conflicts) and confirmed that the issue no longer reproduces. So far, I’ve only tested this on one VM, but I plan to check it in other environments as well. I will attach the modified patches I applied on tag:v6.11-rc3. In my case, receiving data with Debian default setting net.core.high_order_alloc_disable=0, combined with memory and I/O load, triggered the crash. I hope this information is helpful for further investigation. For the mailing list, please feel free to CC me: Cc: Takero Funaki <flintglass@gmail.com> Thanks. Created attachment 306744 [details]
patch for v6.11-rc3
Created attachment 306745 [details]
patch for v6.11-rc3
Created attachment 306746 [details]
patch for v6.11-rc3
Created attachment 306747 [details]
patch1 for v6.11-rc3
Thanks to everyone involved, Xuan's reverting commits have been merged and released in the v6.11 kernel. Although Xuan's proposed fix to reimplement the disabled feature is still in progress, I am closing this issue as the visible problem has been resolved. |