Bug 217620
Summary: | RCU stalls with wireguard over bonding over igb on Linux 6.3.0+ | ||
---|---|---|---|
Product: | Linux | Reporter: | Manuel 'satmd' Leiner (manuel.leiner) |
Component: | Kernel | Assignee: | Virtual assignee for kernel bugs (linux-kernel) |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | bp, Jason, sam |
Priority: | P3 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: |
Description
Manuel 'satmd' Leiner
2023-07-01 12:40:19 UTC
(In reply to Manuel 'satmd' Leiner from comment #0) > I've spent the last week on debugging a problem with my attempt to upgrade > my kernel from 6.2.8 to 6.3.8 (now also with 6.4.0 too). > > The lenghty and detailed bug reports with all aspects of git bisect are at > https://bugs.gentoo.org/909066 > > A summary: > - if I do not configure wg0, the kernel does not hang > - if I use a kernel older than commit > fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c, it does not hang > > The commit refers to code that seems unrelated to the problem for my naiive > eye. > > The hardware is a Dell PowerEdge R620 running Gentoo ~amd64. > > I have so far excluded: > - dracut for generating the initramfs is the same version over all kernels > - linux-firmware has been the same > - CPU microcode has been the same > > It's been a long time since I seriously involved with software development > and I have been even less involved with kernel development. > > Gentoo maintainers recommended me to open a bug with upstream, so here I am. > > I currently have no idea how to make progress, but I'm willing to try things. I've just successfully build v6.4 with fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c reverted. (In reply to Manuel 'satmd' Leiner from comment #1) > (In reply to Manuel 'satmd' Leiner from comment #0) > > I've spent the last week on debugging a problem with my attempt to upgrade > > my kernel from 6.2.8 to 6.3.8 (now also with 6.4.0 too). > > > > The lenghty and detailed bug reports with all aspects of git bisect are at > > https://bugs.gentoo.org/909066 > > > > A summary: > > - if I do not configure wg0, the kernel does not hang > > - if I use a kernel older than commit > > fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c, it does not hang > > > > The commit refers to code that seems unrelated to the problem for my naiive > > eye. > > > > The hardware is a Dell PowerEdge R620 running Gentoo ~amd64. > > > > I have so far excluded: > > - dracut for generating the initramfs is the same version over all kernels > > - linux-firmware has been the same > > - CPU microcode has been the same > > > > It's been a long time since I seriously involved with software development > > and I have been even less involved with kernel development. > > > > Gentoo maintainers recommended me to open a bug with upstream, so here I > am. > > > > I currently have no idea how to make progress, but I'm willing to try > things. > > I've just successfully build v6.4 with > fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c reverted. ... and which seems to be running stable. Can you boot once plain 6.4 and once with the patch reverted adding "debug ignore_loglevel log_buf_len=16M" to the kernel command line in both cases and upload full dmesg from both? Thx. I will - add those cmdline arguments permamently until the bug is resolved - test plain v6.4 again - test v6.4 with reverted fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c - test v6.4 with patch 54d5e4329efe0d1dba8b4a58720d29493926bed0 I will have to test those during European night time and when my health allows for it. This may take a day or two. I did a lot of tests during daytime and have to give my users a bit of rest too. Small change of plans: After talking to Jason, I will do things in this order: - Try v6.4 with patch 54d5e4329efe0d1dba8b4a58720d29493926bed0 - test plain v6.4 again - test v6.4 with reverted fed8d8773b8ea68ad99d9eee8c8343bef9da2c2c all with the adjusted kernel arguments. The patch 54d5e4329efe0d1dba8b4a58720d29493926bed0 allows me to successfully boot v6.4. I'd preferably skip over the other tests if we're able to agree that we don't need these tests anymore. :) Your patch works for me. Tested-by: Manuel Leiner <manuel.leiner@gmx.de> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=7387943fa35516f6f8017a3b0e9ce48a3bef9faa The fix hit the net tree. Will be in the next stable. |