I have since reverted to the working LTS kernel image offered by Arch Linux (4.19.13), but am willing to re-test / gather data additional data on a couple lower-use time periods during the week. After updating to Linux 4.20.0 (along with a full system update otherwise) my BRIDGED network connections to some LXC containers ceased working. Attempting to troubleshoot this issue also produced extremely odd results, which I think offhand MIGHT have caused network packets to fill up some kind of memory buffer instead of being relaid or dropped; there are some additional details at the serverfault and LXC bugs that I filed, as it was initially (and still is) unclear where the actual issue is. - At this time I am unsure if it is related to netdev (bridge, veth), cgroups, or some changed default that should now be configured in a way that is different to previous defaults. https://serverfault.com/questions/947848/linux-bridge-broken-after-upgrade-out-of-ideas-places-to-look-now-4-20-0-arc https://github.com/lxc/lxc/issues/2769 * It is NOT related to IP forwarding, as this is a BRIDGED connection, not a routed one, and it works on older kernels without that enabled. * physical network to bridge works (and will stay connected for a few min after later troubleshooting steps, even if ARP caches / ping flake out and stop responding) * VETH (within LXC) can ping the the host IP on the bridge (but not the gateway, the host can before this step) if manually assigned a static address. Doing this seems to cause general instability and a timed out SSH session. This lead me to rebooting between each round of testing to ensure I had a clean slate to start with. I went over the major settings that I did check in the other two bug reports, but I'm open to checking other values and/or performing different kinds of tests occasionally over a given week. Responses won't be immediate but I'll try to check on this frequently over the next two weeks.
Try applying this patch: https://marc.info/?l=linux-netdev&m=154696956604748&w=2 It solved it for me, what qdisc do you use? (tc qdisc will list them - I was using fq which is why it hit me)
(In reply to Ian Kumlien from comment #1) > Try applying this patch: > https://marc.info/?l=linux-netdev&m=154696956604748&w=2 > > It solved it for me, what qdisc do you use? > (tc qdisc will list them - I was using fq which is why it hit me) Thank you, I can confirm that applying that single line patch DOES make the difference and resolve the issue (for me); though as the published current kernel versions are still need this patch back-ported this bug shouldn't be closed. ArchLinux had a package that made testing the a custom-kernel build easier, but it was based on 4.20.2, so I re-tested without (failed, as expected) the patch and with (appears to be working, as hoped).
Yeah, it didn't make 4.20.2 - It has been picked up and marked for -stable so hopefully it will be in 4.20.3 :)
FYI it's in the current pull set posted to Linus Patch 15 in: https://marc.info/?l=linux-netdev&m=154741526902566&w=2
I've had a similar issie with bridged networking in QEMU (TAP networking to a bridge with enslaved host interface) and the patch mentioned above did solve my issue (where both the VM and the host lost internet connectivity - setting the host interface down, then up again brought networking back for the host). It's not yet included in 4.20.3 either, for anyone looking for this.
It's included in: 4.20.5-rc1 So, it should be in 4.20.5 final ;)
Released and confirmed working, IMHO this bug report can be closed with fixed in 4.20.5