Created attachment 304350 [details]
config used for the kernel
After building 6.4.0rc4 for my VM running on a Windows 10 Hyper-V host, I see the following
[ 756.697753] net_ratelimit: 34 callbacks suppressed
[ 756.697806] hv_netvsc cd9dd876-2fa9-4764-baa7-b44482f85f9f eth0: nvsp_rndis_pkt_complete error status: 2
(snipped repeated messages)
*but*, I'm only able to reliably reproduce this if I am generating garbage on another terminal, e.g. sudo strings /dev/sda
This doesn't appear to affect latency or bandwidth a huge amount, I ran an iperf3 test between the guest and the host while trying to cause these messages.
Although you if you take 17-18 gigabit as the "base" speed, you can see it drop a bit to 16 gigabit while the errors happen and "catch up" when I stop spamming the console.
[ 5] 99.00-100.00 sec 1.89 GBytes 16.2 Gbits/sec
[ 5] 100.00-101.00 sec 1.91 GBytes 16.4 Gbits/sec
[ 5] 101.00-102.00 sec 1.91 GBytes 16.4 Gbits/sec
[ 5] 102.00-103.00 sec 1.91 GBytes 16.4 Gbits/sec
[ 5] 103.00-104.00 sec 1.92 GBytes 16.5 Gbits/sec
[ 5] 104.00-105.00 sec 1.94 GBytes 16.6 Gbits/sec
[ 5] 105.00-106.00 sec 1.89 GBytes 16.2 Gbits/sec
[ 5] 106.00-107.00 sec 1.90 GBytes 16.3 Gbits/sec
[ 5] 107.00-108.00 sec 2.23 GBytes 19.2 Gbits/sec
[ 5] 108.00-109.00 sec 2.57 GBytes 22.0 Gbits/sec
[ 5] 109.00-110.00 sec 2.66 GBytes 22.9 Gbits/sec
[ 5] 110.00-111.00 sec 2.64 GBytes 22.7 Gbits/sec
[ 5] 111.00-112.00 sec 2.65 GBytes 22.7 Gbits/sec
[ 5] 112.00-113.00 sec 2.65 GBytes 22.8 Gbits/sec
[ 5] 113.00-114.00 sec 2.65 GBytes 22.8 Gbits/sec
[ 5] 114.00-115.00 sec 2.65 GBytes 22.8 Gbits/sec
[ 5] 115.00-116.00 sec 2.66 GBytes 22.9 Gbits/sec
[ 5] 116.00-117.00 sec 2.63 GBytes 22.6 Gbits/sec
[ 5] 117.00-118.00 sec 2.69 GBytes 23.1 Gbits/sec
[ 5] 118.00-119.00 sec 2.66 GBytes 22.9 Gbits/sec
[ 5] 119.00-120.00 sec 2.67 GBytes 22.9 Gbits/sec
[ 5] 120.00-121.00 sec 2.66 GBytes 22.9 Gbits/sec
[ 5] 121.00-122.00 sec 2.49 GBytes 21.4 Gbits/sec
[ 5] 122.00-123.00 sec 2.15 GBytes 18.5 Gbits/sec
[ 5] 123.00-124.00 sec 2.16 GBytes 18.6 Gbits/sec
[ 5] 124.00-125.00 sec 2.16 GBytes 18.6 Gbits/sec
(In reply to Adam Baxter from comment #0)
> After building 6.4.0rc4 for my VM running on a Windows 10 Hyper-V host, I
> see the following
> [ 756.697753] net_ratelimit: 34 callbacks suppressed
> [ 756.697806] hv_netvsc cd9dd876-2fa9-4764-baa7-b44482f85f9f eth0:
> nvsp_rndis_pkt_complete error status: 2
> (snipped repeated messages)
What previous kernel version do you use before you upgrade? Does this issue
occur on that version?
Previously I used Debian's 6.1 package and no, I don't see the error there.
(In reply to Adam Baxter from comment #2)
> Previously I used Debian's 6.1 package and no, I don't see the error there.
Can you then bisect between v6.1 and v6.4.0-rc4?
does not occur in 991cbd4f34b1d2d4e4cc41aed6eb4799186c3887
occurs in dca5161f9bd052e9e73be90716ffd57e8762c697
Can you please cc firstname.lastname@example.org? I wasn't able to.
(In reply to Adam Baxter from comment #4)
> does not occur in 991cbd4f34b1d2d4e4cc41aed6eb4799186c3887
> occurs in dca5161f9bd052e9e73be90716ffd57e8762c697
> Can you please cc email@example.com? I wasn't able to.
I have forwarded this BZ report at .
Next time, if you want to forward CC, click Edit on Cc list.
The commit in question does not change the behavior of the networking code. It only checks for an error and outputs a message where previously any errors were silently ignored. The error status "2" indicates that the network packet send failed at the Hyper-V level. I've asked the Hyper-V team for more specifics.
Question: When you run the same scenario (including the "sudo strings /dev/sda") with the older kernel version, do you also see the slight drop in network speed from ~17-18 Gbps to ~16 Gbps? I'm guessing that you do, because the Linux kernel patch just outputs a message about an error that previously was silently ignored. Hopefully outputting the error message is not the cause of the performance drop.
The goal of the patch is to bring more visibility to errors if they *are* happening, rather than having network performance degradation with no indication of what might be causing it.
Adam, haven you tried doing what Michael asked for?
Hi Thorsten & Michael. I'll hopefully get back to testing this this weekend.
Confirming I see similar behaviour without the error messages in 991cbd4f34b1d2d4e4cc41aed6eb4799186c3887 ~6.2rc7
(In reply to Adam Baxter from comment #9)
> Confirming I see similar behaviour without the error messages in
> 991cbd4f34b1d2d4e4cc41aed6eb4799186c3887 ~6.2rc7
Thanks. My interpretation of this result is that there is no regression. The error has happening all along, and what's new is that the error is now reported. Do you agree?
I still want to learn more about why the error is occurring and the implications. I think one of the implications is that outgoing packets are being dropped, which could affect performance but isn't fatal. I'm in conversations with the Hyper-V team about the meaning of the error. One possible cause is that the "send packet" request from the Linux guest is malformed, but that doesn't seem likely to be the case here. There are other cases that the Hyper-V team described in terms of internal Hyper-V behaviors, and I'm following up to clarify the full implications.
FWIW on 6.4.0rc4 I am now seeing that message without putting the system under significant load (e.g. tar -cf - . | ssh otherhost "tar xf")
I'm also getting those messages as well as significantly degraded SMB performance, but i don't believe that this is a kernel regression either, since it only happens if kernel was built using make localmodconfig. "Full" configs that distribution use don't have this issue, so there has to be some configuration option that's not enabled using localmodconfig, except i'm without a clue which one it might be.