Bug 120461
Summary: | Drop on SFC interface around 30 % | ||
---|---|---|---|
Product: | Drivers | Reporter: | Otto Sabart (seberm) |
Component: | Network | Assignee: | drivers_network (drivers_network) |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | bkenward, hladky.jiri, regressions, rstonehouse |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | v4.6-10530-g28165ec | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
ipv4
ipv4 over vlan ipv6 ipv6 over vlan All the results pack (v4.6 vs. 4.7-rc{0,1,3}) Set interrupt affinities |
Created attachment 220261 [details]
ipv4 over vlan
Created attachment 220271 [details]
ipv6
Created attachment 220281 [details]
ipv6 over vlan
From the labels on your plots I'm guessing this was v4.7-rc3, not -rc0? Do you have a v4.7-rc1 kernel available? There were no driver changes between v4.6 and v4.7-rc1, and only three driver changes after that up to v4.7-rc3. (In reply to Otto Sabart from comment #0) > We see a performance drop (about ~30%) on sfc driver (SFC9020) when > performing > netperf TCP maerts test. It seems it started from 4.7-rc0. Thanks for the report (and for running the tests). Was a Solarflare network adapter used on both sides? I ask as the report is for netperf TCP maerts. Was a netperf TCP stream test also run? and did it show any performance regression? This helps determine if it is a TX or RX regression. Hi Bert, (In reply to Bert Kenward from comment #4) > From the labels on your plots I'm guessing this was v4.7-rc3, not -rc0? unfortunately the regression shows on all RCs (starting from rc0). > Do you have a v4.7-rc1 kernel available? There were no driver changes > between v4.6 and v4.7-rc1, and only three driver changes after that up to > v4.7-rc3. In 'results-4.7-rc0-rc1-rc3.tar.gz' archive you can find all the comparisons between v4.6 and v4.7-rc{0,1,3} (unfortunately we do not have results for rc2). Created attachment 220431 [details]
All the results pack (v4.6 vs. 4.7-rc{0,1,3})
Hi Robert, (In reply to Robert Stonehouse from comment #5) > Was a Solarflare network adapter used on both sides? yes, we have exactly the same configuration (cards, machines, ..) on both sides (if you want more info about the machines, just let me know). > I ask as the report is for netperf TCP maerts. Was a netperf TCP stream test > also run? and did it show any performance regression? This helps determine > if it is a TX or RX regression. yes, TCP stream tests are also ran. I am able to see regression only on maerts tests. All the results you can find in the attachemnt (in comment 7). Our team exposed another regression on cpu scheduler BZ120481 [0]. Could it be related? [0] https://bugzilla.kernel.org/show_bug.cgi?id=120481 Strange thing here is that it affects only solarflare card. I don't think there's a v4.7-rc0 tag on the mainline kernel. Do you have git commit IDs for the various things you've tested? As mentioned before, there were no driver changes between v4.6 and v4.7-rc1, so it's apparently something elsewhere in the kernel. It could well relate to the scheduler issue you've spotted. (In reply to Bert Kenward from comment #9) > I don't think there's a v4.7-rc0 tag on the mainline kernel. Do you have git > commit IDs for the various things you've tested? As mentioned before, there > were no driver changes between v4.6 and v4.7-rc1, so it's apparently > something elsewhere in the kernel. It could well relate to the scheduler > issue you've spotted. Hi Bert, yes, you are right, there is no v4.7-rc0 tag on the mainline kernel. This tag was created because we build fedora upstream kernel more often then once per rc. Sorry for confusion. Here are the commit IDs: Fedora upstream tag Related mainline kernel tag =========================================================== 4.7.0-0.rc0.git8.2 ---- v4.6-10530-g28165ec 4.7.0-0.rc1.git1.2 ---- v4.7-rc1-12-g852f42a 4.7.0-0.rc3.git0.1 ---- v4.7-rc3 Thanks for the commit IDs Otto. Do you have a commit ID for the latest "good" revision? I hope to try and recreate and bisect this. (In reply to Bert Kenward from comment #11) > Do you have a commit ID for the latest "good" revision? Latest good revision is 'v4.6'. Unfortunately I have found small bug in our test suite. The results I have attached so far are _not_ TCP_MAERTS tests but TCP_STREAM tests with various -M size [0]. The reproducer: $ ip a show dev sfc0 3: sfc0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 link/ether 00:0f:53:08:29:10 brd ff:ff:ff:ff:ff:ff inet 172.20.20.10/24 brd 172.20.20.255 scope global sfc0 valid_lft forever preferred_lft forever inet6 fd60::10/64 scope global valid_lft forever preferred_lft forever inet6 fe80::20f:53ff:fe08:2910/64 scope link valid_lft forever preferred_lft forever $ netperf -cC -t TCP_STREAM -l 30 -L 172.20.20.10 -H 172.20.20.20 -T ,0 -T 0, -- -M $SIZE I have re-tested everything manually to make sure there is a regression: TCP_STREAM test with various -M sizes: ====================================== v4.6: +--------+------+------+------+ | -M | 512 | 1024 | 8192 | +--------+------+------+------+ | 1. run | 4322 | 6499 | 9329 | | 2. run | 4463 | 6516 | 9329 | | 3. run | 4385 | 6471 | 9326 | +--------+------+------+------+ v4.6-10530-g28165ec: +--------+------+------+------+ | -M | 512 | 1024 | 8192 | +--------+------+------+------+ | 1. run | 3649 | 4667 | 6315 | | 2. run | 3684 | 4801 | 6313 | | 3. run | 3693 | 4790 | 6283 | +--------+------+------+------+ The regression is possible to see on TCP_MAERTS test too. The reproducer: $ netperf -cC -t TCP_MAERTS -l 30 -L 172.20.20.10 -H 172.20.20.20 -T ,0 -T 0, -- -M $SIZE TCP_MAERTS test with various -M sizes: ====================================== v4.6: +--------+------+------+------+ | -M | 512 | 1024 | 8192 | +--------+------+------+------+ | 1. run | 6975 | 6967 | 6966 | | 2. run | 6958 | 6978 | 6974 | | 3. run | 6942 | 6933 | 6958 | +--------+------+------+------+ v4.6-10530-g28165ec: +--------+------+------+------+ | -M | 512 | 1024 | 8192 | +--------+------+------+------+ | 1. run | 6756 | 6753 | 6762 | | 2. run | 6757 | 6741 | 6769 | | 3. run | 6760 | 6766 | 6759 | +--------+------+------+------+ HW information: =============== $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 4 On-line CPU(s) list: 0-3 Thread(s) per core: 1 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 62 Model name: Intel(R) Xeon(R) CPU E5-2403 v2 @ 1.80GHz Stepping: 4 CPU MHz: 1591.040 BogoMIPS: 3599.14 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 10240K NUMA node0 CPU(s): 0-3 ethtool info: ============= $ ethtool -i sfc0 driver: sfc version: 4.0 firmware-version: 3.3.0.6298 bus-info: 0000:1b:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: no supports-register-dump: yes supports-priv-flags: no [0] http://www.netperf.org/svn/netperf2/tags/netperf-2.6.0/doc/netperf.html Still valid for kernel v4.7-rc5-227-ge7bdea7. Bert, this issue is listed in my regression reports for 4.7 and I wonder what the status is. It seems nothing much happened for more than a week now, which is a bad sign as 4.7 final seems only a week or two away. I'm hoping to bisect it in the next day or so. I've attempted to bisect this today. Initially I thought I'd successfully reproduced the regression, but what I'm actually seeing is occasional (approx 1 run in 5) lower performance, regardless of the kernel. I'm seeing performance very close to line rate normally. Two differences: - I'm using a pair of machines with older but higher clocked CPUs - E3-1230 @ 3.2 GHz. Netperf is reporting quite low CPU utilisation though. I'll see if I can find a pair of machines that are more similar. - The NICs I'm using have newer firmware - I'll try with the older firmware shortly. A further update - with the same firmware version (3.3.0.6298) I still see performance very close to line rate with tag v4.7-rc1. (In reply to Bert Kenward from comment #16) > I've attempted to bisect this today. Initially I thought I'd successfully > reproduced the regression, but what I'm actually seeing is occasional > (approx 1 run in 5) lower performance, regardless of the kernel. I'm seeing > performance very close to line rate normally. I provisioned our machines to test it little bit more. For me is occasional a higher performance and more often I can see lower performance: Wed Jul 6 04:00:16 CEST 2016 1: measured throughput: 6270.64 2: measured throughput: 6274.19 3: measured throughput: 6244.32 4: measured throughput: 6252.82 5: measured throughput: 6256.34 6: measured throughput: 6244.57 7: measured throughput: 6231.55 Wed Jul 6 04:08:24 CEST 2016 1: measured throughput: 9327.26 2: measured throughput: 9326.68 3: measured throughput: 6255.88 4: measured throughput: 9326.81 5: measured throughput: 6248.88 6: measured throughput: 6251.87 7: measured throughput: 9326.16 Wed Jul 6 04:11:59 CEST 2016 1: measured throughput: 9325.41 2: measured throughput: 9327.90 3: measured throughput: 6240.21 4: measured throughput: 6245.22 5: measured throughput: 6241.08 6: measured throughput: 6254.99 7: measured throughput: 6251.38 Wed Jul 6 04:17:15 CEST 2016 1: measured throughput: 9322.18 2: measured throughput: 6239.87 3: measured throughput: 9324.77 4: measured throughput: 6250.38 5: measured throughput: 9327.27 6: measured throughput: 9328.37 7: measured throughput: 9329.61 > Two differences: > - I'm using a pair of machines with older but higher clocked CPUs - E3-1230 > @ 3.2 GHz. Netperf is reporting quite low CPU utilisation though. I'll see > if I can find a pair of machines that are more similar. > - The NICs I'm using have newer firmware - I'll try with the older firmware > shortly. I think the problem is CPU affinity. I do not know the exact kernel implementation but it seems that kernel by default pins all the sfc interrupts mostly to core 0. $ cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 ...... 46: 166 0 0 2 IR-PCI-MSI 14155776-edge sfc0-0 47: 1060610 0 0 0 IR-PCI-MSI 14155777-edge sfc0-1 49: 6576243 0 0 0 IR-PCI-MSI 14155778-edge sfc0-2 50: 772157 0 0 0 IR-PCI-MSI 14155779-edge sfc0-3 ...... But we bind netperf and netserver to core 0 at the same time (-T options): $ netperf -P0 -cC -t TCP_STREAM -l 30 -L 172.20.20.10 -H 172.20.20.20 -T ,0 -T 0, -- -M 8192 - If I bind netperf+netserver to diffrent core, I cannot reproduce lower performance anymore: $ netperf -P0 -cC -t TCP_STREAM -l 30 -L 172.20.20.10 -H 172.20.20.20 -T ,3 -T 3, -- -M 8192 - If I change affinity to handle sfc's interrupts on a diffrent core and I keep netperf+netserver running on core 0, I cannot reproduce lower performance anymore. $ echo 8 > /proc/irq/46/smp_affinity $ echo 8 > /proc/irq/47/smp_affinity $ echo 8 > /proc/irq/49/smp_affinity $ echo 8 > /proc/irq/50/smp_affinity $ netperf -P0 -cC -t TCP_STREAM -l 30 -L 172.20.20.10 -H 172.20.20.20 -T ,0 -T 0, -- -M 8192 $ cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 46: 166 0 0 0 IR-PCI-MSI 14155776-edge sfc0-0 47: 1060610 0 0 0 IR-PCI-MSI 14155777-edge sfc0-1 49: 6576243 0 0 0 IR-PCI-MSI 14155778-edge sfc0-2 50: 772157 0 0 534538 IR-PCI-MSI 14155779-edge sfc0-3 What do you think? Thank you! Created attachment 222161 [details]
Set interrupt affinities
Thanks Otto. We sent a patch to net-next back in May that included affinity hints as part of a wider change, but I don't believe it ever got merged. The attached patch only includes the affinity hints - can you try it on your system?
I see the same bimodal performance with older kernels (4.6 or earlier), so I don't think this is a regression with 4.7-rc1. Otto, do you agree? (In reply to Bert Kenward from comment #19) > Created attachment 222161 [details] > Set interrupt affinities > > Thanks Otto. We sent a patch to net-next back in May that included affinity > hints as part of a wider change, but I don't believe it ever got merged. The > attached patch only includes the affinity hints - can you try it on your > system? With this patch applied I can reproduce lower rate only occasionally (1 run in ~10). The interrupts are spread between all cores: CPU0 CPU1 CPU2 CPU3 37: 36130760 0 0 0 IR-PCI-MSI 14155776-edge sfc0-0 38: 0 8439395 0 0 IR-PCI-MSI 14155777-edge sfc0-1 39: 0 0 8611333 0 IR-PCI-MSI 14155778-edge sfc0-2 40: 0 0 0 12262519 IR-PCI-MSI 14155779-edge sfc0-3 (In reply to Bert Kenward from comment #20) > I see the same bimodal performance with older kernels (4.6 or earlier), so I > don't think this is a regression with 4.7-rc1. Otto, do you agree? Yes, I agree. I think we can close this bug. Thank you for collaboration! |
Created attachment 220251 [details] ipv4 We see a performance drop (about ~30%) on sfc driver (SFC9020) when performing netperf TCP maerts test. It seems it started from 4.7-rc0. All the results you can find in attachments. Which commit/change could cause this regression? Any hints?