Bug 61441
Summary: | Network stop working ( (bnx2): transmit queue 0 timed out) | ||
---|---|---|---|
Product: | Networking | Reporter: | Javier Barroso (javibarroso) |
Component: | IPV4 | Assignee: | Stephen Hemminger (stephen) |
Status: | CLOSED OBSOLETE | ||
Severity: | normal | CC: | alan, javibarroso |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 2.6.32.61 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Javier Barroso
2013-09-16 10:58:36 UTC
Hello, For reference, the same bug was found in earlier 2.6.32 kernels from openfiler. See https://forums.openfiler.com/index.php?/topic/6566-bnx2-driver-issues-bl/page__gopid__27342#entry27342 The network interfaces are: 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (rev 12) 07:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (rev 12) They are working with bonding Thank you Hello, I has recollected , what I think it is interesting, data from sar. Please note that though it is not the same kernel which is reported here, it is the same issue. I had that crash located in the time. If you think it is better to post the output with the last kernel and with the last crash , I can replay those commands. # sar -n NFS -I SUM -n EDEV -b -w -f /var/log/sa/sa12 -s 19:00:00 -e 19:45:00 ; \ sar -n NFSD -B -n SOCK -f /var/log/sa/sa12 -s 18:00:00 -e 19:40:00 ; \ echo "07:51:11 PM LINUX RESTART" Linux 2.6.32-71.18.1.el6-0.20.smp.gcc4.1.x86_64 (rhnas01) 09/12/2013 07:00:01 PM cswch/s 07:10:01 PM 816.95 07:20:01 PM 124.78 07:30:01 PM 151901.63 ************************* [1] 07:40:01 PM 5386.04 Average: 41660.65 07:00:01 PM INTR intr/s 07:10:01 PM sum 1182.60 07:20:01 PM sum 2099.52 07:30:01 PM sum 5587.73 07:40:01 PM sum 6378.32 ************************* [2] Average: sum 3651.24 07:00:01 PM tps rtps wtps bread/s bwrtn/s 07:10:01 PM 6.91 0.26 6.65 2.10 73.17 07:20:01 PM 7154205.83 7154303.81 7154245.95 7153372.94 7153219.18 ************************* [3] 07:30:01 PM 1989.93 0.02 1989.90 0.19 15965.17 07:40:01 PM 2643.06 0.03 2643.04 0.52 21146.24 Average: 1030.48 1899971.86 1041.05 1899725.28 8255.77 07:00:01 PM IFACE rxerr/s txerr/s coll/s rxdrop/s txdrop/s txcarr/s rxfram/s rxfifo/s txfifo/s 07:10:01 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:10:01 PM eth2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:10:01 PM eth3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:10:01 PM eth0 0.00 0.00 0.00 7148628.18 0.00 0.00 0.00 0.00 0.00 07:10:01 PM eth1 0.00 0.00 0.00 7148628.18 0.00 0.00 0.00 0.00 0.00 07:10:01 PM bond0 0.00 0.00 0.00 14297256.35 0.00 0.00 0.00 0.00 0.00 ************************* [4] 07:10:01 PM bond0.200 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:10:01 PM bond0.710 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:10:01 PM bond0.711 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM eth2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM eth3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM bond0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM bond0.200 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM bond0.710 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM bond0.711 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM eth2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM eth3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM bond0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM bond0.200 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM bond0.710 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM bond0.711 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM eth2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM eth3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM eth1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM bond0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM bond0.200 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM bond0.710 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM bond0.711 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: eth2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: eth3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: eth0 0.00 0.00 0.00 1899982.44 0.00 0.00 0.00 0.00 0.00 Average: eth1 0.00 0.00 0.00 1899982.44 0.00 0.00 0.00 0.00 0.00 Average: bond0 0.00 0.00 0.00 3799964.87 0.00 0.00 0.00 0.00 0.00 Average: bond0.200 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: bond0.710 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: bond0.711 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 07:00:01 PM call/s retrans/s read/s write/s access/s getatt/s 07:10:01 PM 0.00 0.00 0.00 0.00 0.00 0.00 07:20:01 PM 0.00 0.00 0.00 0.00 0.00 0.00 07:30:01 PM 0.00 0.00 0.00 0.00 0.00 0.00 07:40:01 PM 0.00 0.00 0.00 0.00 0.00 0.00 Average: 0.00 0.00 0.00 0.00 0.00 0.00 Linux 2.6.32-71.18.1.el6-0.20.smp.gcc4.1.x86_64 (rhnas01) 09/12/2013 06:00:01 PM scall/s badcall/s packet/s udp/s tcp/s hit/s miss/s sread/s swrite/s saccess/s sgetatt/s 06:10:01 PM 247.40 0.00 247.32 0.00 247.31 0.00 0.29 0.00 0.28 123.50 123.03 06:20:01 PM 247.36 0.00 247.30 0.00 247.30 0.00 0.27 0.00 0.26 123.55 122.98 06:30:01 PM 249.72 0.00 249.67 0.00 249.65 0.00 0.24 0.93 0.24 124.32 123.67 06:40:01 PM 246.90 0.00 246.87 0.00 246.86 0.00 0.24 0.00 0.24 123.43 122.68 06:50:01 PM 243.95 0.00 243.83 0.00 243.84 0.00 0.24 0.02 0.22 121.82 121.32 07:00:01 PM 248.13 0.00 248.11 0.00 248.11 0.00 0.26 0.00 0.26 123.87 123.46 07:10:01 PM 205.87 0.00 205.84 0.00 205.82 0.00 0.18 0.00 0.18 103.02 102.24 07:20:01 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ************************* [5] 07:30:01 PM 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Average: 187.74 0.00 187.70 0.00 187.69 0.00 0.19 0.11 0.19 93.74 93.28 06:00:01 PM pgpgin/s pgpgout/s fault/s majflt/s 06:10:01 PM 1.71 19.59 41.19 0.00 06:20:01 PM 7.33 15.52 13.88 0.00 06:30:01 PM 8.77 17.55 15.22 0.00 06:40:01 PM 5.29 16.55 12.87 0.00 06:50:01 PM 9.90 16.92 16.35 0.00 07:00:01 PM 1.20 20.02 20.90 0.00 07:10:01 PM 1.10 18.30 48.91 0.00 07:20:01 PM 0.05 7.08 27.77 0.00 07:30:01 PM 0.05 3991.31 17.66 0.00 ************************* [6] Average: 3.93 457.66 23.86 0.00 06:00:01 PM totsck tcpsck udpsck rawsck ip-frag 06:10:01 PM 267 17 26 0 0 06:20:01 PM 267 17 26 0 0 06:30:01 PM 267 17 26 0 0 06:40:01 PM 267 17 26 0 0 06:50:01 PM 267 17 26 0 0 07:00:01 PM 267 17 26 0 0 07:10:01 PM 269 17 26 0 0 07:20:01 PM 261 9 26 0 0 ************************* [7] 07:30:01 PM 255 9 26 0 0 Average: 265 15 26 0 0 07:51:11 PM LINUX RESTART Chronologically, I can see from sar output: At 07:10 PM a huge number of packages by seconds (14297256.35) are dropped at [4] At 07:20 PM a huge number of transfers by seconds (7154205) are issue to physical devices [3], and NFS stopped it works [5],[7] At 07:30 PM a huge number of context switches (151901) were created [1] and a big number (3991) of page were out to the disk [6] At 07:40 PM a peak of interrupts by seconds (6378) happens [6] Maybe somebody with more experience than me can tell what happened at that moment. And if we can configure kernel in some way. Thank you very mcuh We had change disks of that server to another (most modern Blade640 G6) blade. So surely I cannot test any tip if the issue don't happen in the new environment. So if you feel closing this bug is a must, I will understand. Thank you 2.6.32 is obsolete so yes |