Distribution: Gentoo Hardware Environment: Motherboard : Intel E7210 with onboard e1000 cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz 4x 512MB DIMM DDR Synchronous 333 MHz Software Environment: Linux amnesia 2.6.13-rc3-mm1 #3 SMP Sun Aug 14 16:38:51 CEST 2005 i686 Intel(R) Pentium(R) 4 CPU 2.80GHz GenuineIntel GNU/Linux Gnu C 3.3.5-20050130 Gnu make 3.80 binutils 2.15.92.0.2 util-linux 2.12q mount 2.12q module-init-tools 3.0 e2fsprogs 1.37 reiserfsprogs 3.6.19 reiser4progs 1.0.4 xfsprogs 2.6.25 nfs-utils 1.0.7 Linux C Library 2.3.4 Dynamic linker (ldd) 2.3.4 Procps 3.2.5 Net-tools 1.60 Kbd 1.12 Sh-utils 5.2.1 udev 058 Modules Loaded ip_gre e1000 Problem Description: First I must say this is a very odd problem. It happend the first time a couple of weeks ago. My ISP's router became unreachable then when it came back online all computers where up except this one. All others are running a 2.6.12-version. Back then I thought it was just some weird random occurance but it's happened twice after that. Everytime when my ISP's router becomes unreachable for some seconds the computer hangs totally. Takes no input, magic sysrq doesn't work either, and the screen is black. The problem does not occur when I tried 2.6.12.1. I'm connected with 1gbit using an onboard e1000-card to the ISP to I thought it could maybe be some driver problem where the nic gets corrupted packets and causes a panic or something some how. Right now I'm going to try 2.6.13-rc7 to see if the problem occurs there too. Steps to reproduce: Using the same kernelversion and waiting for the ISP's router (my default gw) to go down. Reproduced 3 times.
bugme-daemon@kernel-bugs.osdl.org wrote: > > http://bugzilla.kernel.org/show_bug.cgi?id=5131 > > Summary: Computer hangs when default-gw becomes unreachable > Kernel Version: 2.6.13-rc3-mm1 > > ... > > Right now I'm going to try 2.6.13-rc7 to see if the problem occurs there too. > Thanks. It would be useful if you could also test 2.6.13-rc6-mm2. I assume there was nothing interesting in the kernel logs?
bugme-daemon@kernel-bugs.osdl.org wrote: >http://bugzilla.kernel.org/show_bug.cgi?id=5131 > > > > > >------- Additional Comments From akpm@osdl.org 2005-08-26 04:44 ------- >bugme-daemon@kernel-bugs.osdl.org wrote: > > >>http://bugzilla.kernel.org/show_bug.cgi?id=5131 >> >> Summary: Computer hangs when default-gw becomes unreachable >> Kernel Version: 2.6.13-rc3-mm1 >> >>... >> >>Right now I'm going to try 2.6.13-rc7 to see if the problem occurs there too. >> >> >> > >Thanks. It would be useful if you could also test 2.6.13-rc6-mm2. > >I assume there was nothing interesting in the kernel logs? > > > >------- You are receiving this mail because: ------- >You reported the bug, or are watching the reporter. > > > > I may be able to test 2.6.13-rc6-mm2 in a while. Unfortunetly this system is in production so I can not just take it down. But If the problem happens again I will change to 2.6.13-rc6-mm2. No nothing interesting at all in the kernel logs. I can however add that a friend with similar specs and same kernel version had a similar problem except that the computer didn't crash right away. After the problem happened he could login as root on the console and everything seemed to be working except that the e1000-card could not communicate with the gateway. The gateway did answer the arp-requests from the e1000-card but it didn't respond to icmp or route any traffic. However he could communicate with all other computers in the same subnet and those computers could also communicate with the gateway properly. He also tried changing mac-address on the card and IP-address. After that the gateway would answer to a few icmp-packets and then dying like before. He also tried unloading and loading the e1000-module without success. A few minutes later the computer hanged like mine does. After a power-reset everything worked fine again.
Andrew Morton wrote: >bugme-daemon@kernel-bugs.osdl.org wrote: > > >>http://bugzilla.kernel.org/show_bug.cgi?id=5131 >> >> Summary: Computer hangs when default-gw becomes unreachable >> Kernel Version: 2.6.13-rc3-mm1 >> >>... >> >>Right now I'm going to try 2.6.13-rc7 to see if the problem occurs there too. >> >> >> > >Thanks. It would be useful if you could also test 2.6.13-rc6-mm2. > >I assume there was nothing interesting in the kernel logs? > > > > I don't know if this is any help but I was able to reproduce the problem in another way. I also have a second e1000-nic in the box. A 64-bit one sitting in a pci-slot. The problem occurs every time if I do something like this: Both nic's are connected to the same switch and is not separated by vlans or anything like that. The first nic (eth0) has address 192.168.0.2, the second nic (eth1) has 192.168.0.2. Then what I did was that I applied a staticroute to a second box so that it would use the eth1-nic: ip route add 192.168.0.3 dev eth1 Then the second box connected to 192.168.0.2 (eth0) via ftp and downloads a file which is then sourced with the 192.168.0.2 ip but transferred via eth1. 192.168.0.2(eth0) --> eth1 -> 192.168.0.3 Packets are returned coming in on eth0. 192.168.0.3 -> 192.168.0.2(eth0) Immedietly when i transferred the file the nic's stopped transferring data and I was back at the problem my friend had. The box could for some reason not communicate with the gateway (192.168.0.1) but it could with any other box in the subnet. Why this problem occured now when the gateway had nothing to do with any of the test I have no idea. I solved the problem by unloading the e1000-module and loading it again. And this now happened in 2.6.13-rc7
Any update on this problem please. Thanks.
Closing the bug since no recent activity. Please reopen if confirmed with newest kernel.