Bug 9648 - linux-image-2.6.23-1-amd64: kernel Oops, (segfault) when downloading from net (sky2 driver)
Summary: linux-image-2.6.23-1-amd64: kernel Oops, (segfault) when downloading from net...
Status: RESOLVED INSUFFICIENT_DATA
Alias: None
Product: Drivers
Classification: Unclassified
Component: Network (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-12-27 14:38 UTC by Ondrej Certik
Modified: 2009-04-08 10:06 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.24-rc6
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Ondrej Certik 2007-12-27 14:38:11 UTC
Most recent kernel where this bug did not occur: 2.6.23 i386   
Distribution: Debian
Hardware Environment: amd64
Software Environment: Debian unstable
Problem Description: kernel Oops, (segfault) when downloading from net

Steps to reproduce:

I was using 2.6.23 i386 on Intel Quad with sky2 driver all fine. When switched to 
2.6.23 amd64, downloading any bigger file (50MB+) almost always causes the kernel to oops (still sky2 driver).

More information including logs and config files in this bugreport:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=457967

The maintainer of the Debian kernel decided it's the upstream bug, so I reported it here.

Feel free to ask for more information.

Ondrej
Comment 1 Ondrej Certik 2007-12-27 15:40:13 UTC
I should stress, that I tried 2.6.24-rc6 and it still fails. More info
in the Debian bug.
Comment 2 Stephen Hemminger 2007-12-27 16:07:56 UTC
How much memory do you have?  My guess is that problem is memory mapping above 4G.
Some hardware doesn't work, it may not even be a device driver issue but more of a DMA mapping problem.
Comment 3 Ondrej Certik 2007-12-27 16:14:35 UTC
Good guess, I have 4GB:

$ free -om
             total       used       free     shared    buffers     cached
Mem:          3962       1849       2113          0         39       1660
Swap:         3820          0       3820

I think I have exactly 4GB, but I am no expert in this.
Comment 4 Ondrej Certik 2007-12-27 16:18:55 UTC
Actually very nice guess.

The i386 worked because it wasn't compiled with big memory support (-bigmem Debian package). And actually, recently I installed the -bigmem i386 Debian kernel package and I experienced sudden hangs - but there was nothing in logs, so I thought it's because of running i386 on 64bit hardware, so I installed Debian amd64 on the box with true amd64 kernel. And the issue just became more apparent.
Comment 5 Stephen Hemminger 2007-12-27 16:51:22 UTC
With 4G the BIOS has to "make a hole" and move some of the memory above
4G.  In order to reach that memory, some translation is needed, or
the system has to avoid that memory for DMA access. What motherboard
is this?
Comment 6 Ondrej Certik 2007-12-27 17:14:32 UTC
I see. The motherboard is:

Gigabyte GA-965P-DS3 - Intel P965
Comment 7 Stephen Hemminger 2007-12-27 19:04:32 UTC
I have that motherboard, and was using 4G of memory without problem.
Check your BIOS version, they are up to F11.
http://america.giga-byte.com/FileList/BIOS/motherboard_bios_ga-965p-ds3_f11.exe
Comment 8 Ondrej Certik 2007-12-29 09:37:58 UTC
Is there a way how to get this information remotely over ssh? 

But currently it doesn't matter, I cannot log in anymore

$ ssh -vvv august
[...]
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
Connection closed by 147.32.50.195

the kernel probably shot itself in the head again. I'll be sitting at this computer again on Wednesday, so I'll tell you then. While I'll be there, is there something else I can check?
Comment 9 Ondrej Certik 2007-12-30 06:31:40 UTC
I managed to login:

$ sudo dmidecode 
# dmidecode 2.9
SMBIOS 2.4 present.
39 structures occupying 1198 bytes.
Table at 0x000F0100.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
	Vendor: Award Software International, Inc.
	Version: F4
	Release Date: 01/08/2007
[...]

So it seems the version is F4, while the newest one seems F11. But I this computer is new, also the Release Date is recent, so the "dmidecode" is maybe lying.
Comment 10 Ondrej Certik 2008-02-03 10:16:58 UTC
So the problem was in the sky2 driver or hardware. I installed the latest released 2.6.24 kernel, which is in Debian, I tried both 64 bit and 32 bit with low mem, and I can reproduce the problem 100% now (no matter if 64bit, 32bit, high mem, low mem) - I just need to download 50MB+ from the net and it hangs for sure.

I got fedup with this, so today I took an old RealTek RTL8139 network card, put that in, switched the ethernet cable and it works like a charm. I stressed the computer a lot today, I downloaded 700MB+ from the net, upgraded hundrends of packages in Debian, built 200MB of C++ sources in parallel (on all four processors) several times and I haven't experienced a single hang. And I run on the 64 bit kernel + Debian.

So my problem is fixed now, but the real problem is imho either in the sky2 driver, or hardware. I am not an expert in either, so unless someone will be directing me what to do, I cannot help much.

As to the bios - I didn't see the version of it at boot or in the bios setup, so all information I can get is given in my previous posts. 

Ondrej
Comment 11 Alan 2009-04-08 10:06:11 UTC
Closing old stale bugs,please re-open if still present in 2.6.29+

Note You need to log in before you can comment on or make changes to this bug.