Kernel Bug Tracker – Bug 6780
System freezes after 2 weeks when using SysKonnect SK-98xx based network card.
Last modified: 2006-07-22 14:03:59 UTC
Distribution: Debian sarge 3.1
Hardware Environment: not significant for this bug; tested on tens of different
Software Environment: Standart router with iptables firewall.
Problem Description: When the system has got any network card (tested with
Marvell Technology Group Ltd. Yukon Gigabit Ethernet 10/100/1000Base-T Adapter
(rev 13) and 3Com Corporation 3c940 10/100/1000Base-T [Marvell] (rev 10)
network cards) which uses SysKonnect SK-98xx driver (CONFIG_SK98LIN=y) and that
network card has got network cable and IP address set up (and there's
activity), the machine freezes completely after 14-15 days of uptime.
When I connect keyboard to the machine, it doesn't react, even NumLock led does
not turn on/off. The only solution is to press the reset button.
I've noticed, that each machine hangs up very accurately-exactly after 14-16
I am also sure, this is related to SK-98xx driver, because when I remove the
mentioned network card and install another one, the problem goes away.
Steps to reproduce: Compile a kernel supporting SK-98xx, boot it, set up the
interface and just wait for about 2 weeks--the system should freeze.
Note: this might be a duplicate of #6277 (sorry, I'm tired of driving miles
away from my home in order to reboot a couple of machines; besides that
machines need stability-they're network routers).
Some more notes: those systems are not using any kind of shaping system
(neither HTB, not CBQ), but problem still occurs.
Strange. If it was 47 days then I'd say it's due to a jiffies
rollover. But nothing much happens after 14 days. Possibly a packet
Do you have the NMI watchdog enabled? Add `nmi_watchdog=1' to the
kernel boot command line. That'll get us a trace if the machine
has any life at all left in it. You'd need a serial console
or a digital camera to record it though.
I don't think this might be a packet count rollover, because I use that network
cards on differently loaded machines (one transfers about 800 Gbs per day while
other only about 10) and the problem still exists.
I don't have NMI watchdog enabled. Sorry, but I do not have neither a serial
console nor a digital camera.
maybe there is any way/tool to reset device statistics manually without a
reboot? I mean something like resetting /proc/interrupts, transmitted bytes and
packets counters? Any ideas?
Does the problem happen with the skge driver? The sk98lin driver
has a lot of messy private management code that is unnecessary.
The skge driver supports the same hardware, and is supported.
The sk98lin driver is not supported by the kernel community and
is planned to be obsoleted.
I'll try to test with the skge driver and then I'll post the results.
With this driver my systems does not freeze any more at all, the problem seems
to be gone.
Since skge supersedes sk98lin now, and the problem is probably
in the vendor MIB portion of the driver let's just accelerate removal
of the old driver.