Bug 7215

Summary: PCMCIA network card causes either X or kernel to freeze
Product: Drivers Reporter: Eric (ericabel)
Component: PCMCIAAssignee: Dominik Brodowski (linux)
Status: RESOLVED INSUFFICIENT_DATA    
Severity: normal CC: akpm, linux, protasnb, stefan
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.18 Subsystem:
Regression: Yes Bisected commit-id:

Description Eric 2006-09-27 10:17:16 UTC
Most recent kernel where this bug did not occur:  2.6.8
Distribution:  Debian
Hardware Environment: Dell Inspiron 8500
Software Environment: 
Problem Description: I am using a Dell Inspiron 8500 with a PCMCIA wireless
network card.  With kernel 2.6.8 I could run the computer for months at a time
(as should be the case with a linux system, right?).  Upon upgrading from 2.6.8
to 2.6.15 (and subsequently from pcmcia-cs to pcmciautils) the system will
become unresponsive when browsing the internet.  At that time (when 2.6.15 was
first released) I browsed the lists, hoping for some solution, but I found no
reports of these same symptoms.  I then figured I would wait it out, hoping a
fix would come in future kernel releases, but I am up to 2.6.18, and the problem
still persists.  The unfortunate aspect of this problem is that it's not
predictable, though I can cause the system freeze quickly by browsing sites with
lots of javascript.  I've checked all of the log files that I can think of
(/var/log/messages /var/log/syslog /var/log/XFree85.log, etc.), and see no
indication of an error...it's almost as if the system freezes before it gets a
chance to log the problem.  I do believe it's related to the integration of
PCMCIA from pcmcia-cs to pcmciautils for the following reasons:
1.  It worked fine with kernel 2.6.8 and less.
2.  The problem seems to be browser independent (I can kill it just as quickly
with konqueror, or Mozilla)
3.  The problem is network card independent (I've killed the system with 3
different wireless network cards, one using ndiswrapper, 2 using linux drivers)
4.  Though I can't be certain, my feeling is that X is becoming unresponsive (to
either mouse or keyboard), as I tried booting runlevel 3 (text console), and
browsing with links while uploading/downloading large files, and after still not
freezing the system for 10 minutes, I gave up (normally I can manage to cause
the freeze in less than a minute if I'm trying...I just browse warez or porn
sites...lots of javascript there).

Like I mentioned earlier, I've been patiently awaiting a solution to this
problem to appear in the mailing lists for more than a year now, but to no
avail.  If anyone has any ideas, it would be much appreciated.

Thanks,

Eric

Steps to reproduce:
Comment 1 Andrew Morton 2006-09-27 10:45:29 UTC
It could be that the machine has oopsed, only we don't know about
it because you're stuck in X.

Are you able to set up a serial console?

netconsole is easier, but if it's a networking problem then that
might not give us any info either.

Comment 2 Eric 2006-09-27 12:56:38 UTC
Quoting bugme-daemon@bugzilla.kernel.org:

> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>
>
>
>
>
> ------- Additional Comments From akpm@osdl.org  2006-09-27 10:45 -------
> It could be that the machine has oopsed, only we don't know about
> it because you're stuck in X.
>
> Are you able to set up a serial console?
>
> netconsole is easier, but if it's a networking problem then that
> might not give us any info either.
>

I can still connect through the ethernet port, which is unaffected by this
problem.  I will try that, and see if I can get some more clues.  Anything
specific I should look for when the problem occurs?

>
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>


Comment 3 Andrew Morton 2006-09-27 13:04:06 UTC
On Wed, 27 Sep 2006 13:07:00 -0700
bugme-daemon@bugzilla.kernel.org wrote:

> I can still connect through the ethernet port, which is unaffected by this
> problem.

You can?  That's interesting.  When you do so, is there nothing interesting
in the `dmesg' output?

This means that there's no point in setting up a serial console or anything.

If you can still connect to the machine, then what do you mean by "it
freezes"?  Just that the X interface is wedged up?

If so, run `top' and `ps aux', see if some process is stuck spinning in a
loop or something.

Could it just be that the keyboard/mouse have gone bad?

What happens if you do `sudo killall X' when logged in over the network? 
Does the X server terminate?  Can it be restarted?

etcetera....

Comment 4 Eric 2006-09-27 15:36:35 UTC
Quoting bugme-daemon@bugzilla.kernel.org:

> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>
>
>
>
>
> ------- Additional Comments From akpm@osdl.org  2006-09-27 13:04 -------
> On Wed, 27 Sep 2006 13:07:00 -0700
> bugme-daemon@bugzilla.kernel.org wrote:
>
>> I can still connect through the ethernet port, which is unaffected by this
>> problem.
>
> You can?  That's interesting.

Scratch that...I can't.  I meant, in theory, I can connect through the 
ethernet
port, but I just tried, and the netconsole I was working on went dead when the
computer froze.

> When you do so, is there nothing interesting in the `dmesg' output?

Even without doing this, I thought dmesg just printed the contents of
/var/log/messages, which has a time stamp, and I have tried looking at the log
file right before it died (to see the last time stamp), then the next 
series of
messages are associated with the new boot after restarting (starting with the
message "restart".  Basically there's nothing I can see no messages which seem
to occur around the time the system freezes up, much less anything which
indicates a problem (in syslog either).  This is why the problem is so
perplexing to me.

>
> This means that there's no point in setting up a serial console or anything.

I guess there is...but it'll be a while.  I'll need to get a serial
cable...haven't had need for one of those in years.

>
> If you can still connect to the machine, then what do you mean by "it
> freezes"?  Just that the X interface is wedged up?

At this point, I can say that I have no interaction with the machine, either
with the mouse or keyboard.

>
> If so, run `top' and `ps aux', see if some process is stuck spinning in a
> loop or something.
>
> Could it just be that the keyboard/mouse have gone bad?

No, the computer will run forever if I use only the ethernet connection, it's
only when I connect wirelessly that the problem manifests itself.

>
> What happens if you do `sudo killall X' when logged in over the network?
> Does the X server terminate?  Can it be restarted?

Don't know, haven't been able to interact with the computer once it's 
frozen up
yet.

>
> etcetera....
>
>
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>


Comment 5 Eric 2006-10-16 08:36:02 UTC
OK, I've ascertained now that it is the kernel which seizes 
up...nothing works. The serial terminal freezes up when the computer 
freezes...and I still can't
find any trace of error messages.

Eric

Quoting bugme-daemon@bugzilla.kernel.org:

> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>
>
>
>
>
> ------- Additional Comments From ericabel@mit.edu  2006-09-27 15:36 -------
> Quoting bugme-daemon@bugzilla.kernel.org:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>>
>>
>>
>>
>>
>> ------- Additional Comments From akpm@osdl.org  2006-09-27 13:04 -------
>> On Wed, 27 Sep 2006 13:07:00 -0700
>> bugme-daemon@bugzilla.kernel.org wrote:
>>
>>> I can still connect through the ethernet port, which is unaffected by this
>>> problem.
>>
>> You can?  That's interesting.
>
> Scratch that...I can't.  I meant, in theory, I can connect through the
> ethernet
> port, but I just tried, and the netconsole I was working on went dead 
> when the
> computer froze.
>
>> When you do so, is there nothing interesting in the `dmesg' output?
>
> Even without doing this, I thought dmesg just printed the contents of
> /var/log/messages, which has a time stamp, and I have tried looking 
> at the log
> file right before it died (to see the last time stamp), then the next
> series of
> messages are associated with the new boot after restarting (starting with the
> message "restart".  Basically there's nothing I can see no messages 
> which seem
> to occur around the time the system freezes up, much less anything which
> indicates a problem (in syslog either).  This is why the problem is so
> perplexing to me.
>
>>
>> This means that there's no point in setting up a serial console or anything.
>
> I guess there is...but it'll be a while.  I'll need to get a serial
> cable...haven't had need for one of those in years.
>
>>
>> If you can still connect to the machine, then what do you mean by "it
>> freezes"?  Just that the X interface is wedged up?
>
> At this point, I can say that I have no interaction with the machine, either
> with the mouse or keyboard.
>
>>
>> If so, run `top' and `ps aux', see if some process is stuck spinning in a
>> loop or something.
>>
>> Could it just be that the keyboard/mouse have gone bad?
>
> No, the computer will run forever if I use only the ethernet connection, it's
> only when I connect wirelessly that the problem manifests itself.
>
>>
>> What happens if you do `sudo killall X' when logged in over the network?
>> Does the X server terminate?  Can it be restarted?
>
> Don't know, haven't been able to interact with the computer once it's
> frozen up
> yet.
>
>>
>> etcetera....
>>
>>
>>
>> ------- You are receiving this mail because: -------
>> You reported the bug, or are watching the reporter.
>>
>
>
>
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>


Comment 6 Eric 2006-10-25 13:09:32 UTC
Hello,  I hope you haven't closed out this bug report because of the time lag
since my last email, but I just wanted to let you know that the problem still
persists, and I am out of ideas as to what to check.  It seems to be a kernel
problem, since when it freezes, the serial terminal becomes unresponsive, in
addition any running sub processes also hang.  I do think that the 
error occurs
before any error data can be logged, because I still can't find any 
trace of an
error in any log file.  Unfortunately going back to the 2.6.8 kernel isn't so
easy for me at this point, so I am dealing with the repeated system shutdowns
every time this happens.  Any ideas?

Erid

Quoting bugme-daemon@bugzilla.kernel.org:

> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>
>
>
>
>
> ------- Additional Comments From ericabel@mit.edu  2006-10-16 08:36 -------
> OK, I've ascertained now that it is the kernel which seizes
> up...nothing works. The serial terminal freezes up when the computer
> freezes...and I still can't
> find any trace of error messages.
>
> Eric
>
> Quoting bugme-daemon@bugzilla.kernel.org:
>
>> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>>
>>
>>
>>
>>
>> ------- Additional Comments From ericabel@mit.edu  2006-09-27 15:36 -------
>> Quoting bugme-daemon@bugzilla.kernel.org:
>>
>>> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>>>
>>>
>>>
>>>
>>>
>>> ------- Additional Comments From akpm@osdl.org  2006-09-27 13:04 -------
>>> On Wed, 27 Sep 2006 13:07:00 -0700
>>> bugme-daemon@bugzilla.kernel.org wrote:
>>>
>>>> I can still connect through the ethernet port, which is unaffected by this
>>>> problem.
>>>
>>> You can?  That's interesting.
>>
>> Scratch that...I can't.  I meant, in theory, I can connect through the
>> ethernet
>> port, but I just tried, and the netconsole I was working on went dead
>> when the
>> computer froze.
>>
>>> When you do so, is there nothing interesting in the `dmesg' output?
>>
>> Even without doing this, I thought dmesg just printed the contents of
>> /var/log/messages, which has a time stamp, and I have tried looking
>> at the log
>> file right before it died (to see the last time stamp), then the next
>> series of
>> messages are associated with the new boot after restarting (starting 
>> with the
>> message "restart".  Basically there's nothing I can see no messages
>> which seem
>> to occur around the time the system freezes up, much less anything which
>> indicates a problem (in syslog either).  This is why the problem is so
>> perplexing to me.
>>
>>>
>>> This means that there's no point in setting up a serial console or 
>>> anything.
>>
>> I guess there is...but it'll be a while.  I'll need to get a serial
>> cable...haven't had need for one of those in years.
>>
>>>
>>> If you can still connect to the machine, then what do you mean by "it
>>> freezes"?  Just that the X interface is wedged up?
>>
>> At this point, I can say that I have no interaction with the machine, either
>> with the mouse or keyboard.
>>
>>>
>>> If so, run `top' and `ps aux', see if some process is stuck spinning in a
>>> loop or something.
>>>
>>> Could it just be that the keyboard/mouse have gone bad?
>>
>> No, the computer will run forever if I use only the ethernet 
>> connection, it's
>> only when I connect wirelessly that the problem manifests itself.
>>
>>>
>>> What happens if you do `sudo killall X' when logged in over the network?
>>> Does the X server terminate?  Can it be restarted?
>>
>> Don't know, haven't been able to interact with the computer once it's
>> frozen up
>> yet.
>>
>>>
>>> etcetera....
>>>
>>>
>>>
>>> ------- You are receiving this mail because: -------
>>> You reported the bug, or are watching the reporter.
>>>
>>
>>
>>
>>
>> ------- You are receiving this mail because: -------
>> You reported the bug, or are watching the reporter.
>>
>
>
>
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>


Comment 7 Natalie Protasevich 2007-07-04 19:55:44 UTC
Eric,
As I understand, your laptop freezes, and after reboot you don't see anything logged in /var/log/messages, /var/lof/Xorg.log, and without X in text mode your keyboard and mouse and the whole system work fine?
You also mentioned that you cant go back to the 2.6.8 because of system shutdowns - can you explain more on this.
For more information, please run "dmesg -n 7" to increase verbosity of system messages and after next freeze/reboot collect the log files and attach to the bugzilla. You can also increase verbosity of Xorg: do man on Xorg and see about logverbose and verbose parameters. 
Thanks.
Comment 8 Eric 2007-07-09 08:26:19 UTC
Subject: Re:  PCMCIA network card causes either X or kernel to
	freeze

Quoting bugme-daemon@bugzilla.kernel.org:

> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>
>
> protasnb@gmail.com changed:
>
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                 CC|                            |protasnb@gmail.com
>
>
>
>
> ------- Comment #7 from protasnb@gmail.com  2007-07-04 19:55 -------
> Eric,
> As I understand, your laptop freezes, and after reboot you don't see anything
> logged in /var/log/messages, /var/lof/Xorg.log, and without X in text 
> mode your
> keyboard and mouse and the whole system work fine?

Basically, I have not been able to reproduce the problem without 
booting X.  The
system has not yet frozen in console mode.  I have since reverted back 
to 2.6.8,
and have not upgraded kernels since, due to this issue.  One work around would
be to use the pcmcia-cs packages with a later kernel, but for now the 
system is
working fine with the 2.6.8 configuration.

However, on very rare occasions, when I am utilizing excessive bandwidth on my
wireless network card even in 2.6.8 I have experienced the same sort of system
freeze-up, which I only mention because this is evidence that the problem has
nothing to do with the pcmcia driver compiled into the later kernel versions,
but something more fundamental at the hardware level?

Anyway, I'm still not sure if X is the problem or not.  I will try the 
dmesg -n
7 and reproduce the system freeze and see if the log catches anything.

Thanks for the response,

Eric

> You also mentioned that you cant go back to the 2.6.8 because of system
> shutdowns - can you explain more on this.
> For more information, please run "dmesg -n 7" to increase verbosity of system
> messages and after next freeze/reboot collect the log files and attach to the
> bugzilla. You can also increase verbosity of Xorg: do man on Xorg and 
> see about
> logverbose and verbose parameters.
> Thanks.
>
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>
Comment 9 Andrew Morton 2007-08-02 17:26:30 UTC
We can't make much progress without seeing the kernel printks.  There's probabyl something there, only we're not seeing it.  Options would be:

- When X is running, do the alt-shift-F1 thing to get back to the VGA console,
  and try to get X (which is still running) to freeze in this state.  Maybe you'll
  see some messages

- Get that serial console working!

- have you tried getting netconsole working over the wired ethernet?

If we _can_ get those messages coming out then we can go further and investigate the NMI watchdog, whcih can be useful in locating random hangs.  But until we can see those messages there isn't much point in setting that up.
Comment 10 Eric 2007-08-03 08:29:18 UTC
Quoting bugme-daemon@bugzilla.kernel.org:

> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>
>
> akpm@osdl.org changed:
>
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>         Regression|0                           |1
>
>
>
>
> ------- Comment #9 from akpm@osdl.org  2007-08-02 17:26 -------
> We can't make much progress without seeing the kernel printks.  There's
> probabyl something there, only we're not seeing it.  Options would be:
>
> - When X is running, do the alt-shift-F1 thing to get back to the VGA 
> console,
>  and try to get X (which is still running) to freeze in this state.  Maybe
> you'll
>  see some messages

I'll give this a try.

>
> - Get that serial console working!
>
> - have you tried getting netconsole working over the wired ethernet?

Both of these freeze up along with X (if X is indeed causing the problem)

>
> If we _can_ get those messages coming out then we can go further and
> investigate the NMI watchdog, whcih can be useful in locating random hangs.
> But until we can see those messages there isn't much point in setting 
> that up.

I understand.

>
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>
Comment 11 Natalie Protasevich 2007-11-07 21:34:13 UTC
Eric, any progress on debugging? Do you need any assistance with debug procedures?
Comment 12 Eric 2007-11-08 08:27:50 UTC
If you have any ideas that haven't been already suggested, it would be greatly
appreciated.  So far I have been unable to recover any error messages
associated with the system freeze up...remote terminal (ssh) freezes up, as
does a serial terminal.  I tried writing a perl script which prints the 
time in
a text file, then I'll freeze the system, and check the text file, and indeed,
the script stops with the system freeze.  This is the most information I have
been able to accumulate.

Eric

Quoting bugme-daemon@bugzilla.kernel.org:

> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>
>
>
>
>
> ------- Comment #11 from protasnb@gmail.com  2007-11-07 21:34 -------
> Eric, any progress on debugging? Do you need any assistance with debug
> procedures?
>
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>
Comment 13 Natalie Protasevich 2007-11-08 08:52:21 UTC
How about MNI watchdog?
You can enable it by "nmi_watchdog=<1 or 2>" on the boot line.

1 or 2 depends if you have APIC or IO-APIC based NMI. I can tell more precisely if you attach /proc/interrupts and dmesg.
Comment 14 Natalie Protasevich 2008-03-30 12:22:03 UTC
Eric, any updates on this? How is it working with recent kernel?
Comment 15 Stefan Hegny 2008-05-06 13:22:30 UTC
Seems I have exactly the same problem on a TP23 with Debian 4.0, 2.6.18.
Description matches, I can provoke the error by more network traffic via the card.
If debugging help still needed, I can jump in.

Regards,
Stefan

> Eric, any updates on this? How is it working with recent kernel?
> 
Comment 16 Natalie Protasevich 2008-05-06 13:44:54 UTC
I can suggest git bisect since you have good working version. You can try to identify what breaks it going from 2.6.8 to 2.6.9 say. See
http://www.kernel.org/doc/local/git-quick.html

It is a bit elaborate process, but it should show a kernel change that impaired your laptop.
Comment 17 Dominik Brodowski 2009-10-17 12:19:58 UTC
Could you post the output of "lspcmcia -vvv" please?
Comment 18 Eric 2009-10-18 18:56:33 UTC
You can probably close this out.  The computer this applies to is dead.

Eric

Quoting bugzilla-daemon@bugzilla.kernel.org:

> http://bugzilla.kernel.org/show_bug.cgi?id=7215
>
>
> Dominik Brodowski <linux@brodo.de> changed:
>
>           What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                 CC|                            |linux@brodo.de
>
>
>
>
> --- Comment #17 from Dominik Brodowski <linux@brodo.de>  2009-10-17 
> 12:19:58 ---
> Could you post the output of "lspcmcia -vvv" please?
>
> --
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You reported the bug.
>
Comment 19 Dominik Brodowski 2010-02-19 18:34:16 UTC
closing this bug, as no more data is available