Bug 27792

Summary: Fan speed freezes, and hotkeys stop working. Lenovo Thinkpad L512
Product: Drivers Reporter: amnesia
Component: Platform_x86Assignee: drivers_platform_x86 (drivers_platform_x86)
Status: ASSIGNED ---    
Severity: blocking CC: aaron.lu, acpi-bugzilla, alan, amnesia, evol.ig, jdelvare, jp-bug-report, jrnieder, lenb, mjg59-kernel, rjw, rostedt, rui.zhang, szg00000
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.6.2 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 16444    
Attachments: 2.6.36-4 while everything's still working normally
2.6.39-rc9 while everything's still working normally
acpidump of 2.6.39-rc6 when the bug is occuring
acpidump of 2.6.36-4 when the bug is occuring
acpidump of 3.0-rc5 when the bug is occuring
acpidump 3.2.0-2-amd64 when the bug is occuring
dmesg output after the problem occurs on 3.3.0
dmesg
dmesg2
acpidump bios 1.37
acpidump bios 1.37

Description amnesia 2011-01-29 11:24:01 UTC
Hi, I'm not sure whether I'm submitting this in the good category, my apologies if it isnt.

Well I've noticed some strange issue a couple of weeks ago. In kernel revisions lower as 2.6.37 I was noticing another bug (Bug 25542) that got fixed in 2.6.37.
So here comes the strange problem.

When I run linux for 1/2/3 hours (it's kinda variable the time), the fan keeps turning stationary no matter what the temperature is, and the fn keys also stop working. When the problem does not exist everything runs fine.

A reboot/kexec/powerdown-powerup with the plug kept in doesn't work, the problem keeps existing after those actions. When it has occurred the only solution is to turn it off, pull out the plug(I usually use it without the battery in it), put it back in and boot up. When this is done everything's working normally again, but it does happen again in 1/2/3 hours of usage/idling.

When the problem exists, and I reboot(with the ac cable kept in) into a livecd (ubuntu 10.10, 2.6.35-22) the problem keeps existing. 

Reloading/unloading thinkpad_acpi also didn't have any effect. There also isn't anything logged about it by the kernel.

Everything works fine on windows.
Comment 1 amnesia 2011-01-30 21:55:58 UTC
Aren't having this issue on 2.6.35-22, so I will try the latest git tree tomorrow.
Comment 2 amnesia 2011-02-01 17:40:08 UTC
Problem also exists in 2.6.38-rc3
Comment 3 Rafael J. Wysocki 2011-04-26 19:48:48 UTC
We need to figure out how the fan is really controlled.

Please see what's under /sys/class/thermal/ on your system (preferably with
the current mainline kernel).
Comment 4 Len Brown 2011-04-26 21:02:15 UTC
please attach the output from acpidump,
available here:
http://userweb.kernel.org/~lenb/acpi/utils/pmtools/acpidump/
Comment 5 amnesia 2011-05-06 12:45:49 UTC
Created attachment 56832 [details]
2.6.36-4 while everything's still working normally
Comment 6 amnesia 2011-05-06 12:52:02 UTC
Rafael J. Wysocki:

/sys/class/thermal/:
total 0
lrwxrwxrwx 1 root root 0 May  6 14:04 cooling_device0 -> ../../devices/virtual/thermal/cooling_device0
lrwxrwxrwx 1 root root 0 May  6 14:04 cooling_device1 -> ../../devices/virtual/thermal/cooling_device1
lrwxrwxrwx 1 root root 0 May  6 14:04 cooling_device2 -> ../../devices/virtual/thermal/cooling_device2
lrwxrwxrwx 1 root root 0 May  6 14:04 cooling_device3 -> ../../devices/virtual/thermal/cooling_device3
lrwxrwxrwx 1 root root 0 May  6 14:04 cooling_device4 -> ../../devices/virtual/thermal/cooling_device4
lrwxrwxrwx 1 root root 0 May  6 14:04 thermal_zone0 -> ../../devices/virtual/thermal/thermal_zone0

That's on 2.6.36-4, compiling the current mainline at the moment, will post the output of that later.

Len Brown:

I've attached the acpidump made on 2.6.36-4, and soon will give an acpidump made on the current mainline. I must notice I took acpidump from the debian repositories since I couldnt get the source compiled due to wrong header files, not sure whether that an issue?

NOTE:
In upper comments I mentioned the problem didnt occur in 2.6.35-22, but that wasnt true, it just takes longer for the problem to occur on that kernel. So basically the problem exists in all latests kernels, but it occurs after a longer amount of time in 2.6.35* and 2.6.36*
Comment 7 amnesia 2011-05-06 13:02:24 UTC
Rafael J. Wysocki:

/sys/class/thermal/:
total 0
lrwxrwxrwx 1 root root 0 May  6 14:59 cooling_device0 -> ../../devices/virtual/thermal/cooling_device0
lrwxrwxrwx 1 root root 0 May  6 14:59 cooling_device1 -> ../../devices/virtual/thermal/cooling_device1
lrwxrwxrwx 1 root root 0 May  6 14:59 cooling_device2 -> ../../devices/virtual/thermal/cooling_device2
lrwxrwxrwx 1 root root 0 May  6 14:59 cooling_device3 -> ../../devices/virtual/thermal/cooling_device3
lrwxrwxrwx 1 root root 0 May  6 14:59 cooling_device4 -> ../../devices/virtual/thermal/cooling_device4
lrwxrwxrwx 1 root root 0 May  6 14:59 thermal_zone0 -> ../../devices/virtual/thermal/thermal_zone0

That's on 2.6.39-rc6.
Comment 8 amnesia 2011-05-06 13:04:21 UTC
Created attachment 56842 [details]
2.6.39-rc9 while everything's still working normally
Comment 9 amnesia 2011-05-06 16:32:39 UTC
Acpidump seems to be 3 times faster on *36-4 than on 39-rc6 don't know how that's possible?
Comment 10 amnesia 2011-05-06 17:02:09 UTC
Created attachment 56872 [details]
acpidump of 2.6.39-rc6 when the bug is occuring

the contents of /sys/class/thermal/ are the same when the bug is occuring and when everything is fine.
Comment 11 amnesia 2011-05-11 17:11:13 UTC
Created attachment 57382 [details]
acpidump of 2.6.36-4 when the bug is occuring
Comment 12 amnesia 2011-06-29 17:28:36 UTC
Created attachment 63892 [details]
acpidump of 3.0-rc5 when the bug is occuring

So the problem also persists in 3.0-rc5. Exactly the same behavior.
Comment 13 Zhang Rui 2012-01-18 02:42:03 UTC
It's great that kernel bugzilla is back.

can you please verify if the problem still exists in the latest upstream
kernel?
Comment 14 Amnesia 2012-03-19 14:00:12 UTC
The problem also occurs in 3.1.0 3.2.0 and 3.3.0. Still running 2.6.35.14, but I'd like to upgrade.
Comment 15 Amnesia 2012-03-19 14:01:33 UTC
( the reason I changed email adresses, is because I lost my previous emailaddress/password )
Comment 16 Amnesia 2012-03-28 22:08:34 UTC
Could by any change anyone look at this ticket?
Comment 17 evol 2012-04-03 15:14:10 UTC
ThinkPad L512 (2597AB2)
Debian GNU/Linux kernel 3.2.0-2-amd64
similar problem.
Comment 18 evol 2012-04-03 19:04:40 UTC
Created attachment 72801 [details]
acpidump 3.2.0-2-amd64 when the bug is occuring
Comment 19 Amnesia 2012-04-04 12:53:33 UTC
Is there some way to increase the severity of this ticket? (now that someone else also experienced it).
Comment 20 evol 2012-04-05 04:06:16 UTC
also want to know it
Comment 21 Zhang Rui 2012-04-06 02:07:12 UTC
so this is a regression, right?

please attach the dmesg output after the problem occurs.
Comment 22 Amnesia 2012-04-06 12:39:48 UTC
Yes that's right.
Comment 23 Amnesia 2012-04-06 12:40:19 UTC
Created attachment 72829 [details]
dmesg output after the problem occurs on 3.3.0
Comment 24 Amnesia 2012-04-06 13:02:57 UTC
Note: when the problem occurs xev also doesn't pick up any keystrokes, the following is given when the problem does NOT occur:

Volume up:

KeymapNotify event, serial 27, synthetic NO, window 0x0,
    keys:  2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   
           0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 

Volume down:

KeymapNotify event, serial 27, synthetic NO, window 0x0,
    keys:  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   
           0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0

KeymapNotify event, serial 27, synthetic NO, window 0x0,
    keys:  2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   
           0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 

KeymapNotify event, serial 27, synthetic NO, window 0x0,
    keys:  2   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   
           0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0  

Some other issue: It might not be related to this, but when I the problem does not yet occur, and I press the mute button, everything's fine. But when I press volume up/down after using the mute button twice (so I turned it off and on..) the sound also mutes, this can be disabled by pressing the mute button again.
Comment 25 evol 2012-04-07 19:27:10 UTC
Created attachment 72847 [details]
dmesg

Often off key after hibernate, but it happens that after a simple turn off.
from the keypad only works always fn + f10 = numlk; fn + insert = prtsc; fn + down; up; left; right = play; stop; ....
Also, bad fan reacts to the temperature necessary to use thinkfan.

3.2.0-2-amd64
Comment 26 evol 2012-04-08 09:04:17 UTC
Created attachment 72850 [details]
dmesg2

Here dmesg after 2 hours of normal operation, and then press again, disconnected
Comment 27 Amnesia 2012-04-17 16:55:54 UTC
Zhang Rui: Is there anything else I can do to aid the recovery of this bug?
Comment 28 evol 2012-04-21 13:26:22 UTC
also help than I can.!
Comment 29 evol 2012-04-21 14:31:53 UTC
lenovo.com on a new version of BIOS 81ET61WW (1.37) 03/28/2012.
renewed. we'll see what happens.
Comment 30 evol 2012-04-21 15:56:48 UTC
unchanged
Comment 31 Amnesia 2012-04-25 16:45:42 UTC
Same over here, same behavior on 1.37. Would you by any chance want a acpidump on 1.37?
Comment 32 evol 2012-04-26 15:05:16 UTC
Created attachment 73098 [details]
acpidump
bios 1.37

All keys are
Comment 33 evol 2012-04-26 15:07:01 UTC
Created attachment 73099 [details]
acpidump bios 1.37

keys do not work
Comment 34 evol 2012-06-05 11:50:10 UTC
tell me the problem kontognibut do?
Comment 35 Amnesia 2012-06-06 13:31:38 UTC
evol what did you mean with the above? What does "kontognibut" stand for?
Comment 36 evol 2012-06-06 15:05:06 UTC
Excuse my bad english =) (google.translate)

I wanted to say.
tell that someone is engaged in the problem?
whether or not to wait at all?
Comment 37 Amnesia 2012-07-03 07:51:47 UTC
Is there some way to draw more attention towards this case? I really need to upgrade my kernel but aren't able to because of this bug.
Comment 38 Alan 2012-07-03 08:57:33 UTC
bugzilla is not a support forum.,  it's a bug tracking system.

If you need to get something fixed to a timescale then talk to your distribution or whoever your provider is.
Comment 39 Jonathan Nieder 2012-09-30 01:09:28 UTC
Hi!

A few quick questions:

 1. You mentioned that everything works fine on Windows.  When you
    experience this problem, does rebooting into Windows and then
    back to Linux help?

 2. I think you mentioned that some Ubuntu versions do not have
    this problem.  Which ones?  Ideally a list summarizing the kernel
    versions you have tried and what happened with each would be useful.

 3. Have you tried a 3.5.y or newer kernel (like the one from Debian
    experimental)?  If so, how did it behave?

That should do for now.
Comment 40 Amnesia 2012-10-02 09:26:39 UTC
------------------------------------------------------------------------
 1. You mentioned that everything works fine on Windows.  When you
    experience this problem, does rebooting into Windows and then
    back to Linux help?

No it does not, a reboot is needed to resolve the problem.

 2. I think you mentioned that some Ubuntu versions do not have
    this problem.  Which ones?  Ideally a list summarizing the kernel
    versions you have tried and what happened with each would be useful.

2.6.35-22: "The problem" occurs incidentally.
2.6.36-4: "The problem" occurs often.
2.6.38-rc3: "The problem" occurs often.
2.6.39-rc6: "The problem" occurs often.
2.6.39-rc9: "The problem" occurs often.
3.0-rc5: "The problem" occurs often.
3.1.0: "The problem" occurs often.
3.2.0: "The problem" occurs often.
3.3.0: "The problem" occurs often.

 3. Have you tried a 3.5.y or newer kernel (like the one from Debian
    experimental)?  If so, how did it behave?

Yes I did, I tried 3.6, and it had exactly the same behaviour.

------------------------------------------------------------------------

Is there any more information I can gather for you in order to debug the problem?
(In reply to comment #39)
> Hi!
> 
> A few quick questions:
> 
>  1. You mentioned that everything works fine on Windows.  When you
>     experience this problem, does rebooting into Windows and then
>     back to Linux help?
> 
>  2. I think you mentioned that some Ubuntu versions do not have
>     this problem.  Which ones?  Ideally a list summarizing the kernel
>     versions you have tried and what happened with each would be useful.
> 
>  3. Have you tried a 3.5.y or newer kernel (like the one from Debian
>     experimental)?  If so, how did it behave?
> 
> That should do for now.
Comment 41 Amnesia 2012-10-02 09:28:07 UTC
Sorry I answered to quick.

 1. You mentioned that everything works fine on Windows.  When you
    experience this problem, does rebooting into Windows and then
    back to Linux help?

Rebooting does not solve this problem. The only thing that solves this is shutting down completely.
Comment 42 jayp 2012-10-04 09:55:28 UTC
I have the same issue.
For me, just shutting down and restarting does not help. I also have to disconnect the AC and remove the Battery for a few seconds. Then the FN keys and Fan work again until the problem reappears after a while.

Everything works fine on Windows.
Comment 43 Amnesia 2012-10-04 17:03:53 UTC
Same over here, I don't need to remove the battery cause I never use it. So the problem only gets fixed when it's disconnected from any power source.
Comment 44 evol 2012-10-11 18:37:42 UTC
>  1. You mentioned that everything works fine on Windows.  When you
>     experience this problem, does rebooting into Windows and then
>     back to Linux help?

Rebooting does not solve this problem. The problem disappears only after food and accumulator shutdown.

>  2. I think you mentioned that some Ubuntu versions do not have
>     this problem.  Which ones?  Ideally a list summarizing the kernel
>     versions you have tried and what happened with each would be useful.

3 days of uptime. the problem isn't revealed

evol@evol-ThinkPad-L512:~$ lsb_release -d
Description:	Ubuntu 12.04.1 LTS

evol@evol-ThinkPad-L512:~$ uname -a
Linux evol-ThinkPad-L512 3.2.0-31-generic #50-Ubuntu SMP Fri Sep 7 16:16:45 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

evol@evol-ThinkPad-L512:~$ lsmod 
Module                  Size  Used by
nfsd                  277809  2 
nfs                   356410  1 
lockd                  86161  2 nfsd,nfs
fscache                61529  1 nfs
auth_rpcgss            53380  2 nfsd,nfs
nfs_acl                12883  2 nfsd,nfs
sunrpc                245464  17 nfsd,nfs,lockd,auth_rpcgss,nfs_acl
snd_hrtimer            12744  1 
nls_utf8               12557  0 
isofs                  40257  0 
nls_iso8859_1          12713  0 
nls_cp437              16991  0 
vfat                   17585  0 
fat                    61512  1 vfat
uas                    18180  0 
usb_storage            49198  2 
snd_usb_audio         122982  2 
snd_usbmidi_lib        25395  1 snd_usb_audio
joydev                 17693  0 
snd_hda_codec_hdmi     32474  1 
parport_pc             32866  0 
bnep                   18281  2 
rfcomm                 47604  0 
ppdev                  17113  0 
bluetooth             180104  10 bnep,rfcomm
arc4                   12529  2 
snd_hda_codec_realtek   224173  1 
rtl8192se              99989  0 
psmouse                97362  0 
serio_raw              13211  0 
rtlwifi               111202  1 rtl8192se
mac80211              506816  2 rtl8192se,rtlwifi
uvcvideo               72627  0 
videodev               98259  1 uvcvideo
v4l2_compat_ioctl32    17128  1 videodev
cfg80211              205544  2 rtlwifi,mac80211
jmb38x_ms              17646  0 
memstick               16569  1 jmb38x_ms
snd_hda_intel          33773  5 
snd_hda_codec         127706  3 snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_intel
snd_hwdep              13668  2 snd_usb_audio,snd_hda_codec
thinkpad_acpi          81819  0 
snd_pcm                97188  5 snd_usb_audio,snd_hda_codec_hdmi,snd_hda_intel,snd_hda_codec
snd_seq_midi           13324  0 
radeon                804426  3 
snd_rawmidi            30748  2 snd_usbmidi_lib,snd_seq_midi
mei                    41616  0 
snd_page_alloc         18529  2 snd_hda_intel,snd_pcm
ttm                    76949  1 radeon
drm_kms_helper         46978  1 radeon
drm                   242038  5 radeon,ttm,drm_kms_helper
i2c_algo_bit           13423  1 radeon
snd_seq_midi_event     14899  1 snd_seq_midi
wmi                    19256  0 
snd_seq                61896  3 snd_seq_midi,snd_seq_midi_event
snd_timer              29990  4 snd_hrtimer,snd_pcm,snd_seq
snd_seq_device         14540  3 snd_seq_midi,snd_rawmidi,snd_seq
snd                    78855  26 snd_usb_audio,snd_usbmidi_lib,snd_hda_codec_hdmi,snd_hda_codec_realtek,snd_hda_intel,snd_hda_codec,snd_hwdep,thinkpad_acpi,snd_pcm,snd_rawmidi,snd_seq,snd_timer,snd_seq_device
soundcore              15091  1 snd
nvram                  14413  1 thinkpad_acpi
video                  19596  0 
coretemp               13525  0 
mac_hid                13253  0 
lp                     17799  0 
parport                46562  3 parport_pc,ppdev,lp
usbhid                 47199  0 
hid                    99559  1 usbhid
r8169                  62099  0 
sdhci_pci              18826  0 
sdhci                  33205  1 sdhci_pci

>  3. Have you tried a 3.5.y or newer kernel (like the one from Debian
>     experimental)?  If so, how did it behave?

I will try
Comment 45 evol 2012-10-14 19:03:10 UTC
>  3. Have you tried a 3.5.y or newer kernel (like the one from Debian
>     experimental)?  If so, how did it behave?

evol@evolaptop:~$ uname -a
Linux evolaptop 3.5-trunk-amd64 #1 SMP Debian 3.5.5-1~experimental.1 x86_64 GNU/Linux

same problem
Comment 46 evol 2012-10-19 04:05:02 UTC
collected a kernel 3.6.2.
same problem
Comment 47 Amnesia 2012-10-29 19:52:28 UTC
Just found the following references to this case:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=687853
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1048544

Is there anyway to raise attention towards this case, I really need to upgrade but am unable to due to this bug.
Comment 48 Jonathan Nieder 2012-10-29 20:27:38 UTC
(In reply to comment #47)
> Is there anyway to raise attention towards this case, I really need to
> upgrade
> but am unable to due to this bug.

Can you bisect?
Comment 49 Jonathan Nieder 2012-10-29 20:31:50 UTC
(In reply to comment #6)
> In upper comments I mentioned the problem didnt occur in 2.6.35-22, but that
> wasnt true, it just takes longer for the problem to occur on that kernel. So
> basically the problem exists in all latests kernels, but it occurs after a
> longer amount of time in 2.6.35* and 2.6.36*

Can you be more precise about this?  E.g. does it happen within one week 80% of the time on 2.6.36 but within one hour 80% of the time on 3.6.2?  (Completely made up examples of what a concrete description would look like.)

Another way to help would be to clean up the attachments by marking some as obsolete, so the signal is easier to find amid the noise.  Most of them are the same --- the output of "acpidump" on a given machine does not change from one kernel version to another.
Comment 50 amnesia 2012-10-29 20:43:13 UTC
Thanks for your response.

> Can you bisect?

No but I'm going to try. I'll keep you up to date. It's going to take a while since the bug occurs sporadically.

> Can you be more precise about this?  E.g. does it happen within one week 80%
> of the time on 2.6.36 but within one hour 80% of the time on 3.6.2? 
> (Completely
> made up examples of what a concrete description would look like.)

From 2.6.36 -> * it happens in 100% of the time, the duration until it appears is variable though, that's why it's this hard to troubleshoot.

> Another way to help would be to clean up the attachments by marking some as
> obsolete, so the signal is easier to find amid the noise.  Most of them are
> the same --- the output of "acpidump" on a given machine does not change from
> one kernel version to another.

Done.
Comment 51 Jonathan Nieder 2012-10-29 21:19:04 UTC
(In reply to comment #50)

> From 2.6.36 -> * it happens in 100% of the time, the duration until it
> appears
> is variable though, that's why it's this hard to troubleshoot.

What's a typical duration?  Most often does it happen within a week?

What happens with 2.6.35.14?  Do I understand correctly that that's the kernel you have been stuck on for everyday use?
Comment 52 amnesia 2012-10-29 21:44:35 UTC
It happens withing hours at > 2.6.36. That's correct 2.6.35.14 is the kernel I'm stuc on for everyday use. It almost never happens on that. So to summarize:

2.6.35.14 -> once in a week/month
> 2.6.37 -> once every half an hour ish.

I'm trying to bisect atm.
Comment 53 amnesia 2012-11-02 15:54:20 UTC
I've tried bisecting, but since the problem occurs randomly, it's almost impossible to get useful data. There is quite a lot of improvement since 3.6.4. Is anything remarkable changed in the ACPI codebase of 3.6.4 compared to 3.6.0 ? (unloading thinkpad_acpi doesn't have any effect btw)
Comment 54 amnesia 2012-11-02 16:00:25 UTC
The problem seems to occur more frequently when there's a high CPU utilization.
Comment 55 evol 2012-11-02 17:14:38 UTC
(In reply to comment #54)
> The problem seems to occur more frequently when there's a high CPU
> utilization.

too noticed such regularity
Comment 56 amnesia 2012-11-08 16:18:47 UTC
Jonathan, is there any other way how I can debug this? Perhaps turn on some debugging output for a specific kernel element?
Comment 57 evol 2013-02-09 16:59:00 UTC
exact same problem ubuntu 12.10
Comment 58 Zhang Rui 2013-06-14 08:19:22 UTC
according to the acpidump attached, as there is no ACPI fan in this machine, I assume it is the thinkpad_acpi driver that control the fan&hotkey.

Matthew,
can you help look at this issue?
Comment 59 Jean Delvare 2013-06-21 09:58:53 UTC
We had a related report on the lm-sensors list:
http://lists.lm-sensors.org/pipermail/lm-sensors/2013-June/039143.html

Anyone working on this bug?
Comment 60 evol 2013-07-28 15:55:36 UTC
exact same problem ubuntu 13.04
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1048544