On many Dell-notebook systems, when resuming from ram without the battery, the temperature sensors and also the fan are not working, causing the system to overheat. It's a form of bug 14667 just for the specific case when the battery is not inserted. When you insert the battery afterwards, everything starts to work properly, but that's rather a workaround and not a solution.
Created attachment 52042 [details] dmesg after boot start dmesg after booting 2.6.38.1
Created attachment 52052 [details] acpidum 2.6.38.1
Created attachment 52062 [details] dmesg after suspend
Created attachment 52072 [details] dmesg after inserting the battery After battery insertion everything works fine.
I was wondering: shouldn't something show up in dmesg when I put the battery back in? What's about the "unknown key pressed"? I didn't press any key, by the way...
The same event also occurs on my Notebook. It seems to get triggered by inserting/removing the battery
Created attachment 66682 [details] DSDT.hex Please override the DSDT. http://www.lesswatts.org/projects/acpi/overridingDSDT.php Run "echo 1 > /sys/module/acpi/parameters/aml_debug_output" Run the sensors when battery is on and unplugged. Attach the output of dmesg.
It's great that the kernel bugzilla is back. Can you please verify if the problem still exists in the latest upstream kernel?
It certainly still exists in debians current testing-release (3.1.x) I'm just compiling a vanilla 3.2.1 with that custom DSDT provided Hang on
I did compile the Kernel using that attached DSDT Using that attached DSDT doesn't give me all temperatures, only those using the module coretemp, the section ------ acpitz-virtual-0 Adapter: Virtual device temp1: +0.0°C (crit = +100.0°C) temp2: +0.0°C (crit = +100.0°C) temp3: +0.0°C (crit = +100.0°C) ------ does not appear anymore, still the gnome sensor-applet seems to detect it, but gives error, when reading it (using libsensors) Note: The above output was taken, when not overriding the dsdt and causing the mentioned bug Besides gnome isn't offering me the option to suspend anymore. When trying to force a suspend using s2ram --force, it returns "s2ram_do: No such device" Was the DSDT meant for me at all or was it meant for Alessandros ACPI-Dump? I will attach mine, just in case
Created attachment 72114 [details] Another ACPI-Dump of an affected machine In case that attached DSDT wasn't meant for my machine, thus causing bugs/errors and mine is needed
One more thing to add: Sensors and suspend are working without that custom DSDT. But the bug still exists in the current 3.2.1
I also am still affected by this on my Dell Studio 1537. acpitz-virtual-0 reports 0°C three times upon resuming. Plugging in the battery fixes it and both acpitz-virtual-0 and coretemp-isa-0000 report ~30 degrees Celsius. If you need any more system data or things to test, please ping.
Sorry for later response, does this bug exist in the latest upstream kernel?
I can still reproduce this bug in the latest Arch Linux kernel (v. 3.8.6). With that persistance, I doubt it will be fixed in the latest stable vanilla kernel. Didn't try the current release candidate though.
Hi: Sorry for later response. I check the acpi table. All temperatures are from EC(Embedded Controller) and temp sensors normally are connected to EC. Linux doesn't do anything with EC during system suspend and resume. So this may a Bios issue. Does this happen on the Windows?
ping...
Sorry for the delay. Unfortunately, I can't test this right now as I don't have a Windows installation any more. I'll try and find the Vista DVD it came with to give it a quick reinstall. As I don't have a battery anymore for my Studio 1535 and solely rely on the wall plug, it may prove difficult for me to successfully test patches and the like on a before-after basis - but oh well.
Oops, sorry, didn't notice the first ping. I'll test it later once I get to reboot into Windows, but I doubt it. It is still strangely similar to Bug 14667, but I'm not sure, what was done over there to fix it. Might have been worked around a BIOS bug? I don't know. Anyways, I will check Windows' behavious later once I am able to reboot
Okay; finally got around to start Windows, pulled the battery, selected sleep (suspend 2 ram), let it wake up and tested if the fan still works. In fact it did when putting some load onto the cpu and gpu after waking up. Didn't read temperatures, but I highly doubt those wouldn't work, when the fan works afterwards (unlike on linux) Also, I can still confirm this issue on Kernel 3.10
(In reply to Simon Gebler from comment #21) > Okay; finally got around to start Windows, pulled the battery, selected > sleep (suspend 2 ram), let it wake up and tested if the fan still works. > In fact it did when putting some load onto the cpu and gpu after waking up. > Didn't read temperatures, but I highly doubt those wouldn't work, when the > fan works afterwards (unlike on linux) > Could you check whether the temperature is correct on Winodws after resume? At last, Linux depends on the temperature to decide the fan status. So if temperature doesn't work and it also will affect fan. > Also, I can still confirm this issue on Kernel 3.10
> Could you check whether the temperature is correct on Winodws after resume? > At last, Linux depends on the temperature to decide the fan status. So if > temperature doesn't work and it also will affect fan. Yes, the temperatures are still working afterwards. And it really is Linux, that is actively controlling the fan or some ACPI stuff, that stop working like the temperatures are? In any case - it does not affect the notebook(s) on Windows
Created attachment 107182 [details] debug.patch Hi, could you try this patch?
(In reply to Lan Tianyu from comment #24) > Created attachment 107182 [details] > debug.patch > > Hi, could you try this patch? Sure! Diffed /var/log/messages on tag v3.10 with and without said patch. With it, I get these messages that don't appear on the unmodified kernel upon resume: ACPI Error: No handler for Region [ECRM] (ffff88013b027048) [EmbeddedControl] (20130328/evregion-161) ACPI Error: Region EmbeddedControl (ID=3) has no handler (20130328/exfldio-305) ACPI Error: Method parse/execution failed [\_SB_.LID0._PSW] (Node ffff88013b035028), AE_NOT_EXIST (20130328/psparse-537) ACPI: _PSW execution failed Hope that helps.
ping Tianyu.
Sorry for later response. Could you check whether the issue still takes place in the latest upstream kernel v3.16?
Current standard Arch Linux Kernel $uname -a Linux Mr-Radar 3.15.4-1-ARCH #1 SMP PREEMPT Mon Jul 7 07:42:54 CEST 2014 x86_64 GNU/Linux After suspend and resume the temperatures seem to be working finally Yet I was trying to stresstest the CPU a little and the fan still doesn't seem to turn on, even on 80°C, where it was properly running before suspend and resume. Inserting the battery made it kick back to life immediately. Freshly compiled Linux mainline (3.16 rc4): $ uname -a Linux Mr-Radar 3.16.0-1-mainline #1 SMP PREEMPT Thu Jul 10 17:57:35 CEST 2014 x86_64 GNU/Linux Temperatures work. Fan does not, jumps into overdrive when reinserting it at 80°C core temperature So it has been fixed partially. Temperature reporting works, fan still does not work after suspend-to-ram and waking back up without battery.
This seems a thermal bug. Could you provide the output of the following command when the bug take places? grep . /sys/class/thermal/cooling_device*/* grep . /sys/class/thermal/thermal_zone*/*
please try the patches at https://bugzilla.kernel.org/show_bug.cgi?id=78201#c10 https://bugzilla.kernel.org/show_bug.cgi?id=78201#c20 and see if the problem still exist.
Created attachment 143631 [details] Commands from comment 30
Looking at the output, I checked my "sensors" command again. I think I missed that it doesn't seem to output any ACPI-temperatures. And according to my previous attachment, they're still zero. SO I just totally missed some temperatures missing and looked at the cpucore and radeon temperatures instead. can't see "acpitz-virtual-0" even after a sensors-detect. Which is kinda odd, but I suppose every part of the bug is still valid and not fixed. I will try those patches though and report back
Tried the patch and no change
Created attachment 147961 [details] dsdt.hex Could you try this dsdt.hex? Following this link https://01.org/linux-acpi/documentation/overriding-dsdt?langredirect=1 Put it where the kernel build can include it: $ cp DSDT.hex $SRC/include/ Add this to the kernel .config: CONFIG_STANDALONE=n CONFIG_ACPI_CUSTOM_DSDT=y CONFIG_ACPI_CUSTOM_DSDT_FILE="DSDT.hex" Make the kernel and off you go!
I can report no changes in this issue with the other DSDT Compiled a Kernel with said options and the linked DSDT, but no changes. Temperatures are 0 when resuming, no fan. It's still working as before with battery. Aside from a confirmation about the override at boot... (Aug 25 16:24:35 Mr-Radar kernel: ACPI: Override [DSDT-CANTIGA ], this is unsafe: tainting kernel Aug 25 16:24:35 Mr-Radar kernel: Disabling lock debugging due to kernel taint Aug 25 16:24:35 Mr-Radar kernel: ACPI: DSDT 0x00000000BDFE8000 Logical table override, new table: 0xFFFFFFFF8188F380 Aug 25 16:24:35 Mr-Radar kernel: ACPI: DSDT 0xFFFFFFFF8188F380 007268 (v02 Intel CANTIGA 06040000 INTL 20140724) ...I can't see any additional log-output about the problem when suspending and resuming. Or is there anything else I should look for?
for the fan not working issue, there is not ACPI fan available on this laptop, which means the Fan is not controlled via ACPI. for the temp sensor issue, please attach the output of "grep . /sys/class/thermal/thermal_zone*/*" 1. after resuming, with battery. 2. after resuming, without battery. 3. before suspending, with battery. 4. before suspending, without battery.
BTW, please check if there are any BIOS options related with Thermal control. please try to upgrade to the latest BIOS, if available.
I'm seeing a similar issue here on a Toshiba Satellite P50-B-108. If I resume the system without the AC adapter plugged in, then the fan stops working and the machine quickly overheats, leading to thermal throttling and a very hot case (to the point where you can't touch it anymore). Sensors still work here though, it's just the fan control that 'disappears'. @Zhang: Should I open a separate bug for this, or should I post further information in this bug?
Created attachment 160731 [details] Output for some cases This is the output (Might have sane temperature differences) for: Before suspend with and without battery After suspend with battey
Created attachment 160741 [details] Last case output And this is the output for the other case. After suspend without battery.
There are no options related to any thermal controls in the BIOS and it already is the lastest one available. Not sure how the fan is controlled, might be some Dell-way. It's combined for GPU and CPU, might be some off platform code in the device's firmware, what do I know. Either way, I figured it will probably be related, especially when looking at that other fixed bug in my initial comment on this bug. Maybe it'll be okay again, whenever the temperatures are working properly after resuming from batteryless suspend. Anyways, the only proper differing case from that command is after resuming without battery, all the temperatures seem to be 0. And as mentioned before, it is fixed shortly after reinserting the battery.
Created attachment 166731 [details] customized DSDT to see why _TMP returns 0 First, please check if the problem still exists in the latest upstream kernel. If yes, please apply the customized DSDT attached, and after boot, run "echo 1 > /sys/module/acpi/parameters/aml_debug_output", and then please re-attach the dmesg output after "grep . /sys/class/thermal/thermal*/temp" both before and after resume, and with & without battery inserted.
Created attachment 166741 [details] customized DSDT to see why _TMP returns 0
Created attachment 166751 [details] customized DSDT to see why _TMP returns 0
note: please test using the dsdt attached in comment #45
Sorry, life is very busy lately. Currently building a kernel with that DSDT, will test once I don't have anything productive up and am able to reboot and test
Created attachment 168711 [details] Debug output as per comment #43 There we go. Did a test with battery in and one with removed battery, additionally I added the values the I get after 'fixing' the problem by reinserting the battery
Created attachment 168721 [details] Dmesg output during the resume Just in case of interest, some output that was generated during a resume procedure, before I was typing anything new on the terminal.
Bug still applies to the current 3.19.2 Kernel. But now I can see fans as well! Attaching the output of "grep . /sys/class/thermal/cooling_device*/*" seperately. Yet it always seems to be the same, no matter what. Output of /sys/class/thermal/thermal*/temp is unchanged In other news, it's more or less related, I get some more sensors output by the i8k module, now showing 4 temperature zones and 2 fans (despite the notebook only having one) The output seems to be equally affected by this problem, showing 0 RPM and 0°C while the bug is active (and querying sensors at any time with that module enabled freezes the notebook for a bit, possibly letting it 'hold' ctrl afterwards, but that's mostly unrelated here I suppose) I'm currently just wondering why I can't get the acpi temperatures detected by the sensors-detect program, so it is shown by sensors. ACPI Temps are visible in the sensors-plugin for the XFCE-Panel
Created attachment 172541 [details] Output, grep for cooling_device*/*
[root@Mr-Radar sige]# grep . /sys/class/thermal/thermal*/temp /sys/class/thermal/thermal_zone0/temp:0 /sys/class/thermal/thermal_zone1/temp:0 /sys/class/thermal/thermal_zone2/temp:0 [ 463.628391] [ACPI Debug 34131653] String [0x07] "In _TMP" [ 463.628429] [ACPI Debug 34131699] Integer 0x0000000000000001 [ 463.628440] [ACPI Debug 34131710] Integer 0x0000000000000000 [ 463.629067] [ACPI Debug 34132335] Integer 0x0000000000000000 [ 463.630099] [ACPI Debug 34133367] String [0x07] "In _TMP" [ 463.630118] [ACPI Debug 34133388] Integer 0x0000000000000001 [ 463.630127] [ACPI Debug 34133397] Integer 0x0000000000000000 [ 463.635051] [ACPI Debug 34138316] Integer 0x0000000000000000 [ 463.636052] [ACPI Debug 34139318] String [0x07] "In _TMP" [ 463.636071] [ACPI Debug 34139341] Integer 0x0000000000000001 [ 463.636078] [ACPI Debug 34139348] Integer 0x0000000000000000 [ 463.638229] [ACPI Debug 34141492] Integer 0x0000000000000000 We can see that, when the bug occurs, some EC operation region field returns 0, instead of a meaningful value, including \_SB.PCI0.LPCB.EC0.CPUT \_SB.PCI0.LPCB.EC0.SYST \_SB.PCI0.LPCB.EC0.VGAT Thus, IMO, this is a BIOS/Firmware problem. Anyway, to make a double check, Lv, can you please ask Simon do some test to make sure EC is working well when the problem occurs.
Checked with Lv, EC driver has no impact of the EC operation region field values. So close it as this is a BIOS bug. Please check if there is any BIOS upgrade available, please feel free to re-open it if it can be proven not a BIOS bug, e.g. the problem can not be reproduced in Windows.
Yes, EC transactions are serial and should have nothing to do with the return value of the RD_EC command. All known EC state machine issues after 3.19 are all related to QR_EC command not RD_EC command. So this is not an EC related issue. We can confirm that after 4.2-rcx released. I'll ping here when the fixes are ready in the upstream. The problem might be an thermal driver issue or ACPICA core issue. Let's also confirm this after some tracing facilities upstreamed. Thanks -Lv