Bug 12846

Summary: Regression issue with kernel 2.6.29-rc6-git1: high power consumption during sleep
Product: Drivers Reporter: Raymond Wooninck (tittiatcoke)
Component: Video(Other)Assignee: Benjamin Herrenschmidt (benh)
Status: CLOSED CODE_FIX    
Severity: high CC: benh, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.29-rc6-git1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 12398, 56331    
Attachments: Logfile that shows power consumption (kernel 2.6.27-13)
Logfile that shows power consumption (kernel 2.6.29-rc6-git1)
dmesg output for kernel 2.6.27-13
dmesg output for kernel 2.6.29-rc6-git1
Script from thinkwiki.org which is used to test how much power is consumed
ACPIDUMP file
debug pci power stuff
Dmesg output with additional printk
Here's the patch that I sent privately that fixes it

Description Raymond Wooninck 2009-03-09 08:18:16 UTC
Latest working kernel version:   2.6.27
Earliest failing kernel version: 2.6.29
Distribution:  openSUSE Factory (11.2)
Hardware Environment:   IBM ThinkPad T42
Software Environment:   
Problem Description:  
When I suspend my notebook to sleep (state S3), then the power consumption
remains high and drains the battery in a couple of hours. When I boot with the
older kernel version (2.6.27.13) then I have a normal power drainage during
sleep.

I have tested this with the acpi-sleep script from thinkwiki.org, which is used
to test if an IBM laptop is suffering from the radeon bug.

This is exactly the same system and I am just booting with either on of the kernels. So nothing in the hardware nor distribution is changed apart from the used kernel.


Steps to reproduce: Execute the attached script and validate the values in the /var/log/battery.log file.
Comment 1 Raymond Wooninck 2009-03-09 08:18:55 UTC
Created attachment 20466 [details]
Logfile that shows power consumption (kernel 2.6.27-13)
Comment 2 Raymond Wooninck 2009-03-09 08:19:13 UTC
Created attachment 20467 [details]
Logfile that shows power consumption (kernel 2.6.29-rc6-git1)
Comment 3 Raymond Wooninck 2009-03-09 08:19:32 UTC
Created attachment 20468 [details]
dmesg output for kernel 2.6.27-13
Comment 4 Raymond Wooninck 2009-03-09 08:19:55 UTC
Created attachment 20469 [details]
dmesg output for kernel 2.6.29-rc6-git1
Comment 5 Raymond Wooninck 2009-03-09 08:20:17 UTC
Created attachment 20470 [details]
Script from thinkwiki.org which is used to test how much power is consumed
Comment 6 Andrew Morton 2009-03-09 10:30:21 UTC
Reassigned to acpi, marked as a regression.
Comment 7 Rafael J. Wysocki 2009-03-09 14:10:23 UTC
Raymond, can you also test 2.6.28, please?
Comment 8 ykzhao 2009-03-09 17:57:02 UTC
Hi, Raymond
    Will you please attach the output of acpidump?
    From the description it seems that this is a regression. Will you please use git-bisect to identify the commit which causes the regression?
    Thanks.
Comment 9 Raymond Wooninck 2009-03-10 10:38:42 UTC
Created attachment 20486 [details]
ACPIDUMP file
Comment 10 Raymond Wooninck 2009-03-10 10:42:21 UTC
Hi Rafael,

Unfortunately openSUSE does not offer the 2.6.28 kernel, so I am currently trying to build it. As far as I can remember from the past, I ran a couple of weeks on this one without any complaints. So i assume that this issue was not in 2.6.28, however I will test and attach the results here. 

Hi Yakui, 

I have attached the acpidump of my system, but unfortunately I can not use the git-bisect command. I have just a small laptop and I do not have enough space for it. I am really sorry for this. 

Regards

Raymond
Comment 11 Zhang Rui 2009-03-12 20:05:20 UTC
please verify if the problem still exists with commit 1fb25cb8b83e85f5bf1a4adb3c9a254c4ce92405 reverted.
Comment 12 Zhang Rui 2009-03-12 20:25:43 UTC
this seems like a duplicate of bug #3022, but I wonder why this is a regression.
Comment 13 Raymond Wooninck 2009-03-13 03:07:44 UTC
Hi, 

I am still working on getting the kernel 2.6.28.7 compiled through the openSUSE build service. That is unfortunately the only place where I would be able to build kernels. 

I will check again among the openSUSE community to validate if others have the same issue. Also once I have the 2.6.28.7 kernel build, I will try to pull the kernel sources with git and then to revert the above indicated commit. 

Thanks 

Raymond
Comment 14 Raymond Wooninck 2009-03-13 03:13:00 UTC
With regards to the duplicated of bug #3022, I am not sure about this. In the logfiles it can be seen that the radeon chip is reporting to go the D2 state. The same lines are reported when suspending on the working 2.6.27 kernel. 
Comment 15 Raymond Wooninck 2009-03-15 05:40:07 UTC
Hi Rui,

>please verify if the problem still exists with commit
>1fb25cb8b83e85f5bf1a4adb3c9a254c4ce92405 reverted.

I finally managed to built the 2.6.28-rc8 kernel with this commit reverted. I have tested again according to the same method and the result is that the correct behavior is there. 

The logfile /var/log/battery.log shows the following :

Sun Mar 15 12:51:19 CET 2009
before: 29860 mWh
after: 29530 mWh
diff: -330 mWh
seconds: 2613 sec
result: -454 mW
Congratulations, your model seems NOT to be affected.

This result is comparable with the kernel 2.6.27, so this commit causes the issue for the radeon chips. 

Regards

Raymond
Comment 16 Rafael J. Wysocki 2009-03-15 09:43:22 UTC
Caused by:

commit 1fb25cb8b83e85f5bf1a4adb3c9a254c4ce92405
Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Date:   Thu Feb 5 12:06:52 2009 +1100

    radeonfb: Fix resume from D3Cold on some platforms


First-Bad-Commit : 1fb25cb8b83e85f5bf1a4adb3c9a254c4ce92405

In fact, 2.6.29-rc8 contains patches that are supposed to fix the problems exposed by the above commit.  It the problem still present in vanilla 2.6.29-rc8?
Comment 17 Raymond Wooninck 2009-03-15 09:49:51 UTC
Hi Rafael,

I have also tried 2.6.29-rc8 first, but this has exactly the same issue as rc6 and rc7. Only reverting the indicated commit, solved the problem. 

Maybe the patches in rc8 were more focused on resolving the issues with resuming and not on power drainage during sleep ??
Comment 18 Rafael J. Wysocki 2009-03-15 09:56:03 UTC
Could be.  CCing Ben.
Comment 19 Benjamin Herrenschmidt 2009-03-15 16:52:26 UTC
The difference here seems to be that before that commmit, we would manually stick the chip into D2 state by whacking its config space, while after that commit we use pci_set_power_state(...,PCI_D2)

Raymond, can you try editing radeon_set_suspend() in radeon_pm.c and basically replace the pci_set_power_state(rinfo->pdev, PCI_D2) call with what was there before, that is:

		for (;;) {
			pci_read_config_word(
				rinfo->pdev, rinfo->pm_reg+PCI_PM_CTRL,
				&pwr_cmd);
			if (pwr_cmd & 2)
				break;			
			pci_write_config_word(
				rinfo->pdev, rinfo->pm_reg+PCI_PM_CTRL,
				(pwr_cmd & ~PCI_PM_CTRL_STATE_MASK) | 2);
			mdelay(500);
		}

You'll also need to add back

    u16 pwr_cmd;

at the top of the function.
Comment 20 Zhang Rui 2009-03-15 18:28:30 UTC
re-assign to benjamin. :)
Comment 21 Raymond Wooninck 2009-03-16 05:05:58 UTC
Hi Benjamin,

I have applied the patch to the standard 2.6.29-rc8 kernel and this is working. During sleep the system is having the expected battery consumption.  

Please find the output of /var/log/battery.log

Mon Mar 16 12:32:25 CET 2009
before: 29260 mWh
after: 29050 mWh
diff: -210 mWh
seconds: 1521 sec
result: -497 mW
Congratulations, your model seems NOT to be affected.

Regards

Raymond
Comment 22 Benjamin Herrenschmidt 2009-03-16 19:26:27 UTC
Now that's interesting. So something isn't working in pci_set_power_state(). Any chance you can sprinkle some printk's inside that function and the function it itself calls to see what it's doing ? Is it failing with an error ? Or the problem is purely due to the fact that we do a loop trying multiple times to set D2 state until it "sticks" while the core isn't ?
Comment 23 Raymond Wooninck 2009-03-17 01:29:25 UTC
Hi Benjamin,

Unfortunately I am not a developer, but I know how to change files and create patchfiles :-)  If you tell me what printk's to put and where to find the pci_set_power_state function, then I can do this and I can deliver you the output. 

Another thing is that this might also have to do with the old bug around the Radeon chip and the difficulty to get it into sleep mode. I have such "faulty" radeon chip and therefore it might be that the standard pci_set_power_state does not work. As also mentioned in comment #12, it seems as if the old bug is back again. I know that I definitely require the Radeon Workaround for Thinkpad notebooks in order to have it "sleeping" correctly. 

But please let me know what to do and I will do it :-)

Regards

Raymond
Comment 24 Benjamin Herrenschmidt 2009-03-18 20:33:35 UTC
Created attachment 20589 [details]
debug pci power stuff

This patch adds various printk's to the PCI set power state code,
please try with that (and without the previous change that fixes the problem), and send me the output in dmesg
Comment 25 Raymond Wooninck 2009-03-19 01:27:14 UTC
Hi Benjamin,

Thanks for the patch. Will rebuild the kernel and send you the dmesg as soon as possible. 

Regards

Raymond
Comment 26 Raymond Wooninck 2009-03-19 09:14:29 UTC
Hi Benjamin,

Please find the dmesg attached. Hope that this helps .

Regards

Raymond
Comment 27 Raymond Wooninck 2009-03-19 09:15:20 UTC
Created attachment 20597 [details]
Dmesg output with additional printk
Comment 28 Benjamin Herrenschmidt 2009-03-20 15:01:35 UTC
Created attachment 20613 [details]
Here's the patch that I sent privately that fixes it