Bug 29412 - fans running at full-speed after resume from suspend with radeon and KMS
Summary: fans running at full-speed after resume from suspend with radeon and KMS
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Alex Deucher
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-02-18 22:46 UTC by Jon Dowland
Modified: 2012-04-16 23:36 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.38-rc5+
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg from v2.6.32-rc1^ (91.92 KB, text/plain)
2012-03-17 11:49 UTC, Jon Dowland
Details
possible fix (1.67 KB, patch)
2012-03-19 16:42 UTC, Alex Deucher
Details | Diff
alternate fix (1.96 KB, patch)
2012-03-27 13:41 UTC, Alex Deucher
Details | Diff

Description Jon Dowland 2011-02-18 22:46:30 UTC
Hi,

Originally reported against the wrong product, here:
https://bugzilla.kernel.org/show_bug.cgi?id=18552

I have an ATI Radeon X850XT PCIE card plugged into an Intel DG35EC motherboard.

With kernels since 2.6.32-rc1 and most recently tested with mainline commit 2a324ce7b79a3a90cc2d4ade5d5f960a99000caa (2.6.38-rc5+),  if I attempt a suspend-to-ram, upon resume, the fan on my ATI card remains at full speed (loud), and does not throttle back to quiet (normal behaviour).

Booting with 'radeon.modeset=0' prevents this.

Please let me know what further information is needed.

Thanks!
Comment 1 Alex Deucher 2011-02-18 23:14:28 UTC
Can you bisect?
Comment 2 Jon Dowland 2011-03-12 13:43:17 UTC
I did bisect to identify the first bad commit as 17d857be649a21ca90008c6dc425d849fa83db5c , which is merely tagging 2.6.32-rc1.  I suspect some user space component is only enabling KMS if the kernel version equals or exceeds this tag and that's why it is only triggered from that commit.

Can you give me any hints on how to do a more useful bisect? Perhaps limiting the scope to a particular source path?
Comment 3 Jonathan Nieder 2012-03-16 19:05:06 UTC
(In reply to comment #2)

> I did bisect to identify the first bad commit as
> 17d857be649a21ca90008c6dc425d849fa83db5c , which is merely tagging
> 2.6.32-rc1.
> I suspect some user space component is only enabling KMS if the kernel
> version
> equals or exceeds this tag and that's why it is only triggered from that
> commit.

Thanks.  Odd.  The test the radeon driver uses is

    if (info->dri->pKernelDRMVersion->version_minor >= 5)
      ginfo.request = RADEON_INFO_ACCEL_WORKING2;
    else
      ginfo.request = RADEON_INFO_ACCEL_WORKING;

which wouldn't be affected by utsversion.  So, questions:

1) Is that bisection result reproducible?  I.e., if you do:

 cd linux
 git checkout v2.6.32-rc1
 cp /boot/config-$(uname -r) .config; # current configuration
 make localmodconfig; # optional: minimize configuration
 make deb-pkg; # optionally with -j<num> for parallel build
 dpkg -i ../<name of package>; # as root
 reboot
 ... test test test ...

 cd linux
 git checkout HEAD^
 make deb-pkg; # maybe with -j4
 dpkg -i ../<name of package>
 reboot
 ... test some more ...

does the first package produced reproduce the misbehavior and
the second one not?

2) Can you reproduce this without X?  What happens if you boot
in recovery mode, run "modprobe radeon" to make sure the driver
is loaded, and suspend?

3) An attachment with "dmesg" output (and /var/log/Xorg.0.log if
X seems to be involved) from a non-working kernel would also be
interesting.
Comment 4 Jon Dowland 2012-03-17 11:27:26 UTC
Ok retesting v2.6.32-rc1 first, I had trouble building it (gcc 4.6 won't work); trouble getting the initramfs to work, and trouble getting X to start (some kind of race condition / bad failure mode in the gdm3 init script launched ~300 Xorg instances before I killed it).

However, I could reproduce the issue without X: single-user mode, no radeon module loaded: fans climbed down as they should. Single-user mode, radeon driver loaded: fans didn't climb down.

I'll check v2.6.32-rc1^ next, then I'll collect dmesg/Xorg.0.logs as appropriate.

(I confirmed this with 3.3~rc6-1~experimental.1 in the mean time)
Comment 5 Jon Dowland 2012-03-17 11:49:13 UTC
Created attachment 72633 [details]
dmesg from v2.6.32-rc1^

Frustratingly, the issue reared its head with v2.6.32-rc1^.  This dmesg is from a single user mode session, two suspends with "modprobe radeon" in the middle.

It's getting hard to run kernels this old (even though they aren't that old!) notice the error loading the radeon firmware: no idea why it hasn't managed, other kernels do (and it's there).  Also udev refuses to start with < 2.6.32.

I may have to set up a test root with an older userspace (squeeze perhaps) to go further :(

Does this basically invalidate my previous bisect?
Comment 6 Jonathan Nieder 2012-03-19 08:44:40 UTC
udev should cope fine with kernels < 2.6.32 iirc as long as CONFIG_SYSFS_DEPRECATED is not set.
Comment 7 Jonathan Nieder 2012-03-19 08:54:19 UTC
The painful memories are coming back to me: udev also requires v2.6.28-rc6~45 ("reintroduce accept4") and dup3 and friends from 2.6.27.
Comment 8 Jérôme Glisse 2012-03-19 16:22:32 UTC
It's just a kms issue no need to bisect. Something is not restored properly.
Comment 9 Alex Deucher 2012-03-19 16:42:50 UTC
Created attachment 72654 [details]
possible fix

Does this patch fix the issue?
Comment 10 Jon Dowland 2012-03-20 15:40:16 UTC
I'll have access to the machine again this Friday; I'll try it this weekend. Thanks!
Comment 11 Jon Dowland 2012-03-27 09:21:24 UTC
Sadly, this patch doesn't seem to solve it.  I'll add a printk and re-try just to be doubly sure I'm not messing things up and loading the wrong module.
Comment 12 Alex Deucher 2012-03-27 13:41:08 UTC
Created attachment 72731 [details]
alternate fix

After you verify the first patch does not fix the issue, you can try this one.  Can you also attach a copy of your vbios?  To get a copy of your vbios:

(as root)
(use lspci to get the bus id)
cd /sys/bus/pci/devices/<pci bus id>
echo 1 > rom
cat rom > /tmp/vbios.rom
echo 0 > rom
Comment 13 Alex Deucher 2012-03-27 13:42:05 UTC
Ignore the last hunk of that patch (rv770.c).
Comment 14 Jon Dowland 2012-03-29 18:08:59 UTC
That second patch sorts it! Thanks! I booted to single-user; modprobe radeon; pm-suspend; resumed: fine. Moved to multi-user; logged into GNOME; suspended via GNOME menu; resumed: fine again.

Would you still like the vbios?

Cheers!
Comment 15 Alex Deucher 2012-03-29 23:01:52 UTC
Perfect.  No need for the vbios.  I'll send the patch to Dave.
Comment 16 Florian Mickler 2012-04-16 21:17:12 UTC
A patch referencing this bug report has been merged in Linux v3.4-rc2:

commit 402976fe51b2d1a58a29ba06fa1ca5ace3a4cdcd
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Thu Mar 29 19:04:08 2012 -0400

    drm/radeon/kms: fix fans after resume
Comment 17 Alex Deucher 2012-04-16 23:34:23 UTC
This bug can be closed.

Note You need to log in before you can comment on or make changes to this bug.