After modprobing radeon driver with runpm=1 notebook display panel output is immediatelly black (probably turned off) and after one or two seconds kernel freeze/crash (sysrq not working too). I was able to dump & sync syslog kernel output before freeze (called sync command in infinite loop on background). See attachment where is log from syslog daemon after modprobing radeon kernel module (with runpm=1). I'm not able to provide any other debug output as display is off and kernel crashing... Problem is reproducable always on notebook Dell Latitude E6440 which have muxless AMD Radeon HD 8690M graphic card. Black screen is probably caused by intel driver at line: [ 171.913779] i915: switched off And kernel crash by NULL derefence: [ 173.442690] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 My AMD graphics card is identified by lspci -nn as: 01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Sun XT [Radeon HD 8670A/8670M/8690M] [1002:6660]
Created attachment 131981 [details] syslog output
what kernel are you using? Can you attach your full dmesg output without radeon.runpm=1?
Created attachment 132211 [details] dmesg output I'm using version 3.14 (as specified in bugzilla). Dmesg output from kernel without any radeon params is attached.
Your system does not appear to have the ATPX acpi methods that are required for runtime pm to work properly (required to power off the dGPU). You should see something like: ATPX version X, functions 0xXXXXXXXX in your dmesg output.
So I cannot turn off dGPU when it is not used? Also I think that kernel should not crash when booting with (maybe incorrect?) param runpm.
Btw, I looked into DSDT/SSDT acpi tables and there is ATPX method (in SSDT7, scope \_SB.PCI0.GFX0).
(In reply to Pali Rohár from comment #5) > So I cannot turn off dGPU when it is not used? > Correct. The driver requires that method to power on/off the dGPU. > Also I think that kernel should not crash when booting with (maybe > incorrect?) param runpm. Yes, that should probably be fixed. (In reply to Pali Rohár from comment #6) > Btw, I looked into DSDT/SSDT acpi tables and there is ATPX method (in SSDT7, > scope \_SB.PCI0.GFX0). Did you enable vgaswitcheroo support in your kernel config?
(In reply to Alex Deucher from comment #7) > (In reply to Pali Rohár from comment #6) > > Btw, I looked into DSDT/SSDT acpi tables and there is ATPX method (in > SSDT7, > > scope \_SB.PCI0.GFX0). > > Did you enable vgaswitcheroo support in your kernel config? Kernel is from http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.14-trusty/ And in file /boot/config-3.14.0-031400-generic I see: CONFIG_VGA_SWITCHEROO=y
I looked into radeon_atpx_handler.c code and I found reason why radeon kernel driver does not detect ATPX... First here is lspci output: 00:02.0 VGA compatible controller [0300]: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller [8086:0416] (rev 06) 01:00.0 Display controller [0380]: Advanced Micro Devices, Inc. [AMD/ATI] Sun XT [Radeon HD 8670A/8670M/8690M] [1002:6660] Second here is relevant code of function radeon_atpx_detect(void) from file radeon_atpx_handler.c int vga_count = 0; while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != NULL) { vga_count++; has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true); } if (has_atpx && vga_count == 2) { ... ATPX was detected ... } And some defines (from pci_ids.h): #define PCI_CLASS_DISPLAY_VGA 0x0300 #define PCI_CLASS_DISPLAY_OTHER 0x0380 Because my Radeon card has pci class 0380 and not 0300 it is not checked for ATPX in while loop and so vgaswitcheroo is not enabled. I created this quick & dirty patch and after that runpm=1 working without any crash. --- radeon_atpx_handler.c.orig 2014-04-14 17:36:36.583744668 +0200 +++ radeon_atpx_handler.c 2014-04-14 23:50:53.354492060 +0200 @@ -528,6 +528,12 @@ static bool radeon_atpx_detect(void) has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true); } + while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_OTHER << 8, pdev)) != NULL) { + vga_count++; + + has_atpx |= (radeon_atpx_pci_probe_handle(pdev) == true); + } + if (has_atpx && vga_count == 2) { acpi_get_name(radeon_atpx_priv.atpx.handle, ACPI_FULL_PATHNAME, &buffer); printk(KERN_INFO "VGA switcheroo: detected switching method %s handle\n", Now also vgaswitcheroo debugfs file appeared: $ sudo cat /sys/kernel/debug/vgaswitcheroo/switch 0:IGD:+:Pwr:0000:00:02.0 1:DIS: :DynPwr:0000:01:00.0 Alex, I think that now you have everything needed for implementing proper fix for this bug.
Created attachment 132301 [details] fix ATPX detection on non-VGA dGPUs Thanks for sorting this out.
Created attachment 132311 [details] avoid a possible crash when runpm is forced on non-ATPX systems Fix runpm=1 handling on non-PX systems.
(In reply to Alex Deucher from comment #10) > Created attachment 132301 [details] > fix ATPX detection on non-VGA dGPUs > > Thanks for sorting this out. This patch is same as mine, already tested and is working.
(In reply to Alex Deucher from comment #11) > Created attachment 132311 [details] > avoid a possible crash when runpm is forced on non-ATPX systems > > Fix runpm=1 handling on non-PX systems. It is not possible to apply this patch on top of 3.14 nor on top of linus master (55101e2d6ce1c780f6ee8fee5f37306971aac6cd) linux/drivers/gpu/drm/radeon$ patch -p5 -i 0002-drm-radeon-don-t-allow-runpm-1-on-systems-with-out-A.patch patching file radeon_kms.c Hunk #1 FAILED at 107. 1 out of 1 hunk FAILED -- saving rejects to file radeon_kms.c.rej
(In reply to Pali Rohár from comment #13) > (In reply to Alex Deucher from comment #11) > > Created attachment 132311 [details] > > avoid a possible crash when runpm is forced on non-ATPX systems > > > > Fix runpm=1 handling on non-PX systems. > > It is not possible to apply this patch on top of 3.14 nor on top of linus > master (55101e2d6ce1c780f6ee8fee5f37306971aac6cd) > > linux/drivers/gpu/drm/radeon$ patch -p5 -i > 0002-drm-radeon-don-t-allow-runpm-1-on-systems-with-out-A.patch > patching file radeon_kms.c > Hunk #1 FAILED at 107. > 1 out of 1 hunk FAILED -- saving rejects to file radeon_kms.c.rej It relies on other patches in the radeon -fixes tree. It should apply against: http://cgit.freedesktop.org/~deathsimple/linux/log/?h=drm-fixes-3.15-wip
Now I tested this patch with 3.15-rc2 kernel and no kernel crash with runpm=1 anymore... But there is another problem, runpm=1 somehow not working correctly. It does not poweroff radeon card when it is not used.
My bad, I'm using tlp which calling: $ echo on > /sys/bus/pci/devices/0000:01:00.0/power/control when notebook is running on ac. And this prevent runpm to work correctly. After I blacklisted radeon card in tlp then runpm started working correctly.
Ok, when set auto control via $ echo auto > /sys/bus/pci/devices/0000:01:00.0/power/control card is automatically turned off when it is not used. When I set on via $ echo on > /sys/bus/pci/devices/0000:01:00.0/power/control then it is always on. So it working as expected and closing this bug as fixed.