Bug 51381
The error messages are just a symptom. They are generated because the driver is trying to access hardware that has been powered down. vgaswitcheroo probably needs some logic to track what state the hardware is in so that the drivers know whether it's there or not. Since this is a regression (everything worked like a charm in 3.6.6) i would guess the logic is in place and that this is just a side effect of some other fix. But that's only a guess as I'm not a kernel developer. Can you bisect? There's a lot of commits between 3.6.6 and 3.6.9, but I will see what I can do. I now ran the same 3.6.6 kernel as I did before, but now it didn't work there either, so something else must be wrong. I will do some more testing, but it seems the kernel is not to blame after all. intel + radeon + vgaswitcheroo seems to have been problematic for quite some time: https://bugzilla.kernel.org/show_bug.cgi?id=23592 I'm using 3.10rc5. While I have no problems with suspending or quitting X (didn't have problems with some earlier kernels either), I have had yet another problem for some time now. If I start X with the radeon card enabled then disable the radeon card with vgaswitcheroo then switch to a tty then switch back to X then X hangs and I get many messages like [ 236.688466] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting [ 236.688470] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing D46E (len 62, WS 0, PS 0) @ 0xD48A I assume the problem is still the same. The most pressing issue is that X should not get stuck, no matter what. It's much better to reset the driver somehow, even if you lose 3d acceleration or so. As it is now, X eats all keyboard input and you have to use sysrq keys or reboot over ssh which is not really ideal. See comment #1. I have same dual graphics configuration (Intel graphics and ATI HD 5650) and also have problems with resume. It takes 40 seconds, and I get following error messages: [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting [drm:atom_execute_table_locked] *ERROR* atombios stuck executing CD0C (len 62, WS 0, PS 0) @ 0xCD28 [drm:atom_execute_table_locked] *ERROR* atombios stuck executing BA84 (len 937, WS 4, PS 0) @ 0xBB94 [drm:atom_execute_table_locked] *ERROR* atombios stuck executing BA1A (len 76, WS 0, PS 8) @ 0xBA22 [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed I use 3.14.2 kernel, system init is systemd. The problems began since switching from 3.12 to 3.13. I use vgaswitcheroo to disable discrete GPU during boot time. Beginning with 3.13 kernel radeon use runtime power management (runpm), and I have to pass "runpm=0" to radeon module, because runpm automatically reenables discrete GPU and brings another problems (https://bugs.gentoo.org/show_bug.cgi?id=506188). Without parameter "runpm=0" my system rusumes immediately. Please let me know if you need additional information. Thank you! Created attachment 135131 [details]
kernel log
Kernel log includes suspend-resume info and error messages.
(In reply to newgarry from comment #8) > > Without parameter "runpm=0" my system rusumes immediately. Please let me > know if you need additional information. Thank you! Are you saying that everything is working properly without runpm=0? (In reply to Alex Deucher from comment #10) > Are you saying that everything is working properly without runpm=0? Yes, it is. System resumes immediately and without error messages. Hello, I'm not sure if the problem I have is the same as reported in the original bug report on 2012-12-07, but I have a problem very similar to one described by newgarry on 2014-05-04. I have an Asus K73TA laptop with AMD A6-3400M APU and Radeon 6550 GPU. I get foloowing errors on boot with Kernels version 3.13 and above: [ 53.720848] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting [ 53.720975] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing CE56 (len 62, WS 0, PS 0) @ 0xCE72 [ 53.721107] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing BB62 (len 1036, WS 4, PS 0) @ 0xBC5F [ 53.721240] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing BAF8 (len 76, WS 0, PS 8) @ 0xBB00 [ 55.775951] [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xFFFFFFFF) [ 55.776083] [drm:evergreen_resume] *ERROR* evergreen startup failed on resume [ 55.776364] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed Initially I thought this is Debian specific so I reported it on Debian BTS, it has more details here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=737684 I bisected kernel versions between 3.12 and 3.13 and I determined that this issue was introduced in the following git commit: commit 10ebc0bc09344ab6310309169efc73dfe6c23d72 Author: Dave Airlie <airlied@redhat.com> Date: Mon Sep 17 14:40:31 2012 +1000 drm/radeon: add runtime PM support (v2) This hooks radeon up to the runtime PM system to enable dynamic power management for secondary GPUs in switchable and powerxpress laptops. v2: agd5f: clean up, add module parameter Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Newgarry, can you please try kernel v3.12 and see if it works correctly for you? Because of this issue I cannot upgrade my kernel to anything above 3.12. I spent a week bisecting the kernel, so I would really appreciate someone looking into this. I tried reviewing the changes introduced in this commit, but I know too little about radon drivers to be able to understand the impact they had. If you need me to do any additional testing or provide you with extra information, please don't hesitate to contact me, I'll do what I can. Sincerely, Teofilis Martisius There have been a lot of PX fixes in 3.15. Can you try 3.15? Additionally, you can disable the PX runtime pm support by appending radeon.runpm=0 on the kernel command line in grub. Hi, Thank you for very quick response. I have just tried v3.15rc7 from Debian Experimental, it still has this same problem. I have attached an excerpt from dmesg at the end of this message. I'll attach a full dmesg log as well. I have tried same kernel with radeon.runpm=0, and it works correctly. I can run glxgears on both my primary and my secondary GPU with "xrandr --setprovideroffloadsink xx yy" and "DRI_PRIME=1 glxgears". Both work correctly. So disabling power management works as a workaround. However, it's just a workaround, and it would be interesting to get the underlying issue fixed. I'll try 3.15rc8 next, see if that has any improvements. P.S. My current kernel boot-time options are: quiet radeon.audio=0 modeset=1 radeon.dpm=1 radeon.no_wb=1 radeon.runpm=0 I'm running Debian/Sid. Could it be something broken in userspace interfering? Sincerely, Teofilis Martisius ====== [ 55.886107] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting [ 55.886234] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing CE56 (len 62, WS 0, PS 0) @ 0xCE72 [ 55.889662] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing BB62 (len 1036, WS 4, PS 0) @ 0xBC5F [ 55.892979] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing BAF8 (len 76, WS 0, PS 8) @ 0xBB00 [ 55.896345] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed [ 57.418996] radeon 0000:01:00.0: Wait for MC idle timedout ! [ 57.609356] radeon 0000:01:00.0: Wait for MC idle timedout ! [ 57.628060] [drm] PCIE GART of 1024M enabled (table at 0x0000000000273000). [ 57.628181] radeon 0000:01:00.0: WB enabled [ 57.628189] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff88014876ac00 [ 57.628194] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff88014876ac0c [ 57.637229] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc90011cb2118 [ 57.844594] [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xFFFFFFFF) [ 57.844724] [drm:evergreen_resume] *ERROR* evergreen startup failed on resume [ 57.847954] [drm:radeon_pm_resume_dpm] *ERROR* radeon: dpm resume failed Created attachment 138161 [details]
Dmesg from 3.15rc7 from Debian/Experimental
Asus K73TA laptop with AMD A6-3400M APU and Radeon 6550 GPU
Hello, I can reproduce this problem on v3.15rc8 with power management enabled (radeon.runpm=1). So this hasn't been fixed in v3.15 yet. v3.15rc8 works OK with radeon.runpm=0 flag Sincerely, Teofilis Martisius Does removing radeon.dpm=1 from the kernel command line fix the issue? It's enabled by default on asics where it is stable. Hello, I tried removing radeon.dpm=1, it did NOT fix the issue. I think my GPUs are considered "stable" by now- it's not a new laptop. Let me know if you need anything else. Sincerely, Teofilis Martisius [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.15.0-rc8 root=UUID=6b66e960-0da6-4a99-abfe-f23614b50db9 ro quiet radeon.audio=0 modeset=1 radeon.no_wb=1 ..... [ 59.122053] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting [ 59.122183] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing CE56 (len 62, WS 0, PS 0) @ 0xCE72 [ 59.122315] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing BB62 (len 1036, WS 4, PS 0) @ 0xBC5F [ 59.122448] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing BAF8 (len 76, WS 0, PS 8) @ 0xBB00 [ 60.722717] radeon 0000:01:00.0: Wait for MC idle timedout ! [ 60.923224] radeon 0000:01:00.0: Wait for MC idle timedout ! [ 60.942019] [drm] PCIE GART of 1024M enabled (table at 0x0000000000273000). [ 60.942140] radeon 0000:01:00.0: WB enabled [ 60.942148] radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000040000c00 and cpu addr 0xffff880089552c00 [ 60.942153] radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000040000c0c and cpu addr 0xffff880089552c0c [ 60.951189] radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x0000000000072118 and cpu addr 0xffffc90011cb2118 [ 61.178505] [drm:r600_ring_test] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xFFFFFFFF) [ 61.178637] [drm:evergreen_resume] *ERROR* evergreen startup failed on resume [ 66.180845] [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting [ 66.180974] [drm:atom_execute_table_locked] *ERROR* atombios stuck executing C50C (len 1136, WS 0, PS 0) @ 0xC536 [ 66.219989] vgaarb: device changed decodes: PCI:0000:01:00.0,olddecodes=io+mem,decodes=none:owns=none [ 66.219995] vgaarb: device changed decodes: PCI:0000:00:01.0,olddecodes=io+mem,decodes=none:owns=none (In reply to Teofilis Martisius from comment #18) > Hello, > > I tried removing radeon.dpm=1, it did NOT fix the issue. I think my GPUs are > considered "stable" by now- it's not a new laptop. Please attach your full dmesg output with radeon.dpm=1 removed. Does disabling the dGPU manually via vgaswitcheroo with runpm=0 or on older kernels prior to 10ebc0bc09344ab6310309169efc73dfe6c23d72 actually work or does it have similar problems? Created attachment 138351 [details]
Dmesg output from 3.15rc8 without radeon.dpm=1 switch
Dmesg output from 3.15rc8 without radeon.dpm=1 switch and without radeon.runpm=0 switch.
Command line: BOOT_IMAGE=/boot/vmlinuz-3.15.0-rc8 root=UUID=6b66e960-0da6-4a99-abfe-f23614b50db9 ro quiet radeon.audio=0 modeset=1 radeon.no_wb=1
Ok, I tried playing around with Linux 3.12, that's the last stable version before that commit. Booted with parameters: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.12-1-amd64 root=UUID=... ro quiet radeon.audio=0 modeset=1 radeon.dpm=1 radeon.no_wb=1 radeon.runpm=0 Executed: echo OFF >/sys/kernel/debug/vgaswitcheroo/switch dGPU got switched off, everything works fine. Got a message in dmesg: [ 1454.396648] radeon: switched off I don't want to spam attachments on this bug report, full dmesg output if needed at: http://menulis.org/kernel/dmesg_3.12_off1.log I tried it with radeon.runpm=1 as well, same results, everything works fine. Dmesg: http://menulis.org/kernel/dmesg_3.12_off2.log Let me know if you want these two dmesg logs attached here in bugzilla. Let me know if you need me to try something else next. Sincerely, Teofilis Martisius Sorry, I should have been more clear, on kernel 3.12, does the dGPU switch on again properly after via debugfs after you've disabled it via debugfs? The problem you are sseing is that the GPU turns off ok, but seems to have problems turning back on: [ 52.340438] radeon 0000:01:00.0: Refused to change power state, currently in D3 [ 52.416464] radeon 0000:01:00.0: Refused to change power state, currently in D3 [ 52.432469] radeon 0000:01:00.0: Refused to change power state, currently in D3 and the registers are all reading back 0xffffffff which usually means the device is still powered off. Also why are you using radeon.no_wb=1? That may cause problems. Hi, Thank you once again for quick response. Ok, a while ago I added radeon.no_wb=1 as without it I was getting display corruption. That problem seems to be gone now, so there's no reason to keep that option any more- I've taken it off. I think you nailed the problem. On 3.12, it fails to turn ON the dGPU after it has been turned OFF. I have tried this by booting 3.12 with following boot parameters: [ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.12-1-amd64 root=UUID=6b66e960-0da6-4a99-abfe-f23614b50db9 ro quiet radeon.audio=0 modeset= Then I did: echo OFF >/sys/kernel/debug/vgaswitcheroo/switch echo ON >/sys/kernel/debug/vgaswitcheroo/switch After "echo ON" I got similar "atombios stuck in loop" errors in dmesg, and errors that dGPU failed to come back on. "DRI_PRIME=1 glxgears" of course fails to work after that as well. I've attached the dmesg from 3.12. Ok, what next? P.S. Sorry for slow responses. I only have time to do this in the evenings, and I'm in London- I guess you are in a different timezone. Sincerely, Teofilis Martisius Created attachment 138411 [details]
Dmesg output from 3.12, fails to turn on dGPU
Ok, so it appears your dGPU never powered up properly. You just see the problem now because prior to the runpm patch (which dynamically turns the dGPU on/off) it was always left on. Created attachment 138421 [details]
possible fix
Does this patch help? If not, can you try increasing the size of the delay and see if that helps?
Hello, Sorry for the delay, I had other plans for the weekend. The patch did not help. I tried it with default delay of 20, and then I tried it with delay set to 200 (200 what? milliseconds?). I tried both default delay and 200 delay on both 3.12.21 and on 3.15rc8, no luck. I changed the patch to increase the delay and to print out the delay- you can see it in dmesg. I have attached the dmesg output for the 200 delay runs for 3.12.21 and 3.15rc8. I ran the kernels with following boot parameters: 3.12.21: BOOT_IMAGE=/boot/vmlinuz-3.12.21d200 root=UUID=xxx ro quiet radeon.audio=0 modeset=1 3.15.0-rc8 BOOT_IMAGE=/boot/vmlinuz-3.15.0-rc8teo root=xxx ro quiet radeon.audio=0 modeset=1 radeon.runpm=0 Sincerely, Teofilis Martisius diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c index b512c00..0574d56 100644 --- a/drivers/gpu/drm/radeon/radeon_device.c +++ b/drivers/gpu/drm/radeon/radeon_device.c @@ -1093,8 +1093,10 @@ static void radeon_switcheroo_set_state(struct pci_dev *pdev, enum vga_switchero /* don't suspend or resume card normally */ dev->switch_power_state = DRM_SWITCH_POWER_CHANGING; - if (d3_delay < 20 && radeon_switcheroo_quirk_long_wakeup(pdev)) - dev->pdev->d3_delay = 20; + if (d3_delay < 200 /*&& radeon_switcheroo_quirk_long_wakeup(pdev)*/) { + dev->pdev->d3_delay = 200; + printk(KERN_INFO "radeon: d3 delay set to 200\n"); + } radeon_resume_kms(dev, true, true); Created attachment 138621 [details]
dmesg output for 3.12.21 patched with delay of 200
Created attachment 138631 [details]
dmesg output for 3.15-rc8 patched with delay of 200
Ok, it appears powering up the dGPU has never worked properly on your system. As a workaround for now you can disable runpm by adding radeon.runpm=0. I can add a quirk to the driver to disable runpm on your system by default until someone figures out how to fix it. Created attachment 138641 [details]
testing patch
Does this patch help? Note, this will probably break regular suspend/resume, so just test it for switcheroo.
Created attachment 138791 [details]
dmesg output for 3.15-rc8, delay of 200, patched with testing patch 2
Hello, I have applied patch2 to 3.15-rc8 along with patch1 and delay of 200, ran with "radeon.runpm=0", tried to turn off and on the dGPU, and the problem is still there. Dmesg attached. Sincerely, Teofilis Martisius Hello, Kernel 3.15.2 solved my problem, described in comment 8. Many thanks! Created attachment 143411 [details]
disable runpm by default on problematic systems
This patch disables runpm by default on the Asus K73TA laptop so that the system will be usable out of the box until we fix the deeper issue.
Hi, Please let me know if I can do anything to help "fix the deeper issue". I'd like to have runpm working properly on my machine, and I do have some free time. Unfortunately I'm not familiar with Radeon GPU internals or kernel driver internals so I cannot do it myself- I tried going through the .c files affected by the patches you sent and understand what's going on there and failed. But I can build & test kernels, do some experiments, and feed you the information. On the other hand, your time is probably better spent on R9 support and new OpenGL features... Sincerely, Teofilis I'm also getting this with kernel 3.16.3 on Arch. I've also fixed the problem by adding radeon.runpm=0 to my boot parameters. CPU=A6-3420m with Ati 6520G and 7670m -> Asus K53TK [ 1.056775] checking generic (b0000000 300000) vs hw (b0000000 10000000) [ 1.056777] fb: switching to radeondrmfb from EFI VGA [ 1.056802] Console: switching to colour dummy device 80x25 [ 1.057757] [drm] initializing kernel modesetting (SUMO 0x1002:0x9647 0x1043:0x2122). [ 1.057777] [drm] register mmio base: 0xFEB00000 [ 1.057778] [drm] register mmio size: 262144 ....................................................................... [ 1.487137] [drm] Initialized radeon 2.39.0 20080528 for 0000:00:01.0 on minor 0 [ 1.487250] radeon 0000:01:00.0: enabling device (0000 -> 0003) [ 1.487940] [drm] initializing kernel modesetting (TURKS 0x1002:0x6840 0x1043:0x2122). [ 1.487962] [drm] register mmio base: 0xFEA20000 [ 1.487964] [drm] register mmio size: 131072 --- /* Asus K53TK laptop with AMD A6-3420M APU and Radeon 7670m GPU * https://bugzilla.kernel.org/show_bug.cgi?id=51381 */ { PCI_VENDOR_ID_ATI, 0x6840, 0x1043, 0x2122, RADEON_PX_QUIRK_DISABLE_PX }, Thanks. I've added a quirk for your system. I'm also experiencing this issue on kernel 4.4 on Debian where this happens at boot time and the system is basically unable to boot due to it locking up completely. runpm=0 also solves the issue. My hardware: CPU: AMD Phenom II X4 965 Motherboard: Gigabyte 990FXA-UD3 GPU: AMD Radeon HD 5670 If there is any way I can help then let me know. I've started seeing this on a Dell Latitude E6540 with a Radeon 8790M and Intel hybrid graphics. I'm using vgaswitcheroo to disable the Radeon in Linux. I'm on Arch Linux, I wasn't seeing the problem in kernel 4.5.0 but I'm seeing it now in 4.5.1. The kernel spits out errors about the atombios being stuck in a loop and then fills dmesg with messages about "ring 3 stalled". After about a full minute of this, X starts on the Intel GPU and the system seems to run normally. The kernel continues to spit out error messages during operation. Adding radeon.runpm=0 to the kernel parameters results in a normal boot, where the driver starts up normally and X starts up much more quickly. I do see that a patch related to runtime PM (e64c952efb8e0c15ae82cec8e455ab4910690ef1) went into the kernel recently. I guess you probably have different issue. Alex, what info I need provide to add laptop to quirk list? Acer 7560g, 6620G+6650M. Is this still an issue with kernel 3.8? A bunch of PX fixes went into that kernel. Hi, Last I tried it was 4.7. I removed my laptop from the quirks list and ATOMBIOS stuck in loop message still happened. I'll test this with 4.8-rc4 over the weekend. Sincerely, Teofilis Hello, I was hoping the new changes would fix this, but it's not the case. The system freeze (and gets "atombios stuck" msg) like 15secs when starting X and resuming from suspend. Also running DRI_PRIME=1 glxgears just gets a blank window with a print loop of this message: "radeon: The kernel rejected CS, see dmesg for more information." and of course there is the "*ERROR* atombios stuck" stuff. I'm using an Asus K53TA laptop (6520G + 6650M) and a git kernel just recompiled yesterday. Created attachment 232761 [details]
dmesg output from 4.8-rc4 when turning OFF dGPU via vgaswitcheroo
Created attachment 232771 [details]
dmesg output from 4.8-rc4 with RADEON_PX_QUIRK_DISABLE_PX quirks removed
Hi, I ran two tests over the weekend. First, I tried booting up stock 4.8-rc4. glxgears runs fine both on APU and dGPU. But it fails when I try to turn OFF my dGPU by doing: echo OFF >/sys/kernel/debug/vgaswitcheroo/switch dmesg output attached. Second, I modified 4.8-rc4 & removed my computer from radeon_device.c quirks list (it has RADEON_PX_QUIRK_DISABLE_PX quirk assigned) and rebooted. dmesg attached. It does bootup but with DRI_PRIME=1 glxgears displays just a blank window. dmesg still shows the "atombios stuck in loop" error. I won't be able to run more tests soon as I bricked the laptop trying to upgrade BIOS. I plan to repair/recover it but it will take a while. OS being used is Debian/Sid. I used stock kernel from kernel.org, with 1 line change in 2nd test. Same laptop as before, described in previous comments. I hope any of this helps. Sincerely, Teofilis Martisius Hi, Ok, I got my laptop fixed and now my BIOS is v2.14 (was 2.06). Nothing changed. I have tested with kernel v4.8. I get same errors as in 4.8-rc4. Please let me know what else can I do to help get this solved. I'll try to reproduce this with v4.9 when it's closer to release. Sincerely, Teofilis Martisius Your system does not appear to support powerdown of the dGPU. From your log: [ 9.611183] ATPX version 1, functions 0x00000181 Bit 1 of the functions should be set if it does. System is kernel 4.9.0-RC8, Fedora 25, "[AMD/ATI] Venus PRO [Radeon HD 8850M / R9 M265X]" Some other bug pushed me to boot kernel with radeon.rumpm=0 at boot. I noticied that I had this ATOM bios stuck when trying to turn ON the graphic card due to laptop power usage being to high. turn off is "echo OFF | sudo tee /sys/kernel/debug/vgaswitcheroo/switch turn on is "echo OFF | sudo tee /sys/kernel/debug/vgaswitcheroo/switch" my switch looks like this: 0:IGD:+:Pwr:0000:00:02.0 1:DIS: :Off:0000:01:00.0 Power usage is reported by powertop. Low means ~12w and high means ~17w So, the test cases are: 1) Boot, turn off AMD. Power usage is LOW Suspend and resume. Power usage is now high. Switch reports OFF. 2) Boot, turn off AMD. Power usage is LOW Suspend and resume. Power usage is now high. Switch reports OFF. Turn on the GPU takes a long time and throws atom bios stuck error. Turning it off again reduces usage almost to low but my laptop fans are a bit crazy. 3) Boot, turn off AMD. Power usage is LOW Before suspending turn ON via systemd hook. Suspend and resume. After suspend turn OFF via systemd hook. No errors in dmesg. Power usage is low. No delays. So 3) is the perfect solution where everything works as expected with just and quick workaround. Probably my GPU doesn't support turn ON after suspending and my laptop should suspend with GPU turned on. I have this same issue, with an Asus K73TK laptop. It has an A6-3420M-APU and a 7670M dGPU. On boot these appear in dmesg: [ 27.436081] [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting [ 27.436195] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing CE9C (len 62, WS 0, PS 0) @ 0xCEB8 [ 27.436307] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing BB9C (len 1036, WS 4, PS 0) @ 0xBC99 [ 27.436420] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing BB32 (len 76, WS 0, PS 8) @ 0xBB3A [ 27.436740] [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed [ 29.148572] [drm] PCIE GART of 1024M enabled (table at 0x0000000000162000). [ 29.370027] [drm:r600_ring_test [radeon]] *ERROR* radeon: ring 0 test failed (scratch(0x8504)=0xFFFFFFFF) [ 29.370149] [drm:evergreen_resume [radeon]] *ERROR* evergreen startup failed on resume [ 29.370321] [drm:radeon_pm_resume [radeon]] *ERROR* radeon: dpm resume failed [ 34.372053] [drm:atom_op_jump [radeon]] *ERROR* atombios stuck in loop for more than 5secs aborting [ 34.372115] [drm:atom_execute_table_locked [radeon]] *ERROR* atombios stuck executing C546 (len 1136, WS 0, PS 0) @ 0xC570 Adding a PX quirk for this device worked, and booting happens without errors now. I'll send the patch to amd-gfx. Dear all, I'm using Linux Mint, before I used Arch Linux and also Red Hat, but my laptop suffer of the same problem because of this kernel bug where Radeon gets stuck in a loop: Feb 06 17:57:17 oldlaptop kernel: [drm:atom_op_jump [radeon]] ERROR atombios stuck in loop for more than 5secs aborting Feb 06 17:57:17 oldlaptop kernel: [drm:atom_execute_table_locked [radeon]] ERROR atombios stuck executing CB56 (len 62, WS 0, PS 0) @ 0xCB72 Feb 06 17:57:17 oldlaptop kernel: [drm:atom_execute_table_locked [radeon]] ERROR atombios stuck executing B716 (len 236, WS 4, PS 0) @ 0xB7E3 Feb 06 17:57:17 oldlaptop kernel: [drm:atom_execute_table_locked [radeon]] ERROR atombios stuck executing B674 (len 74, WS 0, PS 8) @ 0xB67C Feb 06 17:57:17 oldlaptop kernel: [drm:si_dpm_enable [radeon]] ERROR si_init_smc_table failed Feb 06 17:57:17 oldlaptop kernel: [drm:radeon_pm_resume [radeon]] ERROR radeon: dpm resume failed This is happening every time I'm switching from a VT to another or any time I need to login. Is there any progress or resolution of this issue? Thanks, Luca (In reply to Luca T. from comment #53) > Dear all, > I'm using Linux Mint, before I used Arch Linux and also Red Hat, but my > laptop suffer of the same problem because of this kernel bug where Radeon > gets stuck in a loop: > > Feb 06 17:57:17 oldlaptop kernel: [drm:atom_op_jump [radeon]] ERROR atombios > stuck in loop for more than 5secs aborting > Feb 06 17:57:17 oldlaptop kernel: [drm:atom_execute_table_locked [radeon]] > ERROR atombios stuck executing CB56 (len 62, WS 0, PS 0) @ 0xCB72 > Feb 06 17:57:17 oldlaptop kernel: [drm:atom_execute_table_locked [radeon]] > ERROR atombios stuck executing B716 (len 236, WS 4, PS 0) @ 0xB7E3 > Feb 06 17:57:17 oldlaptop kernel: [drm:atom_execute_table_locked [radeon]] > ERROR atombios stuck executing B674 (len 74, WS 0, PS 8) @ 0xB67C > Feb 06 17:57:17 oldlaptop kernel: [drm:si_dpm_enable [radeon]] ERROR > si_init_smc_table failed > Feb 06 17:57:17 oldlaptop kernel: [drm:radeon_pm_resume [radeon]] ERROR > radeon: dpm resume failed > > This is happening every time I'm switching from a VT to another or any time > I need to login. > > Is there any progress or resolution of this issue? > > Thanks, > > Luca Ok sorry, I found myself that adding the flag "radeon.runpm=0" to the kernel parameters resolve the issue. Thanks anyway, Luca Dear all, to fix this issue permanently so that I can quickly switch between multiple accounts and I'm also able to resume from suspend my laptop I added the following options into: - added radeon.dpm=0 radeon.runpm=0 to grub so that kernel will load radeon without enabling the power management features - created file /etc/modprobe.d/radeon-pm.conf with the following content: options radeon runpm=0 options radeon dpm=0 *this will make kernel driver to read the same options after resume from suspend Regards, Luca (In reply to Alex Deucher from comment #32) > Created attachment 138641 [details] > testing patch > > Does this patch help? Note, this will probably break regular > suspend/resume, so just test it for switcheroo. Hello Alex, this bug is still present also in kernel 5.12.1, can you please help me to understand how to fix this issue? Thanks in advance, Luca For users still facing this issue, I workaround it with this: https://github.com/aelveborn/vgaswitcheroo-systemd Basically before suspending it restores the GPU powerstate and resumes it once coming from a suspend state. Never had problems again. (In reply to luminoso from comment #57) > For users still facing this issue, I workaround it with this: > https://github.com/aelveborn/vgaswitcheroo-systemd > > Basically before suspending it restores the GPU powerstate and resumes it > once coming from a suspend state. > > Never had problems again. Hi Luminoso, Thanks a lot, I'll try it out and let you know. Thanks, Luca (In reply to luminoso from comment #57) > For users still facing this issue, I workaround it with this: > https://github.com/aelveborn/vgaswitcheroo-systemd > > Basically before suspending it restores the GPU powerstate and resumes it > once coming from a suspend state. > > Never had problems again. Hi Luminoso, it works greatly, thanks a lot!!! I read about this switch but I lost the patience to try and retry, but you saved my life! :) Thank you so much, Luca (In reply to luminoso from comment #57) > For users still facing this issue, I workaround it with this: > https://github.com/aelveborn/vgaswitcheroo-systemd > > Basically before suspending it restores the GPU powerstate and resumes it > once coming from a suspend state. > > Never had problems again. Hi Luminoso, I have a problem with my graphic card: it happens frequently that my card does not refresh the screen (especially when I'm performing "switch user"). Is it happening to you too? Thanks, Luca |
Created attachment 88591 [details] journald log After updating from 3.6.6 to 3.6.9 my laptop with Intel graphics and ATI HD 5650 will not resume from suspend. I use vgaswitcheroo to disable the ATI card at boot. On resume the computer almost hangs (I can press power button and wait 5 minutes for a proper shutdown, but no other interaction is possible). It logs a lot of messages saying: [drm:atom_op_jump] *ERROR* atombios stuck in loop for more than 5secs aborting [drm:atom_execute_table_locked] *ERROR* atombios stuck executing D098 (len 72, WS 0, PS 0) @ 0xD0C7 Steps to reproduce: echo "OFF" > /sys/kernel/debug/vgaswitcheroo/switch [suspend and resume] Actual results: Almost freeze. Expected results: Resume and work as normal. Log is attached, but if you need anything else just ask.