Bug 15096
Summary: | Resume lock up -- bisected, commit 3a1151e3f124fd1a2c54b8153f510f1a7c715369 - ATI Technologies Inc Mobility Radeon HD 3400 | ||
---|---|---|---|
Product: | ACPI | Reporter: | Rafał Miłecki (zajec5) |
Component: | Power-Video | Assignee: | Zhang Rui (rui.zhang) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | high | CC: | acpi-bugzilla, glisse, lenb, mjg59-kernel, rjw, rui.zhang, stefan.bader |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216, 14230 | ||
Attachments: |
Patch that workarounds resume issue
dmesg Patch that workadounds resume issue and adds more debugging dmesg with patch applied patch: init video->backlight->props.brightness before use it lspci -vvxxx ACPI: Restore backlight state at the end of resume ACPI / PM: Move video resume to a PM notifier |
Description
Rafał Miłecki
2010-01-20 23:15:14 UTC
so s2ram works well in 2.6.32 if you blacklist the acpi video driver, right? Zhang: even more, if I let it load and just remove it (rmmod) before s2ram, it works fine. Of course that commit didn't cause real regression in code, it just enabled code that doesn't work for my machine. Diff: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=3a1151e3f124fd1a2c54b8153f510f1a7c715369 Well, it mostly does as I can control backlight, just suspend&resume does not :) please attach the acpidump output of your laptop. please comment this line in driver/acpi/video.c .resume = acpi_video_resume, and see if it helps. @Zhang can you use acpi dump from bug #11682 comment #3 ? It's exactly the same machine, without any BIOS update or any hardware change. Created attachment 24701 [details]
Patch that workarounds resume issue
@Zhang you were right, acpi_video_resume is guilty one. This patch workarounds problem and add some (I believe that very interesting) debugging messages.
Created attachment 24702 [details]
dmesg
The most important part that comes from my debugging patch:
[ 154.681802] ACPI: set_brightness with brightness -268374449 should be called
[ 154.681803] ACPI: set_brightness with brightness 0 should be called
[ 154.681805] ACPI: set_brightness with brightness -268374449 should be called
[ 154.681806] ACPI: set_brightness with brightness -268374449 should be called
So we have 4 (four!) devices, 3 with uninitialized properties and left 1 wants backlight level 0... That's weird for me.
(In reply to comment #8) > Created an attachment (id=24702) [details] > dmesg > > The most important part that comes from my debugging patch: > [ 154.681802] ACPI: set_brightness with brightness -268374449 should be > called > [ 154.681803] ACPI: set_brightness with brightness 0 should be called > [ 154.681805] ACPI: set_brightness with brightness -268374449 should be > called > [ 154.681806] ACPI: set_brightness with brightness -268374449 should be > called > > So we have 4 (four!) devices, 3 with uninitialized properties and left 1 > wants > backlight level 0... That's weird for me. the 3 devices with uninitialized properties stands for CRT/TV/DVI devices that don't support backlight switching. we only need to take care of the LCD device which has backlight support. > [ 154.681803] ACPI: set_brightness with brightness 0 should be called 0 is also a weird value to me. can you please check your backlight values by "cat /sys/class/backlight/acpi_video0/actual_brightness", and check if the video driver wants to restore the same value during resume. Created attachment 24703 [details]
Patch that workadounds resume issue and adds more debugging
Created attachment 24704 [details]
dmesg with patch applied
The most important parts:
[ 29.819204] ACPI: [suspend] have video device without brightness
[ 29.819348] ACPI: [suspend] have video device with brightness (get:8 props:0)
[ 29.819352] ACPI: [suspend] have video device without brightness
[ 29.819356] ACPI: [suspend] have video device without brightness
...
[ 30.262803] ACPI: [resume] set_brightness should not be called
[ 30.262805] ACPI: [resume] set_brightness with brightness 0 should be called
[ 30.262806] ACPI: [resume] set_brightness should not be called
[ 30.262807] ACPI: [resume] set_brightness should not be called
@Zhang: ah sure, I dropper important condition in my first workaround, sorry. Instead of testing "cat..." I created patch that should give more info. Attached patch and dmesg. Do you know what we have brightness property not set correctly? I added more debugging messages and have following findings: On booting, acpi_video_device_lcd_set_level is called twice: acpi_video_device_lcd_set_level(f6775840, 100) acpi_video_device_lcd_set_level(f6775840, 100) Each time it loops from state 2 to state 10 (inclusive). Condition: > if (level == device->brightness->levels[state]) is true for state==10 only Condition: > if (device->backlight) is never true. And so we never set props.brightness at boot time. To understand reason you've to track acpi_video_device_find_cap. This function calls "acpi_video_init_brightness" which calls "acpi_video_device_lcd_set_level(device, max_level);" which does not change props.brightness (because we don't have device->backlight). It's later when function acpi_video_device_find_cap does: > device->backlight = backlight_device_register(name, > NULL, device, &acpi_backlight_ops); We should somehow modify this code to call acpi_video_device_lcd_set_level(device, max_level); AFTER registering backlight (class) device. I've tried following: echo 7 > /sys/class/backlight/acpi_video0/brightness && sleep 5s && s2ram It gave me following: Jan 25 12:39:30 linux-g0th kernel: [ 153.313254] ACPI handle has no context! (...) Jan 25 12:39:30 linux-g0th kernel: [ 153.570206] ACPI: [suspend] have video device without brightness Jan 25 12:39:30 linux-g0th kernel: [ 153.570346] ACPI: [suspend] have video device with brightness (get:7 props:7) Jan 25 12:39:30 linux-g0th kernel: [ 153.570350] ACPI: [suspend] have video device without brightness Jan 25 12:39:30 linux-g0th kernel: [ 153.570354] ACPI: [suspend] have video device without brightness [suspended here] Jan 25 12:39:30 linux-g0th kernel: [ 154.014039] ACPI: [resume] set_brightness should not be called Jan 25 12:39:30 linux-g0th kernel: [ 154.014041] ACPI: [resume] set_brightness with brightness 7 should be called Jan 25 12:39:30 linux-g0th kernel: [ 154.014042] ACPI: [resume] set_brightness should not be called Jan 25 12:39:30 linux-g0th kernel: [ 154.014043] ACPI: [resume] set_brightness should not be called but it still locks up... Should I care about context error? It's also visible in attached dmesg log. Created attachment 24711 [details]
patch: init video->backlight->props.brightness before use it
this is the patch that fixes the video->backlight->props.brightness uninitialized issue.
Thanks for find this issue! :)
so the problem is: system locks up if changing backlight during resume, right? please attach the output of "grep . /sys/class/backlight/acpi_video*/device/*". First-Bad-Commit : 3a1151e3f124fd1a2c54b8153f510f1a7c715369 Rafael before this patch backlight didn't worked right ? I am wondering if the issue is not with kms deinit the gpu and acpi trying to bang the gpu backlight register after resume before radeon kms resume the gpu. (In reply to comment #16) > so the problem is: > system locks up if changing backlight during resume, right? That's right. I had idea that maybe we lack some ACPI initialization on resume, decided to check it. I used my workaround to do not restore backlight on resume and after successful resuming I tried: # echo 5 > /sys/class/backlight/acpi_video0/brightness It worked fine. So this lock up happens when we change brightness while whole resume only. If we change brightness before suspending or after resume is done, it works fine. I suspect that maybe there is some other driver doing it's resume which affects us. Maybe it does something just before or just after video.c resume? Maybe we don't have enough delay between out resume and other driver resume? Can this be the case? (In reply to comment #17) > please attach the output of "grep . > /sys/class/backlight/acpi_video*/device/*". /sys/class/backlight/acpi_video0/device/hid:device /sys/class/backlight/acpi_video0/device/modalias:acpi:device: /sys/class/backlight/acpi_video0/device/path:\_SB_.PCI0.PEGP.VGA_.LCD_ /sys/class/backlight/acpi_video0/device/uevent:MODALIAS=acpi:device: (In reply to comment #19) > Rafael before this patch backlight didn't worked right ? I am wondering if > the > issue is not with kms deinit the gpu and acpi trying to bang the gpu > backlight > register after resume before radeon kms resume the gpu. Jérôme: I wrote "... I test s2ram from init 3...", but I agree I was not clear enough. I test suspend/resume in init 3, without radeon even loaded (blacklisted it for testing time). I don't have any FB (radeonfb) even compiled, I boot with "vga=normal". (In reply to comment #15) > Created an attachment (id=24711) [details] > patch: init video->backlight->props.brightness before use it > > this is the patch that fixes the video->backlight->props.brightness > uninitialized issue. > Thanks for find this issue! :) Tested-by: Rafał Miłecki <zajec5@gmail.com> But as I already wrote, having props.brightness properly filled doesn't fix real issue for me. Zhand's patch didn't apply cleanly to my git (drm-radeon-testing) so I updated Linus's tree, patched and tried again. Bug still exists in current torvalds/linux-2.6. I've also tried pm-suspend instead of s2ram, same result. Default params for my s2ram are: -p -m. I need them to have display resume working correctly (in vga=0, init 3, no radeon mode). So I though it may be conflict between s2ram which restores GPU mode and video which restores brightness. So I've used "s2ram -f". This caused my display didn't come back (expected, because I need -p -m for that) but Caps Lock didn't work as well. So I locked up again, even without restoring GPU mode by s2ram. No more ideas :| OK, I have one more idea. In case of my machine brightness in controlled by GPU (even using ACPI causes modification of GPU register). Maybe we try to restore brightness (by indirectly touching GPU register) before we wake up GPU? Or maybe before GPU has enough time to wake up? Could you drop "NEEDINFO" state? Are you using the external graphics or integrated Intel graphics? please attach the acpidump output. please attach the output of "cat /sys/class/backlight/acpi_video0/devices/path". (In reply to comment #27) > Are you using the external graphics or integrated Intel graphics? It's non-IGP GPU: ATI Mobility Radeon 34xx. You can find more details about how controlling brightness really happens in bug #11682 > please attach the acpidump output. You already asked for this in comment #5 and I replied in comment #6 . Is linked-attached dump somehow incorrect? > please attach the output of "cat > /sys/class/backlight/acpi_video0/devices/path". \_SB_.PCI0.PEGP.VGA_.LCD_ Reporter should really be allowed to change status to "ASSIGNED" :/ Zhang: do you think there may be something correct in my comment #26 ? Could we somehow make video resume happen after other resume operations? (In reply to comment #28) > (In reply to comment #27) > > Are you using the external graphics or integrated Intel graphics? > > It's non-IGP GPU: ATI Mobility Radeon 34xx. You can find more details about > how > controlling brightness really happens in bug #11682 > > > > please attach the acpidump output. > > You already asked for this in comment #5 and I replied in comment #6 . Is > linked-attached dump somehow incorrect? > Sorry, my bad, I mean lspci -vvxxx output. I'm wondering if we're poking the _BCM for Intel IGD while using an ATI graphics. If yes, this doesn't sound right. what do you think? Created attachment 24905 [details]
lspci -vvxxx
(In reply to comment #30) > I'm wondering if we're poking the _BCM for Intel IGD while using an ATI > graphics. > If yes, this doesn't sound right. what do you think? Sorry, but I have no understanding of how ACPI actually works :| Can not say if this is correct or not. I can test it and I see it working great with just exception for resume. As you can see in bug #11682 I had big problems with brightness on this machine. (In reply to comment #19) > Rafael before this patch backlight didn't worked right ? I am wondering if > the > issue is not with kms deinit the gpu and acpi trying to bang the gpu > backlight > register after resume before radeon kms resume the gpu. That sounds plausible. (In reply to comment #33) > (In reply to comment #19) > > Rafael before this patch backlight didn't worked right ? I am wondering if > the > > issue is not with kms deinit the gpu and acpi trying to bang the gpu > backlight > > register after resume before radeon kms resume the gpu. > > That sounds plausible. It does not. See comment #22 It doesn't matter in your test case, but in my opinion it may be an issue in general. I wonder what happens if you comment out acpi_video_resume() entirely? (In reply to comment #35) > I wonder what happens if you comment out acpi_video_resume() entirely? I *suspect* that: 1) Resume will work fine 2) Brightness level won't be resumed (it will be max) 3) Driver will report bad brightness level info, until changing it (without lock up). Let me recompile and confirm. Suspicions from comment #36 confirmed. Can I somehow change order of resume, to move ACPI resume happen later than currently it does? Maybe if we do ACPI resume after GPU resume, it will solve problem? Keep in mind, that after applying workaround (for example by commenting out acpi_video_resume()) I can resume *and* change brightness without problem. well, the devices resume order is the same as the order they are initialized. it means: 1. ACPI video device is initialized before the ATI graphics device. 2. the ACPI video driver changes the backlight levels for several times during initialization. so I'm wondering why the problem doesn't exist during system boot phase. Zhang: is this NEEDINFO for me? Not sure what else info I can provide :( If order is the same on boot and on resume, maybe that's some timing issue? Maybe in case of booting we are slower and there is some bigger delay between loading video and loading radeon? On Sunday 21 February 2010, Rafał Miłecki wrote:
> 2010/2/21 Rafael J. Wysocki <rjw@sisk.pl>:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.31 and 2.6.32.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.31 and 2.6.32. Please verify if it still should
> > be listed and let the tracking team know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=15096
> > Subject : Resume lock up -- bisected, commit
> 3a1151e3f124fd1a2c54b8153f510f1a7c715369
> > Submitter : Rafał Miłecki <zajec5@gmail.com>
> > Date : 2010-01-20 23:15 (33 days old)
>
> Should be, ready for providing info if requested (current NEEDINFO
> probably incorrect).
(In reply to comment #37) > Suspicions from comment #36 confirmed. > > Can I somehow change order of resume, to move ACPI resume happen later than > currently it does? Maybe if we do ACPI resume after GPU resume, it will solve > problem? No, we only need to move the ACPI video driver's resume past the radeon resume. > Keep in mind, that after applying workaround (for example by commenting out > acpi_video_resume()) I can resume *and* change brightness without problem. Hmm. Perhaps we can call acpi_video_resume() directly from acpi_pm_end(). Can you try that yourself or do you want me to prepare a patch? Rafael: sorry, but I don't know relation between sleep.c and video.c. How can I call acpi_video_resume(struct acpi_device *device) from acpi_pm_end? Is that acpi_device something global, available in one instance only? Should I take it in sleep.c using acpi_bus_get_device? I've to remove acpi_driver.ops.resume to avoid lock up, so we probably have to make acpi_video_resume not static? Generally: can you help me with this? (In reply to comment #41) > (In reply to comment #37) > > Suspicions from comment #36 confirmed. > > > > Can I somehow change order of resume, to move ACPI resume happen later than > > currently it does? Maybe if we do ACPI resume after GPU resume, it will > solve > > problem? > > No, we only need to move the ACPI video driver's resume past the radeon > resume. > does this mean setting backlight in ACPI video driver before radeon driver loaded is also dangerous...? If you're reinitialising the GPU through userspace, there's no way that this can be expected to work - the gpu won't be reinitialised until after userspace has started. Does it work if you're using kms? (In reply to comment #43) > (In reply to comment #41) > > (In reply to comment #37) > > > Suspicions from comment #36 confirmed. > > > > > > Can I somehow change order of resume, to move ACPI resume happen later > than > > > currently it does? Maybe if we do ACPI resume after GPU resume, it will > > > solve problem? > > > > No, we only need to move the ACPI video driver's resume past the radeon > > resume. > > > does this mean setting backlight in ACPI video driver before radeon driver > loaded is also dangerous...? In principle it's not the right thing to do, because we have no idea what the AML involved in that is going to do. Created attachment 25276 [details]
ACPI: Restore backlight state at the end of resume
For completeness, this patch moves the ACPI video resume to the end of resume.
If the problem is reproducible with the Radeon KMS, please check if this patch helps.
Created attachment 25280 [details]
ACPI / PM: Move video resume to a PM notifier
Actually, scratch that.
Please check if this patch helps without the KMS.
(In reply to comment #47) > Created an attachment (id=25280) [details] > ACPI / PM: Move video resume to a PM notifier > > Actually, scratch that. > > Please check if this patch helps without the KMS. OK, now that's for sure. We should not try to touch video (ACPI) before GPU is initialized. With this patch applied it works great when using KMS. That's what happens on resume when we use KMS: 1) radeon KMS initializes GPU 2) end of resume, video (APCI) restores backlight (with Rafael's patch) and it works great! Now, resume without KMS 1) no kernel modules initializes GPU 2) video (ACPI) restores backlight 3) userspace (like pm-utils) initializes GPU as a result we touch GPU before initialization, it locks up and we do not even hit step 3. The solution would be to have info about KMS state (is is used?) or about GPU maybe (is is initialized?). We should restore backlight level when GPU is initialized only (like KMS case). Thanks a lot Rafael for your patch. Maybe I'll start KMS status thread on dri-devel. From ac7729da880e742613129ee6dea0045328670d2d Mon Sep 17 00:00:00 2001 From: Rafael J. Wysocki <rjw@sisk.pl> Date: Mon, 5 Apr 2010 01:43:51 +0200 Subject: [PATCH] ACPI / PM: Move ACPI video resume to a PM notifier shipped in linux-2.6.34-rc4 closed |