Bug 13148
Summary: | resume after suspend-to-ram broken on Sony Vaio VGN-SR19VN when sony-laptop driver present | ||
---|---|---|---|
Product: | Drivers | Reporter: | fanderay (fanderay4) |
Component: | Platform | Assignee: | Rafael J. Wysocki (rjw) |
Status: | CLOSED UNREPRODUCIBLE | ||
Severity: | normal | CC: | florian, jnm11, krummas, lenb, loppituu, malattia, rjw, rui.zhang, saintiss, vyacheslavovich |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 2.6.30rc2 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 7216, 12398, 56331 | ||
Attachments: |
dmesg output
lspci -vvv -xxx output dsdt decode resume failure bisection log acpidump customized DSDT: using fake _DOS dmesg-acpi-debug dmesg-acpi-debug after s2ram and resume PCI: Clear saved_state after the state has been restored ACPI: Always try to get control of the PCIe capability structure from the BIOS ACPI / sony-laptop: Suspend late and resume early do not initialize SNC disable hot keys setup do sony resume in pm notifier chain |
Description
fanderay
2009-04-22 14:39:30 UTC
2.6.29 didn't have this problem, did it? re-assign to Mattia This problem does also occur in 2.6.29. Could you attach a boot log? Sorry, also lspci and DSDT. Thanks Created attachment 21094 [details]
dmesg output
Created attachment 21095 [details]
lspci -vvv -xxx output
Created attachment 21096 [details]
dsdt decode
Handled-By : Mattia Dongili <malattia@linux.it> is CONFIG_SONY_LAPTOP=y important, or do you see the failure also with CONFIG_SONY_LAPTOP=m ? (I assume you see no failure with CONFIG_SONY_LAPTOP=n) Can you git-bisect what between 2.6.29 and the 2.6.30-rc caused the regression? "m" behaves the same as "y" when the module is loaded. As mentioned above, this problem does occur also in 2.6.29 (but not in 2.6.26). I'm not really familiar with git, I'm afraid. fanderay, can you also test 2.6.30-rc4 (there is some new code that might help) and 2.6.28 (to try to narrow down when the breakage started). thanks Hi Mattia, The same problem still occurs in rc4. It does not occur in 2.6.28, so it seems to have broken between .28 and .29. Really, not much happened on sony-laptop between .28 and .29: $ git log v2.6.28..v2.6.29 -- drivers/misc/sony-laptop.c drivers/platform/x86/sony-laptop.c commit d97c0defba25a959a990f6d4759f43075540832e Merge: ec9f168 b4f9fe1 Author: Len Brown <len.brown@intel.com> Date: Fri Jan 9 04:01:26 2009 -0500 Merge branch 'drivers-platform' into release Conflicts: drivers/misc/Kconfig Signed-off-by: Len Brown <len.brown@intel.com> commit 30823736162ff91512965e3c730557e34fa71d6d Author: Lin Ming <ming.m.lin@intel.com> Date: Tue Dec 16 16:59:35 2008 +0800 ACPI: sony-laptop.c: call acpi_get_object_info to get node info Avoid using internal acpica structures acpi_namespace_node and acpi_operand_object Call acpi_get_object_info to get node ascii name and method arg count Signed-off-by: Lin Ming <ming.m.lin@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> commit 41b16dce390510f550a4d2b12b98e0258bbed6e2 Author: Len Brown <len.brown@intel.com> Date: Mon Dec 1 00:09:47 2008 -0500 create drivers/platform/x86/ from drivers/misc/ Move x86 platform specific drivers from drivers/misc/ to a new home under drivers/platform/x86/. The community has been maintaining x86 vendor-specific platform specific drivers under /drivers/misc/ for a few years. The oldest ones started life under drivers/acpi. They moved out of drivers/acpi/ because they don't actually implement the ACPI specification, but either simply use ACPI, or implement vendor-specific ACPI extensions. In the future we anticipate... drivers/misc/ will go away. other architectures will create drivers/platform/<arch> Signed-off-by: Len Brown <len.brown@intel.com> (END) So the only real change that touched the driver was 30823736162ff91512965e3c730557e34fa71d6d which just modifies sony_walk_callback that is not even called on resume. I learned how to use bisect. The first bad commit, according to this exercise, was: ---------------------------------------------------------------- commit b424e8d3b438e841cd1700f6433a100a5d611e4a Merge: 7c7758f f6dc1e5 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Wed Jan 7 15:41:01 2009 -0800 Merge branch 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci ---------------------------------------------------------------- The bisection process was however complicated by the presence of another suspend problem which appeared intermittently between .28 and .29. I'll call this Problem 1 since it seems to have appeared first chronologically, and the original resume problem Problem 2. The symptoms of Problem 1 are a system hang during the suspend process, immediately after the line "Suspending console(s) (use no_console_suspend to debug)" is printed. The screen stays alive but in all other respects the system is hard-frozen, e.g. Caps Lock does not cycle the keyboard LED. The system never actually suspends and must be physically powered off. The first thing I did was to bisect Problem 1. That yielded this first bad commit: ---------------------------------------------------------------- commit 9ea09af3bd3090e8349ca2899ca2011bd94cda85 Author: Heiko Carstens <heiko.carstens@de.ibm.com> Date: Mon Dec 22 12:36:30 2008 +0100 stop_machine: introduce stop_machine_create/destroy. ---------------------------------------------------------------- After that I repeated the original bisection but this time marked Problem 1 points as good. That yielded the first bad commit for Problem 2 mentioned at the beginning. It's interesting to note that at the second to last bisection point (7c7758f99d39d529a64d4f60d22129bbf2f16d74) both suspend and resume were successful. Then at the last bisection point (f6dc1e5e3d4b523e1616b43beddb04e4fb1d376a) Problem 1 reappeared. Finally at b424e8d3b438e841cd1700f6433a100a5d611e4a Problem 2 took over. The bisection log is attached for reference. Created attachment 21173 [details]
resume failure bisection log
Not-Handled-By : Mattia Dongili <malattia@linux.it> Notify-Also : Mattia Dongili <malattia@linux.it> Fanderay, can you please check if the kernel where commit e8c331e963c58b83db24b7d0e39e8c07f687dbc6 is the head works for you correctly? Hi Rafael, The kernel with HEAD e8c331e... exhibits Problem 1, i.e. the suspend itself fails. Thanks for testing! Can you also test the kernel where the head is commit 9eff02e2042f96fb2aedd02e032eca1c5333d767, please? HEAD 9eff02e... also yields a Problem 1 kernel. I'm out of ideas. Could you carry out a bisection between 9eff02e and the last known good kernel (presumably 2.6.28) ? Will that actually be of any help? It would seem that at best it would give more information about Problem 1, but that was fixed long ago anyway. Ah, sorry, my bad. In that case the bisection isn't really going to help. We'll need to figure out what happens to the sony-laptop driver during suspend in your box, then. Mattia, do you have any ideas how to debug this? Not really actually, I suspect some bad interaction with something else. What I'd do though is commenting out parts of the resume function and see which bit causes the problem. Fanderay, There are four blocks in the sony_nc_resume function in 2.6.29, try to comment them out. All of them first and then uncomment them one by one from top to bottom. If you build sony-laptop as module the process will be faster. thanks Mattia Hi Mattia, Good call. I did as you suggested and the results seem to point to this block as the problem: /* set the last requested brightness level */ if (sony_backlight_device && !sony_backlight_update_status(sony_backlight_device)) printk(KERN_WARNING DRV_PFX "unable to restore brightness level\n"); Hah! and now why has this stopped working? Let's ask on linux-acpi and see if anyone has have any clue. what if you comment these two lines? can the laptop come back? Yes, without those lines the resume succeeds; with them it fails (tested again with 30rc6). Zhang, Those two lines have been there since the dawn of times, could it be that some other change in ec.c (?) had a bad effect on restoring the brightness? Is it worth maybe bisecting limiting the scope to drivers/acpi ? Thanks, Mattia please attach the acpidump output. (In reply to comment #29) > Is it worth maybe bisecting limiting the scope to drivers/acpi ? > sounds reasonable. Fanderay, can you please git bisect drivers/acpi to see which commit introduces this regression? Created attachment 21426 [details]
acpidump
The problem with the bisections is the interference from Problem 1 between .28 and .29. Since it's a suspend failure, it is impossible to tell for any bisection point that hits it whether resume is also failing. Can you think of any way around this? Another note: unfortunately it appears that resume failures are caused by more than just the backlight restore. I'm currently running a 30rc6 kernel with the backlight lines removed; this seems to allow the resume to succeed when booting with init=/bin/bash, but today when I tried to resume under normal conditions (after suspend from vesafb console, X running on another vt) the resume produced a hardlock just as before. I haven't yet looked into this further to see whether X, fbcon, etc. make any difference - one thing at a time... > if (sony_backlight_device &&
> !sony_backlight_update_status(sony_backlight_device))
ACPI video backlight control methods are available on this laptop, which means that sony_backlight_device is NULL. So commenting the above two lines should have no effect...
please attach the output of "grep . /sys/class/backlight/*/*"
Plus, would you please change the above two lines to:
if (sony_backlight_device)
if(!sony_backlight_update_status(sony_backlight_device))
and see if this helps.
I run into such kind of problems before.
% grep . /sys/class/backlight/*/* /sys/class/backlight/sony/actual_brightness:6 /sys/class/backlight/sony/bl_power:0 /sys/class/backlight/sony/brightness:6 /sys/class/backlight/sony/max_brightness:7 % In 30rc6 the lines are slightly different: if (sony_backlight_device && sony_backlight_update_status(sony_backlight_device) < 0) printk(KERN_WARNING DRV_PFX "unable to restore brightness level\n"); If I change this to if (sony_backlight_device) if (sony_backlight_update_status(sony_backlight_device) < 0) printk(KERN_WARNING DRV_PFX "unable to restore brightness level\n"); then the resulting object file is byte-for-byte identical with the original one according to cmp(1), so I guess it doesn't make a difference... there are two set of ACPI video backlight control methods available. One is for the integrated Intel graphics which is not available on this laptop. Another one is for the external ATI graphics, but unfortunately the _DOS method is not implemented, which results in no ACPI backlight control. So a simple solution is that enabling the ACPI backlight control on this laptop instead of Sony platform specific methods, like the other sony laptops, so that we will not invoke sony_backlight_update_status any more. But as you said, this is a regression, we'd better root cause the problem. could you please run git-bisect to find out which commit introduces this regression? As I explained in Comments # 15 and 32, interference from another problem makes bisection unreliable. If you have a way around this then I'll try again. I'm also concerned that removing the backlight_update_status call in sony_nc_resume is not solving the problem completely (Comment #32); is there a patch for "enabling the ACPI backlight control on this laptop instead of Sony platform specific methods" that I can try to see if it helps? (In reply to comment #37) > As I explained in Comments # 15 and 32, interference from another problem > makes > bisection unreliable. If you have a way around this then I'll try again. > sorry I forgot this. > I'm also concerned that removing the backlight_update_status call in > sony_nc_resume is not solving the problem completely (Comment #32); is there > a > patch for "enabling the ACPI backlight control on this laptop > instead of Sony platform specific methods" that I can try to see if it helps? no, there is no such kind of patch, the way to fix it is using a customized DSDT. I'll attach the DSDT later. Created attachment 21479 [details]
customized DSDT: using fake _DOS
hmm, can you make sure sony_backlight_update_status is also invoked with the same parameter in 2.6.28? (In reply to comment #40) > hmm, can you make sure sony_backlight_update_status is also invoked with the > same parameter in 2.6.28? Zhang, see comment #14, there has been almost no change to sony-laptop between .28 and .29, the only change is related to code that is run at load time when the driver lists the available methods in the SNC device definition. Results with Zhang's custom DSDT: 1. Boot with init=/bin/bash: resume succeeds. 2. Boot normally: resume fails with a hardlock. These are the same results as those obtained by commenting out the call to sony_backlight_update_status in sony_nc_resume. Note: The "normal boot" case for me means booting to standard multiuser mode with no X and an ordinary VGA console (no framebuffer). This is a Debian system. One minor additional piece of data about the failure case with the custom DSDT: after the hardlock occurs and the power is physically cycled, the system comes up to the BIOS boot screen with the LCD brightness set to 0. This doesn't happen in other cases. could you please set CONFIG_ACPI_DEBUG, rebuilt with the custom DSDT, boot with init=/bin/bash, acpi.debug_layer=0xffffffff, acpi.debug_level=0x07 and attach the dmesg output after boot. Created attachment 21532 [details]
dmesg-acpi-debug
On Monday 25 May 2009, Mattia Dongili wrote:
> On Sun, May 24, 2009 at 09:11:50PM +0200, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.29. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13148
> > Subject : resume after suspend-to-ram broken on Sony Vaio
> VGN-SR19VN when sony-laptop driver present
> > Submitter : fanderay <fanderay4@googlemail.com>
> > Date : 2009-04-22 14:39 (33 days old)
>
> sorry for not mentioning this before, but it looks like this regression
> was introduced between .28 and .29
(In reply to comment #43) > could you please set CONFIG_ACPI_DEBUG, > rebuilt with the custom DSDT, > boot with init=/bin/bash, acpi.debug_layer=0xffffffff, acpi.debug_level=0x07 > and attach the dmesg output after boot. oops, what I really want is the dmesg output after resume, can you reattach that info please? Created attachment 21557 [details]
dmesg-acpi-debug after s2ram and resume
sorry, the DSDT in comment #39 is way off the target. According to the comment #15, you don't have any suspend/resume problem before f6dc1e5e3d4b523e1616b43beddb04e4fb1d376a is applied, is that true? I can't really say anything more beyond that comment and the attached bisection log. Also since all of those tests were carried out only with init=/bin/bash, even the "working" cases should be viewed with a grain of salt in light of Comment #42. I just moved to .30.1 and something significant seems to have changed: now resume succeeds when booting with init=/bin/bash, although it still hardlocks on a normal boot ("normal" as described in Comment #42). Perhaps some module is causing problems. Could it be some resume scriptlet in the suspend utilities from Debian? Also, out of curiosity (I don't think it has been tried before), does booting with "acpi_backlight=vendor" help? Thanks -- mattia Hi Mattia, Moved to .30.4 and results are the same. It doesn't seem like it would be a script, since things appear to hardlock immediately before even hardware-level functions are restored (after pressing the resume button, the hard drive light comes on for a moment, then goes off, and the screen stays black, keyboard input is not recognized and caps lock does not even cycle the keyboard LED), but anyway I killed acpid which is the driver for such scripts and it didn't change anything. Nor does booting with acpi_backlight=vendor (wouldn't this mean to use the sony rather than generic ACPI backlight control? if so, that's what happens on this system anyway even without this option.) (Er, just realized I was being daft in Comment #50 - I'd simply built with sony-laptop as a module, so naturally resume worked with init=/bin/bash since it wasn't loaded. The real state of things is still as summarized in Comment #42.) Any progress on this? Do you need any testing done? I have a vgn-sr19vn Hi Marcus, I think it would be hugely useful if you have a SR19VN that you can help test with! The most basic question is, does resume after s2ram work for you on that system? Hi, Mattia, what's the status of this bug? (In reply to comment #55) > Hi Marcus, > > I think it would be hugely useful if you have a SR19VN that you can help test > with! The most basic question is, does resume after s2ram work for you on > that > system? It is not working - blank screen and i have to hard reboot *** Bug 13225 has been marked as a duplicate of this bug. *** Hi, Suspend to ram worked prior 2.6.27.12 on my VGN-SR19vn laptop. Git bisect tells me, that 5f94bb6eda87dcd0136ceb52b62c03ebbb651443 is first bad commit commit 5f94bb6eda87dcd0136ceb52b62c03ebbb651443 Author: Rafael J. Wysocki <rjw@sisk.pl> Date: Wed Jan 14 00:38:27 2009 +0100 PCI: Suspend and resume PCI Express ports with interrupts disabled commit 90d25f246ddefbb743764f8d45ae97e545a6ee86 upstream SELinux breaks sony-laptop-module in 2.6.27.11 kernel with Fedora 11. I tested using insructions from description and i can test patches. I'm not sure at all if this is the same bug and I wouldn't expect this particular commit to cause problem. I have an idea to test, but you'll need to test 2.6.31 before that. So, please try 2.6.31 and let me know if that works. Hi, 2.6.31 didn't works. Created attachment 23116 [details]
PCI: Clear saved_state after the state has been restored
Please try this patch on top of 2.6.31 and see if it changes anything.
(In reply to comment #62) > Created an attachment (id=23116) [details] > PCI: Clear saved_state after the state has been restored > > Please try this patch on top of 2.6.31 and see if it changes anything. Nope, didn't work. No changes. Created attachment 23134 [details]
ACPI: Always try to get control of the PCIe capability structure from the BIOS
Please try this patch too (preferably along with the previous one).
(In reply to comment #64) > Created an attachment (id=23134) [details] > ACPI: Always try to get control of the PCIe capability structure from the > BIOS > > Please try this patch too (preferably along with the previous one). I try with and without #62 on 2.6.31. Didn't work. I also tested both,#62 and #64, on 2.6.31.1 and no changes. Hmm. Mattia, what's Sony PIC? Created attachment 23193 [details]
ACPI / sony-laptop: Suspend late and resume early
If it is what I think it is, it likely should be resumed before the PCIe ports.
Please try this patch (on top of 2.6.31) and report back (I couldn't test it).
(In reply to comment #67) > Created an attachment (id=23193) [details] > ACPI / sony-laptop: Suspend late and resume early > > If it is what I think it is, it likely should be resumed before the PCIe > ports. > > Please try this patch (on top of 2.6.31) and report back (I couldn't test > it). Tested patch on top of clean 2.6.31 and didn't work. Too bad. I'm out of ideas for now. At least I can't figure out in what way sony-laptop may depend on PCIe root ports and vice versa. If anyone can explain that to me, I'll appreciate it very much. Ari, can you please check if the problem is present in the kernel where mainline commit c70e0d9dfef3d826c8ae4f7544acc53887cb161d is the head (ie. download the Linus' git tree, do 'git checkout c70e0d9dfef3d826c8ae4f7544acc53887cb161d', compile and install the resulting kernel and see if resume works)? (In reply to comment #70) > Ari, can you please check if the problem is present in the kernel where > mainline commit c70e0d9dfef3d826c8ae4f7544acc53887cb161d is the head (ie. > download the Linus' git tree, do 'git checkout > c70e0d9dfef3d826c8ae4f7544acc53887cb161d', compile and install the resulting > kernel and see if resume works)? Hi, suspending halt/crash somewhere between suspendin console(s)-text and shutting off display, capslock-led don't work. I tested with and without sony-laptop module. So, I'm not sure the results of -stable bisection can be applied to the mainline. Let's go back to the top of the git, then. Can you confirm that suspend/resume works with 2.6.32-rc3 without sony-laptop and doesn't work with sony-laptop? (In reply to comment #72) > So, I'm not sure the results of -stable bisection can be applied to the > mainline. Let's go back to the top of the git, then. > > Can you confirm that suspend/resume works with 2.6.32-rc3 without sony-laptop > and doesn't work with sony-laptop? Tested and confirmed. Thanks. So, it looks like we need to focus on sony-laptop and find out why it causes resume to fail. I'm not really familiar with this driver, so I need some time to understand how it works. Rafael, sony-laptop doesn't do much on resume, really. My understanding was that resuming succeeds with init=/bin/bash (is the case for single user mode too?). Is it still true? Could it be that it's not really the resume process that triggers the bug but some script poking at the sysfs files from sony-laptop? I sent a couple of patches that are now in the acpi-test tree, they are on the linux-acpi list, this one specifically could help: http://www.spinics.net/lists/linux-acpi/msg24620.html Could you give it a try? Thanks (In reply to comment #75) > Rafael, sony-laptop doesn't do much on resume, really. I know, but apparently something it does affects suspend/resume. To be precise, it probably affects the BIOS which then triggers the issue. Of course, it also is possible that one of the sony-laptop scripts does something wrong. To rule this out, can you please advise Ari how to disable those scripts without unloading the sony-laptop driver? Ari, did you try hibernation on this box? I don't know of any sony-laptop specific scripts, I was more thinking of the distribution specific resume callbacks that restore brightness or rfkill statuses. But I just re-read the whole thread and it looks like booting with init=/bin/bash actually never succeeded. What I was going to suggest if it was working was to resume and then poke at the files manually to see if anything there is causing the failure. Rafael, apologies for missing your previous question at #66, Sony PIC is pci device that was used to control most of the special features on sony laptops. The SR series is not equipped with it so only the SNC (SNY5001 in the DSDT) part of the sony-laptop driver is active. Something else we could start with is commenting out the SNC driver registration in sony-laptop (see attached patch) and see how that goes with a suspend/resume cycle... Sounds dumb but I have no real clues about what's wrong, maybe one of the initialization we do in sony_nc_function_setup needs to be undone when suspending. Created attachment 23306 [details]
do not initialize SNC
(In reply to comment #76) > (In reply to comment #75) > > Rafael, sony-laptop doesn't do much on resume, really. > > I know, but apparently something it does affects suspend/resume. To be > precise, it probably affects the BIOS which then triggers the issue. > > Of course, it also is possible that one of the sony-laptop scripts does > something wrong. To rule this out, can you please advise Ari how to disable > those scripts without unloading the sony-laptop driver? > > Ari, did you try hibernation on this box? Hibernation works (In reply to comment #78) > Created an attachment (id=23306) [details] > do not initialize SNC I tested patch on top of clean 2.6.31.3 and suspend/resume to ram works, but there is no control of brightness. (In reply to comment #75) > Rafael, sony-laptop doesn't do much on resume, really. > > My understanding was that resuming succeeds with init=/bin/bash (is the case > for single user mode too?). Is it still true? > > Could it be that it's not really the resume process that triggers the bug but > some script poking at the sysfs files from sony-laptop? > > I sent a couple of patches that are now in the acpi-test tree, they are on > the > linux-acpi list, this one specifically could help: > http://www.spinics.net/lists/linux-acpi/msg24620.html > Could you give it a try? Tested top of clean 2.6.31.3 and resume didn't work. (In reply to comment #80) > (In reply to comment #78) > > Created an attachment (id=23306) [details] [details] > > do not initialize SNC > > I tested patch on top of clean 2.6.31.3 and suspend/resume to ram works, but > there is no control of brightness. ok, this gives us a starting point. Could you revert the patch and try the one I am about to post? Thanks Created attachment 23384 [details]
disable hot keys setup
(In reply to comment #83) > Created an attachment (id=23384) [details] > disable hot keys setup Resume didn't work. Mattia, what's the status of this bug? Ari, the problem still exists in the latest kernel, i.e. 2.6.32, right? (In reply to comment #85) > Mattia, > what's the status of this bug? Still open as far as i know. I started basically a binary search to figure out what is breaking suspend in sony-laptop but never had time to follow up. I'll try to do so after Ari confirms. I have a Sony PCG-6122M (VAIO Z51X) running Fedora 12 Linux localhost.localdomain 2.6.31.6-166.fc12.x86_64 #1 SMP Wed Dec 9 10:46:22 EST 2009 x86_64 x86_64 x86_64 GNU/Linux The suspend works fine if I unload the sony_laptop module but fails to shutdown if the module is loaded (In reply to comment #85) > Ari, > the problem still exists in the latest kernel, i.e. 2.6.32, right? ping ari... please re-open it if the problem still exists in the latest git kernel. Reopening this bug as the problem continues to exist as of 2.6.33.1. Booting with init=/bin/bash and sony-laptop driver loaded, resume succeeds. Booting normally (multiuser state but without X), resume fails with a hard lockup. Created attachment 25563 [details]
do sony resume in pm notifier chain
please apply this patch on top of 2.6.34-rc1 and see if it helps.
ping fanderay... I've been running fedora 13 for a while now and the problem is the same. I've tried compiling the git kernel several times but my machine won't boot. I've spent a lot of time trying to do this but can't get a working .config. If anyone can give me a pointer on how to do this I'd be grateful. I've compiled plenty of kernels in the past and never had this problem. Hi Zhang Rui, thanks for the patch. Unfortunately it does not change the result: resume fails with the same hard lockup. Following the failed resume, the system starts with the LCD backlight brightness level set to 0 and has to be manually corrected. [I believe this symptom is coincident with a change that was made some time ago that results in the ACPI rather than the platform-specific backlight driver being used on this system (i.e. /sys/class/backlight for some time now has acpi_video0 instead of sony).] Just wanted to emphasize an important point about this problem in case it got lost in the noise: Originally the hard lockup on resume occurred even when booting with init=/bin/bash provided the sony-laptop driver was loaded. As we saw in Comments #25 - #28, commenting out the backlight restore seemed to fix the problem *when booting with init=/bin/bash*. Also, using the ACPI rather than sony-specific backlight routines (which now happens by default on this system in current kernels) yields a successful resume *when booting with init=/bin/bash*. However, on a normal system boot (but without X/fbcon), the resume always fails. I have never seen it succeed under these conditions, except possibly as far back as .26 or so. It therefore seems possible that there are multiple causes for the resume failure, and that the sony backlight restore was just one of them (which is now "resolved" since the ACPI backlight control routines are now used instead). This raises the question of whether the current resume failure has anything to do with the sony-laptop driver at all; it may not. This suggests a couple of other tests, like verifying whether resume still fails in current kernels even without the sony-laptop driver loaded. I don't have much time to do this kind of testing at the moment, so it would be helpful if others with this laptop can assist. It would also be good for the ACPI developers to start looking beyond sony-laptop for possible causes. Still a problem in kernels younger than 2.6.36 ? This issue appears to have been partially fixed in 2.6.35.x and fully fixed in 2.6.36.x. In .35 it did not hard-lock on resume but resume caused a continuous stream of radeon-related errors; it was possible to shut down afterward. In .36 resume seemed to work correctly for the first time. All my tests were with KMS enabled. Ok, I'm closing it as unreproducible, the fix obviously made it already to the stable tree's, so nothing to worry about. Thanks for testing! |