s2disk worked in 2.6.28 and no longer works in 2.6.30.2 (s2disk here means: echo disk > /sys/power/state). Screen blanks and system hangs with no writes to disk. Hard restart required afterwards. Bisection identifies problem commit has hash 295f00042aaf6b553b5f37348f89bab463d4a469.
marked as regression, cc'ed s2d developers.
Caused by: commit 295f00042aaf6b553b5f37348f89bab463d4a469 Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Date: Fri Jan 2 16:12:48 2009 +0100 ide: don't execute the next queued command from the hard-IRQ context (v2) First-Bad-Commit : 295f00042aaf6b553b5f37348f89bab463d4a469
On Saturday 01 August 2009 12:52:08 bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13886 > > > > > > --- Comment #2 from Rafael J. Wysocki <rjw@sisk.pl> 2009-08-01 10:52:07 --- > Caused by: > > commit 295f00042aaf6b553b5f37348f89bab463d4a469 > Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> > Date: Fri Jan 2 16:12:48 2009 +0100 > > ide: don't execute the next queued command from the hard-IRQ context (v2) > > First-Bad-Commit : 295f00042aaf6b553b5f37348f89bab463d4a469 Hi Rafal, We have been through one similar case already, please see: http://bugzilla.kernel.org/show_bug.cgi?id=13371 s2d bisection will always point to this commit because it really broke s2d for the "whole" 12 days during early -rc phase. The fix: commit 2ea5521022ac8f4f528dcbae02668e02a3501a5a Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Date: Wed Jan 14 19:19:04 2009 +0100 ide: fix suspend regression Last time I've spent like a week working with the reporter to find the real bugger (in e100 networking driver, it stays unfixed after over 2 months). I'm not going to do it again. If the real bugger turns out to be my patch or patch passed through me then please ping me again, till then I'm off cc:. Also a small reminder -- I'm no longer maintainer of drivers/ide/ so please pass everything through Dave first as he should be able to verify whether the issue was fixed already, i.e.: there are still some outstanding bugfixes from June needed for -stable: http://marc.info/?l=linux-ide&m=124910557313722&w=2 [ yes, those are fixes for a Sparc Ultra 10 specific problems reported on one sunny Saturday's evening and fully debugged/fixed after 2 days.. ]
The fix mentioned above (commit 2ea5521022ac8f4f528dcbae02668e02a3501a5a) does not address the problem I am encountering (which is present in 2.6.30.2). There is no e100 hardware present here. I am more than willing to help with the debugging process from this end. Might we test by reverting his commit on 2.6.30.2?
(In reply to comment #4) > The fix mentioned above (commit 2ea5521022ac8f4f528dcbae02668e02a3501a5a) > does > not address the problem I am encountering (which is present in 2.6.30.2). > Bart's point is that the problem introduced by commit 295f00042aaf6b553b5f37348f89bab463d4a469 was fixed by commit 2ea5521022ac8f4f528dcbae02668e02a3501a5a, so if you have a bisection point between the two commits, the bisection may lead to commit 295f00042aaf6b553b5f37348f89bab463d4a469 instead of the real culprit. > There is no e100 hardware present here. How is e100 relevant here? > I am more than willing to help with the debugging process from this end. > > Might we test by reverting his commit on 2.6.30.2? That might be difficult, because I guess commit 295f00042aaf6b553b5f37348f89bab463d4a469 doesn't revert cleanly. Could you instead test the kernel where commit 2ea5521022ac8f4f528dcbae02668e02a3501a5a is the head and see if the problem is present there? [Bart, I know you're not maintaining IDE any more, but your commit was pointed out by the bisection. Thanks for the information that it was a false positive.] First-Bad-Commit: unknown
By way of an update: (1) You were right, git zeroed in on a false positive. Compiling with 2ea5521022ac8f4f528dcbae02668e02a3501a5a as HEAD fixed s2disk. I apologize for not having understood what was meant the first time around. (2) Bisecting between that and 2.6.30 narrowed down the "problem" commit to: 2f0d0fd2a605666d38e290c5c0d2907484352dc4. This is not really a problem (see below). (3) Between 2.6.26 and 2.6.28 things broke for me (CPU ran about 17 degrees hotter while idle). The fix was noapic, nolapic, and pci=noacpi. This issue was resolved between 2.6.28 and 2.6.30.2 but pci=noacpi was needed to prevent IRQ problems (because IOAPIC support had been removed from my kernel). It turns out pci=noacpi, 2f0d0fd2a605666d38e290c5c0d2907484352dc4, and s2disk don't get along. (4) I am happy to confirm that 2.6.30.4 works great with the re-insertion of IOAPIC uniprocessor support in the kernel and removing all APIC and ACPI boot parameters. Sorry for the false alarm. ~Andy
Thanks for the update, closing.