I tested kernel 2.6.36 and found that hibernation failed when the process was attempting to suspend operation of CPU's 1,2 and 3. I verified that hiberation proceeds if CPU's 1.2 and 3 are manually suspended prior to requesting that part of the hibernation (freezing) test. Tested 2.6.35 - same result. The test works for 2.6.34, thus the regression appears to be in linux-source-2.6.35 'test' - refers to the procedures set out in source/Documentation/power basic.pm.debugging.txt
Are you able to switch the CPUs 1, 2, 3 on and off using the /sys/devices/system/cpu/cpu[1-3]/online interfaces?
This is filed against "2.6.35-1" -- what is that? How is it related to 2.6.35.7 -- the current stable release of Linux? The description mentions 2.6.36, which is not yet released, so I assume you tested some 2.6.36-rc? You mentioned that 2.6.34 works. Does the latest version of 2.6.34.stable work? -- currently 2.6.34.7
On Tue, Oct 5, 2010 at 5:31 AM, <bugzilla-daemon@bugzilla.kernel.org> wrote: Hi Len, > https://bugzilla.kernel.org/show_bug.cgi?id=19612 > > > Len Brown <lenb@kernel.org> changed: > > What |Removed |Added > > ---------------------------------------------------------------------------- > Status|NEW |NEEDINFO > CC| |lenb@kernel.org > --- Comment #2 from Len Brown <lenb@kernel.org> 2010-10-05 04:31:07 --- > This is filed against "2.6.35-1" -- what is that? > How is it related to 2.6.35.7 -- the current stable release of Linux? > > The description mentions 2.6.36, which is not yet released, > so I assume you tested some 2.6.36-rc? > > You mentioned that 2.6.34 works. > Does the latest version of 2.6.34.stable work? -- currently 2.6.34.7 > > -- > Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You reported the bug. > I'm running Debian|testing/experimental on a 2x2 CPU Opteron machine. The source code packages tested are: linux-source-2.6.34_2.6.34-1~experimental.2_all.deb linux-source-2.6.35_2.6.35-1~experimental.3_all.deb linux-source-2.6.36_2.6.36~rc5-1~experimental.1_all.deb I will check the debian package for 2.6.34 and test 2.6.34.7 during the weekend.
100% > Test results 9 October 2010 file tmp.pm_test.txt > > linux-source-2.6.34_2.6.34-1~experimental.2_all.deb > > Devices = pm_test fail/pass 0/10 > Platform = pm_test fail/pass 0/10 > Processors = pm_test fail/pass 2*/10 [1][2] > Core = pm_test fail/pass 3/2 [3][4][5] > [1] Long delay around 112 sec, system came back apparently fully > operational. On next test, instant error message, I *think* related to > access to disks, then returned to gnome scre > en and termaial with terminal message "bash: echo: write error: > Input/output error". Reset (reboot) needed. > [2] Long delay around 112 sec, system came back apparently fully > operational. Checked and found one mdadm raid5 disc degraded. Rebooted. Told > mdadm to add the partition back into t > he array ... rebuilding etc. > * This may underestime the number of failures. If in doubt about success, I > rebooted the system. Even so, sometime in this process one of my mdadm > partitions got kicked from its ra > id5 array. > [3] Long delay 120 sec, system restored screed, no keyboard or mouse > activity, flashing caps lock. reset (reboot) > [4] Long delay around 112 sec, system came back apparently fully > operational. Checked and found one mdadm raid5 partition degraded. Rebooted. > Told mdadm to add the partition back i > nto the array ... rebuilding etc. > [5] long delay around 120 sec, black screen, no keyboard, no mouse, reset > (reboot). Checked and found two mdadm raid5 partitions degraded. Told mdadm > to add the partitions back int > o the arrays ... rebuilding etc. > > 2.6.34.7 > > Processors = pm_test fail/pass 4/10 [1][2][3][4][5] > Devices = pm_test fail/pass 0/10 > Platform = pm_test fail/pass 0/10 > > [1] Long delay around 60 sec, partial restore, keyboard active, then system > locked up, reset (reboot) needed > [2] Long delay around 60 sec , screen restored locked up, reset (reboot) > needed > [3] instant failure to hibernate, system operational, terminal message > "bash: echo: write error: Input/output error" > [4] Long delay around 60 sec, black screen locked up, flashing caps lock, > reset (reboot) needed > [5] 15 sec delay (normal) with message "ata1 SRST failed (errorno=-16) then > lockup, reset (reboot) needed. > > >
"Are you able to switch the CPUs 1, 2, 3 on and off using the /sys/devices/system/cpu/cpu[1-3]/online interfaces?" Yes. With CPUs 1,2,3 = 0 and running kernel 2.6.24.7 ... processors = pm_test fail/pass 0/10 On 10/9/10, zz zzzzzzzzzzzzz <tempo444z@gmail.com> wrote: > 100% > >> Test results 9 October 2010 file tmp.pm_test.txt >> >> linux-source-2.6.34_2.6.34-1~experimental.2_all.deb >> >> Devices = pm_test fail/pass 0/10 >> Platform = pm_test fail/pass 0/10 >> Processors = pm_test fail/pass 2*/10 [1][2] >> Core = pm_test fail/pass 3/2 [3][4][5] >> [1] Long delay around 112 sec, system came back apparently fully >> operational. On next test, instant error message, I *think* related to >> access to disks, then returned to gnome scre >> en and termaial with terminal message "bash: echo: write error: >> Input/output error". Reset (reboot) needed. >> [2] Long delay around 112 sec, system came back apparently fully >> operational. Checked and found one mdadm raid5 disc degraded. Rebooted. >> Told >> mdadm to add the partition back into t >> he array ... rebuilding etc. >> * This may underestime the number of failures. If in doubt about success, >> I >> rebooted the system. Even so, sometime in this process one of my mdadm >> partitions got kicked from its ra >> id5 array. >> [3] Long delay 120 sec, system restored screed, no keyboard or mouse >> activity, flashing caps lock. reset (reboot) >> [4] Long delay around 112 sec, system came back apparently fully >> operational. Checked and found one mdadm raid5 partition degraded. >> Rebooted. >> Told mdadm to add the partition back i >> nto the array ... rebuilding etc. >> [5] long delay around 120 sec, black screen, no keyboard, no mouse, reset >> (reboot). Checked and found two mdadm raid5 partitions degraded. Told >> mdadm >> to add the partitions back int >> o the arrays ... rebuilding etc. >> >> 2.6.34.7 >> >> Processors = pm_test fail/pass 4/10 [1][2][3][4][5] >> Devices = pm_test fail/pass 0/10 >> Platform = pm_test fail/pass 0/10 >> >> [1] Long delay around 60 sec, partial restore, keyboard active, then >> system >> locked up, reset (reboot) needed >> [2] Long delay around 60 sec , screen restored locked up, reset (reboot) >> needed >> [3] instant failure to hibernate, system operational, terminal message >> "bash: echo: write error: Input/output error" >> [4] Long delay around 60 sec, black screen locked up, flashing caps lock, >> reset (reboot) needed >> [5] 15 sec delay (normal) with message "ata1 SRST failed (errorno=-16) >> then >> lockup, reset (reboot) needed. >> >> >> >
On Monday, October 11, 2010, zz zzzzzzzzzzzzz wrote: > Test results 9-11 October 2010 file tmp.pm_test.txt > > linux-source-2.6.34_2.6.34-1~experimental.2_all.deb > > Devices = pm_test fail/pass 0/10 > Platform = pm_test fail/pass 0/10 > Processors = pm_test fail/pass 2*/10 [1][2] > Core = pm_test fail/pass 3/2 [3][4][5] > [1] Long delay around 112 sec, system came back apparently fully > operational. On next test, instant error message, I *think* related to > access to disks, then returned to gnome scr$ > [2] Long delay around 112 sec, system came back apparently fully > operational. Checked and found one mdadm raid5 disc degraded. > Rebooted. Told mdadm to add the partition back into $ > * This may underestime the number of failures. If in doubt about > success, I rebooted the system. Even so, sometime in this process one > of my mdadm partitions got kicked from its r$ > [3] Long delay 120 sec, system restored screed, no keyboard or mouse > activity, flashing caps lock. reset (reboot) > [4] Long delay around 112 sec, system came back apparently fully > operational. Checked and found one mdadm raid5 partition degraded. > Rebooted. Told mdadm to add the partition back $ > [5] long delay around 120 sec, black screen, no keyboard, no mouse, > reset (reboot). Checked and found two mdadm raid5 partitions degraded. > Told mdadm to add the partitions back in$ > > 2.6.34.7 > > Processors = pm_test fail/pass 4/10 [1][2][3][4][5] > Devices = pm_test fail/pass 0/10 > Platform = pm_test fail/pass 0/10 > > [1] Long delay around 60 sec, partial restore, keyboard active, then > system locked up, reset (reboot) needed > [2] Long delay around 60 sec , screen restored locked up, reset (reboot) > needed > [3] instant failure to hibernate, system operational, terminal message > "bash: echo: write error: Input/output error" > [4] Long delay around 60 sec, black screen locked up, flashing caps > lock, reset (reboot) needed > [5] 15 sec delay (normal) with message "ata1 SRST failed (errorno=-16) > then lockup, reset (reboot) needed. > > 2.6.35.1 > > Processors = pm_test fail/pass 3/0 [1][2][3] > platform = pm_test fail/pass 0/3 > > [1][2][3] Hard lockup after about 12 secs. Caps lock flashing, black > screen, no keyboard response. Reset (reboot)
Tested kernel 2.6.33.7 for SMP CPU hibernation, seems good. Tested kernel 2.6.34.1 for SMP CPU hibernation, and got some failures, some passes.
Thus it appears to be a regression from 2.6.33 rather than from 2.6.34.
Compiling POWERNOW_K8 into the linux kernel, and not as a module, creates the bug. Testing for SMP CPU hibernation with the bug, gives the following results: kernels 2.6.35 and 2.6.36 - black screen, flashing caps lock, no response from the keyboard. kernel 2.6.34 - failure rates were between 20% and 50%. Perhaps half the failed tests showed that one or more software raid5 partitions was marked as faulty and had been removed from its array. The system appeared to be able to operate normally, though degraded. The other failures gave a black screen and no response from the keyboard. kernel 2.6.33 was not tested with POWERNOW_K8 in the kernel.
Is the problem still present in 2.6.37?
Bug closed as there is no response from the bug reporter. Please feel free to reopen it if the problem still exists in the latest upstream kernel.