Bug 37712 - [PATCH] reboot / "shutdown -r now" hangs ; works fine on 2.6.38.7-1
Summary: [PATCH] reboot / "shutdown -r now" hangs ; works fine on 2.6.38.7-1
Status: RESOLVED OBSOLETE
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: i386 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: platform_i386
URL:
Keywords:
Depends on:
Blocks: 32012
  Show dependency tree
 
Reported: 2011-06-16 20:50 UTC by Dave Hooper
Modified: 2013-12-23 13:53 UTC (History)
8 users (show)

See Also:
Kernel Version: 2.6.39.1,3.0.3
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
kernel config (125.80 KB, text/plain)
2011-06-26 16:25 UTC, Dave Hooper
Details
dmesg (32.74 KB, text/plain)
2011-06-26 16:26 UTC, Dave Hooper
Details
dmesg (40.62 KB, text/plain)
2011-07-22 14:54 UTC, Henrik Hämäläinen
Details
completely removes the quirk for fitpc reboot, since default (acpi) reboot works (569 bytes, patch)
2012-02-05 16:35 UTC, Dave Hooper
Details | Diff

Description Dave Hooper 2011-06-16 20:50:51 UTC
Not sure where to file this, as I don't know which component handles "rebooting". My hardware uses BIOS for reboot but BIOS section seems reserved for "BIOS bugs not fixed by Linux OS".

My symptoms seem to be exactly identical to the following (closed/resolved) bug:
https://bugzilla.kernel.org/show_bug.cgi?id=33302

shutdown -h works fine (and powers off).  shutdown -r hangs rather than rebooting.  Both works as expected on previous kernel (I used 2.6.38.7-1).

Hardware is x86 (not x86_64).


Linux Xxxx 2.6.39-ARCH #1 SMP PREEMPT Tue Jun 7 05:49:02 UTC 2011 i686 Intel(R) Atom(TM) CPU Z510 @ 1.10GHz GenuineIntel GNU/Linux


Jun 16 21:51:32 Xxxx kernel: [    0.088715] CompuLab SBC-FITPC2 series board
detected. Selecting BIOS-method for reboots.
Comment 1 Dave Hooper 2011-06-22 20:28:35 UTC
I have confirmed I have the same issue on 2.6.39-1-vanilla (i.e. kernel build from vanilla sources without ArchLinux patches)
Comment 2 Dave Hooper 2011-06-26 16:25:59 UTC
Created attachment 63552 [details]
kernel config
Comment 3 Dave Hooper 2011-06-26 16:26:32 UTC
Created attachment 63562 [details]
dmesg
Comment 4 H. Peter Anvin 2011-06-26 19:49:56 UTC
On 06/22/2011 01:28 PM, bugzilla-daemon@bugzilla.kernel.org wrote:
>
> --- Comment #1 from Dave Hooper<dave@beermex.com>   2011-06-22 20:28:35 ---
> I have confirmed I have the same issue on 2.6.39-1-vanilla (i.e. kernel build
> from vanilla sources without ArchLinux patches)
>

This bug seems to imply it is related to reboot=bios, although it 
doesn't actually give enough information to make that determination.

reboot=bios was believed to have been fixed in 2.6.39-rc6 so this is a 
bit of a surprise.  We need more system information.

One of the important things is to find out if reboot really *is* BIOS, 
since the default ordering has been changed.

	-hpa
Comment 5 Dave Hooper 2011-06-26 22:08:53 UTC
I attached my dmesg log earlier.  From my dmesg, I see the following (heavily edited for brevity):

[...]
[    0.000000] DMI present.
[    0.000000] DMI: CompuLab SBC-FITPC2/SBC-FITPC2, BIOS NAPA0001.86C.0000.D.1005121012 05/12/2010
[...]
[    0.000000] Base memory trampoline at [c008c000] 8c000 size 16384
[...]
[    0.085194] PM: Registering ACPI NVS region at 3f6bd000 (12288 bytes)
[    0.085325] CompuLab SBC-FITPC2 series board detected. Selecting BIOS-method for reboots.
[...]


Let me know what additional system information you need.
Comment 6 Dave Hooper 2011-06-29 21:17:27 UTC
The regression is still apparent in kernel 2.6.39.2
Comment 7 Dave Hooper 2011-07-04 20:48:41 UTC
(In reply to comment #4)
> On 06/22/2011 01:28 PM, bugzilla-daemon@bugzilla.kernel.org wrote:
> This bug seems to imply it is related to reboot=bios, although it 
> doesn't actually give enough information to make that determination.

Please indicate what additional information you need.
Comment 8 Dave Hooper 2011-07-19 22:08:08 UTC
The regression is still apparent in kernel 2.6.39.3

Can someone please state the additional information I need to provide, or the additional steps I need to perform to help you to investigate and fix this regression.  There doesn't seem to be a huge response to this bug report.
Comment 9 Henrik Hämäläinen 2011-07-22 14:53:48 UTC
Confirming that I'm experiencing this problem as well on 2.6.39.3.

Linux XX 2.6.39-ARCH #1 SMP PREEMPT Sat Jul 9 15:31:04 CEST 2011 i686 Intel(R) Atom(TM) CPU Z550 @ 2.00GHz GenuineIntel GNU/Linux
Comment 10 Henrik Hämäläinen 2011-07-22 14:54:22 UTC
Created attachment 66372 [details]
dmesg
Comment 11 rztrzt 2011-07-29 13:03:58 UTC
Confirming that I'm experiencing this problem as well on 2.6.39.3-1 (2.6.39-ARCH #1 SMP PREEMPT)

This is on a HP nx6110 laptop and seems related to closed bug id 33302, https://bugzilla.kernel.org/show_bug.cgi?id=33302
Also know of another user with a HP nx8220 with the exact same issue.
Comment 12 rztrzt 2011-07-30 07:41:51 UTC
Just enabled the [testing] repo and upgraded to the linux 3.0-2 kernel and the problem is still present.
Comment 13 Dave Hooper 2011-08-28 13:43:47 UTC
I can also confirm that the problem is still present with the linux 3.0.3 kernel. My dmesg still states the following, suggesting that the problem is still due to the BIOS reboot code change:


[    0.088536] CompuLab SBC-FITPC2 series board detected. Selecting BIOS-method for reboots.
Comment 14 H. Peter Anvin 2011-08-28 19:00:25 UTC
Can you try the various reboot methods and see if any of them work at all?
Comment 15 Dave Hooper 2012-02-05 16:28:17 UTC
Now that bugzilla is back up:
What I did not realise was that the reboot quirk prevented me from being able to try the reboot methods on the kernel parameters.

The fix seems to be to remove the quirk entirely for this device.  This appears to enable the default reboot method (=acpi), which works fine for this device.

What I don't understand is how/why the BIOS reboot method was broken by the kernel changes.  From my point of view, I don't care, since my device reboots fine with acpi.  But other folks may want a fix for the actual regression.

Would a kernel developer be able to commit the fix for SBC-FITPC2 i.e. remove the reboot quirk entirely please.

Discussion at archlinux forum:
https://bbs.archlinux.org/viewtopic.php?id=124136&p=2
Comment 16 Dave Hooper 2012-02-05 16:35:21 UTC
Created attachment 72299 [details]
completely removes the quirk for fitpc reboot, since default (acpi) reboot works
Comment 17 Alan 2012-06-27 13:56:54 UTC
Needs to go with a Signed-off-by: line to  <x86@kernel.org> to be applied

(only just noticed this bug has a patch with it)

Alan
Comment 18 Dave Hooper 2012-06-30 09:44:29 UTC
good to know that one year later people are still interested in fixing this kernel bug.
Comment 19 Dave Hooper 2012-06-30 09:46:39 UTC
Note that my patch addresses the issue on only ONE Of the above platforms mentioned above by people experiencing this issue.  So, given that at least two other users experienced problems above, it probably makes sense to fix the general issue if possible, OR modify my proposed patch to fix the other platforms that got broken at the same time as mine by those kernel changes.
Comment 20 Dave Hooper 2012-09-30 08:28:46 UTC
Could somebody clearly state what I must do next in order to get this applied. Many thanks.
Comment 21 Alan 2012-09-30 11:19:34 UTC
See Documentation/SubmittingPatches

Then email it with a description/signed off by to linux-kernel@vger.kernel.org cc x86@kernel.org or if you want you can just add the Signed-off-by: here and I'll take it from there for you.
Comment 22 Dave Hooper 2012-09-30 17:54:54 UTC
Many thanks Alan.  I am happy to sign off on the removal of the existing broken workaround, if that's what you mean.

Signed-off-by: David Hooper <dave@beermex.com>

Description: Remove the quirk for the SBC FITPC.  It seems to have been required when the default was kbd reboot, but no longer required now that default is acpi reboot.  Furthermore, BIOS reboot no longer works for this board as of 2.6.39 or any of the 3.x kernels.
Comment 23 Dave Hooper 2012-09-30 17:56:06 UTC
(again, noting of course that my patch doesn't fix any other boards affected by the 2.6.39 regression, such as the two HP users mentioned above in the comments for this bug)
Comment 24 Alan 2012-10-02 14:01:58 UTC
Done and queued
Comment 25 Dave Hooper 2012-12-03 23:04:16 UTC
When do you think my patch will likely be committed?

Do you think the actual linux change that broke the bios reboot method above will be investigated at all?  (i.e. the linux change that introduce the reboot regression for this device and others) - especially I suppose given that my patch only addresses the device I happen to have, and not the other devices mentioned in this thread.
Comment 26 Alan 2012-12-03 23:11:16 UTC
Umm I thought it had been I'll go double check it didn't get lost
Comment 27 Dave Hooper 2012-12-04 21:10:50 UTC
Apologies, it does appear to have been committed, but hasn't yet been pulled into any kernel releases (at least according to the changelogs published at http://www.kernel.org/pub/linux/kernel/v3.0/ )

Any thoughts from anyone about investigating the cause of what actually broke i.e. why bios reboot worked in 2.6.38 and didn't in 2.6.39 or any of the 3.x kernels?
Comment 28 H. Peter Anvin 2012-12-04 21:57:43 UTC
I strongly suspect that the thing that broke it on your box was change 3d35ac346e981162eeba391e496faceed4753e7b, which *should* have been fixed by change 7806a49ab625ebeb1709e5e87299b64932b807a7.

I'm not super-inclined to spend time tracking down a problem on this machine since the removal of the quirk is The Right Thing, but it might be worth keeping an eye on.
Comment 29 Dave Hooper 2012-12-04 22:32:52 UTC
Sure - I'm referring to any/all other machines that are still also affected by the same bug and for which my patch is no use (there are two HP machines mentioned earlier, by other users in the comments of just this bug report, for example.).

Generally speaking, are kernel developers super-inclined to track down a problem that prevents an unknown number of different machines of different makes and models from rebooting? I keep asking if there is any more information that can be provided to help the developers who actually made the change investigate why their change caused a regression. Obviously diminishing returns as time moves on, as savvy users discovering that their machines won't reboot without hanging silently find this thread and end up just setting kernel reboot options instead.
Comment 30 H. Peter Anvin 2012-12-04 22:35:42 UTC
If there are machines which need reboot option then we need to know about it, obviously, as we should put in (or remove) quirks as necessary.

Depending on the specifics of the machines we may also need to run tests; it might also be useful to do a "git bisect", although at this point it is probably not all that informative.

The thing about reboot=bios is that it is really not a very good reboot method, as it doesn't guarantee a full system reset, and it would be better if we could find other reboot methods that work for as many systems as possible.  I consider reboot=bios to be a last resort.

Note You need to log in before you can comment on or make changes to this bug.