Subject : Re: Sysrq+B doesn't work on my box Submitter : "Zdenek Kabelac" <zdenek.kabelac@gmail.com> Date : 2008-08-01 20:25 References : http://marc.info/?l=linux-kernel&m=121762241105336&w=4 This entry is being used for tracking a regression from 2.6.26. Please don't close it until the problem is fixed in the mainline.
As a response to Rafaels' email for rechecking status of this bug - it still applies to my kernel build from commit: 10fec20ef5eec1c91913baec1225400f0d02df40 Sysrq+B will end in the int3 deadlock when kvm modules are loaded.
(copying mingo) (context: sysrq-B with kvm-intel.ko loaded doesn't work. on my machine, it kills the sata interface, but the processor and network keeps working) Strangely, the specs say: >
* Avi Kivity <avi@qumranet.com> wrote: > (copying mingo) > > (context: sysrq-B with kvm-intel.ko loaded doesn't work. on my machine, > it kills the sata interface, but the processor and network keeps working) > > Strangely, the specs say: > >> • The INIT signal is blocked whenever a logical processor is in VMX >> root operation. >> It is not blocked in VMX non-root operation. Instead, INITs cause VM >> exits (see >> Section 21.3, “Other Causes of VM Exits”). > > So INIT (which is wired to the triple-fault processor output, it seems, > rather than RESET) is blocked and the machine is not reset completely. > > So we need to disable vmx during native_machine_emergency_restart(). > There are at least three ways of doing this: > > - add a vmxoff sequence (with an exception handler) to > native_machine_emergency_restart(). while simplest, this will not > unblock INIT for other cpus > > - add an emergency_restart notifier_block, and have kvm subscribe. This > has the disadvantage of being slightly complex, opening a tiny race > (emergency restart during kvm module initialization), and requiring IPIs > during emergency restart. > > - move vmxon/vmxoff management out of the kvm module and into x86 core. > Bloats the core but reduces complexity. IPIs still required. > > I think the notifier block is the way to go. Ingo, let me know what you > prefer. notifier should be OK i think - sysrq-b is an emergency mechanism after all. btw., "echo b > /proc/sysrq-trigger" never worked reliably for me with KVM also loaded. Ingo
Simple workaround: boot with 'reboot=a' kernel parameter. proposed as a patch for 2.6.1[78].
btw, this isn't a recent regression. the problem has been present since 2.6.20.
Removed from the list, thanks.
Well I have no good news - reboot=a doesn't solve my problem - the machine doesn't emergency reboot with this flag - actually I've forget to mention this in the initial post, that I've already tried I think all those reboot parameters. So for T61 and kvm modules loaded - ACPI reboot will not fix the problem. Currently tested with kernel: 1941246dd98089dd637f44d3bd4f6cc1c61aa9e4
Created attachment 17477 [details] disable vmx on reboot Please test the attached patch. Watch out for not all processors coming back online after the reboot.
Handled-By : Avi Kivity <avi@qumranet.com>
There is some progress - usually first I check if the emergency reboot works in the runlevel 1 with SysRQ+SUB With your attached patch kernel finally reboots - with ACPI or with reboot=kbd. The trouble is - if I start my usual runlevel 5 - the emergency reboot turns again into plain deadlock - I could see 'reseting' written on the console with blinking cursor - both with ACPI & KBD. Hopefully I've not made any mistakes during tests as I've tried to double check them - but for this behavior I've no explanation. Any idea what should I check as a potential source of troubles here? Maybe the vmx switch has to be applied to both CPUs?
Probably the difference was which cpu executed the reset. Try (from runlevel 3): taskset 1 emergency-reboot taskset 2 emergency-reboot (where emergency-reboot is a script that does 'echo b > /proc/sysrq-trigger)
Great, this has finally made it working, now it really reboots. So hopefully it will be now possible to make a workable patch for this.
Which one really worked? taskset 1 or taskset 2?
Hmm - I have thought that I should run them both at the same time. so I've actually put them into a script file
I've made and extra test - each double checked. When I run only taskset 1 or taskset 2 - the reboot will not happen. And there is minor difference - with taskset 2 the machine still looks for a while somewhat 'alive' - i.e. I could switch consoles for some time. Only when I execute shell script with both taskset commands - the reboot will succeed.
Just a minor respin for this bug - anything new ?
A fix is queued for 2.6.29.