Distribution: Fedora Linux 3 Hardware Environment: IBM ThinkPad X24 Software Environment: Fedora kernel RPM build "kernel-2.6.11-1.14_FC3" Problem Description: With ACPI enabled, I can put my ThinkPad X24 laptop to sleep (S3) and wake it back up again. However, if I used the very simple "echo 3 >/proc/acpi/sleep" technique to sleep, my root shell is closed by the time the laptop resumes. It appears that several newlines and an EOF are arriving on my pseudo-TTY from ... well ... from I have no idea where. Steps to reproduce: 1. Log in to the text console as root. Or if you prefer, start a terminal window and "su" to root. 2. Type "echo 3 >/proc/acpi/sleep". 3. Wake the computer back up. Actual Results: If you had logged in to the text console as root, you are no longer logged in. You're simply back to the "login:" prompt. Furthermore, a NULL control character ("^@") is visible as though it had been typed in at the "login:" prompt. If you had used "su" to become root, your root shell is no longer running. There are several blank lines below the "echo" command you typed, and then several blank prompts in the shell from which you typed "su". It appears as though the terminal window received (as input) a number of newlines with an EOF stuck in the middle somewhere. Expected Results: Sleeping and waking up should not generate any "phantom" input. The shell that did the write should come back in exactly the state we left it: still logged in, or still su'ed to root.
Created attachment 4974 [details] additional messages in "dmesg" after waking up I've attached a record of the new messages in "dmesg" after waking up that were not there before. Perhaps there's a clue here as to the cause. Initially I was suspicious of the fact that this output includes a "sleeping function called from invalid context" warning containing syscall_call() which in turn calls sys_write(). Could this be the write system call being run by the "echo" command? Since "echo" is a shell built-in, if hte kernel decided to kill the process doing the write, that would actually kill the entire shell. However, Warren Togami reports that he sees the same "sleeping function called from invalid context" on his ThinkPad T41 but that he does not see the shell-closing behavior I report here. So that may be a red herring. (See <https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=140257> for the record of that exchange.)
No, the oops message isn't related with your issue. I'm a little supect the keyboard driver is broken. Can you try a different kernel, like 2.6.12-rc3?
Unfortunately I have since upgraded. I no longer have the ThinkPad X24 on which I originally reported the problem. I won't object if someone wants to mark this report as UNREPRODUCIBLE.
If someone can reproduce the issue, please reopen.
I am seeing the problem once again using Fedora Core 4 with the latest Fedora kernel (2.6.12-1.1398_FC4) on my ThinkPad X40. I do have a minor additional clue, though: The shell only closes when using "zsh" as root's shell. If I change to "sh" first, the shell does not close. However, the shell does react as though an additional newline had been typed. I.e.: sh-3.00# echo 3 >/proc/acpi/sleep sh-3.00# sh-3.00# The second blank prompt there suggests that the shell thinks someone pressed <Enter> again after the echo command. At least once while testing this, though, the shell got into a bad loop where it would wake up and almost immediately go back to sleep again. I could even see extra "echo 3 >/proc/acpi/sleep" commands at the shell prompt each time. So this time it was as though the shell were repeatedly seeing <Up> followed by <Enter>, where <Up> was bringing the echo command back from the shell's command history and <Enter> was running the command again. The only way I could break out of this sleep loop was to race against the system to close root's shell (<Control>-D) before the echo command could be received yet again. Also, on just one occasion, when I pressed <Control> in order to start typing <Control>-D, the terminal window displayed its popup menu as though I'd pressed the right mouse button or perhaps some keycode which is mapped to <Menu> by the X server. This is all consistent with the general idea that junk is appearing in the input stream. I guess different shells with different command line editing setups react differently to that junk.
same thing if you echo mem >/sys/power/state ? same in 2.6.13? what if you put a command after it to collect the input, eg # echo mem >/sys/power/state; cat > /tmp/input.log << EOF Interseting that zsh and sh behave differently. Do you see input garbage if you suspend from a VGA console? This may be an issue in the input sub-system or in X.
Created attachment 5660 [details] raw input logging script Len, I like your idea about logging the input after the "echo" command. Rather than using "cat", though, I wrote a small script that logs input in the most raw way possible, character by character, until stopped via SIGINT. I'm attaching that script to this report for future reference. So now I'm running the following command line: # echo 3 >/proc/acpi/sleep; ./logger If I run this in a gnome-terminal under X, the logger script records a single carriage return (\r) appearing on stdin. If I run this under a VGA console,the logger script records no input at all. If I run this under zsh, after I manually send SIGINT to the logger script, the zsh process terminates. This is true whether I'm using X or a VGA console. I haven't looked into why it terminates more closely, but this is an interesting question. Is it reading EOF from stdin? Is it being killed by some signal? These are all open questions. Perhaps I should attach a debugger to that zsh process, set a breakpoint in _exit(), and see if I can tell why it's getting there. Does that sound worth checking out? If I run this under sh, the sh process remains alive and running normally after I manually send SIGINT to the logger script. Again, this is true under both X and a VGA console. If I change the echo command to "echo mem >/sys/power/state", nothing changes. Behavior is identical in all cases. So we've now split into two subquestions: (1) why the extra carriage return under X but not under a VGA console, and (2) why does zsh exit while sh keeps running? Curiouser and curiouser!
Since I have no idea how to pursue the spurrious carriage return problem, I thought I'd look closer at why zsh exits while sh keeps running. Turns out zsh is getting SIGFPE in a call to difftime(). In the source code, difftime() is called as: difftime(time(NULL), lastwatch) difftime() is a tiny function. Here's the complete disassembly: 0x005a5b10 <__difftime+0>: push %ebp 0x005a5b11 <__difftime+1>: mov %esp,%ebp 0x005a5b13 <__difftime+3>: fildl 0xc(%ebp) 0x005a5b16 <__difftime+6>: fisubrl 0x8(%ebp) 0x005a5b19 <__difftime+9>: pop %ebp 0x005a5b1a <__difftime+10>: ret That's the whole function. The SIGFPE arises at the "fisubrl" instruction. Strangely, I can confirm that both arguments to difftime() look perfectly reasonable. For example: difftime(1124313430, 1124313394) If I write a small program that directly calls difftime() with these two values, it returns the expected result, with no SIGFPE. So the failure of difftime() is due to some other environmental state, not just the two arguments. I don't know a lot about the floating point arithmetic environment on x86, but there must be *something* different to explain why that "fisubrl" instruction succeeds in one process and fails in another.
I logged in to let you know that i saw the same zsh exit vs bash extra characters symptom as you, but using 2.6.13-rc6 on my D600. Wow, a SIGFPE -- who would have thunk it?
Thanks, Len! It's good to know I'm not crazy. :-) Just to clarify, though, the "extra characters" symptom depends on running under X and is independent of zsh/bash. Either shell gave me extra characters under X; neither shell gave me extra characters under VGA. The zsh/bash distinction affects whether the shell exits due to SIGFPE (zsh) or keeps running without any obvious problems (bash). This symptom is independent of X/VGA.
Seems to be X11 issue rather than kernel one.
I did some tests. it seems 'echo' command is a built-in command in zsh. If I write a simple program which does someting like 'echo 3 >/proc/acpi/sleep' and invoke the program in zsh, zsh isn't crash. If I directly run echo command in zsh, zsh gets a FPE signal. The real cause is still investigating.
Created attachment 6002 [details] patch to correctly restore FPU registers
*** Bug 3919 has been marked as a duplicate of this bug. ***
patch is shipped in Linus's git tree. Closed.
Good news. Thanks for hunting this down, David!