Created attachment 123941 [details] logs and configs This crash is discussed also here https://bugzilla.kernel.org/show_bug.cgi?id=69521 https://bugzilla.kernel.org/show_bug.cgi?id=69581 (for the i915 and saa7134 warnings) [1.] One line summary of the problem: Kernel 3.12.8 gives a warning during resume from S3 sleep [2.] Full description of the problem/report: It happens also with 3.12.7, 3.12.6, 3,12.9, 3.13 and also with drm-intel-nightly. It doesn't happen with 2.6.24.7. OS is Debian Lenny, with vanilla kernel. It happens the same after upgrade to Squeeze. I suspend the machine with s2ram and it goes off. During the resume it writes that warning, then it seem to work normally, except for serial redirection of console. I use redirection of console to serial port (with lilo directive: append=" console=ttyS0 console=tty0") and I check the messages in another machine. This stops working as soon as I suspend. It begins to send mangled lines and it doesn't work again until the next full restart. Only the serial redirection has this problem, the local console works.
Created attachment 123951 [details] logs 3.12.9 on Lenny
Created attachment 123961 [details] logs drm-intel-nightly on lenny
It still happens with 3.12.10.
It still happens with 3.13.2.
Is the scrambled serial console output a regression against some previously working kernel, and if so, what version was that? Please attach an excerpt of the "mangled lines" on the serial console.
Created attachment 125531 [details] mangled text on serial console As I wrote in the first message, it worked on 2.6.24.7. 1) I start minicom on the remote machine 2) I boot the local machine, and on the remote all messages are shown 3) I suspend the local machine, and still the other receives messages during the suspend As long as I resume the local machine -On the same appears the oops text and then the regular one (after the crash, the local console continue to work). -On the remote machine start to appear mangled characters, it seems to mangle also previous text because inside the mess I can view some piece of the correct messages
I tried with 2.6.34.15: the v4l crash is not present. To say the truth, no crash happens. But the serial output is already mangled. So it seems that the regression is somewhere between 2.6.24.7 and 2.6.34.15.
2.6.25 fails. I'd want to try with other 2.6.24.x (against my working 2.6.24.7) but I cannot find them in www.kernel.org. Why some subversion have "extraversion" and others not? For example, there are 2.6.24, 2.6.25, 2.6.26 without extraversion, and 2.6.27 has extraversion up to 57.
(In reply to Valerio Vanni from comment #6) > As I wrote in the first message, it worked on 2.6.24.7. Sorry, missed that. Please supply the output of 'setserial -a /dev/ttyS0' for 2.6.24.7 and verify that the output is the same for 2.6.25, both before and after suspend/resume (or the output if different). The other "extraversion"s are in the stable git tree which you can browse here: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/refs/?id=refs/tags/v3.13.5 If the 'setserial' output provides no clue as to the cause of this regression, the only realistic solution is to perform a 'git bisect' to discover what change(s) caused the regression, which will help narrow down the actual problem (eg., mis-identifying the UART, mis-programming the divisors, or something else). I can provide some instructions on how to perform a bisect, if necessary.
Today I went to try with many kernels, and I found that also 2.6.24.7 was failing. It seemed a partial failing, because it mangled only the resuming events (the following normal events were printed). I went on with test and had restricted this partial failing (against the total, the mangling all events up to the next shutdown) to happen from 3.2.1. Partial failing up to 3.2.0, and total failing from 3.2.1. Now it's failing everywhere, total failing. I don't understand the reason of this sudden change, but at the moment I cannot speak anymore of regression. What could I look at? The serial state has always been the same. with all kernel tried, with partial or total failing, after or before suspension. newton:~# setserial -a /dev/ttyS0 /dev/ttyS0, Line 0, UART: 16550A, Port: 0x03f8, IRQ: 4 Baud_base: 115200, close_delay: 50, divisor: 0 closing_wait: 3000 Flags: spd_normal skip_test
What is the output of: setserial /dev/ttyS0 autoconfig ^skip_test Did this problem begin with a distro upgrade? Was the other machine changed?
The command itself gives no output, but if I look again with "setserial -" I find the same values without "skip_test". No, the problem began with Lenny. I noticed it together with two other warnings during the resume from S3 after a kernel upgrade. The first was 3.12.6, but later I tried all the following kernels as long as they became available. The i915 issue has been fixed in the developement branch, the v4l has still received no attention. The distro upgrades have been, so far, tests (that din't fix anything) All today's tests were on original Lenny restored from a disk image. I have disk images of the distro upgraded (I did also some clean install), so I can do further tests (even dangerous ones). The other machine (at the other end of the serial cable) was not changed. To avoid problem of something-stale, I did the same sequence between each reboot of this machine. -Close minicom on the other machine -Turn off this machine -Open minicom on the other machine -Turn on this machine -Suspend this machine -Resume this machine -Triggered some kernel event to say if it went on the other machine
I just found this old message http://osdir.com/ml/linux.serial/2005-02/msg00000.html And I find a common point: if I output something to the serial (when I see the mangled text), then the kernel too is able to write. A "foo" is not enough, I have to "cat" a little file.
Created attachment 127721 [details] serial console log (note: garbage characters have been converted to spaces loading the txt file in bugzilla) I did a trial on a virtual machine. VmWare Player is installed on a Windows XP host. Inside I installed a Debian Wheezy with Debian kernel (Wheezy 3.2.0-4-686-pae #1 SMP Debian 3.2.54-2 i686 GNU/Linux). I activated serial console on lilo, and in VmWare Player I redirected /dev/ttyS1 to a txt file on the host (in append mode). The result is still the same: at the time the machine is suspended to ram, serial output begins to be mangled, and only a manual writing is able to stop it. 1) Boot the machine: events are written on txt file 2) Suspend the machine (not the "vmware suspend virtual machine", but a "suspend to ram" called from the Debian guest as if it was a physical machine): events are written up to "suspending console(s)". 3) Resume the machine: garbage is written to txt file 4) Triggered some kernel event: garbage is written to txt file 5) echo "ciao" > /dev/ttyS1: "ciao" is added at the end of garbage 6) Triggered some event and shutdown the machine: events are written to txt file
To make it clear: the last test was on a totally different setup. Physical machine was none of the usual two, but another machine. I think it could exclude hardware problems. Note: at the start I filed a bug in the power managment/hibernation-suspend, but they told me to open separate bug reports in the driver sections. https://bugzilla.kernel.org/show_bug.cgi?id=69351 Could this issue be more for that section than for this?
I plan to debug this issue later this week, so no need to file another bug in a different section. The information you've provided should be enough to discover the root cause -- thanks!
Are there any news on this bug? I just tried on a testing machine with 3.14-rc8 and also with linux-next to see if the issue had disappeared: no, it's still there.
It's still present in 3.14.1, 3.15-rc1 and in -next.
Is there some news on this bug? If there's some other test I can do, I'll do. But I've already tried on different hardware and distributions, always with the same result of the output mangled after the resume. I think it should be easy to reproduce this.
It still fails on 3.14.4, 3.15-rc7 and today's -next.
The fix from Peter Hurley is this: https://lkml.org/lkml/2014/7/9/357 commit ae84db9661cafc63d179e1d985a2c5b841ff0ac4 upstream It's now in 3.16.2, 3.14.18, 3.12.28, 3.10.54, 3.2.63. I close the bug report.