Bug 23962 - EeePC 1005HAG freezes momentarily when returning from sleep mode
Summary: EeePC 1005HAG freezes momentarily when returning from sleep mode
Status: RESOLVED INVALID
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-29 00:15 UTC by Maik Zumstrull
Modified: 2012-02-22 21:53 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.36
Subsystem:
Regression: No
Bisected commit-id:


Attachments
strace of modem-manager (359.67 KB, text/plain)
2011-08-08 17:33 UTC, Maik Zumstrull
Details
sysrq trace (207.65 KB, text/plain)
2011-08-08 19:14 UTC, Maik Zumstrull
Details

Description Maik Zumstrull 2010-11-29 00:15:05 UTC
While debugging a different (but possibly related) problem, the netbook immediately waking up after entering sleep mode, which I traced to the built-in 3G modem, I noticed that on this kernel version, the system freezes for a few second four separate times after waking up from sleep mode. It appears to work fine after that.

I suspect that the issue is somehow related to USB reinitialization, because a USB device caused my original problem, the system freezes four times and there are four USB devices, and the usual suspect WiFi has been ruled out.

Tests were performed on Debian 686-bigmem binaries of their 2.6.36 tree. The OS is Debian squeeze on dm-crypt/LUKS btrfs root with the deadline elevator. I'm mentioning the latter because there is probably a small chance these are I/O freezes, though I don't think so. Advice on how to test this is appreciated.

See the previous exchange with the Debian maintainers at <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=605275>. This has been split off the earlier bug, <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=605246>.
Comment 1 Jonathan Nieder 2011-08-05 21:05:34 UTC
Hi Maik,

Thanks for reporting it. I'm not aware of any relevant recent upstream fixes, but it still seems like a good time to check: does the freezes still happen with version 3.0 (or more recent snapshots)? If so, could you send dmesg output from when it happens?
Comment 2 Maik Zumstrull 2011-08-05 21:21:56 UTC
Yes, it still happens with 3.0. And I know more specifically what causes it, I can even trigger it without going to sleep mode. It's modem-manager probing the 3G device. SIGHUP that process, four freezes coming right up. It doesn't cause messages in kern.log / dmesg. modem-manager says:

Aug  5 23:16:43 errai modem-manager[3845]: <info>  (ttyUSB0) opening serial port...
Aug  5 23:16:46 errai modem-manager[3845]: <info>  (ttyUSB0) closing serial port...
Aug  5 23:16:46 errai modem-manager[3845]: <info>  (ttyUSB0) serial port closed
Aug  5 23:16:46 errai modem-manager[3845]: <info>  (ttyUSB0) opening serial port...
Aug  5 23:16:46 errai modem-manager[3845]: <info>  (Huawei): GSM modem /sys/devices/pci0000:00/0000:00:1d.7/usb1/1-8 claimed port ttyUSB0
Aug  5 23:16:47 errai modem-manager[3845]: <info>  (ttyUSB0) closing serial port...
Aug  5 23:16:47 errai modem-manager[3845]: <info>  (ttyUSB0) serial port closed
Aug  5 23:16:49 errai modem-manager[3845]: <info>  (ttyUSB1) opening serial port...
Aug  5 23:17:09 errai modem-manager[3845]: <info>  (ttyUSB2) opening serial port...
Aug  5 23:17:40 errai modem-manager[3845]: <info>  (ttyUSB2) closing serial port...
Aug  5 23:17:45 errai modem-manager[3845]: <info>  (ttyUSB2) serial port closed
Aug  5 23:17:45 errai modem-manager[3845]: <info>  (Huawei): GSM modem /sys/devices/pci0000:00/0000:00:1d.7/usb1/1-8 claimed port ttyUSB2
Aug  5 23:17:53 errai modem-manager[3845]: <info>  (ttyUSB1) closing serial port...
Aug  5 23:17:58 errai modem-manager[3845]: <info>  (ttyUSB1) serial port closed
Aug  5 23:17:58 errai modem-manager[3845]: <info>  (ttyUSB1) opening serial port...
Aug  5 23:18:16 errai modem-manager[3845]: <info>  (ttyUSB1) closing serial port...
Aug  5 23:18:21 errai modem-manager[3845]: <info>  (ttyUSB1) serial port closed
Aug  5 23:18:21 errai modem-manager[3845]: <info>  (Huawei): GSM modem /sys/devices/pci0000:00/0000:00:1d.7/usb1/1-8 claimed port ttyUSB1
Comment 3 Jonathan Nieder 2011-08-08 15:51:01 UTC
bugzilla-daemon@bugzilla.kernel.org wrote:

> Yes, it still happens with 3.0. And I know more specifically what causes it,
> I
> can even trigger it without going to sleep mode. It's modem-manager probing
> the
> 3G device. SIGHUP that process, four freezes coming right up. It doesn't
> cause
> messages in kern.log / dmesg. modem-manager says:
>
> Aug  5 23:16:43 errai modem-manager[3845]: <info>  (ttyUSB0) opening serial
> port...
> Aug  5 23:16:46 errai modem-manager[3845]: <info>  (ttyUSB0) closing serial
> port...
[etc]

Perfect.  Could you use "strace -t -p" to learn what system calls the
hangs are in?
Comment 4 Maik Zumstrull 2011-08-08 17:33:05 UTC
Created attachment 68102 [details]
strace of modem-manager

I'll make it -t -T if you don't mind.

It looks like there are three kinds of calls that take suspiciously long:

19:20:08 open("/dev/ttyUSB1", O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK) = 27 <9.997075>

19:20:18 ioctl(27, SNDCTL_TMR_START or TCSETS, {B57600 -opost -isig -icanon -echo ...}) = 0 <4.998198>

19:21:13 close(27)                      = 0 <4.998585>

Full log attached.
Comment 5 Jonathan Nieder 2011-08-08 18:01:10 UTC
bugzilla-daemon@bugzilla.kernel.org wrote:

> 19:20:08 open("/dev/ttyUSB1", O_RDWR|O_EXCL|O_NOCTTY|O_NONBLOCK) = 27
> <9.997075>

Yeah, I don't think that's supposed to take 10 seconds.  Does a
simple program like the following behave the same way?

By the way, what is the freeze like?  Can you toggle caps lock, move
the mouse, and so on?

-- >8 --
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#include <fcntl.h>

int main(void)
{
	int fd = open("/dev/ttyUSB1", O_RDWR | O_EXCL | O_NOCTTY | O_NONBLOCK);
	close(fd);
	return 0;
}
Comment 6 Maik Zumstrull 2011-08-08 18:18:53 UTC
The screen stops updating, cursors stop blinking. The mouse cursor cannot be moved. If I'm typing at the time, some of the letters will usually show up once the freeze is over.

Your testcase takes about 15s to run and freezes the system as described while running. According to strace, roughly 10s of this are open() and roughly 5s are close(), as with modem-manager.
Comment 7 Jonathan Nieder 2011-08-08 18:58:38 UTC
Maik Zumstrull wrote:

> The screen stops updating, cursors stop blinking. The mouse cursor cannot be
> moved. If I'm typing at the time, some of the letters will usually show up
> once
> the freeze is over.
>
> Your testcase takes about 15s to run and freezes the system as described
> while
> running. According to strace, roughly 10s of this are open() and roughly 5s
> are
> close(), as with modem-manager.

All right, one last test.  It would be nice to find out what the
kernel thinks it's doing when it is hanging.  So:

1. Permit dumps by running "echo 1 >/proc/sys/kernel/sysrq"
   (or "echo 446 >...").

2. Trigger the hang and then press 'L', 'W', and 'T' in sequence while
   holding Alt+SysRQ.  Or perhaps running a script like

	sleep 3
	echo l >/proc/sysrq-trigger

   in the background and then triggering the hang would work better
   --- whatever works.  Documentation/sysrq.txt has details.
Comment 8 Maik Zumstrull 2011-08-08 19:14:24 UTC
Created attachment 68122 [details]
sysrq trace

HTH, I can't read anything out of this.
Comment 9 Jonathan Nieder 2011-08-08 19:37:40 UTC
Maik Zumstrull wrote:

> SysRq : Show Blocked State
>   task                PC stack   pid father
> rs:main Q:Reg   D 00001000     0  1309      1 0x00000000
>  f4d47b60 00000082 00000000 00001000 00000000 00000000 f7667120 c1487980
>  f4d47d0c 00000000 c1487980 f4d47b60 c1487980 c1487980 f596f420 f7960120
>  c10e8493 00000b26 f596f420 0002aa3d 00000000 f7960120 c10e84c4 c10e1ac3
> Call Trace:
>  [<c10e8493>] ? block_write_end+0x50/0x58
>  [<c10e84c4>] ? generic_write_end+0x29/0x96
>  [<c10e1ac3>] ? __mark_inode_dirty+0x20/0x15a
>  [<c10e8526>] ? generic_write_end+0x8b/0x96
>  [<c12ad027>] ? __mutex_lock_common.clone.5+0xe2/0x131
>  [<c12acf37>] ? mutex_lock+0x17/0x25
>  [<c12aca2c>] ? _cond_resched+0x5/0x18
>  [<c12ada31>] ? tty_lock+0x29/0x31
>  [<c11c0553>] ? tty_open+0x52/0x3d1

Looks like open() is waiting for the 'big tty mutex' to be released.
Presumably fixing that would involve:

 1. Understanding what shared state tty_lock() protects in tty_open

 2. Finding finer-grained locks that protect just that, introducing
    them if necessary

 3. Acquiring the big tty mutex in functions that tty_open() calls,
    to avoid regressions

 4. Testing --- if the behavior is pleasant enough, remaining instances
    can be Somebody Else's Problem.

In other words, this is the usual and well known task of eliminating
a big lock[1]: push it down into subsystems and replace it with
subsystem locks, then split those and eliminate those.

I don't know if anyone is working on this.

[1] http://lwn.net/Articles/282319/
Comment 10 Greg Kroah-Hartman 2012-02-22 21:53:24 UTC
All USB bugs should be sent to the linux-usb@vger.kernel.org mailing 
list, and not entered into bugzilla.  Please bring this issue up there,
if it is still a problem in the latest kernel release.

Note You need to log in before you can comment on or make changes to this bug.