|Summary:||2.6.37-rc1: Disk takes 10 seconds to resume - MacBook2,1|
|Component:||Serial ATA||Assignee:||Rafael J. Wysocki (rjw)|
|Severity:||normal||CC:||acpi-bugzilla, devnull, error27, florian, rjw, stern, tj|
|Bug Depends on:|
|Bug Blocks:||7216, 21782|
Description Tobias 2010-11-10 19:33:18 UTC
New with 2.6.37-rc1 (from the ubuntu ppa): [108572.434303] ------------[ cut here ]------------ [108572.434311] WARNING: at /home/kernel-ppa/COD/linux/kernel/power/suspend_test.c:53 suspend_test_finish+0x89/0x90() [108572.434313] Hardware name: MacBook2,1 [108572.434315] Component: resume devices, time: 17260 [108572.434316] Modules linked in: nfs lockd fscache nfs_acl auth_rpcgss sunrpc nls_utf8 isofs nls_iso8859_1 nls_cp437 vfat fat usb_storage binfmt_misc rfcomm sco bnep l2cap parport_pc ppdev arc4 snd_hda_codec_idt ath9k snd_hda_intel mac80211 snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi ath9k_common ath9k_hw snd_seq_midi_event snd_seq snd_timer ath snd_seq_device applesmc cfg80211 snd soundcore joydev led_class btusb bluetooth snd_page_alloc isight_firmware input_polldev appletouch lp mbp_nvidia_bl parport sha256_generic cryptd aes_x86_64 aes_generic hid_apple dm_crypt usbhid hid i915 drm_kms_helper drm i2c_algo_bit firewire_ohci sky2 firewire_core intel_agp intel_gtt video crc_itu_t output [108572.434369] Pid: 15747, comm: pm-suspend Tainted: G W 2.6.37-020637rc1-generic #201011020905 [108572.434371] Call Trace: [108572.434376] [<ffffffff810a44c9>] ? suspend_test_finish+0x89/0x90 [108572.434382] [<ffffffff810646e0>] warn_slowpath_common+0x90/0xc0 [108572.434385] [<ffffffff8106480e>] warn_slowpath_fmt+0x6e/0x70 [108572.434390] [<ffffffff81038649>] ? default_spin_lock_flags+0x9/0x10 [108572.434394] [<ffffffff812cb607>] ? kobject_put+0x27/0x60 [108572.434398] [<ffffffff81399ae7>] ? put_device+0x17/0x20 [108572.434402] [<ffffffff813a3b68>] ? dpm_complete+0x128/0x140 [108572.434406] [<ffffffff810a44c9>] suspend_test_finish+0x89/0x90 [108572.434409] [<ffffffff810a4282>] suspend_devices_and_enter+0xc2/0x100 [108572.434412] [<ffffffff810a43a1>] enter_state+0xe1/0xf0 [108572.434416] [<ffffffff810a3804>] state_store+0xb4/0xf0 [108572.434419] [<ffffffff812cb3a7>] kobj_attr_store+0x17/0x20 [108572.434423] [<ffffffff811ca1a2>] flush_write_buffer+0x62/0x90 [108572.434426] [<ffffffff811ca2e6>] sysfs_write_file+0x66/0xa0 [108572.434431] [<ffffffff8115e76c>] vfs_write+0xcc/0x190 [108572.434434] [<ffffffff8115f195>] sys_write+0x55/0x90 [108572.434437] [<ffffffff8100c132>] system_call_fastpath+0x16/0x1b [108572.434440] ---[ end trace bdcb9cd79c701bb3 ]---
Comment 1 Rafael J. Wysocki 2010-11-10 20:18:15 UTC
Apparently, the resume of devices took too much time. Is this reproducible?
Comment 2 Dan Carpenter 2010-11-14 19:23:15 UTC
I can reproduce it on my eee. How do I debug this?
Comment 3 Rafael J. Wysocki 2010-11-14 23:18:04 UTC
Please try to boot with initcall_debug in the kernel command line and see what devices take long time to resume.
Comment 4 Dan Carpenter 2010-11-18 20:06:10 UTC
It's my harddrive that takes 10 seconds. [ 1.565105] scsi 0:0:0:0: Direct-Access ATA ST9160310AS 0303 PQ: 0 ANSI: 5 [ 69.523147] call 0:0:0:0+ returned 0 after 10428995 usecs I'll look into it some more, but I doubt there is anything that will speed it up. I suspect this warning may lead to a lot of bogus bug reports where we just have to tell people their hardware sucks.
Comment 5 Rafael J. Wysocki 2010-11-18 21:57:10 UTC
Well, there have not been too many of them and I wouldn't like to overlook a valid one. This particular one may indicate a driver problem as well, so I'm reassigning to Storage/SATA.
Comment 6 Dan Carpenter 2010-11-19 08:15:52 UTC
It get's delayed in ata_wait_ready(). That function has a printk "link is slow to respond, please be patient" if you go over 5 seconds. I added a printk to see exactly how long it was taking. It first waits for 6 seconds and triggers the message and it waits for 4 seconds which is where my 10 second delay comes from. This doesn't look like a kernel bug to me. Tobias, you may be experiencing a different bug. I sort of hijacked your bug report. Could you boot with initcall_debug and post your dmesg?
Comment 7 Tejun Heo 2010-11-19 13:25:22 UTC
Yeah, around 10 seconds is pretty normal duration for disk spinup. Nothing wrong with Dan's case. Tobias?
Comment 9 Tobias 2010-11-22 20:05:10 UTC
Sorry for the delay, I was on an internet-less vacation. I have not been able to reproduce this. If it was somewhat storage-related, could a slow nfsmount or a cdrom drive that is starting to fail trigger this? I have now switched to rc3 and will see what happens. Kernel command line twisting is not my strong point, but from my noobish look at the attached dmesg initcall_debug seems to work. I'll report back if this happens again. Please tell me if there is more I can do.
Comment 10 Dan Carpenter 2010-11-23 05:29:45 UTC
We need a suspend resume cycle in your dmesg. :P
Comment 12 Tobias 2010-11-23 19:41:13 UTC
Doh, I blame the bugzilla breakage yesterday. Is the new dmesg of any use?
Comment 13 Dan Carpenter 2010-11-23 20:23:17 UTC
[ 9287.020921] ata3.01: configured for UDMA/133 [ 9287.035898] call 2:0:1:0+ returned 0 after 2379249 usecs (that's your /dev/sda taking 2 seconds) [ 9289.840112] usb 1-4: reset high speed USB device using ehci_hcd and address 3 [ 9289.990593] call 1-4+ returned 0 after 5205424 usecs (that's your usb taking 5 seconds) The other things take an additional 2 seconds for a total of 9 seconds. [ 9294.004368] PM: resume of devices complete after 9412.372 msecs You're on the border. The warning and the stack dump only get printed if you hit 10 seconds. I doubt this is a problem that's fixable with software. Another bad thing about the stack dump is that it taints your kernel with the 'W' for warning.
Comment 14 Rafael J. Wysocki 2010-11-23 22:18:08 UTC
Well, we can drop the stack dump, but then people won't even report the problem, I guess.
Comment 15 Florian Mickler 2011-02-20 10:57:01 UTC
And about ehci_hcd taking 5 seconds to reset... theres nothing to fix there?
Comment 16 Alan Stern 2011-02-20 22:15:32 UTC
It was 1-4 that took 5 seconds, not usb1 or 0000:00:1d.7 (either of which might reasonably be called ehci_hcd). What is the 1-4 device, anyway? Dan's timing calculations in comment #13 aren't right. The ATA and USB resumes ran in parallel, taking 5 seconds overall. As for why the USB resume took so long, I can't tell from this log. Enabling CONFIG_USB_DEBUG might help. By the way, the real bottleneck is the PNP0C0A:00 device, whatever that is: [ 9294.000258] call PNP0C0A:00+ returned 0 after 9187140 usecs
Comment 17 Rafael J. Wysocki 2011-02-21 19:54:16 UTC
PNP0C0A is ACPI battery.
Comment 18 Florian Mickler 2011-02-21 22:05:18 UTC
could this be something like https://bugzilla.kernel.org/show_bug.cgi?id=8246 ? does the kernel boot option "acpi=rsdt" help?