Description Mark Korondi 2007-07-12 00:10:53 UTC
Most recent kernel where this bug did not occur: Tried almost all releases from 2.6.15 but neither works. Distribution: Ubuntu (all versions), Arch Linux (all versions) Hardware Environment: Clevo M55xN Laptop, with an Intel ICH7 motherboard and a "Serial ATA Storage Controller IDE (rev 02) and IDE Controller" installed on it. (devices from 00:1f.0 to 00:1f.3 is exactly the same as in MacBook Pro 15" Core 2 Duo) Software Environment: Independent. Problem Description: After one enough long suspend/resume cycle the OS cannot switch off the CD-Drive therefore laptop cannot suspend second time (it hangs), and also cannot switch off (it reboots). During the second suspend try, the harddisk spins down, LCD switches off, USB mostly switches off (mouse indicator led blacks out), but the fan still works. Also from the CD-Drive it can be heard some noise, similar to the noises comes when the laptop POSTs after poweron, and the drive is being recognized. Of course, when suspend goes fine, I cannot open the CD-tray, but if suspend fails, I can. That's why I think that the failure comes with the CD-drive. Except this bug, suspend works fine on this computer. More information about hardware will be added by comments. Steps to reproduce: 1. Send laptop to suspend 2. Wait more than about 30 secs - 1 minute 3. Wake up it 4. Wait a bit 5.a. try to send it to suspend again (hangs) 5.b. try to poweroff (reboots)
Comment 3 Mark Korondi 2007-07-12 00:17:26 UTC
Created attachment 12003 [details] dmesg output after a successfull suspend/resume cycle
Comment 4 Mark Korondi 2007-07-12 00:17:54 UTC
Created attachment 12004 [details] su -c 'dmidecode' output
Comment 5 Mark Korondi 2007-07-12 00:18:30 UTC
Created attachment 12005 [details] The latest default Arch Linux 2.6.22 kernel
Comment 6 Adrian Bunk 2007-07-12 11:22:30 UTC
After booting dmesg say: ata1: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x00011810 irq 14 ata2: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 bmdma 0x00011818 irq 15 ata1.00: ATAPI: MATSHITAUJ-840D, 1.00, max UDMA/33 ata1.00: configured for UDMA/33 After resuming dmesg says: ata1.00: model number mismatch 'MATSHITAUJ-840D' != '<CD>A<D4>S<C8>I<D4>A<D5>J <AD>8<B4>0<C4> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0>' ata1.00: revalidation failed (errno=-19) ata1.00: limiting speed to UDMA/33:PIO3 ata1: failed to recover some devices, retrying in 5 secs Alan?
Comment 7 Mark Korondi 2007-07-16 02:17:11 UTC
Uhh... As I see there went something wrong with my DVD-drive. Now it doesn't work. I try later a hardreset, because now neither BIOS nor linux recognizes the drive (also tried with libata.atapi_enabled=libata and combined_mode=libata) Sorry for this inconvenience, now I disabled the drive via BIOS. There are the new dmesg, after a long lasting suspend period which after suspend didn't work for the second time (so the problem is still there also without the DVD-Drive)
Comment 8 Mark Korondi 2007-07-16 02:17:50 UTC
Created attachment 12049 [details] dmesg after disabling the IDE-SATA controller
Comment 9 Rafael J. Wysocki 2007-08-29 10:13:36 UTC
(In reply to comment #8) > Created an attachment (id=12049) [details] > dmesg after disabling the IDE-SATA controller The attachement looks like an lspci output. Anyway, your hardware configuration seems to be similar to some known working ones. I'm afraid there's a hardware problem somewhere in your box.
Comment 10 Mark Korondi 2007-08-30 02:24:01 UTC
I can post actually two, next week even three other intel ich7 based clevo's datas which on doesn't work suspend. Each of that provides the same problem. And remember that in windows it works. So sadly it isn't a hardware-caused problem, but it comes from the linux kernel. I know that clevo doesn't build good computers and their BIOS is very unusable, apart from these facts suspend must work in linux just like in windows. (I felt like to test it on PC-(free)BSD, but there it won't come back from suspend, so I didn't get new experiences about this even bigger growing problem)
Comment 11 Rafael J. Wysocki 2007-09-11 13:59:29 UTC
Hm. Have you tried to suspend in the minimal configuration (ie. after booting with init=/bin/bash)?
Comment 12 Mark Korondi 2007-09-11 23:15:00 UTC
Of course I tried it. Also tried with an own kernel which contains all drivers as modules - in this case I've generated the initrd image in order to let it be able to boot because of the filesystem and sata-ide interface modules - and I made experiments with everything built-in, too. The symptom was the same in both case. That is why I am sure that the main problem is somewhere in the suspending functions.
Comment 13 Mark Korondi 2007-09-11 23:21:17 UTC
Oh, one additional thing: I think it's an extremely hardware-related problem (as you said, it works on machines similar to this). I also picked out the whole drive from the laptop, thinking if it cannot be powered off, then I'll help for it... Suspending didn't work. I hope this fact could help to locate my problem's source. Probably the wrong-working device is not the drive, but the ide-sata interface on the motherboard.
Comment 14 Rafael J. Wysocki 2007-09-23 05:39:15 UTC
Please test the 2.6.23-rc7 kernel with the patches from: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.23-rc7/patches/ applied and CONFIG_BLK_DEV_IDEACPI set in .config .
Comment 15 Mark Korondi 2007-09-23 12:25:39 UTC
Created attachment 12911 [details] 2.6.23-rc7 with the suggested 38 suspend&hibernate patches after an unsucceded suspend.
Comment 16 Mark Korondi 2007-09-23 12:27:22 UTC
I tried your suggestions without any success, upper there is a new dmesg after that suspend. (Of course, it could go to suspend, and wake up 3minutes after, but second time it wouldn't work)
Comment 17 Rafael J. Wysocki 2007-09-23 12:45:43 UTC
Well, thanks. Why is there "Linux version 2.6.20-16-generic" in the dmesg?
Comment 18 Mark Korondi 2007-09-23 13:01:51 UTC
OMG, sorry for this silly stupid fault. Of course I tried to suspend with the right kernel... But I rebooted after into the standard generic Ubuntu kernel, watching the Magic number, and the row below (I hope once it says something what I also understand :) ) So now I will post a right dmesg, after one suspend/resume cycle
Comment 19 Mark Korondi 2007-09-23 13:02:27 UTC
Created attachment 12912 [details] 12911: 2.6.23-rc7 with the suggested 38 suspend&hibernate patches
Comment 20 Rafael J. Wysocki 2007-10-06 07:30:48 UTC
Hm, can you try to use the libata-based PATA driver instead of the old IDE one and see if that changes anything (in that case please add libata.noacpi=0 to the kernel command line)?
Comment 21 Mark Korondi 2007-10-07 11:23:00 UTC
It didn't helped. dmesgs are attached below.
Comment 22 Mark Korondi 2007-10-07 11:24:38 UTC
Created attachment 13069 [details] If it stays in suspend it can survive a second or even third one, too
Comment 23 Mark Korondi 2007-10-07 11:25:36 UTC
Created attachment 13070 [details] After being for long suspended, it cannot survive the second one, so no more lines could be posted...
Comment 24 Rafael J. Wysocki 2007-10-07 11:54:46 UTC
Can you attach your current .config, please?
Comment 25 Mark Korondi 2007-10-07 13:27:59 UTC
Created attachment 13071 [details] actually used .config Of course, I can, here is it. Remember that this kernel is patched and compiled as it's been suggested here: http://bugzilla.kernel.org/show_bug.cgi?id=8737#c14
Comment 26 Rafael J. Wysocki 2007-10-07 13:55:56 UTC
Please try with CONFIG_IDE unset and with only one PATA driver selected (not that I expect to see much difference, but I'd like to reduce the noise level as much as reasonably possible :-)). Also (for the same reason), please try with CONFIG_CPU_FREQ unset and compile the kernel without SMP support and see if that has any effect on the symptoms.
Comment 27 Mark Korondi 2007-10-08 17:02:50 UTC
Created attachment 13086 [details] .config built as suggested Tried with no SMP, no CONFIG_IDE, only one PATA driver (ata_piix) and without success. Booted either with libata.noacpi=0 and no.
Comment 28 Rafael J. Wysocki 2007-10-12 12:40:04 UTC
Please apply the patch from: http://bugzilla.kernel.org/attachment.cgi?id=12991&action=view on top of your kernel. Then, please follow the instructions at: http://bugzilla.kernel.org/show_bug.cgi?id=7499#c44 and see if you are able to reproduce the problem and in which step.
Comment 29 Mark Korondi 2007-12-13 03:23:21 UTC
Ah, very big sorry! I missed your last post. Since two months! As soon as I have a little amount of time for it, I will try those tricks you wrote two months ago.
Comment 30 Mark Korondi 2008-01-02 02:35:43 UTC
I tried with the latest rc kernel, it doesn't work. Also tried as you suggested in #28comment (http://bugzilla.kernel.org/show_bug.cgi?id=8737#c28). What should happen? From 8 to 1 level of debugging everything went fine: I waited for 3 secs and it comes back. But I was listening to the DVD-drive, and durig this debugging-mode never have it been switched off, to which our problem is related. On the 0 level it suspended, and of course came back (the noise came fromthe drive during its recognizing), but second time it failed to suspend, like ever. P.S.: sorry again for not responding during the last two or three months, from now I will be online again and prepared for testing ;-)
Comment 31 Rafael J. Wysocki 2008-01-08 13:39:40 UTC
Please test 2.6.24-rc7 with the libata driver. There are some important libata suspend fixes in it that might help.
Comment 32 Mark Korondi 2008-01-11 10:12:39 UTC
Nothing different. It doesn't work
Comment 33 Rafael J. Wysocki 2008-01-11 10:51:22 UTC
Hmm, this may be ACPI-related, but I'm not sure in what way. It seems to be similar to Bug #9673, solved by blacklisting the machine in question.
Comment 34 Mark Korondi 2008-01-11 11:00:11 UTC
OK, at first look I don't understand what to do or try now following the link pointing to 9673 bug... :\
Comment 35 Rafael J. Wysocki 2008-01-11 13:00:22 UTC
Please attach the output of acpidump, to begin with.
Comment 36 Mark Korondi 2008-01-11 16:11:10 UTC
Created attachment 14421 [details] acpidump (2.6.22-14 Ubuntu 7.10 default kernel) OK, lets go...
Comment 37 Mark Korondi 2008-01-11 16:13:34 UTC
Created attachment 14423 [details] A dsdt (the same kernel) I also post the output of `cat /proc/acpi/dsdt`... The two dumps are from a fresh installed ubuntu 7.10 (because I'm using now arch linux and it's repos don't contain the acpidump application)
Comment 38 Rafael J. Wysocki 2008-01-12 13:43:16 UTC
Created attachment 14427 [details] DSDT disassembled
Comment 39 Rafael J. Wysocki 2008-01-12 13:50:25 UTC
Well, I don't see anything suspicious in the DSDT, but still I'm not that much experienced in reading these things. Someone with more ACPI experience should look at it.
Comment 40 Zhang Rui 2009-03-18 18:55:34 UTC
hi, Mark, does the problem still exist in the latest kernel release?
Comment 41 Mark Korondi 2009-03-19 00:27:25 UTC
Hi Zhang, actually I gave that laptop my mother, and at this moment I cannot test it. Now there is an Ubuntu 8.10 installed on it. Sometimes I tried to send it to suspend but the syndrome was the same. As I read the changelog of the new kernel, I know about the improvements on the suspend-resume architecture, so I planned to test it with the new releases. Next weekend (03/27-03/29) I will be able to test the newest kernel, please be patient until that! Thank you for still taking care of this bug!
Comment 42 Mark Korondi 2009-03-27 23:27:33 UTC
Hi Zhang, Well, unfortunately it won't work with 2.6.29. The play is the same, first time it works flawlessly, second time the noise comes from the cd drive, and there comes the black nothing. :( Now as I think over, it's a really, _really_ annoying bug, as it occours only the second suspend cycle. Argh!
Comment 43 Zhang Rui 2009-03-30 02:58:04 UTC
please do this test: 1. set CONFIG_PM_DEBUG and rebuild your kernel 2. echo core > /sys/power/pm_test and then 3. echo mem > /sys/power/state does the problem still exists when you run step 3 for the second time?
Comment 44 Zhang Rui 2009-06-19 03:49:03 UTC
no response from the bug reporter. please re-open it if the problem still exists in the latest git kernel.
Comment 45 Mark Korondi 2009-06-22 11:14:23 UTC
Hello, I tried the latest (git18) kernel, and the symptoms are the same. Interestingly, selecting the [core] testing method, it always successes (blinking cursor on the screen, for ~10 secs, and comes back), but if I select [none], then for the first time it suspends, for the second time doesn't. Even trying [core] between any two [none], it does the job, but the second [none] sucks. So this [core] method _always_ "works" - of course, with this mode enabled, the laptop doesn't switch off as when it suspends. > please re-open it if the problem still exists in the latest git kernel. It's not necessary now. It is a two-year old bug, and I was inspecting it for almost three years. During this interval I tried it with several distribution, a dozen of different quirks, including playing with vbesave, or recompile the dsdt-table. Tried adjusting the SATA-related settings in BIOS, testing it on FreeBSD, and so on. Now I am bored about it. If anybody is interested at this bug, and want me to test it further, I would do it, within one or two weeks (as the machine now located at my mother, 150kms far from me), but I suggest to close the bug with a WONTFIX resolution flag.
Comment 46 Mark Korondi 2009-11-21 23:17:34 UTC
Hi, I've been just playing around with suspending this notebook under Ubuntu 9.10 and it seems working fine for the first (more than two) couple of times. It is an unmodified system, updated since Hardy (04/2008). Unfortunately after several successful suspend-resume cycles the computer hanged last time, but since then I didn't experience that for more than ten times of working suspend-resume cycles. I think that the current description of this bug became deprecated thanking to the latest kernels, so this bug could be resolved with FIXED. If the problem occurs after many working cycles, it would be an other bug.