Bug 8737
Description
Mark Korondi
2007-07-12 00:10:53 UTC
Created attachment 12001 [details]
lspci output
Created attachment 12002 [details]
lspci -vv output
Created attachment 12003 [details]
dmesg output after a successfull suspend/resume cycle
Created attachment 12004 [details]
su -c 'dmidecode' output
Created attachment 12005 [details]
The latest default Arch Linux 2.6.22 kernel
After booting dmesg say: ata1: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x00011810 irq 14 ata2: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 bmdma 0x00011818 irq 15 ata1.00: ATAPI: MATSHITAUJ-840D, 1.00, max UDMA/33 ata1.00: configured for UDMA/33 After resuming dmesg says: ata1.00: model number mismatch 'MATSHITAUJ-840D' != '<CD>A<D4>S<C8>I<D4>A<D5>J <AD>8<B4>0<C4> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0> <A0>' ata1.00: revalidation failed (errno=-19) ata1.00: limiting speed to UDMA/33:PIO3 ata1: failed to recover some devices, retrying in 5 secs Alan? Uhh... As I see there went something wrong with my DVD-drive. Now it doesn't work. I try later a hardreset, because now neither BIOS nor linux recognizes the drive (also tried with libata.atapi_enabled=libata and combined_mode=libata) Sorry for this inconvenience, now I disabled the drive via BIOS. There are the new dmesg, after a long lasting suspend period which after suspend didn't work for the second time (so the problem is still there also without the DVD-Drive) Created attachment 12049 [details]
dmesg after disabling the IDE-SATA controller
(In reply to comment #8) > Created an attachment (id=12049) [details] > dmesg after disabling the IDE-SATA controller The attachement looks like an lspci output. Anyway, your hardware configuration seems to be similar to some known working ones. I'm afraid there's a hardware problem somewhere in your box. I can post actually two, next week even three other intel ich7 based clevo's datas which on doesn't work suspend. Each of that provides the same problem. And remember that in windows it works. So sadly it isn't a hardware-caused problem, but it comes from the linux kernel. I know that clevo doesn't build good computers and their BIOS is very unusable, apart from these facts suspend must work in linux just like in windows. (I felt like to test it on PC-(free)BSD, but there it won't come back from suspend, so I didn't get new experiences about this even bigger growing problem) Hm. Have you tried to suspend in the minimal configuration (ie. after booting with init=/bin/bash)? Of course I tried it. Also tried with an own kernel which contains all drivers as modules - in this case I've generated the initrd image in order to let it be able to boot because of the filesystem and sata-ide interface modules - and I made experiments with everything built-in, too. The symptom was the same in both case. That is why I am sure that the main problem is somewhere in the suspending functions. Oh, one additional thing: I think it's an extremely hardware-related problem (as you said, it works on machines similar to this). I also picked out the whole drive from the laptop, thinking if it cannot be powered off, then I'll help for it... Suspending didn't work. I hope this fact could help to locate my problem's source. Probably the wrong-working device is not the drive, but the ide-sata interface on the motherboard. Please test the 2.6.23-rc7 kernel with the patches from: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.23-rc7/patches/ applied and CONFIG_BLK_DEV_IDEACPI set in .config . Created attachment 12911 [details]
2.6.23-rc7 with the suggested 38 suspend&hibernate patches after an unsucceded suspend.
I tried your suggestions without any success, upper there is a new dmesg after that suspend. (Of course, it could go to suspend, and wake up 3minutes after, but second time it wouldn't work) Well, thanks. Why is there "Linux version 2.6.20-16-generic" in the dmesg? OMG, sorry for this silly stupid fault. Of course I tried to suspend with the right kernel... But I rebooted after into the standard generic Ubuntu kernel, watching the Magic number, and the row below (I hope once it says something what I also understand :) ) So now I will post a right dmesg, after one suspend/resume cycle Created attachment 12912 [details]
12911: 2.6.23-rc7 with the suggested 38 suspend&hibernate patches
Hm, can you try to use the libata-based PATA driver instead of the old IDE one and see if that changes anything (in that case please add libata.noacpi=0 to the kernel command line)? It didn't helped. dmesgs are attached below. Created attachment 13069 [details]
If it stays in suspend it can survive a second or even third one, too
Created attachment 13070 [details]
After being for long suspended, it cannot survive the second one, so no more lines could be posted...
Can you attach your current .config, please? Created attachment 13071 [details] actually used .config Of course, I can, here is it. Remember that this kernel is patched and compiled as it's been suggested here: http://bugzilla.kernel.org/show_bug.cgi?id=8737#c14 Please try with CONFIG_IDE unset and with only one PATA driver selected (not that I expect to see much difference, but I'd like to reduce the noise level as much as reasonably possible :-)). Also (for the same reason), please try with CONFIG_CPU_FREQ unset and compile the kernel without SMP support and see if that has any effect on the symptoms. Created attachment 13086 [details]
.config built as suggested
Tried with no SMP, no CONFIG_IDE, only one PATA driver (ata_piix) and without success. Booted either with libata.noacpi=0 and no.
Please apply the patch from: http://bugzilla.kernel.org/attachment.cgi?id=12991&action=view on top of your kernel. Then, please follow the instructions at: http://bugzilla.kernel.org/show_bug.cgi?id=7499#c44 and see if you are able to reproduce the problem and in which step. Ah, very big sorry! I missed your last post. Since two months! As soon as I have a little amount of time for it, I will try those tricks you wrote two months ago. I tried with the latest rc kernel, it doesn't work. Also tried as you suggested in #28comment (http://bugzilla.kernel.org/show_bug.cgi?id=8737#c28). What should happen? From 8 to 1 level of debugging everything went fine: I waited for 3 secs and it comes back. But I was listening to the DVD-drive, and durig this debugging-mode never have it been switched off, to which our problem is related. On the 0 level it suspended, and of course came back (the noise came fromthe drive during its recognizing), but second time it failed to suspend, like ever. P.S.: sorry again for not responding during the last two or three months, from now I will be online again and prepared for testing ;-) Please test 2.6.24-rc7 with the libata driver. There are some important libata suspend fixes in it that might help. Nothing different. It doesn't work Hmm, this may be ACPI-related, but I'm not sure in what way. It seems to be similar to Bug #9673, solved by blacklisting the machine in question. OK, at first look I don't understand what to do or try now following the link pointing to 9673 bug... :\ Please attach the output of acpidump, to begin with. Created attachment 14421 [details]
acpidump (2.6.22-14 Ubuntu 7.10 default kernel)
OK, lets go...
Created attachment 14423 [details]
A dsdt (the same kernel)
I also post the output of `cat /proc/acpi/dsdt`...
The two dumps are from a fresh installed ubuntu 7.10 (because I'm using now arch linux and it's repos don't contain the acpidump application)
Created attachment 14427 [details]
DSDT disassembled
Well, I don't see anything suspicious in the DSDT, but still I'm not that much experienced in reading these things. Someone with more ACPI experience should look at it. hi, Mark, does the problem still exist in the latest kernel release? Hi Zhang, actually I gave that laptop my mother, and at this moment I cannot test it. Now there is an Ubuntu 8.10 installed on it. Sometimes I tried to send it to suspend but the syndrome was the same. As I read the changelog of the new kernel, I know about the improvements on the suspend-resume architecture, so I planned to test it with the new releases. Next weekend (03/27-03/29) I will be able to test the newest kernel, please be patient until that! Thank you for still taking care of this bug! Hi Zhang, Well, unfortunately it won't work with 2.6.29. The play is the same, first time it works flawlessly, second time the noise comes from the cd drive, and there comes the black nothing. :( Now as I think over, it's a really, _really_ annoying bug, as it occours only the second suspend cycle. Argh! please do this test: 1. set CONFIG_PM_DEBUG and rebuild your kernel 2. echo core > /sys/power/pm_test and then 3. echo mem > /sys/power/state does the problem still exists when you run step 3 for the second time? no response from the bug reporter. please re-open it if the problem still exists in the latest git kernel. Hello,
I tried the latest (git18) kernel, and the symptoms are the same.
Interestingly, selecting the [core] testing method, it always successes (blinking cursor on the screen, for ~10 secs, and comes back), but if I select [none], then for the first time it suspends, for the second time doesn't. Even trying [core] between any two [none], it does the job, but the second [none] sucks.
So this [core] method _always_ "works" - of course, with this mode enabled, the laptop doesn't switch off as when it suspends.
> please re-open it if the problem still exists in the latest git kernel.
It's not necessary now. It is a two-year old bug, and I was inspecting it for almost three years. During this interval I tried it with several distribution, a dozen of different quirks, including playing with vbesave, or recompile the dsdt-table. Tried adjusting the SATA-related settings in BIOS, testing it on FreeBSD, and so on.
Now I am bored about it. If anybody is interested at this bug, and want me to test it further, I would do it, within one or two weeks (as the machine now located at my mother, 150kms far from me), but I suggest to close the bug with a WONTFIX resolution flag.
Hi, I've been just playing around with suspending this notebook under Ubuntu 9.10 and it seems working fine for the first (more than two) couple of times. It is an unmodified system, updated since Hardy (04/2008). Unfortunately after several successful suspend-resume cycles the computer hanged last time, but since then I didn't experience that for more than ten times of working suspend-resume cycles. I think that the current description of this bug became deprecated thanking to the latest kernels, so this bug could be resolved with FIXED. If the problem occurs after many working cycles, it would be an other bug. |