Distribution: SUSE Linux 10.1 Beta8 with vanilla 2.6.16 kernel Hardware Environment: Toshiba Satellite P10-554 Notebook. Hyperthreaded Pentium 4. /proc/cpuinfo: processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping : 9 cpu MHz : 2793.509 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5595.70 processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.80GHz stepping : 9 cpu MHz : 2793.509 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe cid xtpr bogomips : 5586.31 Problem Description: On both suspend to disk and suspend to ram, after waking up the system and reloading the image, kernel panics with the line "kernel panic - not syncing: Not enough cpus". I will attach an image of complete resume process. Both suspend to disk and to ram work flawless when giving 'noapic' as boot parameter. Manually disabling and enabling CPU1 with writing values to /sys/devices/system/ cpu/cpu1/online works, too. Steps to reproduce: Boot with init=/bin/bash, mount /sys, swapon -a, echo disk/mem > /sys/power/ state, wake up system
Created attachment 7637 [details] Image of kernel panic
Please do: manually offline a CPU and then do a suspend/resume circle. Let's if you can online the CPU after resume. Do you use genapic? (better if you can provide the config file and dmesg before resume)
Could you please take the photo of system hang with boot option 'apic=debug'? This will give us more info to root cause this issue.
> Please do: manually offline a CPU and then do a suspend/resume circle. Let's > if you can online the CPU after resume. System resumes if cpu1 is offlined before suspend. But I can't enable it afterwards. I get the same message like short before the Line "Error taking cpu 1 up: -22" everytime I try to enable it. But now kernel panic. > Do you use genapic? (better if you can provide the config file and dmesg > before resume) /var/log/messages tells me "Mar 23 14:48:40 (none) kernel: Unknown genapic `apic =debug' specified.". Nevertheless, I will attach dmesg and config.gz. > Could you please take the photo of system hang with boot option 'apic=debug'? > This will give us more info to root cause this issue. Will attach it, but it only contains one additional line.
Created attachment 7649 [details] dmesg before suspend
Created attachment 7650 [details] config.gz
Created attachment 7651 [details] kernel panic with apic=debug
Ok, you are using genapic. please try 'CONFIG_X86_PC' instead of 'CONFIG_X86_GENERICARCH'. >System resumes if cpu1 is offlined before suspend. But I can't enable it >afterwards. I get the same message like short before the Line "Error taking >cpu 1 up: -22" everytime I try to enable it. But now kernel panic. This only could happen when there are two cpus to me. What I'd like you to try is offline cpu1 manually. and then do suspend/resume. After resume, manually online cpu1. let's see if it works. and please give me the dmesg. >acpic=debug It appears you spelled it wrong. it's 'apic=debug'.
>Ok, you are using genapic. please try 'CONFIG_X86_PC' instead >of 'CONFIG_X86_GENERICARCH'. All tests are done with the new kernel now. >This only could happen when there are two cpus to me. What I'd like you to try >is offline cpu1 manually. and then do suspend/resume. After resume, manually >online cpu1. let's see if it works. and please give me the dmesg. That's exactly what I did. Unfortunatelly, nothing is written to disk after resume. So I will attach dmesg before setting cpu1 offline and an image with disabling cpu1, suspending, resuming and setting cpu1 online again. >It appears you spelled it wrong. it's 'apic=debug'. Yes, I noticed that already but did attach the 'old' dmesg.
Created attachment 7657 [details] dmesg before disabling cpu1 and suspending
Created attachment 7658 [details] image of disabling cpu1, suspending, resuming and enabling cpu1
How about boot option 'lpj=11172736'?
After resume and before online cpu1, please check if time is correct. Thanks!
Also, how about boot option 'clock=tsc' or 'clock=pit'?
lets try to no panic when the 2nd cpu fails to start up and see if the system can come up wihh 1 cpu...
Well, the time is indeed not correct after resume. The time varies within a specific range. Hours and minutes always stay the same. lpj=11172736 --> 1 to 4 seconds (like without a boot param) clock=tsc --> 1 to 2 seconds clock=pit --> it stays always the same But the ranges maybe be random, though. For example: `date` 14:01:01 as soon as it reaches 14:01:05, it switches back to 14:01:01 and the same game starts again.
Looks we are approaching to the root cause ;). please deselect 'CONFIG_X86_PM_TIMER', let's try if resume works. I guess pm timer is the root cause, so don't use it.
Do interrupts work after resume? I.e. if you do time sleep 1, does it actually return after one second, or does it hang forever?
Sorry for the delay, I was on vacation for some time and had no access to the machine... I can't disable 'CONFIG_X86_PM_TIMER' because it got removed some time ago IIRC. At least it gets automatically readded in .config or set to CONIFG_X86_PM_ TIMER=y if I try Interrups do not work. time sleep 1 hangs forever.
Ok, let's use a rude method :) You can delete the line include '&timer_pmtmr_init,' in arch/i386/kernel/timers/timer.c I now can know the pm timer doesn't work. I guess the LPC's config space isn't completely restored in the resume, so LPC allocate pm timer's io port to a different position, which cause the io port can't be decoded.
Can you attach the lspci -xxx output before/after resume?
Please reopen this bug if: - it is still present in kernel 2.6.17 and - you can provide the requested data.