Bug 77201
Created attachment 137871 [details]
CPU info
Created attachment 137881 [details]
dmidecode
Created attachment 137891 [details]
lspci
Created attachment 137901 [details]
lsmod
Make that 3.15-rc8 (3.15-rc5 was a typo). Does CPU offline/online (via sysfs) work on this machine with 3.15-rc8? I assume you wanted me to try it with powernow-k8 loaded. When doing 'echo 0 > /sys/devices/system/cpu/cpu1/online' the machine keeps working and I can see that cpu1 is gone from /proc/interrupts (as stated in cpu-hotplug.txt). After trying to get the cpu online again with 'echo 1 > /sys/devices/system/cpu/cpu1/online' the machine hangs. CPU0 can't be put offline as there is no /sys/devices/system/cpu/cpu0/online even when booting with cpu0_hotplug, but I suppose that's expected. > After trying to get the cpu online again ... the machine hangs.
Thanks for testing.
So the answer is "no", cpu online/offline does not work.
so the cpu offline/online works when powernow-k8 is UN loaded? If powernow-k8 is UN loaded then cpu offline/online works fine. Viresh, Srivatsa, this looks like fallout of some recent core cpufreq changes related to CPU offline/online. Can you please have a look at this bug entry? Hi, I took a quick look at the CPU hotplug path in both cpufreq.c and powernow-k8.c, but nothing really appeared to stand out. Does reverting commit c3274763bfc (cpufreq: powernow-k8: Initialize per-cpu data-structures properly) help? I just want to know if that is a suspect or not, since that commit went in around 3.14 and is the only relevant commit in powernow-k8 driver during that time. I'll attach the reverted patch with this bugzilla for your reference. Also, please give the output of: cat /sys/devices/system/cpu/cpuNum/cpufreq/related_cpus cat /sys/devices/system/cpu/cpuNum/cpufreq/affected_cpus for both CPUs 0 and 1 (just replace cpuNum with 0 and 1). One other thing: what is the exact version of the earliest failing kernel? Is it vanilla 3.14 or a stable version of 3.14 (eg: 3.14.x)? (For pass/fail testing, please use the sysfs method with powernow-k8 loaded, like you did in comment 7. Once we get this fixed, you can try the full suspend/resume tests). Thank you! Regards, Srivatsa S. Bhat Created attachment 138661 [details]
Revert-of-commit-c3274763bfc.patch
Patch to revert commit c3274763bfc (cpufreq: powernow-k8: Initialize per-cpu data-structures properly).
I have tested v3.15-7257-g963649d and applied the patch you attached and it still hangs when trying to bring the cpu online. The earliest 3.14 I could quickly test right now was 3.14.1 (provided by the distro I'm using) and it also fails when trying to bring the cpu online. If it helps, 3.13.8 worked fine (suspend and hibernate, didn't actually test cpu offline/online). Here is the other information you requested: cat /sys/devices/system/cpu/cpu0/cpufreq/related_cpus 0 1 cat /sys/devices/system/cpu/cpu0/cpufreq/affected_cpus 0 1 cat /sys/devices/system/cpu/cpu1/cpufreq/related_cpus 0 1 cat /sys/devices/system/cpu/cpu1/cpufreq/affected_cpus 0 1 Hi, Can you please add these to top of your drivers/cpufreq/Makefile ccflags-y := -DDEBUG ccflags-y += -DVERBOSE_DEBUG16:34 and then give us bootlogs & offline+online logs? Hi, Here are a few more suggestions you can try: 1. Use either performance or powersave governor: echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor (You can use either performance or powersave; you don't have to try both). The ondemand governor (which is usually the default) is complex and uses timers, workqueues etc and interacts with the cpufreq CPU hotplug code, whereas the above 2 governors are extremely simple and don't have any complex interactions with hotplug. So trying them would help isolate the problem and understand whether it is a cpufreq core issue or a cpufreq governor issue. 2. Enable lockdep debugging, specifically the config options mentioned below, so that we'll know if we have a locking issue: CONFIG_LOCKDEP_SUPPORT=y CONFIG_DEBUG_RT_MUTEXES=y CONFIG_DEBUG_PI_LIST=y CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_LOCKDEP=y CONFIG_DEBUG_ATOMIC_SLEEP=y You can try the above 2 things along with the suggestions Viresh gave, in a single run, and let us know how it goes. Regards, Srivatsa S. Bhat I have done as asked and added the ccflags-y to the top of drivers/cpufreq/Makefile and enabled the specified config options. I have separated the logs in 3 files: boot log, governor change (ondemand->performance) log and cpu offline/online log. When using the performance governor, cpu offline/online works fine, but if afterterwards I try to change the governor back to ondemand the machine hangs, it doesn't matter for which cpu I try to change the governor first. I'm also attaching the config I used, just in case anyone wants to check which config options were used. Created attachment 138871 [details]
Boot log
Created attachment 138881 [details]
Governor change (ondemand -> performance)
Created attachment 138891 [details]
CPU offline/online
Created attachment 138901 [details]
config file
Hi,
Have you missed the most important log? Offline/online with
Hang..
In your case both CPUs share clock line and so it doesn't matter
which cpu you play with.
Sent from my iPad
> On 10-Jun-2014, at 6:14 pm, bugzilla-daemon@bugzilla.kernel.org wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=77201
>
> --- Comment #17 from registosites@hotmail.com ---
> I have done as asked and added the ccflags-y to the top of
> drivers/cpufreq/Makefile and enabled the specified config options.
>
> I have separated the logs in 3 files: boot log, governor change
> (ondemand->performance) log and cpu offline/online log.
>
> When using the performance governor, cpu offline/online works fine, but if
> afterterwards I try to change the governor back to ondemand the machine
> hangs,
> it doesn't matter for which cpu I try to change the governor first.
>
> I'm also attaching the config I used, just in case anyone wants to check
> which
> config options were used.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.
> Hi,
>
> Have you missed the most important log? Offline/online with
> Hang..
>
> In your case both CPUs share clock line and so it doesn't matter
> which cpu you play with.
I don't know how to get the log from when the machine hangs, I've tried:
echo 1 > /sys/devices/system/cpu/cpu1/online ; dmesg > hang.log
When trying to execute that the machine hangs and I have to do a hard reset. After booting and fsck I don't have any traces of hang.log in the filesystem.
If there is any other way to get the logs I'll need guidance or pointers on how to do it. In any case I'm attaching the log of cpu offline with the ondemand cpu governor and powernow-k8 loaded.
Created attachment 138911 [details]
CPU offline, powenow-k8 loaded, ondemand cpu gov.
Probably just do cat /proc/kmsg & offline then online cpu and simply copy logs from minicom ? Another (although crude) method: On most setups, you can go to the console by pressing Ctrl+Alt+F3 or any of the function keys. Go to the console this way, login and run the cpu offline/online command. The kernel logs will get printed directly in front of you on the console. Then take a picture and attach it with the bugzilla :-) Of course, choose whatever method is convenient for you. Regards, Srivatsa S. Bhat I've tried using the crude log method, with cat /proc/kmsg & and then blindly typing the commands to offline/online the cpu. The result is attached as a picture. Created attachment 138931 [details]
CPU online hang log
Did 3.13 work for you correctly? I have just tested 3.13.8* and I can confirm that cpu offline/online works without problems. * This is the latest 3.13 I have quick and easy access to (it is the last 3.13 provided by my distro). Hi, Thanks for your efforts !! We couldn't get the complete picture as there are just too many prints wanting to fit on screen :) Can you try this to get the right set of prints for us, that way we might be able to focus in the right direction: - Remove all pr_debug() lines from powernow-k8 driver - And apply attached patch for cpufreq core, only reducing number of print messages. (Rebased over 3.15) And then give us another screenshot.. Also some queries about your current logs: - I hope this was the last message you saw on screen and it just became unresponsive? freq_table: target index is 0, freq is:2200000 kHz and the expected ones after this are: powernow_k8: targ: cpu 0, 2200000 kHz, min 800000, max 2200000 powernow_k8: targ: curr fid 0x8, vid 0x15 powernow_k8: cpu 0 transition to index 0 powernow_k8: table matched fid 0xe, giving vid 0x12 powernow_k8: cpu 0, changing to fid 0xe, vid 0x12 So, it looks like the CPU did came back and something happened while changing freq to max. Somehow this problem is related to something special being done in your driver. We don't see this problem otherwise for other platforms. One thing i could figure out is scheduling a *work* for changing frequencies but I am not sure if the problem is related to that.. I tried to have a look at what changed between 3.13.8 and 3.14, and couldn't figure out anything special that might end up in this issue :( If you couldn't get anything conclusive with above tests then there might be some chances that it *isn't* related to cpufreq and some other changes in kernel are responsible. The best we can try is: get only cpufreq back to the old state, i.e. 3.13.8, by reverting commits and try again.. few reverts were required for this and to simplify your work I have created a branch with all reverts required. git://git.linaro.org/people/viresh.kumar/mylinux.git powernow-k8-debugging NOTE: We can still see some diff this way: git diff v3.13.8..powernow-k8-debugging -- drivers/cpufreq/ -- include/linux/cpufreq.h But there shouldn't be any *functional* change for cpufreq.c/governors/freq-table.c/powernow-k8 drivers and all cpufreq files do compile.. (haven't tried compiling powernow-k8).. Let me know if this solves your problem or make it worst :) Created attachment 139091 [details]
light-weight debug prints for cpufreq core: for debugging this problem
(In reply to Viresh Kumar from comment #31) > Also some queries about your current logs: > - I hope this was the last message you saw on screen and it just became > unresponsive? > > freq_table: target index is 0, freq is:2200000 kHz > > and the expected ones after this are: > > powernow_k8: targ: cpu 0, 2200000 kHz, min 800000, max 2200000 > powernow_k8: targ: curr fid 0x8, vid 0x15 > powernow_k8: cpu 0 transition to index 0 > powernow_k8: table matched fid 0xe, giving vid 0x12 > powernow_k8: cpu 0, changing to fid 0xe, vid 0x12 > > So, it looks like the CPU did came back and something happened while changing > freq to max. > > Somehow this problem is related to something special being done in your > driver. We don't see this problem otherwise for other platforms. > Right.. I spent all night yesterday trying to figure out what could be the bug, but I didn't find any leads :-( > One thing i could figure out is scheduling a *work* for changing frequencies > but I am not sure if the problem is related to that.. > Yeah, that looked odd to me too! > I tried to have a look at what changed between 3.13.8 and 3.14, and couldn't > figure out anything special that might end up in this issue :( > Not sure if any workqueue changes between those versions have any effect on the CPU hotplug path for powernow-k8 cpufreq driver. > > If you couldn't get anything conclusive with above tests then there might be > some chances that it *isn't* related to cpufreq and some other changes in > kernel are responsible. The best we can try is: get only cpufreq back to the > old state, i.e. 3.13.8, by reverting commits and try again.. > I agree. I was about to suggest something similar. > few reverts were required for this and to simplify your work I have created > a branch with all reverts required. > > git://git.linaro.org/people/viresh.kumar/mylinux.git powernow-k8-debugging > Since Viresh has already set this up, you can try this. If that doesn't help, I'll give you a list of possible suspect commits in cpufreq between the 2 versions to try out. The ones at the top of my list would be: 12478cf0c55e (cpufreq: Make sure frequency transitions are serialized) e0b3165ba521 (cpufreq: add 'freq_table' in struct cpufreq_policy) 4e97b631f24 (cpufreq: Initialize governor for a new policy under policy->rwsem) 1c0ca90207 (cpufreq: don't call cpufreq_update_policy() on CPU addition) 6f1e4efd882 (cpufreq: Fix timer/workqueue corruption by protecting reading governor_enabled) Regards, Srivatsa S. Bhat (In reply to Srivatsa S. Bhat from comment #33) > Since Viresh has already set this up, you can try this. If that doesn't > help, I'll give you a list of possible suspect commits in cpufreq between What else can be there once I have almost got it back to 3.13.8 ? > the 2 versions to try out. The ones at the top of my list would be: > 12478cf0c55e (cpufreq: Make sure frequency transitions are serialized) > > e0b3165ba521 (cpufreq: add 'freq_table' in struct cpufreq_policy) Remember, he is getting the problem on 3.14 itself and so no fun playing with patches that went into 3.15, right? And if I am not wrong these went recently only. (In reply to Viresh Kumar from comment #34) > (In reply to Srivatsa S. Bhat from comment #33) > > > Since Viresh has already set this up, you can try this. If that doesn't > > help, I'll give you a list of possible suspect commits in cpufreq between > > What else can be there once I have almost got it back to 3.13.8 ? > No, you misunderstood my intention. Of course your branch is a much stricter one with all new cpufreq commits reverted. I was just trying to point out commits which came after 3.13.8, in case even your branch (3.13.8 along with cpufreq reverts) ends up passing his tests. In that case we'll have to narrow down *within* the commits between v3.13.8 and v3.14.1, no? My list was to help with that. > > the 2 versions to try out. The ones at the top of my list would be: > > > 12478cf0c55e (cpufreq: Make sure frequency transitions are serialized) > > > > e0b3165ba521 (cpufreq: add 'freq_table' in struct cpufreq_policy) > > Remember, he is getting the problem on 3.14 itself and so no fun playing > with patches that went into 3.15, right? And if I am not wrong these went > recently only. Right, sorry, I was doing a diff between 3.13.8 and 3.15-rc8 (based on his first report) by mistake. Ignore these 2 commits, they are recent. But the other ones I listed came in between v3.13.8 and v3.14, so they might be worth trying. Regards, Srivatsa S. Bhat (In reply to Srivatsa S. Bhat from comment #35) > No, you misunderstood my intention. Of course your branch is a much > stricter one with all new cpufreq commits reverted. > > I was just trying to point out commits which came after 3.13.8, in case even > your branch (3.13.8 along with cpufreq reverts) ends up passing his tests. > In that case we'll have to narrow down *within* the commits between v3.13.8 > and v3.14.1, no? My list was to help with that. Ahh, sorry.. But in that case, wouldn't a git bisect between v3.13.8..my-branch would be easier? (In reply to Viresh Kumar from comment #36) > (In reply to Srivatsa S. Bhat from comment #35) > > No, you misunderstood my intention. Of course your branch is a much > > stricter one with all new cpufreq commits reverted. > > > > I was just trying to point out commits which came after 3.13.8, in case > even > > your branch (3.13.8 along with cpufreq reverts) ends up passing his tests. > > In that case we'll have to narrow down *within* the commits between v3.13.8 > > and v3.14.1, no? My list was to help with that. > > Ahh, sorry.. > > But in that case, wouldn't a git bisect between v3.13.8..my-branch would be > easier? Yes, a git bisect would have been ideal, but apparently it is failing due to build errors in intermediate commits (see his first report and bisect log) :-( So we might have to manually pick suspect commits to test... Regards, Srivatsa S. Bhat (In reply to Srivatsa S. Bhat from comment #37) > (In reply to Viresh Kumar from comment #36) > > But in that case, wouldn't a git bisect between v3.13.8..my-branch would be s/3.13.8/3.14 > > easier? > > Yes, a git bisect would have been ideal, but apparently it is failing due to > build errors in intermediate commits (see his first report and bisect log) > :-( > So we might have to manually pick suspect commits to test... Does the above change make it any better ? We only have few revert patches (GOOD) on top of 3.14 (BAD) and can easily make out which one is making 3.14 good :) (In reply to Viresh Kumar from comment #38) > (In reply to Srivatsa S. Bhat from comment #37) > > (In reply to Viresh Kumar from comment #36) > > > > But in that case, wouldn't a git bisect between v3.13.8..my-branch would > be > > s/3.13.8/3.14 > > > > easier? > > > > Yes, a git bisect would have been ideal, but apparently it is failing due > to > > build errors in intermediate commits (see his first report and bisect log) > > :-( > > So we might have to manually pick suspect commits to test... > > Does the above change make it any better ? > > We only have few revert patches (GOOD) on top of 3.14 (BAD) and can easily > make out which one is making 3.14 good :) I hope so :-) Regards, Srivatsa S. Bhat > Hi, > > Thanks for your efforts !! > > We couldn't get the complete picture as there are just too many prints > wanting to fit on screen :) > > Can you try this to get the right set of prints for us, that way we might be > able to focus in the right direction: > > - Remove all pr_debug() lines from powernow-k8 driver > - And apply attached patch for cpufreq core, only reducing number of print > messages. (Rebased over 3.15) I have deleted all pr_debug() lines and recompiled with all the patches, modifications and config options from previous posts. > > And then give us another screenshot.. The new screenshot is attached. After bringing the cpu online I got back to a prompt but the machine hangs. > > Also some queries about your current logs: > - I hope this was the last message you saw on screen and it just became > unresponsive? > > freq_table: target index is 0, freq is:2200000 kHz > That was the last output before the machine became unresponsive. > and the expected ones after this are: > > powernow_k8: targ: cpu 0, 2200000 kHz, min 800000, max 2200000 > powernow_k8: targ: curr fid 0x8, vid 0x15 > powernow_k8: cpu 0 transition to index 0 > powernow_k8: table matched fid 0xe, giving vid 0x12 > powernow_k8: cpu 0, changing to fid 0xe, vid 0x12 > > So, it looks like the CPU did came back and something happened while changing > freq to max. > > Somehow this problem is related to something special being done in your > driver. We don't see this problem otherwise for other platforms. > > One thing i could figure out is scheduling a *work* for changing frequencies > but I am not sure if the problem is related to that.. > > I tried to have a look at what changed between 3.13.8 and 3.14, and couldn't > figure out anything special that might end up in this issue :( > > > If you couldn't get anything conclusive with above tests then there might be > some chances that it *isn't* related to cpufreq and some other changes in > kernel are responsible. The best we can try is: get only cpufreq back to the > old state, i.e. 3.13.8, by reverting commits and try again.. > > few reverts were required for this and to simplify your work I have created > a branch with all reverts required. > > git://git.linaro.org/people/viresh.kumar/mylinux.git powernow-k8-debugging > > NOTE: We can still see some diff this way: > git diff v3.13.8..powernow-k8-debugging -- drivers/cpufreq/ -- > include/linux/cpufreq.h > > But there shouldn't be any *functional* change for > cpufreq.c/governors/freq-table.c/powernow-k8 drivers and all cpufreq files > do compile.. (haven't tried compiling powernow-k8).. > > Let me know if this solves your problem or make it worst :) I didn't manage to compile this one, it fails with: CC drivers/cpuidle/sysfs.o CC lib/dynamic_debug.o CC [M] drivers/cpufreq/acpi-cpufreq.o CC drivers/cpuidle/governors/ladder.o drivers/cpufreq/acpi-cpufreq.c: In function ‘store_boost’: drivers/cpufreq/acpi-cpufreq.c:160:26: error: ‘struct cpufreq_driver’ has no member named ‘boost_supported’ if (!acpi_cpufreq_driver.boost_supported) ^ drivers/cpufreq/acpi-cpufreq.c: In function ‘show_cpb’: drivers/cpufreq/acpi-cpufreq.c:180:49: error: ‘struct cpufreq_driver’ has no member named ‘boost_enabled’ return sprintf(buf, "%u\n", acpi_cpufreq_driver.boost_enabled); ^ drivers/cpufreq/acpi-cpufreq.c: In function ‘boost_notify’: drivers/cpufreq/acpi-cpufreq.c:540:37: error: ‘struct cpufreq_driver’ has no member named ‘boost_enabled’ boost_set_msrs(acpi_cpufreq_driver.boost_enabled, cpumask); ^ drivers/cpufreq/acpi-cpufreq.c: At top level: drivers/cpufreq/acpi-cpufreq.c:897:2: error: unknown field ‘set_boost’ specified in initializer .set_boost = _store_boost, ^ drivers/cpufreq/acpi-cpufreq.c:897:2: warning: excess elements in struct initializer drivers/cpufreq/acpi-cpufreq.c:897:2: warning: (near initialization for ‘acpi_cpufreq_driver’) drivers/cpufreq/acpi-cpufreq.c: In function ‘acpi_cpufreq_boost_init’: drivers/cpufreq/acpi-cpufreq.c:908:22: error: ‘struct cpufreq_driver’ has no member named ‘boost_supported’ acpi_cpufreq_driver.boost_supported = true; ^ drivers/cpufreq/acpi-cpufreq.c:909:22: error: ‘struct cpufreq_driver’ has no member named ‘boost_enabled’ acpi_cpufreq_driver.boost_enabled = boost_state(0); ^ drivers/cpufreq/acpi-cpufreq.c:913:37: error: ‘struct cpufreq_driver’ has no member named ‘boost_enabled’ boost_set_msrs(acpi_cpufreq_driver.boost_enabled, ^ drivers/cpufreq/acpi-cpufreq.c: In function ‘show_cpb’: drivers/cpufreq/acpi-cpufreq.c:181:1: warning: control reaches end of non-void function [-Wreturn-type] } ^ scripts/Makefile.build:314: recipe for target 'drivers/cpufreq/acpi-cpufreq.o' failed make[2]: *** [drivers/cpufreq/acpi-cpufreq.o] Error 1 scripts/Makefile.build:455: recipe for target 'drivers/cpufreq' failed make[1]: *** [drivers/cpufreq] Error 2 make[1]: *** Waiting for unfinished jobs.... CC drivers/cpuidle/governors/menu.o CC lib/nlattr.o CC net/core/rtnetlink.o LD drivers/cpuidle/governors/built-in.o LD drivers/cpuidle/built-in.o Makefile:841: recipe for target 'drivers' failed make: *** [drivers] Error 2 make: *** Waiting for unfinished jobs.... CC net/core/utils.o CC lib/average.o CC lib/cpu_rmap.o CC lib/dynamic_queue_limits.o Created attachment 139111 [details]
Cleaner hang log
(In reply to registosites from comment #40) > That was the last output before the machine became unresponsive. Hmm, so it looks like something happened while changing frequency and core is doing things correctly otherwise. > I didn't manage to compile this one, it fails with: > > CC drivers/cpuidle/sysfs.o > CC lib/dynamic_debug.o > CC [M] drivers/cpufreq/acpi-cpufreq.o > CC drivers/cpuidle/governors/ladder.o > drivers/cpufreq/acpi-cpufreq.c: In function ‘store_boost’: > drivers/cpufreq/acpi-cpufreq.c:160:26: error: ‘struct cpufreq_driver’ has no I intentionally didn't revert unrelated drivers completely and so this. But can't you just disable acpi-cpufreq driver and test? I thought yes, and so did this. Otherwise I would update my branch for you. Ahh ok, I'll disable acpi-cpufreq and give it another try. I think acpi-cpufreq is needed, unless I'm not seeing this correctly (which is quite likely). Using nconfig and asking for SymInfo on AMD Opteron/Athlon64 PowerNow! (which is required to test the problem at hand) I get: CONFIG_X86_POWERNOW_K8: (...) Depends on: CPU_FREQ [=y] && X86 [=y] && ACPI [=y] && ACPI_PROCESSOR [=m] && X86_ACPI_CPUFREQ [=m] Doesn't this mean acpi-cpufreq is needed? (In reply to registosites from comment #44) > Depends on: CPU_FREQ [=y] && X86 [=y] && ACPI [=y] && ACPI_PROCESSOR [=m] && > X86_ACPI_CPUFREQ [=m] I don't have a clue why does it depend on that :( Anyway, I have updated my branch with two more commits (at top) and it does compile now. In case these revert fixes it for you and you try to identify which revert fixed it, you might also need to apply top revert to get things compiled. No luck with these also, machine still hangs after bringing the cpu online :( (In reply to registosites from comment #46) > No luck with these also, machine still hangs after bringing the cpu online :( I expected this as as I couldn't find anything significant in cpufreq side. @Srivatsa: Please have a look at diff between v3.13.8..my-branch -- drivers/cpufreq to see if I have missed something that could break stuff here. Now what? I don't think something can be wrong with workqueues and it might be some ACPI problem. Not very sure though. Can you try bisecting between 3.13 and 3.14 again? And try fixing/skipping commits with compile failures ?? Its very difficult to narrow it down otherwise. @Rafael: any guesses? Hmm, this just became even more hard :-( Viresh, sure, I'll check the diff tomorrow to see if there is any obvious breakage. Meanwhile, can the bug reporter please test these commits that I mentioned earlier? (6f1e4efd882 is my first suspect commit after 3.13.8) 6f1e4efd882 (cpufreq: Fix timer/workqueue corruption by protecting reading governor_enabled) 1c0ca90207 (cpufreq: don't call cpufreq_update_policy() on CPU addition) 4e97b631f24 (cpufreq: Initialize governor for a new policy under policy->rwsem) I know that the issue might not be in cpufreq (especially after your test results from Viresh's branch). But I still I have some tiny hope that we might have missed something in cpufreq... Besides, bisecting within the cpufreq commits is a lot faster than bisecting everything between 3.13.8 and 3.14, in case the bug is really in cpufreq. By the way, testing the above commits means that you need to test 1 commit before the commits mentioned above and see if it works, and then go to the mentioned commit and see if it fails. Regards, Srivatsa S. Bhat By the way, another approach I have in mind is to remove the usage of work_on_cpu() from the powernow-k8 driver (since that's the only noticeably different thing about this driver), and check if it solves the problem. That way, powernow-k8 would become mostly similar to all other cpufreq drivers and this might make it easier to isolate the issue. (Of course work_on_cpu() must be remove with care, such that the correctness of the code is still maintained). Regards, Srivatsa S. Bhat (In reply to Srivatsa S. Bhat from comment #48) > Meanwhile, can the bug reporter please test these commits that I mentioned > earlier? (6f1e4efd882 is my first suspect commit after 3.13.8) > > 6f1e4efd882 (cpufreq: Fix timer/workqueue corruption by protecting reading > governor_enabled) > > 1c0ca90207 (cpufreq: don't call cpufreq_update_policy() on CPU addition) > > 4e97b631f24 (cpufreq: Initialize governor for a new policy under > policy->rwsem) I'll test the commits before these and these commits and post back when I have the results. Should I also test e0b3165ba521 (cpufreq: add 'freq_table' in struct cpufreq_policy) 12478cf0c55e (cpufreq: Make sure frequency transitions are serialized) in case the previous ones work/don't work? (In reply to Srivatsa S. Bhat from comment #49) > By the way, another approach I have in mind is to remove the usage of > work_on_cpu() from the powernow-k8 driver (since that's the only noticeably > different thing about this driver), and check if it solves the problem. That is a good idea. At least we could narrow down the issue this way. (In reply to registosites from comment #50) > (In reply to Srivatsa S. Bhat from comment #48) > > Should I also test > > e0b3165ba521 (cpufreq: add 'freq_table' in struct cpufreq_policy) > 12478cf0c55e (cpufreq: Make sure frequency transitions are serialized) > > in case the previous ones work/don't work? Yes, please. (In reply to Srivatsa S. Bhat from comment #48) > Meanwhile, can the bug reporter please test these commits that I mentioned > earlier? (6f1e4efd882 is my first suspect commit after 3.13.8) > > 6f1e4efd882 (cpufreq: Fix timer/workqueue corruption by protecting reading > governor_enabled) It was already reverted in my branch > 1c0ca90207 (cpufreq: don't call cpufreq_update_policy() on CPU addition) Went in 3.15 > 4e97b631f24 (cpufreq: Initialize governor for a new policy under > policy->rwsem) Also reverted in my branch > I know that the issue might not be in cpufreq (especially after your test > results from Viresh's branch). But I still I have some tiny hope that we > might have missed something in cpufreq... Besides, bisecting within the > cpufreq commits is a lot faster than bisecting everything between 3.13.8 and > 3.14, in case the bug is really in cpufreq. > > By the way, testing the above commits means that you need to test 1 commit > before the commits mentioned above and see if it works, and then go to the > mentioned commit and see if it fails. Actually I couldn't understand how this may help? Sorry for missing what you had in mind :( (In reply to Srivatsa S. Bhat from comment #49) > By the way, another approach I have in mind is to remove the usage of > work_on_cpu() from the powernow-k8 driver (since that's the only noticeably > different thing about this driver), and check if it solves the problem. I wouldn't bet my 10 cents on that :). Also I am not sure if people added a work there just for fun, there must be some constraints and we might see crashes or hangs.. But yeah, maybe we can try that as well.. I was thinking about this earlier but never shared as I thought it might not be worth. Logs shared by $reporter of this problem clears that cpufreq and governors did all good, more or less. And it was the target_index() which never came back.. I will try having a closer look again though. $reporter also reported that offline/online works fine with performance gov but switching back to ondemand again crashes it. Which makes me believe that cpufreq core is doing everything correctly and there is some problem while switching freq after offline/online. @Reporter (sorry we don't know your name :( ), Can you please try this? Keep the prints as you shared in last hang pic. Set governor to userspace, set any cpu to min freq, offline/online cpu1, change freq to any freq (and later max freq if earlier one works) and see how things behave.. I am quite sure it wouldn't last. If you still see a hang you might want to do this: - Add many debug prints in your target_index() path (as gov is userspace you wouldn't see them again and again). - And see where exactly you hang .. - I am quite sure work will get scheduled and so try adding as many debug prints as possible on the first go only. HTH :) (In reply to Rafael J. Wysocki from comment #52) > (In reply to registosites from comment #50) > > (In reply to Srivatsa S. Bhat from comment #48) > > > > Should I also test > > > > e0b3165ba521 (cpufreq: add 'freq_table' in struct cpufreq_policy) > > 12478cf0c55e (cpufreq: Make sure frequency transitions are serialized) > > > > in case the previous ones work/don't work? > > Yes, please. Actually no. These must have went into 3.15 and we are trying to debug 3.14 here. Did I miss anything ? (In reply to Viresh Kumar from comment #55) > (In reply to Rafael J. Wysocki from comment #52) > > (In reply to registosites from comment #50) > > > (In reply to Srivatsa S. Bhat from comment #48) > > > > > > Should I also test > > > > > > e0b3165ba521 (cpufreq: add 'freq_table' in struct cpufreq_policy) > > > 12478cf0c55e (cpufreq: Make sure frequency transitions are serialized) > > > > > > in case the previous ones work/don't work? > > > > Yes, please. > > Actually no. These must have went into 3.15 and we are trying to debug 3.14 > here. Did I miss anything ? I have finished testing the following: 6f1e4efd882 (cpufreq: Fix timer/workqueue corruption by protecting reading governor_enabled) 87ae97f10c0 (cpufreq: s3c24xx: Staticize local variable) result: both ok 1c0ca90207 (cpufreq: don't call cpufreq_update_policy() on CPU addition) d9a789c7a0 (cpufreq: Refactor cpufreq_set_policy()) result: both bad 4e97b631f24 (cpufreq: Initialize governor for a new policy under policy->rwsem) 5a7e56a5d29 (cpufreq: Initialize policy before making it available for others to use) result: both bad I have also tested setting the governor to userspace, setting the cpu to the minimum speed, cpu offline/online, set an intermediate cpu speed and then maximum speed. The machine becomes unresponsive only when trying to set the cpu to the maximum speed. If I understand correctly, the suggestion to try next is to keep the patch from comment 32, keep the changes I've done to powernow-k8 (remove all pr_debug()), keep the extra compile flags in drivers/cpufreq/Makefile and also add plenty of pr_debug() to drivers/cpufreq/cpufreq.c. I assume it is here I should add more debug output since target_index() can be found only here. If I'm correct, in which places would it make more sense to add debug output and which info should I try to output? I'm ok with doing this as long as it doesn't get too complicated, the reason is my programming skills are not that good and I have zero knowledge of the kernel's internals. (In reply to Mauro from comment #56) Hi Mauro, we finally know your name :) > I have finished testing the following: > > 6f1e4efd882 (cpufreq: Fix timer/workqueue corruption by protecting reading > governor_enabled) > 87ae97f10c0 (cpufreq: s3c24xx: Staticize local variable) > result: both ok So you tested your stuff on these two commits (i.e. no reverts hacks)? > 1c0ca90207 (cpufreq: don't call cpufreq_update_policy() on CPU addition) > d9a789c7a0 (cpufreq: Refactor cpufreq_set_policy()) > result: both bad > > > 4e97b631f24 (cpufreq: Initialize governor for a new policy under > policy->rwsem) > 5a7e56a5d29 (cpufreq: Initialize policy before making it available for > others to use) > result: both bad I don't know what we can make out of these tests :(.. @Srivatsa: ? > I have also tested setting the governor to userspace, setting the cpu to the > minimum speed, cpu offline/online, set an intermediate cpu speed and then > maximum speed. > > The machine becomes unresponsive only when trying to set the cpu to the > maximum speed. Great. I am more convinced now that it is not a cpufreq problem. It can still be something about your driver though. > If I understand correctly, the suggestion to try next is to keep the patch > from > comment 32, keep the changes I've done to powernow-k8 (remove all > pr_debug()), keep the extra compile flags in drivers/cpufreq/Makefile and > also add plenty of pr_debug() to drivers/cpufreq/cpufreq.c. Correct!! But wait, continue reading ... > I assume it is here I should add more debug output since target_index() can > be found only here. I couldn't understand that, sorry :( > If I'm correct, in which places would it make more sense > to add debug output and which info should I try to output? I wouldn't be interested in any data here in prints. Just want to know what's the last thing executed before hang.. > I'm ok with doing this as long as it doesn't get too complicated, the reason > is my programming skills are not that good and I have zero knowledge of the > kernel's internals. It doesn't look like, you have managed very well :) Okay, lets simplify things a bit.. I have updated my branch again with all changes you need.. Don't add any other change and just test it. I have added this at many locations: + pr_info("%s: %d\n", __func__, __LINE__); Will print function name and line number in file. You *will* get lots of prints now for every frequency transition.. To control these, enable only *userspace* governor from menuconfig and leave all other governors disabled. That way there wouldn't be any transition happening.. Now, repeat your earlier test with userspace governor and see which line/routine is the last one... You will come to know broadly about where you have reached.. Now to go to the exact statement, remove all prints I have added in powernow-k8 and add them only in your hotpath, and add more of them to the last set of functions executed. Lets see if we can get to the last instruction which crashed :) Probably it will be the one changing freq/voltage.. > > 6f1e4efd882 (cpufreq: Fix timer/workqueue corruption by protecting reading > > governor_enabled) > > 87ae97f10c0 (cpufreq: s3c24xx: Staticize local variable) > > result: both ok > > So you tested your stuff on these two commits (i.e. no reverts hacks)? Yes, I have tested with no revert hacks, and as suggested I have tested the commit preceding the suspect commit and the suspect commit. > Okay, lets simplify things a bit.. I have updated my branch again with all > changes you need.. Don't add any other change and just test it. > > I have added this at many locations: > > + pr_info("%s: %d\n", __func__, __LINE__); > > Will print function name and line number in file. > > You *will* get lots of prints now for every frequency transition.. To > control these, enable only *userspace* governor from menuconfig and leave > all other governors disabled. That way there wouldn't be any transition > happening.. > > Now, repeat your earlier test with userspace governor and see which > line/routine is the last one... > > You will come to know broadly about where you have reached.. Now to go to > the exact statement, remove all prints I have added in powernow-k8 and add > them only in your hotpath, and add more of them to the last set of functions > executed. > > Lets see if we can get to the last instruction which crashed :) > > Probably it will be the one changing freq/voltage.. Not much luck with this approach I'm afraid. I have compiled your updated branch with no reverts and changed the config to include only the userspace governor. Then proceeded to test as before: cat /proc/kmsg & echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online echo 1800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed echo 2200000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed When trying to change the frequency to 2.2GHz the machine becomes unresponsive and there is no output :( I have noticed something that may or may not be important, it doesn't matter if I do 800M - cpu offline/online - 1.8G - 2.2G or 800M - cpu offline/online - 2.2G - 1.8G The machine always becomes unresponsive when trying to do the second frequency change after cpu offline/online. I have pictures of the process, I can attach them if you think they are important, but like I said when the machine hangs there is zero output :( (In reply to Mauro from comment #58) > Not much luck with this approach I'm afraid. I have compiled your updated > branch with no reverts and changed the config to include only the userspace > governor. > > Then proceeded to test as before: > cat /proc/kmsg & > echo 800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed > echo 0 > /sys/devices/system/cpu/cpu1/online > echo 1 > /sys/devices/system/cpu/cpu1/online > echo 1800000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed > echo 2200000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed > > When trying to change the frequency to 2.2GHz the machine becomes > unresponsive and there is no output :( Were you getting a lot of prints during earlier frequency transitions? Its awkward as we were getting enough prints earlier, what happened now ? > I have noticed something that may or may not be important, it doesn't matter > if I do > 800M - cpu offline/online - 1.8G - 2.2G > or > 800M - cpu offline/online - 2.2G - 1.8G > > The machine always becomes unresponsive when trying to do the second > frequency change after cpu offline/online. This surely is a important observation. > I have pictures of the process, I > can attach them if you think they are important, but like I said when the > machine hangs there is zero output :( Still I want to see how bad it is now, just attach them :) (In reply to Viresh Kumar from comment #59) > > When trying to change the frequency to 2.2GHz the machine becomes > > unresponsive and there is no output :( > > Were you getting a lot of prints during earlier frequency transitions? > Its awkward as we were getting enough prints earlier, what happened now ? Yes, lots of prints, too many to fit the screen, I was referring to the exact point/step when the machine hangs. > Still I want to see how bad it is now, just attach them :) Will do. Created attachment 139461 [details]
CPU offline/online - OK
Frequency transition to 800MHz and cpu offline/online worked fine.
Created attachment 139471 [details]
Frequency transition hang
This one is what happens when I try to do the second frequency transition, that is, after:
boot -> 800M -> cpu offline/online -> 2.2G ->(1) 1.8G
(1) machine hangs here.
(In reply to Mauro from comment #62) > Created attachment 139471 [details] > Frequency transition hang > > This one is what happens when I try to do the second frequency transition, > that is, after: > > boot -> 800M -> cpu offline/online -> 2.2G ->(1) 1.8G > > (1) machine hangs here. Strange that we didn't got any prints at all :( Let me try some more hacks :) (In reply to Viresh Kumar from comment #63) > Let me try some more hacks :) Okay, I can understand that its taking more time than it should have but its probably more tricky that we actually thought.. Don't loose hope, we can still crack this :) I have added one more commit now. Idea is to stop ongoing frequency transition at any point of time we want. By default everything will work as it was working. So, continue upto the last step, where you crashed, 1.8G .. Before changing last freq do this: echo 1 > /sys/kernel/debug/stop_transition Follow this in code: + /* HACK THIS */ + if (stop_transition) + return 0; So, we will return from whichever part of powernowk8_target_fn() we want. I have placed it at some point now, you can drag it in whichever direction you want. If it works, then drag it later into the code. (In reply to Viresh Kumar from comment #64) > (In reply to Viresh Kumar from comment #63) > > Let me try some more hacks :) > > Okay, I can understand that its taking more time than it should have but its > probably more tricky that we actually thought.. > > Don't loose hope, we can still crack this :) No worries, the bug will be nuked from orbit sooner or later :D > > I have added one more commit now. Idea is to stop ongoing frequency > transition at any point of time we want. > > By default everything will work as it was working. So, continue upto the > last step, where you crashed, 1.8G .. > > Before changing last freq do this: > > echo 1 > /sys/kernel/debug/stop_transition > > > Follow this in code: > > + /* HACK THIS */ > + if (stop_transition) > + return 0; > > So, we will return from whichever part of powernowk8_target_fn() we want. > I have placed it at some point now, you can drag it in whichever direction > you want. > > If it works, then drag it later into the code. I'll move that code around and see if I can find where it crashes. Should I move it around only inside the function where you put it now (transition_frequency_fidvid()) or also try other functions? On that note, since I'm going to be changing only one file and it is built as a module, is it ok if I do only: make drivers/cpufreq/powernow-k8.ko? Also, what is the best and fastest way, if possible, to compile only what changed? Hi Mauro, You can continue debugging with Viresh suggestions, since that's one of the ways to debug this issue. When time permits, can you please try this commit: commit 5bdfff96c69a4d5ab9c49e60abf9e070ecd2acbb Author: Lai Jiangshan <laijs@cn.fujitsu.com> Date: Sat Feb 15 22:02:28 2014 +0800 workqueue: ensure @task is valid across kthread_stop() Of course, try one commit before it and see if it works, and then try the above mentioned commit and see if it fails. The reason I chose this commit is because there are only 2 commits in the workqueue code between 3.13.8 and 3.14.1, and the other commit is not relevant to your case because it is active only if CONFIG_DEBUG_OBJECTS_WORK is set (and this option is not set in your .config). Thank you! Regards, Srivatsa S. Bhat For completeness, the commit immediately preceding the one chosen by Srivatsa is fada94ee64e6 workqueue: add args to workqueue lockdep name so Mauro, please check if fada94ee64e6 works and then if 5bdfff96c69a fails. (In reply to Rafael J. Wysocki from comment #67) > For completeness, the commit immediately preceding the one chosen by > Srivatsa is > > fada94ee64e6 workqueue: add args to workqueue lockdep name > > so Mauro, please check if fada94ee64e6 works and then if 5bdfff96c69a fails. I have tested both, and both hang. I have used the latest test method of using the userspace governor and both hang on the second frequency change attempt. What I'm trying to do now is get the full text output (I've seen something scroll by that might be interesting). I've made a serial cable and testing if it is working. Most text is ok but there are a few glitches here and there, which is expected I guess. If there are any "optimal" settings do tell :) I have started what Viresh requested and the first try worked, but I haven't started moving the debug code to try to find where things go wrong. I'd say my initial bisect might be very close to the source of the problem, the reason I say this is because the machine goes into suspend just fine, but when trying to resume it hangs (that was my initial testing procedure), knowing what I know now I'd say a good guess is it hangs when trying to bring the cpu online. As a last note, I won't have much time to look into this until the weekend, so radio silence until then doesn't mean I quit ;) (In reply to Mauro from comment #68) > (In reply to Rafael J. Wysocki from comment #67) > > For completeness, the commit immediately preceding the one chosen by > > Srivatsa is > > > > fada94ee64e6 workqueue: add args to workqueue lockdep name > > > > so Mauro, please check if fada94ee64e6 works and then if 5bdfff96c69a > fails. > > I have tested both, and both hang. I have used the latest test method of > using the userspace governor and both hang on the second frequency change > attempt. Well, the scheduler changes are left then. > I'd say my initial bisect might be very close to the source of the problem, > the reason I say this is because the machine goes into suspend just fine, > but when trying to resume it hangs (that was my initial testing procedure), > knowing what I know now I'd say a good guess is it hangs when trying to > bring the cpu online. Oh, we know that for sure. :-) That's why I asked whether or not CPU online worked via sysfs. (In reply to Rafael J. Wysocki from comment #69) > Well, the scheduler changes are left then. Maybe some ACPI stuff as well ? (In reply to Mauro from comment #68) > I have started what Viresh requested and the first try worked, but I haven't > started moving the debug code to try to find where things go wrong. Great, and we reach here only after work is scheduled. So. Srivatsa lost his bet :) and its not scheduler as well.. I have updated my tree again to get to the real target, transition_fid_vid(). This is calling few routines to change freq/volt and we can check them one by one. I am quite hopeful this will work as well and next few may break . Give us a good news this time :) > I'd say my initial bisect might be very close to the source of the problem, You meant the new hack ? > the reason I say this is because the machine goes into suspend just fine, > but when trying to resume it hangs (that was my initial testing procedure), > knowing what I know now I'd say a good guess is it hangs when trying to > bring the cpu online. All thanks to Rafael for his guess. It was perfect!! (In reply to Viresh Kumar from comment #45) > I don't have a clue why does it depend on that :( Now, I know why. This can very much be some ACPI issue considering so much dependency it has on it and I am not good at ACPI :( 631e8ac powernow-k8: Add a kconfig dependency on acpi-cpufreq (In reply to Viresh Kumar from comment #71) > (In reply to Mauro from comment #68) > > I have started what Viresh requested and the first try worked, but I > haven't > > started moving the debug code to try to find where things go wrong. > > Great, and we reach here only after work is scheduled. > > So. Srivatsa lost his bet :) and its not scheduler as well.. > Haha :-) Good to know we have made some more progress :-) Regards, Srivatsa S. Bhat (In reply to Viresh Kumar from comment #72) > (In reply to Viresh Kumar from comment #45) > > I don't have a clue why does it depend on that :( > > Now, I know why. This can very much be some ACPI issue considering so much > dependency it has on it and I am not good at ACPI :( > > 631e8ac powernow-k8: Add a kconfig dependency on acpi-cpufreq It looks like the relevant routine doesn't do anything ACPI-related at all. It accesses MSRs directly it seems. (In reply to Viresh Kumar from comment #71) > > I'd say my initial bisect might be very close to the source of the problem, > > You meant the new hack ? What I meant was, when I was doing the bisect (the one I attached with the initial bug report) I was testing if the machine suspended and resumed well. At the time I had no idea of what the problem was, but now I see that it was hanging when trying to bring the cpu online (during resume). I've tested your latest changes and now the machine hangs, I'll be attaching the logs from this and last test. Created attachment 139721 [details]
From Viresh's tree, commit 0f92fd8
This one does not hang when using stop_transition.
Created attachment 139731 [details]
From Viresh's tree, commit 3610fd8
This one hangs even when using stop_transition.
Created attachment 139741 [details]
Log from commit 5bdfff96c69a
This one hangs, might not have anything interesting but there seems to be some "fireworks" going on.
Created attachment 139751 [details]
Log from commit fada94ee64e6
Same as the one from commit 5bdfff96c69a.
Created attachment 139761 [details]
Hang test script
I've made a small script to automate the test and reduce the chance of me making some error.
I'm attaching this as a reference in case later I see some debug output on screen that does not make it across the serial cable to the text log. Obviously my echos with information do not get across the cable but help in checking where things hang.
I forgot to mention before, the test with commit 3610fd8 from Viresh's tree hangs at "Changing frequency to 1.8GHz" I don't see more output, the log matches what I see on screen.
(In reply to Mauro from comment #77) > Created attachment 139731 [details] > From Viresh's tree, commit 3610fd8 > > This one hangs even when using stop_transition. Oh, this went against my expectation. We crashed before changing freq :( So this is what we have verified: - transition_frequency_fidvid() gets called and after this only we do something that crashes system. - And that isn't a call to transition_fid_vid() as we are crashing before that only. So, you need to move around: /* HACK THIS */ if (stop_transition) return 0; in transition_frequency_fidvid() and find which is the last line before which it is working. If that's a function call, go into that function and check again. Give us the last instruction that causes it.. As an sample of what you need to do, check my latest branch. But you need to do some more changes manually too. Lets crack it soon.. I have managed to narrow this down to a function, specifically: cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE); Now this is from cpufreq and not powenow-k8, I haven't tried doing anything inside cpufreq.c because I don't know if it would work just (moving the if-return) or if that would be a mistake. I'm attaching logs of stop_transition before and after cpufreq_notify_transition() as they seem to be slightly different. When the machine hangs I only see my message about changing freq to 1.8GHz, there is no more output to screen before the machine hangs. Created attachment 139791 [details]
stop_transition before cpufreq_notify_transition()
All is good, machine doesn't hang.
Created attachment 139801 [details]
stop_transition after cpufreq_notify_transition()
Machine hangs, no debug output when trying to change freq.
It looks like one of the notifiers crashes in srcu_notifier_call_chain() called from __cpufreq_notify_transition(). We need to figure out which one. (In reply to Mauro from comment #82) > I have managed to narrow this down to a function, specifically: > cpufreq_notify_transition(policy, &freqs, CPUFREQ_POSTCHANGE); I guessed this yesterday, but thought this might not be the problem as there aren't too many drivers notified for these transitions. And those rarely get modified. As Rafael also mentioned, the problem is obviously with some of the recipients of these notifiers failing for the POSTCHANGE notification. In case you don't know how this works, this is how: Some kernel drivers can register for notifiers, so that before and after every frequency tranistions notifiers would be called and those drivers can act accordingly. One of them is broken here. List of all files which are registering for these notifiers. arch/arm/kernel/smp.c arch/arm/kernel/smp_twd.c arch/arm/mach-pxa/viper.c arch/arm/mach-s3c24xx/mach-osiris-dvs.c arch/blackfin/mach-common/dpmc.c arch/cris/arch-v32/kernel/time.c arch/powerpc/oprofile/op_model_cell.c arch/sparc/kernel/time_64.c arch/x86/kernel/tsc.c arch/x86/kvm/x86.c drivers/acpi/processor_perflib.c drivers/acpi/processor_thermal.c drivers/cpufreq/cpufreq.c drivers/cpufreq/cpufreq_governor.c drivers/cpufreq/cpufreq_stats.c drivers/cpufreq/cris-artpec3-cpufreq.c drivers/cpufreq/cris-etraxfs-cpufreq.c drivers/cpufreq/loongson2_cpufreq.c drivers/cpufreq/ppc_cbe_cpufreq_pmi.c drivers/gpu/drm/tilcdc/tilcdc_drv.c drivers/i2c/busses/i2c-davinci.c drivers/i2c/busses/i2c-s3c2410.c drivers/macintosh/windfarm_cpufreq_clamp.c drivers/mmc/host/davinci_mmc.c drivers/mmc/host/s3cmci.c drivers/mtd/nand/s3c2410.c drivers/pcmcia/soc_common.c drivers/staging/tidspbridge/rmgr/drv_interface.c drivers/thermal/cpu_cooling.c drivers/tty/serial/samsung.c drivers/tty/serial/sh-sci.c drivers/video/da8xx-fb.c drivers/video/nuc900fb.c drivers/video/pxafb.c drivers/video/s3c2410fb.c drivers/video/sa1100fb.c drivers/watchdog/s3c2410_wdt.c include/linux/cpufreq.h Those that look relevant to me for your configuration: arch/x86/kernel/tsc.c arch/x86/kvm/x86.c drivers/acpi/processor_perflib.c drivers/acpi/processor_thermal.c Only first two had significant changes and there are some big changes in 3.14 around these. I am not good at that code and we need to get few more people into this discussions. Though Rafael/Srivatsa might have seen these files.. But before that we must narrow down on this.. Believe me, we are very very close :) I have created two more branches: powernow-k8-post1 and powernow-k8-post2 (Compile tested only for x86_64_defconfig) .. All debug messages/stop_transition/etc are dropped as we don't need them anymore and so drop them from your script as well.. powernow-k8-post1 has reverted all changes between 3.13.8 and 3.14 for arch/x86/kernel/tsc.c file and powernow-k8-post2 have done it for arch/x86/kvm/x86.c.. Only one of these two will work for you. Check which one, I would put my 10 cents on *-post1 branch :) Once, any one of the branches work for you, try doing a git bisect for that branch and 3.14 (Only few revert patches actually, should be easy).. And you can tell me which revert patch has *fixed* the issue for you and that's the culprit. Then we can follow on LKML, will put you in CC, on the original commit threads.. Thanks for your dedication towards fixing this up, it was really a nightmare to get down to this :) -- viresh (In reply to Viresh Kumar from comment #86) > Only one of these two will work for you. Check which one, I would put my 10 > cents on *-post1 branch :) You should have bet more, *-post2 hangs, *-post1 works. > Once, any one of the branches work for you, try doing a git bisect for that > branch and 3.14 (Only few revert patches actually, should be easy).. > > And you can tell me which revert patch has *fixed* the issue for you and > that's the culprit. I had to skip the last step of the bisect as it fails to build, I'm attaching the bisect log. It fails with: arch/x86/kernel/cpu/perf_event.c: In function ‘arch_perf_update_userpage’: arch/x86/kernel/cpu/perf_event.c:1903:23: error: ‘CYC2NS_SCALE_FACTOR’ undeclared (first use in this function) userpg->time_shift = CYC2NS_SCALE_FACTOR; ^ arch/x86/kernel/cpu/perf_event.c:1903:23: note: each undeclared identifier is reported only once for each function it appears in scripts/Makefile.build:308: recipe for target 'arch/x86/kernel/cpu/perf_event.o' failed make[3]: *** [arch/x86/kernel/cpu/perf_event.o] Error 1 make[3]: *** Waiting for unfinished jobs.... CC kernel/printk/printk.o scripts/Makefile.build:455: recipe for target 'arch/x86/kernel/cpu' failed make[2]: *** [arch/x86/kernel/cpu] Error 2 scripts/Makefile.build:455: recipe for target 'arch/x86/kernel' failed make[1]: *** [arch/x86/kernel] Error 2 Makefile:841: recipe for target 'arch/x86' failed make: *** [arch/x86] Error 2 > > Then we can follow on LKML, will put you in CC, on the original commit > threads.. > > Thanks for your dedication towards fixing this up, it was really a nightmare > to get down to this :) I'm just scratching my itch ;) I also have to thank you, and everyone else that looked into this, it looks like this is not affecting many people and yet you took the time and effort to track this down. Created attachment 139901 [details]
powernow-k8-post1 bisect log
(In reply to Mauro from comment #87) > (In reply to Viresh Kumar from comment #86) > > Only one of these two will work for you. Check which one, I would put my 10 > > cents on *-post1 branch :) > > You should have bet more, *-post2 hangs, *-post1 works. Thanks for your great results :) Last round of debugging I want you to do :) One of these commits is the culprit: 57c67da sched/clock, x86: Move some cyc2ns() code around 20d1c86 sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs Can you test these commits and the ones just before them? And we will know which commit broke this for you.. As both are written by Peter Zijlstra, we can ping him back and lets see what he says. I've tried (from your tree, but I suppose that doesn't matter): 20d1c86 sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs 57c67da sched/clock, x86: Move some cyc2ns() code around 5dd12c21 sched/clock, x86: Use mul_u64_u32_shr() for native_sched_clock() 62b94a0 sched/preempt: Take away preempt_enable_no_resched() from modules and all of them fail to compile. Looking at the bisect log I attached when submitting the bug report, I see that all of them are the same I had to skip because they would not compile. They all fail with: CC arch/x86/kernel/asm-offsets.s In file included from include/linux/spinlock.h:271:0, from include/linux/mmzone.h:7, from include/linux/gfp.h:4, from include/linux/slab.h:14, from include/linux/crypto.h:24, from arch/x86/kernel/asm-offsets.c:8: include/linux/spinlock_api_smp.h: In function ‘__raw_spin_lock_bh’: include/linux/spinlock_api_smp.h:134:2: error: implicit declaration of function ‘__local_bh_disable_ip’ [-Werror=implicit-function-declaration] __local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ include/linux/spinlock_api_smp.h:134:34: error: ‘SOFTIRQ_LOCK_OFFSET’ undeclared (first use in this function) __local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ include/linux/spinlock_api_smp.h:134:34: note: each undeclared identifier is reported only once for each function it appears in include/linux/spinlock_api_smp.h: In function ‘__raw_spin_unlock_bh’: include/linux/spinlock_api_smp.h:176:2: error: implicit declaration of function ‘__local_bh_enable_ip’ [-Werror=implicit-function-declaration] __local_bh_enable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ include/linux/spinlock_api_smp.h:176:33: error: ‘SOFTIRQ_LOCK_OFFSET’ undeclared (first use in this function) __local_bh_enable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ include/linux/spinlock_api_smp.h: In function ‘__raw_spin_trylock_bh’: include/linux/spinlock_api_smp.h:181:34: error: ‘SOFTIRQ_LOCK_OFFSET’ undeclared (first use in this function) __local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ In file included from include/linux/spinlock_api_smp.h:190:0, from include/linux/spinlock.h:271, from include/linux/mmzone.h:7, from include/linux/gfp.h:4, from include/linux/slab.h:14, from include/linux/crypto.h:24, from arch/x86/kernel/asm-offsets.c:8: include/linux/rwlock_api_smp.h: In function ‘__raw_read_lock_bh’: include/linux/rwlock_api_smp.h:175:34: error: ‘SOFTIRQ_LOCK_OFFSET’ undeclared (first use in this function) __local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ include/linux/rwlock_api_smp.h: In function ‘__raw_write_lock_bh’: include/linux/rwlock_api_smp.h:202:34: error: ‘SOFTIRQ_LOCK_OFFSET’ undeclared (first use in this function) __local_bh_disable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ include/linux/rwlock_api_smp.h: In function ‘__raw_read_unlock_bh’: include/linux/rwlock_api_smp.h:251:33: error: ‘SOFTIRQ_LOCK_OFFSET’ undeclared (first use in this function) __local_bh_enable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ include/linux/rwlock_api_smp.h: In function ‘__raw_write_unlock_bh’: include/linux/rwlock_api_smp.h:275:33: error: ‘SOFTIRQ_LOCK_OFFSET’ undeclared (first use in this function) __local_bh_enable_ip(_RET_IP_, SOFTIRQ_LOCK_OFFSET); ^ cc1: some warnings being treated as errors /home/msantos/build/temp/cpufreq_revert/src/linux/./Kbuild:81: recipe for target 'arch/x86/kernel/asm-offsets.s' failed make[1]: *** [arch/x86/kernel/asm-offsets.s] Error 1 Makefile:859: recipe for target 'prepare0' failed make: *** [prepare0] Error 2 (In reply to Mauro from comment #87) > (In reply to Viresh Kumar from comment #86) > > Only one of these two will work for you. Check which one, I would put my 10 > > cents on *-post1 branch :) > > You should have bet more, *-post2 hangs, *-post1 works. > Very impressive debugging, both of you! :-) Regards, Srivatsa S. Bhat (In reply to Mauro from comment #90) > I've tried (from your tree, but I suppose that doesn't matter): > 20d1c86 sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs > 57c67da sched/clock, x86: Move some cyc2ns() code around > 5dd12c21 sched/clock, x86: Use mul_u64_u32_shr() for native_sched_clock() > 62b94a0 sched/preempt: Take away preempt_enable_no_resched() from modules > > and all of them fail to compile. Looking at the bisect log I attached when > submitting the bug report, I see that all of them are the same I had to skip > because they would not compile. > Can we use the same "1 commit before, 1 commit after" method here? That is, for each commit that you want to test, try one commit before it and see if it passes, and one commit after it and see if it fails. That should give us a good indication of whether the commit in question is problematic or not. Of course you'll need to use the mainline tree for this, which doesn't include any of the reverts. Regards, Srivatsa S. Bhat (In reply to Srivatsa S. Bhat from comment #92) > Can we use the same "1 commit before, 1 commit after" method here? > That is, for each commit that you want to test, try one commit before it and > see if it passes, and one commit after it and see if it fails. That should > give us a good indication of whether the commit in question is problematic > or not. Of course you'll need to use the mainline tree for this, which > doesn't include any of the reverts. Well, the bisect log I attached when I opened the bug report was done with mainline, the conclusion was: # only skipped commits left to test # possible first bad commit: [0bd3a173d711857fc9f583eb5825386cc08f3948] sched/preempt, locking: Rework local_bh_{dis,en}able() # possible first bad commit: [10b033d434c25a6c9e0f4f4dc2418af1b8236c63] sched/clock, x86: Avoid a runtime condition in native_sched_clock() # possible first bad commit: [6577e42a3e1633afe762f47da9e00061ee4b9a5e] sched/clock: Fix up clear_sched_clock_stable() # possible first bad commit: [35af99e646c7f7ea46dc2977601e9e71a51dadd5] sched/clock, x86: Use a static_key for sched_clock_stable # possible first bad commit: [ef08f0fff87630d4f67ceb09514d8b444df833f8] sched/clock: Remove local_irq_disable() from the clocks # possible first bad commit: [20d1c86a57762f0a33a78988e3fc8818316badd4] sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs # possible first bad commit: [57c67da274f3fab38e08d2c9edf08b89e1d9c71d] sched/clock, x86: Move some cyc2ns() code around # possible first bad commit: [5dd12c2152743747ca9f50ef80281e54cc416dc0] sched/clock, x86: Use mul_u64_u32_shr() for native_sched_clock() # possible first bad commit: [62b94a08da1bae9d187d49dfcd6665af393750f8] sched/preempt: Take away preempt_enable_no_resched() from modules # possible first bad commit: [9ea4c380066fbe23fe0da7f4abfabc444f2467f4] locking: Optimize lock_bh functions Unless I'm seeing this wrong, the commits Viresh wanted me to try are included there, as well as several commits before and after, the result I'm seeing now is the same as the one I saw before, those commits fail to compile. (In reply to Mauro from comment #93) > (In reply to Srivatsa S. Bhat from comment #92) > > > Can we use the same "1 commit before, 1 commit after" method here? > > That is, for each commit that you want to test, try one commit before it > and > > see if it passes, and one commit after it and see if it fails. That should > > give us a good indication of whether the commit in question is problematic > > or not. Of course you'll need to use the mainline tree for this, which > > doesn't include any of the reverts. > > Well, the bisect log I attached when I opened the bug report was done with > mainline, the conclusion was: > > # only skipped commits left to test > # possible first bad commit: [0bd3a173d711857fc9f583eb5825386cc08f3948] > sched/preempt, locking: Rework local_bh_{dis,en}able() > # possible first bad commit: [10b033d434c25a6c9e0f4f4dc2418af1b8236c63] > sched/clock, x86: Avoid a runtime condition in native_sched_clock() > # possible first bad commit: [6577e42a3e1633afe762f47da9e00061ee4b9a5e] > sched/clock: Fix up clear_sched_clock_stable() > # possible first bad commit: [35af99e646c7f7ea46dc2977601e9e71a51dadd5] > sched/clock, x86: Use a static_key for sched_clock_stable > # possible first bad commit: [ef08f0fff87630d4f67ceb09514d8b444df833f8] > sched/clock: Remove local_irq_disable() from the clocks > # possible first bad commit: [20d1c86a57762f0a33a78988e3fc8818316badd4] > sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs > # possible first bad commit: [57c67da274f3fab38e08d2c9edf08b89e1d9c71d] > sched/clock, x86: Move some cyc2ns() code around > # possible first bad commit: [5dd12c2152743747ca9f50ef80281e54cc416dc0] > sched/clock, x86: Use mul_u64_u32_shr() for native_sched_clock() > # possible first bad commit: [62b94a08da1bae9d187d49dfcd6665af393750f8] > sched/preempt: Take away preempt_enable_no_resched() from modules > # possible first bad commit: [9ea4c380066fbe23fe0da7f4abfabc444f2467f4] > locking: Optimize lock_bh functions > > Unless I'm seeing this wrong, the commits Viresh wanted me to try are > included there, as well as several commits before and after, the result I'm > seeing now is the same as the one I saw before, those commits fail to > compile. Hmm, right. Ok, then I guess we'll just have to try and fix the build errors in these commits and try your tests on them. Viresh might already be working on doing that, but if not, I'll try to fix them for you. Regards, Srivatsa S. Bhat (In reply to Srivatsa S. Bhat from comment #94) > Hmm, right. Ok, then I guess we'll just have to try and fix the build errors > in these commits and try your tests on them. Viresh might already be working > on doing that, but if not, I'll try to fix them for you. I haven't done anything yet and don't know how to get these fixed.. I have gone through the series which played with this stuff and looks there were way too many issues reported by multiple people. https://lkml.org/lkml/2014/1/21/512 Anyway Mauro has already verified that reverting these two commits fixes our issues over 3.14.. > 20d1c86 sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable IRQs > 57c67da sched/clock, x86: Move some cyc2ns() code around So, can we get peter in the discussion here now ? We can tell him that things aren't bisectable and so don't know which of these has broken things.. -- viresh (In reply to Viresh Kumar from comment #95) > (In reply to Srivatsa S. Bhat from comment #94) > > Hmm, right. Ok, then I guess we'll just have to try and fix the build > errors > > in these commits and try your tests on them. Viresh might already be > working > > on doing that, but if not, I'll try to fix them for you. > > I haven't done anything yet and don't know how to get these fixed.. I have > gone through the series which played with this stuff and looks there were > way too many issues reported by multiple people. > > https://lkml.org/lkml/2014/1/21/512 > > Anyway Mauro has already verified that reverting these two commits fixes our > issues over 3.14.. > > > 20d1c86 sched/clock, x86: Rewrite cyc2ns() to avoid the need to disable > IRQs > > 57c67da sched/clock, x86: Move some cyc2ns() code around > > So, can we get peter in the discussion here now ? We can tell him that > things aren't bisectable and so don't know which of these has broken things.. > Sure, that sounds better. No point trying to debug this further when he can probably tell us precisely what is going wrong, and fix it immediately. Regards, Srivatsa S. Bhat Hi Mauro, I think we can do one more/final test to make sure we have hit the right file, before Peter shouts at us:) Though I am pretty sure that our analysis points at what I have mentioned in my mail: https://lkml.org/lkml/2014/6/16/477 Can you please modify: arch/x86/kernel/tsc.c to simply return 0 from beginning of this function: time_cpufreq_notifier() ? This should fix your issues, if our analysis was right. This function is the one getting called from our POSTCHANGE notifications .. These changes must be done over 3.14 without any other modifications.. (In reply to Viresh Kumar from comment #97) > Can you please modify: arch/x86/kernel/tsc.c to simply return 0 from > beginning of this function: time_cpufreq_notifier() ? This should fix your > issues, if our analysis was right. > > This function is the one getting called from our POSTCHANGE notifications .. > > These changes must be done over 3.14 without any other modifications.. We are in the clear, we're not getting shouted at :) With no patches or other changes (besides adding return 0;) and using the same .config and test script I've been using so far it seems to work fine, the machine does not hang. (In reply to Mauro from comment #98) > We are in the clear, we're not getting shouted at :) > > With no patches or other changes (besides adding return 0;) and using the > same .config and test script I've been using so far it seems to work fine, > the machine does not hang. Ahh, nice. Thanks for your tests. I suspect that what fails is the set_cyc2ns_scale(tsc_khz, freq->cpu); at the end of time_cpufreq_notifier(), so I wonder if commenting this single line out (instead of making the whole function return 0 unconditionally) make the problem go away too? (In reply to Rafael J. Wysocki from comment #100) > I suspect that what fails is the > > set_cyc2ns_scale(tsc_khz, freq->cpu); > > at the end of time_cpufreq_notifier(), so I wonder if commenting this single > line out (instead of making the whole function return 0 unconditionally) > make the > problem go away too? That also seems to work, the machine stays responsive. So my bet is that the while () loop in cyc2ns_write_begin() becomes infinite in the CPU online case on this machine. (In reply to Rafael J. Wysocki from comment #102) > So my bet is that the while () loop in cyc2ns_write_begin() becomes infinite > in the CPU online case on this machine. I don't understand this code well and so wouldn't bet this time :) But in that case, why isn't the other CPU responding? Only one should be busy in the loop, right ? Because that is a CPU online and the other CPU is waiting for this one to complete. That's because the powernow-k8 driver does the work_on_cpu() in powernowk8_target() and the CPU that called it blocks on the flush_work() in there while the other one is running the loop in cyc2ns_write_begin() in the hope to be kicked out of it. At least that's my theory. :-) So this looks like a quite interesting deadlock to me. (In reply to Rafael J. Wysocki from comment #104) > Because that is a CPU online and the other CPU is waiting for this one to > complete. > > That's because the powernow-k8 driver does the work_on_cpu() in > powernowk8_target() and the CPU that called it blocks on the flush_work() in > there while the other one is running the loop in cyc2ns_write_begin() in the > hope to be kicked out of it. At least that's my theory. :-) Hmm, but why are you so sure that work_on_cpu() isn't executed on CPU0 ? And we are guaranteed that the work will be called on CPU0. It should atleast sometimes get called from CPU0. Maybe Mauro can confirm this after another test but I don't want to waste his time for proving our theories :) > So this looks like a quite interesting deadlock to me. Indeed. And it was a nightmare to solve really. It would have been much easier if one of us could have reproduced it locally. As debugging it remotely is really tough :) .. You don't want to play around code too much as somebody else is working on your behalf :) If you want me to do a few more tests to help kill this bug I'm ok with that. The laptop didn't die of heatstroke so far, so a few more tests should be ok :) (In reply to Mauro from comment #106) > If you want me to do a few more tests to help kill this bug I'm ok with that. > > The laptop didn't die of heatstroke so far, so a few more tests should be ok > :) Thanks, but I dont' have any tests in my mind which might help debugging here. Lets wait for PeterZ. On Wed, Jun 18, 2014 at 11:04:18AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=77201 > > --- Comment #104 from Rafael J. Wysocki <rjw@rjwysocki.net> --- > Because that is a CPU online and the other CPU is waiting for this one to > complete. > > That's because the powernow-k8 driver does the work_on_cpu() in > powernowk8_target() and the CPU that called it blocks on the flush_work() in > there while the other one is running the loop in cyc2ns_write_begin() in the > hope to be kicked out of it. At least that's my theory. :-) > > So this looks like a quite interesting deadlock to me. So two things; it looks like we only need to call set_cyc2ns_scale() (in time_cpufreq_notifier) inside the condition that actually changes tsc_khz, not unconditionally. Since if the tsc_khz value doesn't change, there's nothing to update. This avoids it getting called twice for every change (PRECHANGE and POSTCHANGE) and exhaust the struct cyc2ns data slots (2). --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -918,9 +918,9 @@ static int time_cpufreq_notifier(struct tsc_khz = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new); if (!(freq->flags & CPUFREQ_CONST_LOOPS)) mark_tsc_unstable("cpufreq changes"); - } - set_cyc2ns_scale(tsc_khz, freq->cpu); + set_cyc2ns_scale(tsc_khz, freq->cpu); + } return 0; } Secondly, it looks like (at least for the k8 case) we always call time_cpufreq_notifier(.cpu == smp_processor_id()), so if the above doesn't cure things, could you try this: --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -127,8 +127,6 @@ static struct cyc2ns_data *cyc2ns_write_ if (data == c2n->head) data++; - /* XXX send an IPI to @cpu in order to guarantee a read? */ - /* * When we observe the tail write from cyc2ns_read_end(), * the cpu must be done with that entry and its safe @@ -151,6 +149,15 @@ static void cyc2ns_write_end(int cpu, st smp_wmb(); ACCESS_ONCE(c2n->head) = data; + + if (cpu == smp_processor_id()) { + /* + * If we're on the current CPU, force consume the data + * so that we free up the data tail. + */ + struct cyc2ns_data *data = cyc2ns_read_begin(); + cyc2ns_read_end(data); + } } /* (In reply to Peter Zijlstra from comment #108) > On Wed, Jun 18, 2014 at 11:04:18AM +0000, > bugzilla-daemon@bugzilla.kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=77201 > > > > --- Comment #104 from Rafael J. Wysocki <rjw@rjwysocki.net> --- > > Because that is a CPU online and the other CPU is waiting for this one to > > complete. > > > > That's because the powernow-k8 driver does the work_on_cpu() in > > powernowk8_target() and the CPU that called it blocks on the flush_work() > in > > there while the other one is running the loop in cyc2ns_write_begin() in > the > > hope to be kicked out of it. At least that's my theory. :-) > > > > So this looks like a quite interesting deadlock to me. > > So two things; it looks like we only need to call set_cyc2ns_scale() > (in time_cpufreq_notifier) inside the condition that actually changes > tsc_khz, not unconditionally. > > Since if the tsc_khz value doesn't change, there's nothing to update. > > This avoids it getting called twice for every change (PRECHANGE and > POSTCHANGE) and exhaust the struct cyc2ns data slots (2). > > > --- a/arch/x86/kernel/tsc.c > +++ b/arch/x86/kernel/tsc.c > @@ -918,9 +918,9 @@ static int time_cpufreq_notifier(struct > tsc_khz = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new); > if (!(freq->flags & CPUFREQ_CONST_LOOPS)) > mark_tsc_unstable("cpufreq changes"); > - } > > - set_cyc2ns_scale(tsc_khz, freq->cpu); > + set_cyc2ns_scale(tsc_khz, freq->cpu); > + } > > return 0; > } > I have tested this patch with the userspace governor (same procedure of previous tests) and with the ondemand governor with kernels 3.14 and 3.16-rc2 and it seems to solve the problem. I couldn't get the machine to hang. I have tried cpu offline/online, frequency changes, unloading/loading powernow-k8, suspend/hibernate. > > Secondly, it looks like (at least for the k8 case) we always call > time_cpufreq_notifier(.cpu == smp_processor_id()), so if the above > doesn't cure things, could you try this: > > > --- a/arch/x86/kernel/tsc.c > +++ b/arch/x86/kernel/tsc.c > @@ -127,8 +127,6 @@ static struct cyc2ns_data *cyc2ns_write_ > if (data == c2n->head) > data++; > > - /* XXX send an IPI to @cpu in order to guarantee a read? */ > - > /* > * When we observe the tail write from cyc2ns_read_end(), > * the cpu must be done with that entry and its safe > @@ -151,6 +149,15 @@ static void cyc2ns_write_end(int cpu, st > smp_wmb(); > > ACCESS_ONCE(c2n->head) = data; > + > + if (cpu == smp_processor_id()) { > + /* > + * If we're on the current CPU, force consume the data > + * so that we free up the data tail. > + */ > + struct cyc2ns_data *data = cyc2ns_read_begin(); > + cyc2ns_read_end(data); > + } > } > > /* I have also tried this patch in addition to the previous one (both patches applied) and it doesn't seem to break anything. Let me know if you want me to perform any other tests or if you need me to attach any logs. On Mon, Jun 23, 2014 at 05:16:47PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=77201 > > --- Comment #109 from Mauro <registosites@hotmail.com> --- > (In reply to Peter Zijlstra from comment #108) > > On Wed, Jun 18, 2014 at 11:04:18AM +0000, > > bugzilla-daemon@bugzilla.kernel.org wrote: > > > https://bugzilla.kernel.org/show_bug.cgi?id=77201 > > > > > > --- Comment #104 from Rafael J. Wysocki <rjw@rjwysocki.net> --- > > > Because that is a CPU online and the other CPU is waiting for this one to > > > complete. > > > > > > That's because the powernow-k8 driver does the work_on_cpu() in > > > powernowk8_target() and the CPU that called it blocks on the flush_work() > in > > > there while the other one is running the loop in cyc2ns_write_begin() in > the > > > hope to be kicked out of it. At least that's my theory. :-) > > > > > > So this looks like a quite interesting deadlock to me. > > > > So two things; it looks like we only need to call set_cyc2ns_scale() > > (in time_cpufreq_notifier) inside the condition that actually changes > > tsc_khz, not unconditionally. > > > > Since if the tsc_khz value doesn't change, there's nothing to update. > > > > This avoids it getting called twice for every change (PRECHANGE and > > POSTCHANGE) and exhaust the struct cyc2ns data slots (2). > > > > > > --- a/arch/x86/kernel/tsc.c > > +++ b/arch/x86/kernel/tsc.c > > @@ -918,9 +918,9 @@ static int time_cpufreq_notifier(struct > > tsc_khz = cpufreq_scale(tsc_khz_ref, ref_freq, freq->new); > > if (!(freq->flags & CPUFREQ_CONST_LOOPS)) > > mark_tsc_unstable("cpufreq changes"); > > - } > > > > - set_cyc2ns_scale(tsc_khz, freq->cpu); > > + set_cyc2ns_scale(tsc_khz, freq->cpu); > > + } > > > > return 0; > > } > > > > I have tested this patch with the userspace governor (same procedure of > previous tests) and with the ondemand governor with kernels 3.14 and 3.16-rc2 > and it seems to solve the problem. I couldn't get the machine to hang. > > I have tried cpu offline/online, frequency changes, unloading/loading > powernow-k8, suspend/hibernate. OK, if this one is sufficient, I'll leave it at this. I'll go write a pretty changelog and post proper (probably tomorrow somewhere). Thanks for testing! No problem. This was affecting me so to get this fixed I had to do my part of the work :) Thanks everyone for your help in debugging this and for fixing it :) |
Created attachment 137861 [details] Bisect log Like the description says, I am unable to get my laptop to hibernate or resume from suspend. I have noticed this started after kernel 3.14 hit the repos of my distribution. The symptom is that the machine suspends successfully but completely hangs when resuming (ctrl-alt-del or sysrq don't work at all). The same happens when trying to hibernate, that is, the machine hangs before completing the hibernation process. I have tried to find where things go wrong by doing echoing freezer, devices, platform, processors to /sys/power/pm_test and then echoing mem to /sys/power/state. The tests work successfully until I get to processors. I have found that if powernow-k8 is unloaded before suspending/hibernating things work as expected. I have tried the latest 3.15-rc5 from git and I can still reproduce the problem. I have also tried to bisect to find the commit that causes the regression but I have ended at commits that I had to skip (they would not compile). The bisect log and some other info that might be useful can be found in the attachments.