Most recent kernel where this bug did *NOT* occur: Distribution:Ubuntu Hardware Environment:ASUSTeK Computer INC. A8V-MX Dual core AMD 64 Software Environment: Problem Description: powernow-k8 module reports "Hardware error - pending bit very stuck - no further pstate changes possible" switching from a high power state to a low power state and power management fails after this error Steps to reproduce: Loaded powernow-k8 module and governors (cpufreq-ondemand or cpufreq-conservative (in /etc/modules). Loaded each core to 100% using the following commands ssh-keygen -G moduli.1 -b 1024 & ssh-keygen -G moduli.2 -b 1024 & Upon killing the load processes the module reports these errors (dmesg) and power-management does not work anymore (gets stuck on the high level frequency 2GHz but low core voltage -1.17): [ 454.266700] powernow-k8: Hardware error - pending bit very stuck - no further pstate changes possible [ 454.266709] powernow-k8: transition frequency failed [ 455.504471] powernow-k8: failing targ, change pending bit set
After some more testing, it happens going from low power to high power as well. The consistent behavior is frequency switching fails after successfully switching the voltage. Ex. ramp-up from 1Ghz 1.17 V to 2GHz 1.41 V, it changes to 1.41V but frequency gets stuck at 1GHz (never reaches the desired 2GHz). dmesg shows again: [13496.636380] powernow-k8: Hardware error - pending bit very stuck - no further pstate changes possible [13496.636388] powernow-k8: transition frequency failed
> ------- Additional Comments From babassu@postmaster.co.uk > 2007-03-26 13:48 ------- > After some more testing, it happens going from low power to > high power as well. > Ex. ramp-up from 1Ghz 1.17 V to 2GHz 1.41 V, it changes to > 1.41V but frequency > gets stuck at 1GHz (never reaches the desired 2GHz). dmesg > shows again: > > [13496.636380] powernow-k8: Hardware error - pending bit very > stuck - no further pstate changes possible > [13496.636388] powernow-k8: transition frequency failed What chipset are you using? Could you send me the pstate tables the driver prints when it first loads? At first glance, it looks like you have a hardware error (hence the error message saying that). The processor attempted to perform the transition 1000 times and the chipset never signaled that it was finished. -Mark Langsdorf Operating Systems Research Center AMD, Inc.
The powernow-k8 module reports this when starting up: [ 33.429966] powernow-k8: Found 2 AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ processors (version 2.00.00) [ 33.430014] powernow-k8: 0 : fid 0xc (2000 MHz), vid 0x8 [ 33.430018] powernow-k8: 1 : fid 0xa (1800 MHz), vid 0xa [ 33.430021] powernow-k8: 2 : fid 0x2 (1000 MHz), vid 0x12 The MB (ASUS A8V-MX) chipset is VIA K8M800 + VT8251 The BIOS is American Megatrends Inc. version 0405 from 11/09/2005 I did not look at the current code but I have seen a version with a loop incrementing a counter waiting for the bit. Maybe that loop (if it is still there) is too short for a fast CPU.
> ------- Additional Comments From babassu@postmaster.co.uk > 2007-03-27 08:40 ------- > The powernow-k8 module reports this when starting up: > [ 33.429966] powernow-k8: Found 2 AMD Athlon(tm) 64 X2 Dual > Core Processor > 3800+ processors (version 2.00.00) > [ 33.430014] powernow-k8: 0 : fid 0xc (2000 MHz), vid 0x8 > [ 33.430018] powernow-k8: 1 : fid 0xa (1800 MHz), vid 0xa > [ 33.430021] powernow-k8: 2 : fid 0x2 (1000 MHz), vid 0x12 Looks pretty normal. > The MB (ASUS A8V-MX) chipset is VIA K8M800 + VT8251 The BIOS > is American Megatrends Inc. version 0405 from 11/09/2005 Thanks. I haven't seen this problem on VIA before. > I did not look at the current code but I have seen a version > with a loop incrementing a counter waiting for the bit. > Maybe that loop (if it is still there) is too short for a fast CPU. It reads the MSR 10000 times before retrying, and retries 100 times before giving up. If an operation that is supposed to take 10s of microseconds hasn't completed in a few hundred milliseconds, I don't know that it will. I can send you a patch if you want to try longer timeouts. Also, is there a more recent BIOS? November 2005 is a bit old. -Mark Langsdorf AMD, Inc.
I took the liberty to glance at the code for powernow-k8.c (/usr/src/linux-source-2.6.20/arch/i386/kernel/cpu/cpufreq). I may be totally out to lunch here but bear with me. 1. the function "query_current_values_with_pending_wait" has a loop incrementing a counter to 10000 (not totally properly initialized) trying to read the fid/vid ctl MSR register. Can this be too short (10000 CPU registry reads on a 2GHz CPU) 2. If the call to query_current_values_with_pending_wait fails we try to set the fid/vid control registry again in a loop (up to 100 times) - wrmsr(MSR_FIDVID_CTL, lo, data->pllock * PLL_LOCK_CONVERSION); Can this be potentially bad (it would set the pending bit again but if the old fid==new fid the bit may not be reset ever since no change in frequency is required)? 3. The loop in "query_current_values_with_pending_wait" may ramp up the CPU to 100% triggering the "governor" mechanism. Are we protected to this kind of conflicts ? I will dig more and come back.
> ------- Additional Comments From babassu@postmaster.co.uk > 2007-03-27 11:38 ------- > I took the liberty to glance at the code for powernow-k8.c > (/usr/src/linux-source-2.6.20/arch/i386/kernel/cpu/cpufreq). > I may be totally > out to lunch here but bear with me. > 1. the function "query_current_values_with_pending_wait" has > a loop incrementing a counter to 10000 (not totally properly > initialized) Your issue with u32 i = 0; is what, exactly? > trying to read the fid/vid ctl MSR register. Can this be > too short (10000 CPU registry reads on a 2GHz CPU) I don't believe it is, based on my testing. We used to have a longer loop, but with the retry it locked up the system for too long if the operation failed. If your testing demonstrates the time out needs to be longer, I will ack a patch to increase it. > 2. If the call to query_current_values_with_pending_wait > fails we try to set the fid/vid control registry again > in a loop (up to 100 times). Can this be potentially bad > (it would set the pending bit again but if the old fid==new fid > the bit may not be reset ever since no change in frequency is > required)? No, this is a good thing. Some chip sets have inadequate buffering for pstate/cstate changes, and can get into a degenerate state where a cstate change following a cstate change causes the chipset to overflow the buffer and lose the pstate change. The chipset completes both changes but doesn't know to signal the clear for the pstate change; the processor never clears the pending bit and the pstate change is assumed to never complete. Solution is to resend the pstate change command, which the chipset acknowledges quickly (because it is already in the pstate from the previous change command) and clears the pending bit. AMD observed this behavior in our hardware debug labs and I implemented the recommendation of our silicon engineers. If it bothers you, try taking out that loop, increase the query_current_values_with_pending_wait_loop to something much longer, and see if your system starts behaving properly. > 3. The loop in "query_current_values_with_pending_wait" may > ramp up the CPU to 100% triggering the "governor" mechanism. > Are we protected to this kind of conflicts ? No, but it's never come up in my experience.
Fair enough. I will try to compile some variants and try them. >>Your issue with >> u32 i = 0; >>is what, exactly? I was just glancing at the line i = lo & HW_PSTATE_MASK; just before the loop trying to figure out what is the starting value of i(fortunately the mask is just 0x07 making for a small number) I noticed this function too trying to do a delay I think: count_off_irt(data); Can we put it inside the do loop after setting the MSR ? Thank you.
> ------- Additional Comments From babassu@postmaster.co.uk > 2007-03-27 12:40 ------- > >>Your issue with > >> u32 i = 0; > >>is what, exactly? > > I was just glancing at the line > i = lo & HW_PSTATE_MASK; > just before the loop trying to figure out what is the > starting value of i(fortunately the mask is just 0x07 > making for a small number) Look at the code again: if (cpu_family == CPU_HW_PSTATE) { ... i = lo & HW_PSTATE_MASK; ... return 0; } if i is set to the hardware_pstate index, it isn't used as a counter. > I noticed this function too trying to do a delay I think: > count_off_irt(data); > > Can we put it inside the do loop after setting the MSR ? See p. 275 of the public BKDG. IRT is counted off after the pending bit clears.
I am out to lunch ! I'll do some recompiling in the next while and let you know how it goes. Thank you.
I finaly built the new kernel (what an adventure when you have kernel tools from Ubuntu edgy with kernel source from feisty - the make-kpkg would not add vmlinuz* to the pkg). I have increased the loop in "query_current_values_with_pending_wait" by a factor of 10 (100000). The problem is gone. Now it is time for experimenting. This is likely going to take some time. I will keep you posted with my experiments. Unfortunately I have only one machine available and cannot mess with it too much. I believe that the loop may be replaced by a timed wait that does not depend of the CPU freq (something like udelay/mdelay/sleep with a time value). This reminds me of the Win9x boot problem with AMD K6 I think when some counter was going to fast. Also it would help if the module would recover from the "pending bit stuck state" somehow without the need to reboot the kernel (I hate reboots).
Created attachment 10988 [details] dmesg output with some debug prior to the problem I spoke too soon. Here is some debug info I captured prior to the bit getting stuck.
> Also it would help if the module would recover from the > "pending bit stuck state" somehow without the need to] > reboot the kernel (I hate reboots). It's the hardware that's stuck, not the module.
You mentioned this before: >Some chip sets have inadequate >buffering for pstate/cstate changes, and can get into a >degenerate state where a cstate change following a cstate >change causes the chipset to overflow the buffer and lose >the pstate change. The chipset completes both changes but >doesn't know to signal the clear for the pstate change; the >processor never clears the pending bit and the pstate change >is assumed to never complete. Solution is to resend the >pstate change command, which the chipset acknowledges quickly >(because it is already in the pstate from the previous change >command) and clears the pending bit. How does the chipset detect the resend of the pstate change command if the data in the control register has not changed (the pending bit is still set, the rest of the bits are the same, essentially we are writing <n> over <n> into a register)? I am asking this because in my case at least the resend does not work. Thank you.
After some more tests: 1. My chipset does not like to get flooded with requests. With the loop set to 10000 in "query_current_values_with_pending_wait" the resend loop gets executed at least 3 times each change request. Very quickly the pending bit very stuck shows up. I timed the loop at approximately 1.365 mS. 2. With the same loop set to 100000 (a little over 10ms wait) the resend does not occur most of the time. Once occurred the bit get stuck and no go from there. 3. You are right, once the pending bit is very stuck I cannot be reset it even if I force it. Basically my chipset once is flooded stops responding. 4. I have a supposition that the resend does not work and causes more bad than good on my system. 5. I modified the "query_current_values_with_pending_wait" like this (I gave it max 20mS in steps of 1mS) and I am running problem free for now (the resend does not occur anymore): do { if (i++ > 20) { dprintk("detected change pending stuck\n"); return 1; } msleep(1); rdmsr(MSR_FIDVID_STATUS, lo, hi); } while (lo & MSR_S_LO_CHANGE_PENDING); 6. Note that msleep may let the cpu do other things while is waiting. It can be replaced with usleep with a lower time-out along with increasing the loop limit for better granularity (if I can figure out how). 7. I plan to take the resend loop out completely and run some stats as to how long it takes for a change to be ack-ed on average (the chipset seems to be quite slow at this).
Thanks for the testing. I like the msleep approach and would appreciate a patch. I will not accept a patch that removes the resend. I know that certain ATI chipsets will break without that patch. <HTML dir=ltr><HEAD><TITLE>[Bug 8264] powernow-k8 module gets stuck switching powerlevels on dualcore AMD64</TITLE> <META http-equiv=Content-Type content="text/html; charset=unicode"> <META content="MSHTML 6.00.2900.3059" name=GENERATOR> </HEAD> <BODY> <DIV id=idOWAReplyText21071 dir=ltr> <DIV dir=ltr><FONT size=2>Thanks for the testing. I like the msleep approach and would appreciate a patch.</FONT></DIV> <DIV dir=ltr><FONT size=2></FONT> </DIV> <DIV dir=ltr><FONT size=2>I will not accept a patch that removes the resend. I know that certain ATI chipsets</FONT></DIV> <DIV dir=ltr><FONT size=2>will break without that patch.</FONT></DIV> <DIV dir=ltr><FONT size=2></FONT> </DIV> <DIV dir=ltr><FONT size=2> </DIV></FONT></DIV></BODY></HTML>
Created attachment 10998 [details] dmesg output with p-state timings Unfortunately the problem is still there. It shows up after a while. It looks more and more like a hardware issue. I have attached a dmesg output with timings for every p-state transition. This is with the resend completely disabled. Under the ondemand governor, doing nothing (no load) from time to time it decides to change p-states. After a while it hangs. On average a p-state takes about 6mS I would say. When it got stuck it did wait for 800 mS with no success. Take a look to see if you spot anything wrong in the state transition. It seems like it accepts only a limited number of transitions. Would you happen to know if the BIOS may have anything to do in this matter (the way it initializes some hardware maybe)? I will try to get an updated BIOS see if it makes any difference. I agree with you that the resend should stay. From my tests increasing the check loop (from 50 to 1000 in my case) with a msleep of (1) included would reduce the need to try resending it 100 times. A resend of 10 max would probably suffice to conclude that is very stuck. The check loop seems to be exiting after max 10 iterations when working properly. 1000 is my extreme and may detect better the stuck situation. Thank you.
Created attachment 11001 [details] please ignore this post. driver is behaving properly I think I caught a problem for the situation that happens most. Going down from 2000MHz 1.35V to a lower state (in this case 1800MHz 1.30V) does not respect the guide (Chapter 9.5.6.2.2 - Changiing the FID): "Note: Software must hold the VID constant when changing the FID." This is what I captured; ============current state================= [ 887.141864] targ: curr fid 0xc, vid 0x8 [ 887.141867] cpu 0 transition to index 1 [ 887.141869] table matched fid 0xa, giving vid 0xa [ 887.149844] p-state change confirmed LO=0x60c0c0c HI=0x12060808 ============illegal request ?============= [ 887.149849] cpu 0, changing to fid 0xa, vid 0xa The bit got stuck after that. I will investigate the code to see why it happened.
Comment on attachment 11001 [details] please ignore this post. driver is behaving properly please ignore this post. I was not reading the log properly. The first call is to transition fid to 0a leaving vid the same 08 which is correct behaviour
After some more testing I figured the scenario when it happens. There seems to be a hardware issue. Sometimes the change frequency command is not executed, period. It is not that the command is executed but it forgets to reset the bit.The pending bit is not reset because the command did not work and nothing would after that. I am curious if there is anything else that the chipset reports when it cannot execute a command (some sort of an error somewhere in some register). Would you happen to know what component executes the command (the nothbridge) ? Thank you.
> ------- Additional Comments From babassu@postmaster.co.uk > 2007-04-03 06:56 ------- > After some more testing I figured the scenario when it > happens. There seems to be a hardware issue. That is what the original error condition stated. =) > Sometimes the change frequency command is not executed, > period. It is not that the command is executed but it forgets > to reset the bit. The pending bit is not reset because the > command did not work and nothing would after that. I am > curious if there is anything else that the chipset > reports when it cannot execute a command (some sort of an > error somewhere in some register). The chipset might have it's own error registers - you would need to check the documentation for the chipset. > Would you happen to know what component executes the command > (the nothbridge) ? The northbridge, the processor cores, the crossbar switch, and the southbridge all have things they do in response to a pstate change command. It doesn't look like there are northbridge error registers in the case of failure, though.
Can you please confirm that the problem is still there with the new 2.6.22-rc5+? Thanks.
I have exactly the same problem "powernow-k8: failing targ, change pending bit set" with a VIA K8M890 + VT8237A Chipset and the kernel 2.6.22.5. Is there a chance that the problem is fixed in 2.6.23.X ?
If you get a message that there is a hardware error, it is unlikely that upgrading your kernel will fix your hardware. The problem is probably not fixed in 2.6.23.X because it is a hardware error.
Is this a hardware bug of all VIA K8M890 chips? It seems that many users with the K8M8xx has the problem. Is it possible to write a workaround for buggy hardware or it's better to dispose this board?
I guess it's a fair question. The faulty chipset was identified, but I'm not sure a workaround has been found. Mark, do you think it is possible to quirk this board? downgrade/restrict governor to something like performance, or user space and have frequency changed "gently" by careful user process?
It might be possible to quirk the board, but I don't know that cpufreq has the infrastructure to handle it. I certainly don't have time to write the patch, nor hardware to test it on, but I will support/answer questions for anyone who wants to give it a try.
Hi everybody: I have added comments in bugs 6382 and 8547 because I think they are duplicates of this problem. I also proposed to follow the solution in this bug as it seems the most active/explanatory. Mark and Natalie: I'm thinking in setting the governor to userspace and make a program to do smooth transitions (for example: no more than one per second). Do you think that would suffice to bypass the problem? Also, do you think this can be achieved by using powerfreqd or a custom script/daemon should be written? Thanks for your help. Ivan
I just add here some links for - bug #6382 (Random lockups, no kernel panic just a compleate hang) and - bug #8547 ("Hardware error - pending bit very stuck" on Mobile AMD Sempron) for easier reference.
hello all, I did a coreboot support for K8M890 (Free Software BIOS). I had exactly same issue. I never succeeded to get answer from VIA about this. But, I know at least two things which helped: 1) change the LDTSTOP assert time in southbridge 2) manually pull the LDTSTOP line instead of writing to the MSR change bit. For the "cold" start of CPU in BIOS I implemented method #2 and to mine surprise the powernow-k8 started to work. Maybe it will just work with 1). Don't know. The code here shows how to toggle the LDTSTOP# http://tracker.coreboot.org/trac/coreboot/changeset/3796/trunk the VT8237R_ACPI_IO_BASE is PMIO. The code at http://tracker.coreboot.org/trac/coreboot/changeset/3795/trunk shows how to change the LDTSTOP# duration to 100us. I work with K8M890/VT8237S, but I have datasheets for some other chipsets. Please post full lspci -xxx so I can check how it is setup. I will supply the setpci commands for a test. Rudolf
Hi again, It seems that the VGA BIOS is doing something to IGP engine which makes this error go away. Don't know what. Maybe the folks with problems have too old VGA BIOS. Rudolf
Got the same problem here, but I don't really notice any problems with the computer other than seeing the messages. It's been happening for so long I don't know what kernel it used to not print those messages. For me, the messages are intermittent - they don't always get printed out. # dmesg|grep powernow powernow-k8: Found 1 AMD Sempron(tm) Processor 3000+ processors (1 cpu cores) (version 2.20.00) powernow-k8: 0 : fid 0xa (1800 MHz), vid 0x6 powernow-k8: 1 : fid 0x2 (1000 MHz), vid 0x12 powernow-k8: Hardware error - pending bit very stuck - no further pstate changes possible powernow-k8: transition frequency failed powernow-k8: error - out of sync, fix 0x2 0xa, vid 0x4 0x4 powernow-k8: ph2 null fid transition 0xa powernow-k8: Hardware error - pending bit very stuck - no further pstate changes possible powernow-k8: transition frequency failed powernow-k8: failing targ, change pending bit set # lspci 00:00.0 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.1 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.2 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.3 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.4 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.7 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge [K8T800/K8T890 South] 00:08.0 Multimedia audio controller: C-Media Electronics Inc CM8738 (rev 10) 00:09.0 Multimedia video controller: Internext Compression Inc iTVC15 MPEG-2 Encoder (rev 01) 00:0a.0 Multimedia video controller: Internext Compression Inc iTVC15 MPEG-2 Encoder (rev 01) 00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80) 00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South] 00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:00.0 VGA compatible controller: VIA Technologies, Inc. K8M800/K8N800/K8N800A [S3 UniChrome Pro] (rev 01)
Will you please try the latest upstream kernel and see whether the issue still exists? thanks.
*** Bug 8547 has been marked as a duplicate of this bug. ***
Rudolph - http://tracker.coreboot.org/trac/coreboot/changeset/3795/trunk shows several different changes. Which specific change sets the LDTSTOP# duration to 100 us? I'm interested in writing a fix that change the LDTSTOP# duration on problematic systems.
Hi, In the mean while I think I know what is happening. The LDTSTOP is somehow block by the internal VGA. Ensure that internal VGA power management registers are correctly programmed. We did some VGA reset programming here, the VGA regs seems to match open VX800 documentation(chrome9) I think only power management regs needs to be programmed. http://tracker.coreboot.org/trac/coreboot/browser/trunk/src/southbridge/via/k8t890/k8m890_chrome.c You can also try to use different SMAF message which will change the FID/VID (this seems it is not blocked by the NB) 1) prepare everything as you would to the MSR (FID/VID) 2) instead of writting to "go" bit to the MSR (sorry forgot the bit name) toggle the LDTSTOP from SB with following sequence: http://tracker.coreboot.org/trac/coreboot/browser/trunk/src/mainboard/asus/m2v-mx_se/cache_as_ram_auto.c#L112 The IO base is stored in the PCI space register 0x88 (mask &~0x1). I will go to FOSDEM tomorrow, so I can provide more info on Monday. Hope this is helps. Rudolf
we're shipping 2.6.37 now... did a workaround for this get shipped, or is this still a problem?
2.6.38 released. please re-open this bug if the problem still exists in the latest upstream kernel.