Most recent kernel where this bug did not occur: 2.6.12 Distribution: Gentoo Hardware Environment: Athlon 64 3500+ Problem Description: The document at http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.pdf?info=EXLINK says in section 78:
Created attachment 6125 [details] Lost ticks For reference, acpi_processor_idle+0x143 is instruction 14aa in the following snippet: 14a5: ed in (%dx),%eax 14a6: ed in (%dx),%eax 14a7: fb sti 14a8: 39 c8 cmp %ecx,%eax 14aa: 72 04 jb 14b0 <acpi_processor_idle+0x149> Here, instruction 14a7 is local_irq_enable() at drivers/acpi/processor_idle.c, line 294 (where the processor has just left C2).
> time.c: Lost 1 timer tick(s)! rip acpi_processor_idle+0x143/0x31d [processor]) is this an SMP system? It may be that the 2.6.13 support for SMP C2 has triggered a regression in going from 2.6.12 to 2.6.13. Please confirm that this did or did not happen in 2.6.12, and if it did not then please attach the output under 2.6.12 of /proc/acpi/processor/*/power on a failing system, please verify that # echo 1 > /sys/module/processor/parameters/max_cstate makes the issue go away. Alternatively you can boot with "processor.max_csate=1"
No, this is not an SMP system. I
I complained about this to AMD some time ago (ran into it with my no idle tick work) and they said it was a BIOS bug that happens only on some BIOS and is even supposed to be fixed. So first I would suggest a BIOS update. C2 state does execute SMM code and that probably does something wrong. Possibly just needs a DMI blacklist. It's not a generic Can people seeing this please start attaching dmidecode output.? Also it's a big sledgehammer to disable C2 because it may make the system much hotter. If the delays are not too bad it might be better to disable the message and eat the latency. Do you see the problem still with CONFIG_HZ_250? With CONFIG_HZ_100? I guess we need some test for gettimeofday accuracy to evaluate this. Perhaps John has something handy.
Created attachment 6189 [details] dmidecode The machine is a HP Pavilion k755.de, and the mainboard has MS-7124 Ver. 1.0 printed on it. I couldn
Created attachment 6212 [details] Lost ticks with different HZ values As promised here are the lost ticks messages with different values for HZ, namely 100, 250, and 1000. The kernel was 2.6.12. I
I would like to ask about the status of this bug. Is it possible to get some information from AMD about this, i.e., whether it
The bug report is bogus because the Athlon 64 3500 doesn't even have C2 - only the mobile parts support it. It's probably something else. Venkatesh - feel free to assign it to me since it is clearly not your problem.
Uhm, what? You mean the BIOS is so broken that it reports C2 capabilities to the OS without checking that the processor supports it? Maybe I should open the case and see if there
Ah, never mind - you really seem to have a mobile CPU in a latop. Those of course have C2. Just the desktop parts don't. Perhaps Mark Langsdorf can comment on the problem if he's still reading email. I would be reluctant to always disable c2 on these CPUs. If not I can ping people at AMD next week and figure out what to do. Again, Venki please assign to me. I can't do that myself unfortunately.
Does the Linux kernel have an interrupt pending routine to handle pending interrupts when going into C2? I've asked some AMD hardware engineers to look at this and see if they can ask more pertinent questions.
The idle function starts with interrupts enabled. Anything pending should be processed then. During the actual C2 and before that (reading bus master activity, reading start time) interrupts are disabled.
I would like make it explicit that for testing purposes I enabled interrupts directly before the code that does an inb to enter C2 and I still lost ticks. So the problem seems to be with the BIOS or the hardware, not with Linux
Good that you tested that already because I was about to suggest it :) The inl() should cause an SMI calling into the SMM BIOS and that code is known to often do strange and broken things. I guess we have to wait for AMD people to comment.
When the processor sees a C2 Stop Clock, it generates a Stop Grant message to the SB. In such a situation an interrupt that is pending can only be serviced after the processor sees a stop clock deaasertion message. This might add latency to the interrupt service and hence the warning messages reported by Linux. There are 2 ways to get around this - 1. Interrupt pending message - AMD 64 architecture has implemented an MSR which when setup by the BIOS enables our processor to generate an interrupt pending hypertransport message to the SB when an interrupt is pending and a stop clock is recieved by the processor. This message is followed by the stop grant message by the processor. The SB is then supposed to generate a stop clock deassert immidiately allowing the processor to service this pending interrupt. 2. Some new SB's do not require the interrupt pending message as they perceive that an interrupt is pending when a stop clock was sent and hence generate a wakeup event and/or stop clock deassert allowing the processor to handle this pending interrupt. I would suggest getting in touch with your hardware manufacturer (HP?) and finding out if any of the above are supported.
Thanks for the information. Unfortunately in practice it's common that it's not possible to do anything about this from the platform side (no new BIOS for old systems etc.) Can you recommend a less intrusive software workaround other than disabling C2? I guess one could implement some logic to disable C2/C3 if a few ticks get lost and see if that changes things.
Rahul, the MSR you mentioned is c001_0055h, right? Its value (read through /dev/cpu/0/msr) is 0x33300b0, which means that on a pending interrupt the processor will write 0x33 to I/O port 0xb0; this is indeed this BIOS
Keep it open for now - we do workarounds for common BIOS bugs (and I know these lost ticks happen on various machines) Also I would like to understand what your change actually does. I thought the MSR Rahul mentioned was supposed to be written by hardware, not software, but maybe I misunderstood things. Rahul?
Int Pending MSR is indeed C001_0055h. ANyone with access to the AMD BKDG should be able to get the fiekd definitions also. This register enables the processor to send a message to the I/O hub that results in the pending interrupt being serviced. The two message types are the IO space message and the HyperTransport INT_PENDING message. HyperTransport INT_PENDING message is defined by the HyperTransport 1.05c specification. If the processor or the I/O hub does not support the INT_PENDING HyperTransport message, the IO space message should be selected by IntPndMsg. A check for a pending interrupt is performed at the end of an IO instruction. If there is a pending interrupt and STPCLK is asserted, the processor executes a byte-size read or write to IO space defined with IORd, IOMsgAddr, and IOMsgData (used only for IO writes) to generate an SMI. The SMI wakes up the processor so the original pending interrupt can be serviced. The SMI handler should not take any action if the SMI is generated by this mechanism. In order to prevent SMI generation with this mechanism in the SMI handler,IntrPndMsgDis bit should be set in the SMI handler before the first IO instruction is executed, and itshould be cleared prior to resuming from SMM. If the processor and the I/O hub support the INT_PENDING HyperTransport message, it should be selected by IntPndMsg bit. The check for a pending interrupt is performed when entering the stop grant state. This MSR is writable by software and can be used to enable both of the above defined methods. SMI method needs a BIOS SMI handler implementation that supports it and the Int_Pending HT message needs a SB that is HT spec 1.05c compliant. Andy, if you want to apply a OS workaround for such problems, one implementation could be to enable int_pending message support in the OS. One way I can think of doing this is to force c2/c3 transitions, enable int_pending messages - SMI method or HT message method and check if there are lost ticks. the method that does not yield lost ticks can be implemented on that platform. I know I make it sound very simplistic where as the implementation might not be that simple.
I will be on a 2 month sabbatical starting February 2nd, 2006 and not returning April 12thl. I will not be reachable by email or phone while on sabbatical. If you have technical concerns on general Linux issues, please direct them to David Keck and Jacob Shin (david.keck@amd.com and jacob.shin@amd.com). If you have an issue dealing with Red Hat, please address it to Bhavana Nagendra (bhavana.nagendra@amd.com). If you have an issue dealing with Xen, please address it to Tom Woller (thomas.woller@amd.com). For questions about AMD's strategic relationship with Linux, please email Rich Brunner (richard.brunner@amd.com). I will deal with selected emails when I return. -Mark Langsdorf Linux Validation Tools and Support AMD, Inc. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7232.82"> <TITLE>Out of Office AutoReply: [Bug 5303] AMD64 Erratum: Should not enable C2 when using APIC</TITLE> </HEAD> <BODY> <!-- Converted from text/plain format --> <P><FONT SIZE=2>I will be on a 2 month sabbatical starting February 2nd, 2006 and not returning April 12thl. I will not be reachable by email or phone while on sabbatical.<BR> <BR> If you have technical concerns on general Linux issues, please direct them to David Keck and Jacob Shin (david.keck@amd.com and jacob.shin@amd.com).<BR> <BR> If you have an issue dealing with Red Hat, please address it to Bhavana Nagendra (bhavana.nagendra@amd.com).<BR> <BR> If you have an issue dealing with Xen, please address it to Tom Woller (thomas.woller@amd.com).<BR> <BR> For questions about AMD's strategic relationship with Linux, please email Rich Brunner (richard.brunner@amd.com).<BR> <BR> I will deal with selected emails when I return.<BR> <BR> -Mark Langsdorf<BR> Linux Validation Tools and Support<BR> AMD, Inc.<BR> </FONT> </P> </BODY> </HTML>
Created attachment 7301 [details] Workaround for buggy BIOS (against 2.6.15.4) First take at a workaround. It is enabled by passing an option to the ACPI processor module. This patch tests for the setup of MSR C001_0055h at entering acpi_processor_idle(). I could not make it part of acpi_processor_start() because I don
Revisiting this. I still would prefer to fix this in the BIOS instead of applying this ugly (sorry, Bertro) patch. Mark, does AMD have an opinion on this issue?
Fix in BIOS if at all possible continues to be AMD's position.
I saw this bug is still open. The errata has been fixed in processors Opteron revision C3 and later. I take it Bertro had a bad BIOS and a CPU with this errata. So fix should be that the BIOS is always perfectly programmed (keep dreaming) and if we want to blacklist then we could even do it on CPU revision to limit max_cstate=1.
Joachim, I'd like to implement mentioned blacklist, could you be more specific on revision "Revision Guide for AMD Athlon..." does not mention anything like C3. Could you give family/model/stepping numbers for these "Opteron revision C3"?(In reply to comment #24) > I saw this bug is still open. > The errata has been fixed in processors Opteron revision C3 and later. I > take > it Bertro had a bad BIOS and a CPU with this errata. So fix should be that > the > BIOS is always perfectly programmed (keep dreaming) and if we want to > blacklist > then we could even do it on CPU revision to limit max_cstate=1.
Alexey, sorry for the confusion I meant revisiong CG and there erratum is number 78 and the Revision Guide can be found here http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/25759.pdf Thanks for following up on this.
Created attachment 11748 [details] Set max_cstate for early Opterons Here is a proposed patch to automatically set max_cstate for early Opterons, please check.
The recent timer code disables APIC timer in some cases. The c state should be only limited when the APIC timer is actually used. This code is a lot different now than it was in the kernel where this was originally reported. I think the better way is to prefer irq 0 instead of APIC timer for the single core case on these systems and disable dyntick if C > 1 is available. That's ok because there are no dual core systems with such early steppings (only E+ is dualcore) and the multi socket system don't support anything deeper than C1 anyways.
Subject: Out of Office AutoReply: AMD64 Erratum: Should not enable C2 when using APIC I will be out of office till 05/18 and will have intermittent email access. Email responses will be delayed. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7652.24"> <TITLE>Out of Office AutoReply: [Bug 5303] AMD64 Erratum: Should not enable C2 when using APIC</TITLE> </HEAD> <BODY> <!-- Converted from text/plain format --> <P><FONT SIZE=2>I will be out of office till 05/18 and will have intermittent email access. Email responses will be delayed.</FONT> </P> </BODY> </HTML>
(In reply to comment #27) > Created an attachment (id=11748) [details] > Set max_cstate for early Opterons > > Here is a proposed patch to automatically set max_cstate for early Opterons, > please check. > Alex, any update for this patch?
I will be out of office from 11/02 to 11/09 and will have intermittent email access. Email responses will be delayed. For urgent matters please call me at 9972093530. In my absence, the following people will provide coverage - CSS - Minesh Parekh - Minesh.Parekh@amd.com DTE - Neel Subramani - Neelamegam.Subramani@amd.com Management - Jay Hiremath - Jay.Hiremath@amd.com <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7652.24"> <TITLE>Out of Office AutoReply: [Bug 5303] AMD64 Erratum: Should not enable C2 when using APIC</TITLE> </HEAD> <BODY> <!-- Converted from text/plain format --> <P><FONT SIZE=2>I will be out of office from 11/02 to 11/09 and will have intermittent email access. Email responses will be delayed. For urgent matters please call me at 9972093530. In my absence, the following people will provide coverage -<BR> <BR> CSS – Minesh Parekh – Minesh.Parekh@amd.com<BR> <BR> DTE – Neel Subramani – Neelamegam.Subramani@amd.com<BR> <BR> Management – Jay Hiremath – Jay.Hiremath@amd.com<BR> </FONT> </P> </BODY> </HTML>
I'm going to add this to the local APIC timer disable quirks in 2.6.24-rc Thanks, tglx
Thomas Gleixner wrote: > I'm going to add this to the local APIC timer disable quirks in 2.6.24-rc I’m not sure I understand correctly what you intend to do, but what does this mean for the other interrupts? Is erratum 78 specific to the local APIC timer or does it apply to the interrupts coming from the IOAPIC, too (this is how I understand the erratum)?
On Fri, 16 Nov 2007, bugme-daemon@bugzilla.kernel.org wrote: > ------- Comment #33 from bertro_simul@yahoo.com 2007-11-16 01:43 ------- > Thomas Gleixner wrote: > > > I'm going to add this to the local APIC timer disable quirks in 2.6.24-rc > > > I’m not sure I understand correctly what you intend to do, but what does this > mean for the other interrupts? Is erratum 78 specific to the local APIC timer > or does it apply to the interrupts coming from the IOAPIC, too (this is how I > understand the erratum)? Oops. Right, I confused this with some other problem. I read the errata again and I think your patch is correct. I try to figure out whether we can avoid the extra check in the acpi code and figure this out right at boot time. I get this into mainline asap. Thanks, tglx
I will be on vacation from November 17th until November 22nd. I will respond to all emails when I return. In my absence, refer Linux issues to the OSRC at osrc@elbe.amd.com. Personal email should be sent to mlangsdo@io.com. -Mark Langsdorf Operating System Research Center AMD <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7652.24"> <TITLE>Out of Office AutoReply: [Bug 5303] AMD64 Erratum: Should not enable C2 when using APIC</TITLE> </HEAD> <BODY> <!-- Converted from text/plain format --> <P><FONT SIZE=2>I will be on vacation from November 17th until November 22nd. I will respond to all emails when I return.<BR> <BR> In my absence, refer Linux issues to the OSRC at osrc@elbe.amd.com. Personal email should be sent to mlangsdo@io.com. <BR> <BR> -Mark Langsdorf<BR> Operating System Research Center<BR> AMD<BR> </FONT> </P> </BODY> </HTML>
Thomas, I’ve noticed that a patch made it into 2.6.24-rc4 that cures the disease by killing the patient. Is this the last word on this problem or will we see a patch that provides a workaround for erratum 78 that makes C2 available again?
I will be out of office from 12/03 to 12/10 and will have intermittent email and no cell phone access. Email responses will be delayed. In my absence, the following people will provide coverage - CSS - Minesh Parekh - Minesh.Parekh@amd.com DTE - Neel Subramani - Neelamegam.Subramani@amd.com Management - Jay Hiremath - Jay.Hiremath@amd.com <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7652.24"> <TITLE>Out of Office AutoReply: [Bug 5303] AMD64 Erratum: Should not enable C2 when using APIC</TITLE> </HEAD> <BODY> <!-- Converted from text/plain format --> <P><FONT SIZE=2>I will be out of office from 12/03 to 12/10 and will have intermittent email and no cell phone access. Email responses will be delayed. In my absence, the following people will provide coverage -<BR> <BR> CSS – Minesh Parekh – Minesh.Parekh@amd.com<BR> <BR> DTE – Neel Subramani – Neelamegam.Subramani@amd.com<BR> <BR> Management – Jay Hiremath – Jay.Hiremath@amd.com<BR> </FONT> </P> </BODY> </HTML>
> I’ve noticed that a patch made it into 2.6.24-rc4 > that cures the disease by killing the patient. > > Is this the last word on this problem or will we > see a patch that provides a workaround for erratum 78 > that makes C2 available again? There is no known workaround. Sorry. tglx
a re-worked version of Alexey's patch in comment #27 shipped in linux-2.6.24-rc4. c1c306344669ca40255e36192b101060ffbb1271 (ACPI: Set max_cstate to 1 for early Opterons.) yes, Bertro, this always disables C2 on your box, in favor of fixing the interrupt issue. no, i don't understand Andi's comment #28 either, as i've seen no indication that timers different than the lapic timer are immune from the issue. (if they were, you could have used "nolapic" for a workaround) closed.
I will be on vacation from December 15th 2007 until January 3rd 2008. I will respond to all emails when I return. In my absence, refer Linux issues to the OSRC at osrc@elbe.amd.com. Personal email should be sent to mlangsdo@io.com. -Mark Langsdorf Operating System Research Center AMD <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7652.24"> <TITLE>Out of Office AutoReply: [Bug 5303] AMD64 Erratum: Should not enable C2 when using APIC</TITLE> </HEAD> <BODY> <!-- Converted from text/plain format --> <P><FONT SIZE=2>I will be on vacation from December 15th 2007 until January 3rd 2008. I will respond to all emails when I return.<BR> <BR> In my absence, refer Linux issues to the OSRC at osrc@elbe.amd.com. Personal email should be sent to mlangsdo@io.com. <BR> <BR> -Mark Langsdorf<BR> Operating System Research Center<BR> AMD<BR> </FONT> </P> </BODY> </HTML>
I don't think that patch was the correct solution. The problem is essentially equivalent to APIC timer not working in C2, and that is handled by broadcasting e.g. on Intel. No need to really disable the C states completely. If there are other APIC interrupts they can probably be delayed a bit until the next broadcast Alexey can you expand why did it exactly this way? I think it should be REOPENED, although I unfortunately don't have enough bugzilla rights to do that.
Andi, I referred to #24 as a guide.
I think it would have been better to try broadcasting first and only if that didn't help go to such drastic measures.
Andy, was it you who dismissed "ugly" broadcast patch at #22?
Yes, but the timer broadcast as currently implemented for Intel platforms in the tree is quite different than that. Still not pretty, but bearable. I also don't like disabling power saving modi automatically in general -- imho power saving is very important. And the broadcast should really already help for this I hope.
AMD recommends to disable power saving to avoid errata. Bug is two years old and nobody seem to care to implement anything better that this.
We use timer broadcasting in C2 anyway, but this does not change the problem at all. We can not avoid a situation like this: CPU goes idle Timer interrupt happens while interrupts are disabled right before we go into C2 We wake up only when some other interrupt (keyboard, network whatever) comes in. This is completely independent of local apic timer or hpet/pit broadcast mode. To verify this we can simply backout the quirk (git commit c1c306344669ca40255e36192b101060ffbb1271) and add "noapictimer" to the kernel command line.
You're right; since it applies to all APIC interrupts. I somehow assumed it only applied to APIC timer interrupts, but that was wrong. Objection withdrawn.