Bug 3927

Summary: AMD64/ATI : timer is running twice as fast as it should
Product: Platform Specific/Hardware Reporter: Enrico Scholz (enrico.scholz+bugzilla.kernel)
Component: OtherAssignee: /dev/null (devnull)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: 76306.1226, abirjandi, acpi-bugzilla, akpm, bertro_simul, bunk, bwp, cryos, dagarlas, djacobs, g.horvath, greg, hno, hugh, jg, john.byrd, john.stultz, kernel, kernelpanic, kess, khellman, kontakt, leahcim.seddeg, M.Schouten, macer, mariomastro, me, merlin, miguel.martin, mjg59-kernel, raphael, rohlfing, sandeen, simong99, t8m, the3dfxdude, wingc, zwane
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.15 Subsystem:
Regression: --- Bisected commit-id:
Bug Depends on:    
Bug Blocks: 5252    
Attachments: disable_check_timer.patch vs 2.6.13-rc6
dmesg output from patch (debug off)
dmesg from nx6125 with patch applied
patch, /proc/interrupts, and dmesg
Patch that works around the problem
Lost ticks and APIC errors
kernel .config and boot log, /proc/interrupts
Extend exisitng quirk code to apply fix only to ATI chipsets
PIC, local APIC, I/O APIC, and IDT on Windows XP SP2
New patch
Use PIT based APIC calibration
dmesg (HP nx6125, kernel 2.6.15-rc3 vanilla)
dmesg 2.6.15 rc3 with patch from #73
dmesg 2.6.15-rc3 without timer-routing-1 patch
Boot output from 2.6.15-rc3+timer-routing-1 patch
dmesg 2.6.15-rc3 with patches from #73, #81 and acpi_skip_timer_override
dmesg dual core X2 ATI disable_timer_pin_1 no_timer_check hp a1250n 2.6.15-rc3-git1
2.6.16-rc5 boot.log without disable_timer_pin_1
dmesg output from system described in #149
Full boot log of 2.6.16 and timer running at double speed
dmidecode output for system in #157

Description Enrico Scholz 2004-12-21 14:55:06 UTC
Distribution:
  Fedora Core 3

Hardware Environment:
  AMD Athlon(tm) 64 Processor 3200+
  ATI IXP chipset
  see http://www.tu-chemnitz.de/~ensc/hw/amd64/dmesg.txt
  or dig http://www.tu-chemnitz.de/~ensc/hw/amd64/

Software Environment:
  Fedora Core 3

Problem Description:
  the timer runs twice as fast as expected. E.g. 'sleep 10' exits
  after exactly 5 seconds or 'date && ... wait 10 seconds external
  time ... && date' shows a jump of 20 seconds.
Comment 1 john stultz 2005-01-14 16:13:01 UTC
Does this occur when cpufreq is disabled?
Comment 2 Enrico Scholz 2005-01-17 02:42:42 UTC
yes, it still happens when CONFIG_CPU_FREQ is turned off and/or when I boot into
runlevel 1 (which does not start any cpufreq daemon).
Comment 3 Enrico Scholz 2005-01-17 03:26:20 UTC
'Kernel Version' field seems to be limited in length... so: 

still with 2.6.11-rc1
Comment 4 john stultz 2005-02-02 11:34:13 UTC
Could you check /proc/interrupts to see how frequently you're getting timer
ticks?   You should see the timer interrutp count increase 1000 times a second
(measured against your wrist-watch).

CC'ing Andi to make sure he's aware of this.
Comment 5 Enrico Scholz 2005-02-02 11:46:53 UTC
they increase twice as fast (2000 times/sec)
Comment 6 Andi Kleen 2005-02-03 00:14:23 UTC
I've had a similar bug once on a i386 multiprocessor machine. The problem in 
this case was that the timer interrupt was misconfigured in the APIC and
broadcasted to all CPUs, and each CPU did timer processing, which made the 
time run NRCPUS times as fast.

But this one must be different since the machine has only a single CPU. 
What does grep time /var/log/boot.log say?

If it says something like

time.c: Using 14.318180 MHz HPET timer.

you are using HPET. If yes try "nohpet"
Comment 7 Enrico Scholz 2005-02-03 00:27:10 UTC
see http://www.tu-chemnitz.de/~ensc/hw/amd64/dmesg.txt

| time.c: Using 1.193182 MHz PIT timer.


Playing with ACPI and APIC is difficultly as both is required to boot the
machine. i386 kernel is not working either as it gives lot of interrupts.

fwiw, from time to time (especially on CPU or IO intensive tasks??) I see

| APIC error on CPU0: 40(40)

messages.


For now, I worked around the timer problem by ignoring every second timer
interrupt. But this is probably not a very portable solution ;)
Comment 8 Andi Kleen 2005-02-03 00:53:02 UTC
Can you post the full boot log?
Comment 9 Enrico Scholz 2005-02-03 01:23:07 UTC
is http://www.tu-chemnitz.de/~ensc/hw/amd64/dmesg.txt not enough? When not, how
can I create a better "boot log"?

I do not see the machine before the weekend so I can not provide new logs atm.
Comment 10 Andi Kleen 2005-02-03 01:29:05 UTC
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 21 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.

Most likely one of these overrides is somehow handled wrong. 
Does "noapic" help? 

Could be an ACPI issue.
Comment 11 Enrico Scholz 2005-02-03 01:40:24 UTC
Unfortunately, machine does not boot with 'noapic' or 'acpi=off' :(


But this kind of message seems to be common on AMD64; e.g. first hit in google
shows http://lists.suse.com/archive/suse-amd64/2004-Jul/0104.html


The only unique message seems to be

| ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 21 low level)

Comment 12 john stultz 2005-04-18 18:23:36 UTC
This looks similar to bugme bug #4442 as well as the lkml thread:
http://www.ussg.iu.edu/hypermail/linux/kernel/0504.0/0270.html

Booting with "noapic" might help. Could the submitter try that out?
Comment 13 john stultz 2005-04-18 18:25:53 UTC
Crud, scratch the noapic suggestion. I forgot that had already been tried.
Comment 14 john stultz 2005-04-18 18:29:09 UTC
Could the submitter try the patch found here:
http://www.ussg.iu.edu/hypermail/linux/kernel/0504.0/1625.html
Comment 15 Enrico Scholz 2005-04-23 09:24:38 UTC
yes, this patch or the alternative in
http://www.ussg.iu.edu/hypermail/linux/kernel/0504.0/1862.html fixes the double
timer frequency. But it disables the NMI watchdog.

For reference, my board is an ATI RX480-SB400 found in an HP Pavilion k737.de
Comment 16 john stultz 2005-07-27 16:05:20 UTC
Is this bug currently reproduceable w/ 2.6.12? Does booting with
"no_timer_check" change anything?
Comment 17 john stultz 2005-08-02 13:09:51 UTC
Another similar issue is in bug #3341
Comment 18 john stultz 2005-08-09 11:39:30 UTC
*** Bug 4651 has been marked as a duplicate of this bug. ***
Comment 19 john stultz 2005-08-09 11:43:34 UTC
Also looks similar to bug #4092 (i386) and bug #4442 (x86-64)
Comment 20 john stultz 2005-08-10 11:16:49 UTC
*** Bug 5031 has been marked as a duplicate of this bug. ***
Comment 21 Len Brown 2005-08-10 20:52:23 UTC
this appers to be an interrupt configuration bug
rather than a "timers" bug.
Comment 22 Len Brown 2005-08-10 21:40:49 UTC
Created attachment 5596 [details]
disable_check_timer.patch vs 2.6.13-rc6

Please test this patch to see if it has any effect.
The patch disables the questionable call to check_timer()
when in ACPI+IOAPIC mode.  This patch is not
production ready b/c some NMI gunk is also
(erroneously) in check_timer().
Comment 23 Keith Hellman 2005-08-11 16:36:39 UTC
This patch is working AOK for me so far.    processor	: 0 vendor_id	: AuthenticAMD cpu family	: 15 model		: 28 model name	: Mobile AMD Sempron(tm) Processor 2800+ stepping	: 0 cpu MHz		: 800.129 cache size	: 256 KB fdiv_bug	: no hlt_bug		: no f00f_bug	: no coma_bug	: no fpu		: yes fpu_exception	: yes cpuid level	: 1 wp		: yes flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt 3dnowext 3dnow lahf_lm bogomips	: 1602.80  0000:00:00.0 Host bridge: ATI Technologies Inc: Unknown device 5950 (rev 01) 0000:00:01.0 PCI bridge: ATI Technologies Inc: Unknown device 5a3f 0000:00:13.0 USB Controller: ATI Technologies Inc: Unknown device 4374 0000:00:13.1 USB Controller: ATI Technologies Inc: Unknown device 4375 0000:00:13.2 USB Controller: ATI Technologies Inc: Unknown device 4373 0000:00:14.0 SMBus: ATI Technologies Inc: Unknown device 4372 (rev 11) 0000:00:14.1 IDE interface: ATI Technologies Inc: Unknown device 4376 0000:00:14.3 ISA bridge: ATI Technologies Inc: Unknown device 4377 0000:00:14.4 PCI bridge: ATI Technologies Inc: Unknown device 4371 0000:00:14.5 Multimedia audio controller: ATI Technologies Inc: Unknown device 4370 (rev 02) 0000:00:14.6 Modem: ATI Technologies Inc: Unknown device 4378 (rev 02) 0000:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge 0000:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge 0000:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge 0000:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge 0000:01:05.0 VGA compatible controller: ATI Technologies Inc: Unknown device 5955 0000:05:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+ (rev 10) 0000:05:09.0 CardBus bridge: Texas Instruments: Unknown device 8031 
Comment 24 Andi Kleen 2005-08-11 17:28:01 UTC
Can you show dmesg output with the patch?
Comment 25 Keith Hellman 2005-08-12 06:47:10 UTC
Created attachment 5614 [details]
dmesg output from patch (debug off)

As requested.
Comment 26 Matthew Garrett 2005-08-12 06:55:15 UTC
This patch doesn't seem to work on an HP nx6125 which exhibits these symptoms
(AMD64, ATI chipset). If I apply it, the last line I get from the kernel is 

ACPI: Embedded Controller [C110] (gpe 17)

and then the machine hangs. I'll see if I can sort out a serial port on it and
get a dump of the entire boot. Presumably something in check_timer() is
necessary for this machine to work.

(Attempting to boot with acpi=off or noapic oopses the kernel - I'll try to
track that down at some later stage)
Comment 27 Matthew Garrett 2005-08-12 07:02:16 UTC
After further checking - the patch in
http://www.ussg.iu.edu/hypermail/linux/kernel/0504.0/1625.html works fine.
Skipping check_timer() entirely doesn't.
Comment 28 Andi Kleen 2005-08-12 07:33:13 UTC
Full bootlog for failure with the patch please.
Comment 29 Matthew Garrett 2005-08-15 08:00:24 UTC
Created attachment 5643 [details]
dmesg from nx6125 with patch applied

dmesg output from the failure case supplied. This is with the patch from
comment 22. Boot freezes at this point. The patch from comment 14 works.
Comment 30 Len Brown 2005-08-16 15:04:59 UTC
*** Bug 4092 has been marked as a duplicate of this bug. ***
Comment 31 Len Brown 2005-08-16 15:15:34 UTC
*** Bug 4442 has been marked as a duplicate of this bug. ***
Comment 32 Bertro Simul 2005-08-20 07:35:02 UTC
Keith, your dmesg shows that you turned off the APIC
(with the 
Comment 33 Bertro Simul 2005-08-21 05:08:48 UTC
Created attachment 5705 [details]
patch, /proc/interrupts, and dmesg

I made some random changes and made the following observation.
If I don
Comment 34 Matthew Garrett 2005-08-24 17:10:03 UTC
Created attachment 5755 [details]
Patch that works around the problem

This patch fixes things on my nx6125, and doesn't seem to interfere with any
other code. This surely can't be the correct solution?
Comment 35 john stultz 2005-08-25 12:40:01 UTC
Andi: Any comments on the last patch?
Mathew: Would you consider RFC'ing that patch to lkml to get wider testing and
feedback?
Comment 36 Andi Kleen 2005-08-25 12:44:19 UTC
Hmm, maybe. It looks a bit dubious, but could be it. 
Would need a lot of testing, also on non ATI chipsets. 
 
Comment 37 Bertro Simul 2005-08-27 23:52:17 UTC
Created attachment 5788 [details]
Lost ticks and APIC errors

The patch from comment #34 stops the double timer interrupts
on my machine. I still get messages about lost ticks, although less
than with my patch but from more sources. Furthermore I still receive
occasional APIC errors (APIC error on CPU0: 40(40)), which my patch
stopped.

The lost ticks and the APIC errors may be related because a lost
tick is often (though not always) immediately preceded by an APIC error.
Comment 38 Bertro Simul 2005-08-30 11:20:13 UTC
From the results so far I would guess the following: the timer
is connected to pin 0 of the I/O APIC and the output of the
PIC is connected to pin 2 of the I/O APIC (so that the timer
override is bogus) and to LINT0 of the local APIC; however, somehow
masking LINT0 in check_timer() takes no effect. This setup would
explain all effects I encountered on my machine, namely why:

	- booting without any kernel parameters causes double
	  IRQ 0;
	- disabling IRQ 0 in the PIC stops IRQ 0 altogether;
	- enabling IRQ 1 in the PIC increases IRQ 0 when hitting
	  the keyboard;
	- booting with acpi_skip_timer_override still causes double
	  IRQ 0;
	- and booting with acpi_skip_timer_override and disabling
	  IRQ 0 in the PIC results in normal IRQ 0.

Any idea how to verify or dismiss this hypothesis?
Comment 39 Erik Andr 2005-09-05 11:03:32 UTC
What about opening the case and track the traces on the mainboard?
Comment 40 Chuck Ebbert 2005-09-06 03:11:22 UTC
I am hitting this problem running an i386 2.6.13 kernel on a Compaq Presario
V2312US.  Adapting the patch from comment #34 to i386 solves it.
Comment 41 Andrew Morton 2005-09-06 03:20:49 UTC
ooh, progress.   Can other reporters please test 
http://bugzilla.kernel.org/attachment.cgi?id=5755&action=view ?
Comment 42 Astrid Ke 2005-09-06 04:13:21 UTC
I'm running an Acer Aspire 5024 with the same problem. The patch
http://bugzilla.kernel.org/attachment.cgi?id=5755&action=view solves it.
Comment 43 Andrew Morton 2005-09-07 02:00:52 UTC
I tried the patch from comment #34 and it caused my EMT64 box to hang
partway through booting.
Comment 44 Bertro Simul 2005-09-07 04:32:57 UTC
Hmm, isn
Comment 45 kwall 2005-09-08 16:41:51 UTC
This patch (http://bugme.osdl.org/attachment.cgi?id=5755&action=view) seems to
solve the double-speed system clock over here. I still get the annoying lost
ticks message:

    Losing some ticks... checking if CPU frequency changed.

And the "standard" APIC error (27 lines worth at the moment):

    APIC error on CPU0: 40(40)

I'm not using frequency scaling. dmesg output or whatever else available upon
request. Kernel is vanilla 2.6.13 other thanthe applied patch.
Comment 46 Aaron M Dulles-Coelho 2005-09-08 17:57:55 UTC
Created attachment 5940 [details]
kernel .config and boot log, /proc/interrupts

I've tried this patch on a Compaq V2000 laptop with AMD Turion 64 / ATI, but to
no avail. I' patched Debian's 2.6.12-6 I tried booting this kernel both with
and without acpi_skip_timer_override

I'm running a Compaq V2000 laptop with AMD Turion 64 and ATI. With this patch,
my timer is still running at double-time. I tried acpi_skip_timer_override as
well, but to no avail.

The attachment is my latest /var/log/boot, /proc/interrupts, and the kernel's
.config
Comment 47 Zwane Mwaikambo 2005-09-08 18:02:38 UTC
How about 'noapic' ?
Comment 48 Aaron M Dulles-Coelho 2005-09-08 18:34:05 UTC
Looks like I didn't do enough testing. It works fine with noapic even 
without that latest of posted patches.

bugme-daemon@kernel-bugs.osdl.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=3927
> 
> 
> 
> 
> 
> ------- Additional Comments From zwane@arm.linux.org.uk  2005-09-08 18:02 -------
> How about 'noapic' ?
> 
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.

Comment 49 Andi Kleen 2005-09-08 18:41:09 UTC
noapic doesn't work for all people with this problems - sometimes
it causes the machine not to boot (see also
https://bugzilla.novell.com/show_bug.cgi?id=113323)

I suspect the best course of action is to extend check_ioapic to check
for ATI bridges and then enable the change from comment #34.
Still the APIC errors that are caused by this are bad so it's probably
still not a very good workaround

Best would be to figure out how Windows programs the hardware,
that tends to be most tested (= reliable) way.
                               
Comment 50 Matthew Garrett 2005-09-08 18:46:31 UTC
Right. The basic problem seems to be that we're getting two timer ticks when we
should be getting one. My patch disables one of these, but produces APIC errors.
Presumably we actually want to be disabling the other one, if someone could work
out where it was coming from.

I have a copy of Windows installed on the test machine I have - is there any
easy way to dump the APIC and legacy PIC state under Windows?
Comment 51 Zwane Mwaikambo 2005-09-08 18:48:29 UTC
I have reports that this behaviour has also been observed in Windows XP too. Has
anyone from this bug report observed it on their systems with Windows?
Comment 52 Chuck Ebbert 2005-09-08 20:23:16 UTC
Created attachment 5943 [details]
Extend exisitng quirk code to apply fix only to ATI chipsets

I attached a patch that selectiviely applies the only known fix to ATI
chipsets.

And I've only gotten the APIC errors others are seeing one time, before I added


  /sbin/hdparm -a 8 -m 8 -u 1 -d 1 -c 1 /dev/hda

in /etc/rc.local
Comment 53 Andi Kleen 2005-09-08 20:39:50 UTC
It's wrong to make this dependent on CONFIG_ACPI - it should
be independent on ACPI

Also for i386 it would need to be in a code path outside acpi/
Comment 54 Bertro Simul 2005-09-09 05:16:50 UTC
FWIIW:
	- Windows XP SP2 on my machine doesn
Comment 55 Bertro Simul 2005-09-09 13:33:28 UTC
Created attachment 5959 [details]
PIC, local APIC, I/O APIC, and IDT on Windows XP SP2

It has been suggested to find out what Windows does on
a double timer machine. You can use Microsoft
Comment 56 Bertro Simul 2005-09-10 08:45:32 UTC
I have to correct my statement in comment # 54
about the APIC errors: they go away only if boot
with acpi_skip_timer_override and disable the
PIC; booting with acpi_skip_timer_override and
disabling the I/O APIC pin does not stop them.
Comment 57 Chuck Ebbert 2005-09-16 11:08:58 UTC
If you are hitting this bug, please go to Bugzilla #3927 and post the output of

   lspci -n -s 00:00.0

If your vendor:product have already been reported, please don't report again.


Comment 58 kwall 2005-09-16 11:30:06 UTC
$ lspci -n -s 00:00.0
00:00.0 Class 0600: 1002:5950
Comment 59 Marcus D. Hanwell 2005-09-16 14:47:36 UTC
If it helps here is my output,  
  
cryos-lap ~ # lspci -n -s 00:00.0  
0000:00:00.0 Class 0600: 1002:5951 (rev 01)  
  
Using no_timer_check fixes this issue, I am also getting APIC error on CPU0: 40
(40) errors. This is an Acer Ferrari 4005 laptop using the turion/ATI chipset 
combination. 
Comment 60 Pascal Bolduc 2005-09-17 05:42:45 UTC
lspci -n -s 00:00.0 --> 00:00.0 Class 0600: 1002:5950

--
[Compaq Presario R4000 series (R4035)]
It has already been reported in AC#58. I just wanted to add that no_timer_check
partially fixes the problem. The timer is now ticking all right but then I get
this message at boot :

..MP-BIOS bug: 8254 timer not connected to IO-APIC
 failed.
timer doesn't work through the IO-APIC - disabling NMI Watchdog!
Uhhuh. NMI received for unknown reason 31.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?
 works.
Using local APIC timer interrupts.
Detected 12.464 MHz APIC timer.
testing NMI watchdog ... CPU#0: NMI appears to be stuck (1->1)!
--

With nmi_watchdog=0 as kernel param, it says :

..MP-BIOS bug: 8254 timer not connected to IO-APIC
 failed.
 works.
Using local APIC timer interrupts.
Detected 12.464 MHz APIC timer.
testing NMI watchdog ... CPU#0: NMI appears to be stuck (0->0)!
--

Finally, I just wanted to report that the DSDT has many errors too
(disassembling - recompiling with iasl gives 25 errors and 3 warnings). Perhaps
the bug feeds on this. I'm trying to work around them now. I can post my DSDT
and/or my recompile output if anyone asks.
Comment 61 Stan T 2005-09-17 07:24:28 UTC
[root@]# lspci -s 00:00.0 -n  
00:00.0 Class 0600: 1002:5950 (rev 01)  
  
[root@]# lspci -s 00:00.0 -vvx  
00:00.0 Host bridge: ATI Technologies Inc: Unknown device 5950 (rev 01)  
        Subsystem: Hewlett-Packard Company: Unknown device 2a20  
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-  
Stepping- SERR- FastB2B-  
        Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-  
<TAbort- <MAbort+ >SERR- <PERR-  
        Latency: 64  
        Region 2: I/O ports at 4100 [disabled] [size=32]  
        Region 3: Memory at <ignored> (64-bit, non-prefetchable) [size=512M]  
00: 02 10 50 59 06 00 20 22 01 00 00 06 00 40 00 00  
10: 00 00 00 00 00 00 00 00 01 41 00 00 04 00 00 e0  
20: 00 00 00 00 00 00 00 00 00 00 00 00 3c 10 20 2a  
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  
Yesterday I could not even get an audio cd to play, but when I did before, I 
think it was playing at high speed. Could the timer problem be causing other 
problems too? 
HP Pavilion a1130n - 3500+          Mandriva LE2005 kernel 2.6.11-6mdk 
Stan 
Comment 62 Daniel Drake 2005-09-17 10:14:02 UTC
Downstream bug report: http://bugs.gentoo.org/show_bug.cgi?id=104789
Comment 63 Miguel Martin Mateo 2005-09-18 04:54:05 UTC
# lspci -n -s 00:00.0
0000:00:00.0 Class 0600: 1002:5950 (rev 01)

# lspci -s 00:00.0 -vvx
0000:00:00.0 Host bridge: ATI Technologies Inc: Unknown device 5950 (rev 01)
        Subsystem: Hewlett-Packard Company: Unknown device 2a20
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR- FastB2B-
        Status: Cap- 66Mhz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort+ >SERR- <PERR-
        Latency: 64
        Region 2: I/O ports at 4100 [disabled] [size=32]
        Region 3: Memory at <ignored> (64-bit, non-prefetchable)
00: 02 10 50 59 06 00 20 22 01 00 00 06 00 40 00 00
10: 00 00 00 00 00 00 00 00 01 41 00 00 04 00 00 e0
20: 00 00 00 00 00 00 00 00 00 00 00 00 3c 10 20 2a
30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Comment 64 Miguel Martin Mateo 2005-09-18 04:58:01 UTC
*** Bug 5252 has been marked as a duplicate of this bug. ***
Comment 65 Matthew Garrett 2005-09-20 14:36:09 UTC
Created attachment 6061 [details]
New patch

How about this? It only works in the amd64 case, but should only trigger on the
affected machines.

(I've removed the timer override, too - it seems to be bogus, and on the nx6125
has the side effect that the ACPI thermal trip values are all 16 degrees C.
Thanks, HP)
Comment 66 Miguel Martin Mateo 2005-10-02 11:18:37 UTC
I am testeing kernel 2.6.13.2 from SUSE Kernel of the day in Suse 9.3 

altair:~ # uname -a
Linux altair 2.6.13.2-20050928132226-default #1 Wed Sep 28 13:22:26 UTC 2005
x86_64 x86_64 x86_64 GNU/Linux

The problem with the clock has disapeared.

altair:~ # cat /proc/interrupts
           CPU0
  0:    5273574    IO-APIC-edge  timer
  1:      26496    IO-APIC-edge  i8042
  8:      37612    IO-APIC-edge  rtc
 12:     314691    IO-APIC-edge  i8042
 15:     378468    IO-APIC-edge  ide1
169:          3   IO-APIC-level  acpi, ohci1394
193:      52556   IO-APIC-level  libata
209:    1499114   IO-APIC-level  eth0
217:    1692938   IO-APIC-level  ehci_hcd:usb1, ohci_hcd:usb2, ohci_hcd:usb3
225:    1848465   IO-APIC-level  nvidia
233:       6510   IO-APIC-level  ATI IXP
NMI:       1369
LOC:    5273779
ERR:         37
MIS:          0

But there are still "Lost ticks and APIC errors"

Oct  2 14:38:55 altair kernel: APIC error on CPU0: 00(40)
Oct  2 14:42:38 altair kernel: APIC error on CPU0: 40(40)
Oct  2 14:46:50 altair kernel: APIC error on CPU0: 40(40)
Oct  2 14:46:50 altair kernel: Losing some ticks... checking if CPU frequency
changed.
Oct  2 14:50:02 altair kernel: APIC error on CPU0: 40(40)
Oct  2 14:53:16 altair kernel: APIC error on CPU0: 40(40)

(powersave is disabled)
Comment 67 JG 2005-10-09 04:09:01 UTC
hi,    
    
i've been trying to get this clock too fast prob solved with my acer ferrari 
4005 for quite some time now 
(http://marc.theaimsgroup.com/?l=linux-kernel&m=112835858711786&w=2). 
 
it seems to be working fine with this patch without any further kernel 
parameter (kernel 2.6.13.3): 
http://bugzilla.kernel.org/attachment.cgi?id=5943&action=view 
i also tried patch id=6061 and it didn't work here, because it seems only to 
apply to x86_64. 
 
JG 
 
Comment 68 Simon Goble 2005-10-11 14:04:53 UTC
The patch in attachment 6061 [details] (msg #65) applied to linux-2.6.13-1.1526_FC4 fixes
the double-speed clock problem for a Gateway MX7515 running in x86_64.
It also fixes a tapping speed problem with the Synaptics touchpad which made it
difficult to select text, etc. Many thanks Matthew.
Comment 69 James Pattinson 2005-10-18 07:24:18 UTC
Hi All

I get the same error, the patch in #6061 doesn't work for me, but if I declare
disable_timer_pin_1 to 1 initially, it fixes the problem. I guess this means my
chipset isn't being detected as an ATI in the code.

I really don't understand why, as my chipset's PCI ID is:

0000:00:00.0 Class 0600: 1002:5950 (rev 01)

This does map to PCI_VENDOR_ID_ATI.

FYI I'm using 2.6.13 (amd64) on an Optronix K9A200G-MLF board and an Athlon 64.

Is this issue still being looked at for inclusion into the kernel?

James
Comment 70 James Pattinson 2005-10-18 07:47:25 UTC
Additional update: The patch now works for me. I had apic=debug in my kernel
command line, and having this sets ioapic_force = 1 in setup.c therefore
bypassing the pci bus walk.

James
Comment 71 Richard Mace 2005-11-14 01:20:13 UTC
Just some extra information for 2.6.14 kernels. The boot parameter,
disable_timer_pin_1, seems to work around this problem nicely in 2.6.14 series
kernels.
Comment 72 Malte Marwedel 2005-11-20 02:54:31 UTC
Agree, what Richard Mace wrote.
On my Machine

$ lspci -n -s 00:00.0
0000:00:00.0 0600: 1002:5950 (rev 10)

the double timer problem is solved by adding "disable_timer_pin_1" as boot
parameter when using the 2.6.14.1 kernel.
In opposite, the "disable_timer_pin_1" parameter does not help when using the
2.6.12 kernel (on the same machine).
Comment 73 Andi Kleen 2005-11-26 06:25:03 UTC
The patch at http://www.firstfloor.org/~andi/timer-routing-1
for 2.6.15rc2 should fix it. Please test.

Comment 74 Matthew Garrett 2005-11-26 06:34:52 UTC
This chipset can also be found on 32-bit systems (some Semprons ship with it),
and the problem is also exhibited there. Your patch only seems to touch the
x86_64 code?
Comment 75 Jason Dewayne Clinton 2005-11-26 09:19:22 UTC
I also have a 32-bit Sempron affected by this bug. Please reopen. 
Comment 76 Andi Kleen 2005-11-26 10:11:50 UTC
On 32bit just don't use APIC. You don't need it there.
Comment 77 Matthew Garrett 2005-11-26 10:33:02 UTC
There are machines shipping which seem to require apic support on amd64, and
there are vendors shipping basically identical hardware with either an opteron
or a sempron on-board. Shipping distribution kernels without apic support isn't
a terribly appealing option, since there's a fairly good chance at least some of
these machines will be broken.
Comment 78 Jason Dewayne Clinton 2005-11-26 12:12:33 UTC
Hum... I find that hard to believe because the Wireless doesn't work and the 
computer will not reboot or powerdown when APIC is disabled. (Yes, I'm not 
getting ACPI and APIC confused.) 
Comment 79 Andi Kleen 2005-11-26 12:43:48 UTC
The major distros (RH,SUSE) ship the default kernel with apic off.
Always has been the case. I would be surprised if any distro
did it differently because many older 32bit systems and laptops are extremly
unhappy with APIC on.

Anyways - if you read my l-k email it would be possible to port this
fix over to i386 by trying to detect at runtime if the machine
is ACPI compliant (e.g. by looking for ACPI tables) and if yes
do the changes I did with runtime switches.

It's somewhere on my todo list but very low because 64bit is my priority. 
But it doesn't belong into this bug because it's marked "Other architectures"
and that doesn't cover i386. 

I would appreciate if 64bit users (for which this bug really is,
you others are just piggybackers ;-) could report success or failure
with this patch though.
Comment 80 Matthew Garrett 2005-11-26 15:03:46 UTC
The problems with apics and 32-bit systems were largely resolved around 2.6.9
when  the kernel stopped trying to enable apics even if the BIOS didn't flag
them. We (Ubuntu) have had no significant problems shipping with it enabled.
Comment 81 Andi Kleen 2005-11-26 16:51:27 UTC
Created attachment 6689 [details]
Use PIT based APIC calibration

Does the following patch (against 2.6.15rc2) help?

It changes the APIC calibration to use the PIT instead of the TSC
as reference.
Comment 82 Edgar Villanueva 2005-11-27 20:05:49 UTC
I'm having the same issues with a dual core x2 box i've aquired.  I've having
trouble with the box and 2.6.15-rc2. IDE hangs.

Are these patches applicable to 2.6.14?  Specifically the fedora core 4 2.6.14
version of the kernel.

I can give them a shot with that kernel. 

Comment 83 Gregor Horvath 2005-11-27 22:43:17 UTC
I am having the same problem with an HP nx6152,AMD64/ATI with an Debian stock
Sarge 2.6.8-2-386 Kernel. I also have "APIC error on CPU0: 40(40)" errors.The
problem does not exist with an Debian Sarge stock 2.4.27-2-386 kernel. But with
the 2.4, ACPI is not working at all although it is activated in .config (no
/proc/acpi dir). Also with the 2.4 kernel there are sometimes "spurious 8259A
interrupt: IRQ7." messages in dmesg.
no_timer_check did not help. With noapic or acpi=off the machine is not booting.

Hope this helps a bit.
Comment 84 Andi Kleen 2005-11-28 03:51:06 UTC
No it doesn't help what happens or not happens with ancient non standard
kernels. Please test the patches I am posting, otherwise you're just wasting time 
on this bugzilla.

Comment 85 Richard Mace 2005-11-29 06:05:25 UTC
RESULTS OF PATCH (at http://www.firstfloor.org/~andi/timer-routing-1) APPLIED TO
KERNEL 2.6.15-rc3

HARDWARE: HP nx6125, AMD Turion 64 ML-34, ATI Radeon express 200M chipset

OUTCOME: Failure. Machine hangs on boot at ==> Floppy drive(s): fd0 is 1.44M

(I don't know if this is as a result of using 2.6.15-rc3 or the above mentioned
patch. All patches applied cleanly.) 
Comment 86 kwall 2005-11-29 19:02:11 UTC
The patch in http://bugzilla.kernel.org/show_bug.cgi?id=3927#c73, cleanly
applied to 2.6.15-rc2, hung my machine. The final line of output (all I had the
willingness to hand-copy), was 

ohci_hcd 0000:00:13.0: irq 177, io mem 0xfe02d000

As it is 2.6.14.3 with the no_timer_check boots and the system clock runs at the
proper speed. I get these messages, though:

CPU: AMD Athlon(tm) 64 Processor 3200+ stepping 00
..MP-BIOS bug: 8254 timer not connected to IO-APIC
 failed.
timer doesn't work through the IO-APIC - disabling NMI Watchdog!
Uhhuh. NMI received for unknown reason 2d.
Dazed and confused, but trying to continue
Do you have a strange power saving mode enabled?
 works.
Using local APIC timer interrupts.
Detected 12.436 MHz APIC timer.
testing NMI watchdog ... CPU#0: NMI appears to be stuck (1->1)!
softlockup thread 0 started up.
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
PCI: Using MMCONFIG at e0000000
ACPI: Subsystem revision 20050902
    ACPI-0339: *** Error: Looking up [\_SB_.PCI0.LPC0.LNK0] in namespace,
AE_NOT_FOUND
search_node ffff81001fec5480 start_node ffff81001fec5480 return_node 00000000000
00000
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
PCI: Ignoring BAR0-3 of IDE controller 0000:00:14.1
Comment 87 Andi Kleen 2005-11-29 19:23:15 UTC
Re #85 - can you please test if vanilla rc3 works (iirc rc3 has some USB
problem that causes breakage). If it doesn't then it's not my patch.
There is a USB patch floating around on linux-kernel that might help
or wait for tomorrow and use a uptodate -git.

#86 - same please. Test if the kernel without the patch works as a 
baseline.

And did you guys test the patch from #81 in addition to the timer routing
patch?

Also I wonder if you guys have the chipset with the miscalibrated
PIT - if yes the patches from
http://bugzilla.kernel.org/show_bug.cgi?id=3341                                 
might help (they're only experimental, may require to back the other 
patches out)

All - if that all doesn't help and you have problems could you try to shrink the
console font and make a digital photo after the failure? It's hard to debug this
without a fuller log. Logs please with with timer-routing and the #81 patch
applied.

Thanks for your support.
Comment 88 Bertro Simul 2005-11-29 23:53:53 UTC
If the machine hangs during boot, try the acpi_skip_timer_override kernel
command line parameter.
Comment 89 Richard Mace 2005-11-30 00:33:15 UTC
Quick report on Bertro's suggestion #88 (quickest to implement ;-).

BOOT param: acpi_skip_timer_override

HARDWARE: NP nx6125, Turion ML-34, ATI chipset

GOOD NEWS: Got 2.6.15-rc3 to boot (glacially slowly)

BAD NEWS: All thermal trips set to 16C. CPU got frequency scaled to 800MHz.
(Made a cup of tea during boot,... seriously!). Fans blew like the solar wind...

My naive interpretation: Perhaps the timer override defeats what Andi's patch is
trying to do, but I'll leave that for Andi to comment on. For reference, I have
been able to workaround this timer problem with the boot parameter
"disable_timer_pin_1". That seems to be the best option for the nx6125, ie., the
one with the smallest number of side effects.

I'll try some of the other suggestions later, when time permits....
Comment 90 Richard Mace 2005-11-30 02:49:41 UTC
Created attachment 6719 [details]
dmesg (HP nx6125, kernel 2.6.15-rc3 vanilla)

Andi, as requested, a boot with vanilla kernel 2.6.15-rc3. I have used the boot
parameter "disable_timer_pin_1", which works well for me on all kernels >=
2.6.14. Machine boots perfectly, fans working correctly, thermal trips set
correctly. I've attached a dmesg, as requested.
Comment 91 Matthew Garrett 2005-11-30 06:52:38 UTC
Thermal trips getting set to 16 degrees is a BIOS bug on the nx6125.
Comment 92 Richard Mace 2005-11-30 06:58:06 UTC
Well, booting with "disable_timer_pin_1" sets my thermal trips correctly *and*
it keeps my timer running correctly, so it does something right. The HP nx6125
also suffers from bug #5534, which (I'll stick my neck out here) is possibly
related to this one.
Comment 93 Matthew Garrett 2005-11-30 07:03:24 UTC
I'm not convinced - you see similar bugs to 5534 (to varying extents) on most
recent HPs.
Comment 94 Bertro Simul 2005-11-30 11:47:51 UTC
I tried the patch from comment #73 with 2.6.15-rc3 (but not
the PIT-based APIC calibration) and everything worked smoothly
for me: the machine boots normally and the timer runs at normal
speed. No there were no 
Comment 95 Bertro Simul 2005-11-30 11:50:21 UTC
Created attachment 6724 [details]
dmesg 2.6.15 rc3 with patch from #73
Comment 96 kwall 2005-11-30 19:52:09 UTC
Per Andi's request in <a href="#c87">#87</a>, vanilla 2.6.15-rc2 and 2.6.15-rc3
*without* the <a href="#c73">timer-routing-1 patch</a> both boot. Without
no_timer_check, the clock runs twice as fast. With no_timer_check, the clock
runs normally. Booting either -rc2 or -rc3 *with* the <a
href="#c73">timer-routing-1 patch</a> hangs the box.

I've attached a boot log from 2.6.15-rc3 without the no_timer_check option and
without timer-routing-1.

I've also attached a digital image (sorry if that's bad manners) of the last
screenful of output from booting 2.6.15-rc3 _with_ timer-routing-1 applied.

I haven't tried the patch in <a href="#c81>#81</a>, yet. That's next. Does it
apply on top of <a href="#c73">#73</a> or in lieu of it?
Comment 97 kwall 2005-11-30 19:53:57 UTC
Created attachment 6728 [details]
dmesg 2.6.15-rc3 without timer-routing-1 patch
Comment 98 kwall 2005-11-30 19:56:08 UTC
Created attachment 6729 [details]
Boot output from 2.6.15-rc3+timer-routing-1 patch

This is only the final screenful. Screen wouldn't scroll back.
Comment 99 kwall 2005-11-30 20:41:20 UTC
An update to http://bugzilla.kernel.org/show_bug.cgi?id=3927#c96. 2.6.15-rc3
with the patches fromhttp://bugzilla.kernel.org/show_bug.cgi?id=3927#c73 and
http://bugzilla.kernel.org/show_bug.cgi?id=3927#c81 still hangs the machine at
the same point. However, with acpi_skip_timer_override, the same kernel
_doesn't_hang and the clock runs correctly. I've attached the boot log.
Comment 100 kwall 2005-11-30 20:43:11 UTC
Created attachment 6730 [details]
dmesg 2.6.15-rc3 with patches from #73, #81 and acpi_skip_timer_override
Comment 101 Edgar Villanueva 2005-12-01 07:11:39 UTC
Created attachment 6736 [details]
dmesg dual core X2 ATI disable_timer_pin_1 no_timer_check hp a1250n 2.6.15-rc3-git1

This is a dmesg file for a HP a1250n dual core AMD X2. kernel 2.6.15-rc3-git1
with minimal APIC error messages.
Comment 102 Edgar Villanueva 2005-12-01 07:22:29 UTC
I've just finished doing some overnight testing with 2.6.15-rc3-git1 and the
results after 12 hours are pretty good.  I did not include any of the patches in
this bug.

running with disable_timer_pin_1 and no_timer_check check seems to be working
for me.  Also it cleared up the constant APIC errors I was having specifically
the following:
APIC error on CPU0: 00(40)
APIC error on CPU1: 00(40)
APIC error on CPU0: 40(40)
APIC error on CPU1: 40(40)
APIC error on CPU1: 40(40)
APIC error on CPU1: 40(40)
APIC error on CPU1: 40(40)
APIC error on CPU0: 40(40)

Without no_timer_check (still have diable_timer_pin_1) I get thousands of the
errors. Adding no_timer_check to disable_timer_pin_1 I only get a few APIC
errors. In over 12 hours I only got less than 50.

I have attached my dmesg here
http://bugzilla.kernel.org/attachment.cgi?id=6736&action=view

Does anyone have a recommended way of verifying the timer is running correctly?
Sorry for the noise My first bugs.
Comment 103 Greg Kroah-Hartman 2006-01-04 14:58:14 UTC
Andi, what's the status of this bug?

I have this same symptom and am running a AMD64 chip in 32bit mode, and 2.6.15
(all older 2.6.13, 2.6.14, etc.) kernels also show this.

Which patch of the many attached to this bug should I try?
Comment 104 Jo 2006-01-05 02:28:56 UTC
I thought it was going to be fixed in 2.6.15 ?

At the moment (Acer 4005WLMi in 2.6.14 32-bit), I have "disable_timer_pin_1" in
the kernel command line, which resolves all problems and works beautifully so far.

The only special thing I did, for suppressing the occasional (and harmless)
warnings shown in comment #102, is adding a line to apic.c, that makes this
prink dependant on the following condition:

if ((v & ~0x40) || (v1 != 0x40)) // ignore: 00(40), 40(40)

.. and that's about it :-)
Comment 105 Bertro Simul 2006-01-09 06:17:45 UTC
> The only special thing I did, for suppressing the occasional (and
> harmless)...

What makes you think this is harmless?
Comment 106 Jo 2006-01-13 06:54:12 UTC
Right, I should have said: seems harmless __so far__, and caused no visible
problems on my laptop. I have read it's displayed when a misconfigured/spurious
IRQ triggers -- so of course, if there's a patch to try, I'm willing to test it
and give feedback :-)
Comment 107 artimess 2006-01-13 07:07:59 UTC
Could I please ask one of our kernel experts, explain what is really the 
problem and why only with certain ATI cards.  I posted this error in Feb/March 
of 2005 and we are still having this problem.  No offense I do not want any of 
the good people that put so much effort in helping us be offended.  I just 
want to understand the complication that has us still waiting for a fix.  It 
seems to me we are doing lots of trial and error.  rather than finding the 
root cause of the issue and solve it.  Perhaps there must be a bug fix from 
ATI and it has nothing to do with the Kernel?
Again, can someone explain me why this happens and why it is so difficult to 
resolve it?  I have a computer science degree and I can digest the technical 
explenation if you wish.

Thanks,
Artimess
Comment 108 Jesse Allen 2006-01-13 18:58:01 UTC
Hi,
I have discovered a BIOS update for my laptop:
http://h10025.www1.hp.com/ewfrf/wc/softwareDownloadIndex?lc=en&lang=en&cc=us&os=228&product=1130607&dlc=en&softwareitem=ob-37062-1
It specifically mentions fixing the clock to not run two times faster than it
should with APIC under linux.  I have tried it and it does fix it.  This
indicates that this can be fixed by BIOS release by manufacturer.  So this is
not a linux bug.  If anyone wants to try to to find out the real fix so we can
add a quirk for this hardware for everyone else, I can try to help.  Though ATI
or whoever provides BIOS builds for this hardware should know how to do it already.
Comment 109 Jesse Allen 2006-01-13 19:06:09 UTC
Warning:  If you appended "disable_timer_pin_1" as a kernel parameter and can
flash your BIOS with a fix like one from HP/Compaq, remove it before flashing. 
If you boot with the option after flashing, the kernel will always hang after
loading the scheduler.  Luckily, you don't need it anymore, anyways.

Note:  Compaq only provides support for BIOS flashes for my model through
windows .....
Comment 110 artimess 2006-01-14 00:08:49 UTC
No the patch that HP/Compaq had provided, does not do the job comppletely!  You
still get famous ACPI 40 error.  However, much less than before.  I noticed it
has some other side effects that I end up returning to the older BIOS.  I am
sorry I did not make a not of them, if I remember I will post it later.
The obsevation that I made is that the  error occurs much less uder SUSE stock
of Linux operating systems than the others.  By others I tried Gentoo, Kanotix,
I am waiting for next release of Fedora to test it on that too.

With Kanotix (debian) without the BIOS patch not only you get all the errors and
skewed time, also keyboard functions badly and you get repeation of the keys,
very annoying.
I am wondering if there might be another way of fixing this issue by correcting
 its DSDT rather than messing up with Kernel, but then again I am not  expert in
neither of them.
Comment 111 Jesse Allen 2006-01-14 09:51:41 UTC
True, it don't fix APIC error on CPU 0:(40), maybe another bug?  But I haven't
even seen any side effects.  What does it mean?  I didn't have any bad side
effects with the bios flash either.
Comment 112 Anthony Durity 2006-01-17 11:29:07 UTC
Hi, not sure if I should post here, or open a new bug report. I decided here 
might be best. I have a HP Compaq r4000 (r4218ea to be axact). I have flashed 
my bios to version F1.B (using win expee - which i instaled especially for 
doing that and nuked it right after.) It is an Amd64 ATI IXP system. I am using 
Ubuntu Dapper amd-k8 2.6.15.12 kernel at the mo.

So - When I boot with no parameters the machine gets as far as the acpi _STA 
_INI ......... stuff and hangs. If I pass apic=debug it boots (no idea why this 
should affect the acpi scanning stuff). So far so weird. So the machine now 
boots and if I pass report_lost_ticks I get loads of messages saying I am 
losing ticks. This may be normal so I do not pass that option (I only tried it 
because I saw a post saying this might help things. Besides that I am getting 
the the lovely messages at quite frequent intervals in dmesg
APIC error on CPU0: 00(40)
APIC error on CPU0: 40(40)
which I see is at KERN_DEBUG level. Should this not be KERN_ERR seeing as how 
it's an error? If it truly is a debug statement should it not be prtink'd only 
when in apic debug mode?
Anyway I can help - let me know! If you need any data at all let me know!

Let me know what to create attachments of. bye!
ps: my laptop does not run at double speed as per the title of this message but 
I feel this is where my issue should go... If this bug is getting too long 
maybe another one relating to this issue _should_ be created. It took me a 
while to get here :)
Comment 113 Greg Kroah-Hartman 2006-01-17 12:26:36 UTC
Please create a new bug for different issues, it makes it way too difficult
otherwise.
Comment 114 kwall 2006-01-21 06:46:12 UTC
Would it be possible to get the timer-routing-1 patch introduced in #73
(http://www.firstfloor.org/~andi/timer-routing-1) rediffed against 2.6.16-rc1?
It has give me the best results when used with acpi_skip_timer_override but no
longer applies to 2.6.16-rc1. Thanks.

More generally, what is the status of this bug?
Comment 115 Andi Kleen 2006-01-22 14:23:22 UTC
The patchkit in ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/
has a different solution that should work without any command line arguments
and doesn't need the new timer routing.
Comment 116 kwall 2006-01-24 05:20:43 UTC
Andi's patches work for me. Not being a quilt user, I just grabbed
ftp://ftp.firstfloor.org/pub/ak/x86_64/quilt/x86_64-2.6.16-rc1-060118-2.bz2.
Comment 117 Eric Sandeen 2006-01-27 19:33:04 UTC
Andi, 2.6.16-rc1-git4 + apic-main-timer + apic-main-timer-ati solves the clock
problem for me on a Compaq SR1710NX, with the Asus A8AE-LE motherboard, thanks!

I can't tell, is this a workaround of a mobo/bios/acpi bug, or is it fixing things
to properly interact with these boxes?

Thanks,

-Eric
Comment 118 Andi Kleen 2006-01-27 21:08:13 UTC
Kind of both. It uses a different timer to avoid the IRQ 0 issues.
The local APIC timer is near completely implemented by the CPU itself and
the chipset vendor has much less possibility to screw things up.
But that timer is known to have some limitations on some other platforms, so it
can't be used just used everywhere. For now it's only enabled on ATI.

Closing the bug now. I will try to push that patch still into 2.6.16.

 
Comment 119 Andi Kleen 2006-01-27 21:13:09 UTC
I should add that the actual timer routing problems are still unresolved.
We actually got description of the problem from ATI itself with 
an analysis, but as  far as I can figure out it didn't full explain the problems
(or rather it should have worked with the timer-routing applied at least)

It's possible the BIOS had all broken timer overrides too (ATI wouldn't
be the first vendor where this happened) 

It was also complicated by the fact that ATI implemented a workaround
into many BIOS, but some of the fixes imeplemented
anymore when the workaround was enabled in the BIOS.

But with the use of the APIC timer it doesn't really matter anymore
because it doesn't require IRQ0 routing at all.

And of course people kept mixing in other unrelated timer problems in
here, which also complicated this bug.
Comment 120 Andi Kleen 2006-01-27 22:13:09 UTC
Undo mistaken reopen.
Comment 121 Bertro Simul 2006-01-28 09:22:09 UTC
Andi, if the information provided by ATI is not
covered by an NDA, I would be delighted if you
could provide some details. Thanks a lot.
Comment 122 Andi Kleen 2006-01-28 21:19:07 UTC
Well it had some NDA notices on it. Although I didnt sign anything, I 
don't want to distribute it.   But basically it just described in a long winded
way that if the interrupt is both enabled on the PIC and on the APIC then it
will be delivered twice.  I don't quite believe it's that simple because
timer-routing disabled the PIC completely and it still didn't help for
everybody. [In short I think the analysis was incomplete at best]

Also it doesn't explain why the timer fallback code chose to unmask both
PIC and APIC in the first place - it only does that when the sane
options (PIC only or APIC only) don't work.

It also had a recommendation for a BIOS level workaround where you set a magic
bit in the Northbridge and then the CPU will ignore the messages from the PIC.
If the BIOS incorporated that fix and you  used to disable the APIC pins like
the earlier workaround patches did then nothing would be delivered and 
the system wouldn't boot (there was another document that described 
this problem). At some point i tried to code a patch that check for that
bit and don't do anything, otherwise disable PINs, but it also didn't work in
all cases and ended up quite hackish. 

So in the end I chose to use the APIC timer. I had actually written that
code before for some different purpose, so it wasn't that much of a redevelopment.

Comment 123 Jo 2006-01-30 06:43:34 UTC
I have two questions concerning the new fix Andi Kleen mentions:

- does it apply also to 32-bit compiled kernels ? (currently works fine for me
since disable_timer_pin_1 was added)

- is the solution compatible with dynamic ticks ? It currently works fine
(32-bit mode), and displays this: "dyn-tick: Disabling APIC timer, using PIT
reprogramming"

Thanks !
Comment 124 Andi Kleen 2006-01-30 07:06:49 UTC
No, the patch is for 64bit kernels. Dynamic ticks isn't a standard feature,
if you want support for non standard patches you have to ask the patch author.
On 32bit you can just run with noapic.

Comment 125 JG 2006-01-30 12:46:41 UTC
noapic will hang my system (acer ferrari 4005) during boot phase.  
  
i've described the only thing that worked for me with a 32bit kernel (didn't  
test 2.6.15* yet) under comment #67.  
  
JG 
 
Comment 126 Bertro Simul 2006-01-30 23:51:33 UTC
I
Comment 127 Andi Kleen 2006-01-31 00:10:58 UTC
Your comment doesn't make sense because noapic doesn't use any interrupt
routing tables in the DSDT. Did you perhaps confuse it with acpi=off?
I'm not a windows expert, but my understanding is that XP doesn't use 
the APIC in many circumstances neither.
Comment 128 Jason Dewayne Clinton 2006-01-31 00:12:55 UTC
As more annecdotal evidence that this bug was closed prematurely, I know two
people in addition to myself with AMD Sempron laptops with ATI motherboards for
whom this patch does not fix the problem. The Sempron is a 32-bit chip running a
motherboard which was seemingly designed to accept both Semprons and Athlon
64's. Applying the fix to the 64-bit-only side leaves the Sempron users in the dark.

Please reopen.
Comment 129 Bertro Simul 2006-01-31 04:05:18 UTC
AK> ... noapic doesn't use any interrupt routing tables in the DSDT.

Well, it does; have a look at my DSDT that I posted here:
http://bugzilla.kernel.org/attachment.cgi?id=4448&action=view.
When the APIC is not used, ACPI uses the LNK[A-H] devices
in the DSDT to determine to which interrupt pins of the PIC
the PCI devices are connected. If, as in my case, the SATA
controller is (claimed to be) connected to LNK0, but this
link is not declared as a device (only LNKA -- LNKH are),
ACPI does not enable the controller
Comment 130 Andi Kleen 2006-01-31 04:08:44 UTC
Ok that sounds more like an ACPI bug. Perhaps you complain to them?
I would recommend you open a new bug for that, as far as I'm concerned this
one is done.
Comment 131 Henrik Nordstrom 2006-02-02 19:48:14 UTC
For what it's worth apic-main-timer + apic-main-timer-ati +
apic-timer-only-with-cx does not seem to help here. It reports finding the
affected ATI chipset but I still have to specify no_timer_check to get the clock
to run at normal rate.

Fedora Core kernel-2.6.15-1.1884_FC5

might be due to the ACPI updates in that kernel?
(acpi-release-20060113-2.6.16-rc1.diff.bz2)

MSI RS482M4 board. ATI Radeon XPress 200 based.
Comment 132 Henrik Nordstrom 2006-02-02 19:54:52 UTC
Forgot to mention that Fedora Core kernel-2.6.15-1.1884_FC5 is based on
2.6.16-rc1-git4 despite it's apparent 2.6.15 version number..
Comment 133 Andi Kleen 2006-02-03 00:02:26 UTC
What happens when you apply the full patch and just specify "apicpmtimer" ?
Comment 134 Henrik Nordstrom 2006-02-04 15:33:48 UTC
Infortunately there is some overlapping changes between your patch and the
Fedora kernel and I am a little lazy to wedge in all the rejects, but I have now
added apic-pmtimer-calibrate to the mix adding the requested option and the
symptom is still there..

Of your patches I now have the following applied:

apic-main-timer
apic-pmtimer-calibrate
apic-main-timer-ati
apic-timer-only-with-cx
pmtimer-dont-touch-pit

The Fedora kernel additionally have which may be relevant

acpi-release-20060113-2.6.16-rc1.diff.bz2
linux-2.6-x86-apic-off-by-default.patch

maybe some more but nothing obvious which stands out.

I very much suspect the acpi update.

Maybe relevant boot messages says:


and there is a lot of IRQ0 (timer) routed to each CPU according to
/proc/interrupt (most to CPU0, 30% to CPU1)

ATI board detected. Using APIC/PM timer.
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:11 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:11 APIC version 16
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 33, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 21 low level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
Comment 135 Henrik Nordstrom 2006-02-04 15:55:47 UTC
Some more maybe interesting boot messages:

time.c: Using 3.579545 MHz PM timer.
time.c: Detected 2193.722 MHz processor.
Calibrating delay using timer specific routine.. 4396.40 BogoMIPS (lpj=8792803)
Security Framework v1.0.0 initialized
SELinux:  Initializing.
SELinux:  Starting in permissive mode
selinux_register_security:  Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0(2) -> Node 0 -> Core 0
Using local APIC timer interrupts.
Detected 12.464 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 2201.27 BogoMIPS (lpj=4402549)
Disabling vsyscall due to use of PM timer
time.c: Using PM based timekeeping.


Note the BogoMIPS of the second CPU.. this is a dual core CPU with both cores
running on the same clock, and with no_timer_check they both report the higher
value..
Comment 136 Chuck Ebbert 2006-02-11 15:58:58 UTC
I am still having this problem in 2.6.16-rc2-mm1, unless I specify
disable_timer_pin_1.  Boot messages:

ATI board detected. Using APIC/PM timer.
...
time.c: Using 1.193182 MHz PIT timer.
time.c: Detected 1600.110 MHZ processor.
time.c: Using PIT/TSC based timekeeping.

System is Compaq v2312us notebook, host bridge ATI RS480,
PCI ID 1002:5950

There is a BIOS update available but I'd have to reinstall XP to apply it.
Comment 137 Andi Kleen 2006-02-11 16:11:17 UTC
-mm* has some completely different rewritten timecode that's completely
unsupported from my side and probably has all kinds of old bugs already fixed. I
would recommend you check linus mainline.
Comment 138 Bertro Simul 2006-02-13 21:56:40 UTC
I installed yesterday
Comment 139 Andi Kleen 2006-02-14 00:20:32 UTC
It's a bug - i have a partial fix but it needs a little more work still.
Comment 140 Eric Sandeen 2006-02-24 19:01:50 UTC
perhaps this is a known issue, but 2.6.16-rc4 still runs at 2x for me.
Comment 141 Andi Kleen 2006-02-28 06:01:19 UTC
Can people who still have problems with the timer test 2.6.16-rc5 please?
Comment 142 Richard Mace 2006-02-28 12:25:32 UTC
Andi, 

Tested 2.6.16-rc5 on the hp nx6125 and it fixes the double timer issue for me.
In fact, this is the first patch that seems to work on this machine. No more
need to boot with disable_timer_pin_1. Great work!
Comment 143 Eric Sandeen 2006-02-28 17:26:20 UTC
Andi, works for me on my Compaq SR1710NX.  Thanks!
Comment 144 Henrik Nordstrom 2006-03-01 06:33:37 UTC
Fedora Core kernel-2.6.15-1.1996_FC5 (2.6.16rc5-git3) works fine here

   MSI RS482M4-IDL board  (ATI Radeon Xpress 200 / RS482 + SB450)
   Athlon 64 X2 4200+ cpu (dual core).
Comment 145 kwall 2006-03-01 19:58:52 UTC
Created attachment 7493 [details]
2.6.16-rc5 boot.log without disable_timer_pin_1
Comment 146 kwall 2006-03-01 20:01:33 UTC
Andi:

2.6.16-rc5 still runs double-time without disable timer-pin-1. I've attached the
boot log as http://bugme.osdl.org/attachment.cgi?id=7493&action=view. Here's the
output of trtc.c in this case:

1141270963:61429: rtc 256 int 0 (=0)
1141270963:813422: rtc 448 int 752 (=752)
1141270964:333099: rtc 16 int 520 (=520)
1141270964:813333: rtc 448 int 480 (=480)
1141270965:813244: rtc 448 int 1000 (=1000)
1141270966:332794: rtc 16 int 520 (=520)
1141270966:813157: rtc 448 int 480 (=480)
1141270967:813066: rtc 448 int 1000 (=1000)
1141270968:332490: rtc 16 int 520 (=520)
1141270968:812977: rtc 448 int 480 (=480)
1141270969:812889: rtc 448 int 1000 (=1000)
1141270970:332185: rtc 16 int 520 (=520)
1141270970:812800: rtc 448 int 480 (=480)
1141270971:812712: rtc 448 int 1000 (=1000)
1141270972:331881: rtc 16 int 520 (=520)
1141270972:812622: rtc 448 int 480 (=480)
1141270973:812533: rtc 448 int 1000 (=1000)
1141270974:331577: rtc 16 int 520 (=520)
1141270974:812444: rtc 448 int 480 (=480)


Comment 147 Andi Kleen 2006-03-01 20:15:57 UTC
Thanks. But that looks a bit instable (normally 
the sums at the end should be rougly the same), but if it works ok. 
And you're using hZ=1000 I guess.
Comment 148 kwall 2006-03-01 20:22:38 UTC
Yes. HZ=1000:

[~]$ grep HZ /archive/kernel/linux-2.6/.config
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
Comment 149 D. Hugh Redelmeier 2006-03-02 00:18:53 UTC
This bug looks to be related to http://bugme.osdl.org/show_bug.cgi?id=5573 .
It has also generated a Ubunto HowTo:
http://ubuntuforums.org/showthread.php?s=bb5681d829b8bfd25862caab2a63db20&t=75281

I observe this bug on my HP Pavilion a1250n.  It has a dual core Athlon 64 3800+
x2 and uses the ATI Radeon Xpress 200 chipset (RS482 nothbridge, SB400
southbridge).  It is running Fedora Core 4's kernel-smp-2.6.15-1.1831_FC4 for
x86-64 (no, not a kernel.org kernel).

I tried the following things to make the clock behave:
1. in BIOS config, tried to disable "spread spectrum" as per #41 in Ubunto
HowTo.  There was no such setting to change.
2. updated to HP BIOS 3.40 [no change]
3. booted with notsc [no change]
4. booted with acpi_skip_timer_override [no change]
5. booted with disable_timer_pin_1 [worked!]

Even though the clock problem is gone, there are a couple of symptoms that
suggest related APIC or interrupt routing problems to me.  After 30 hours of
uptime (assuming the clock isn't lying)

- I see 29 errors like this: APIC error on CPU0: 40(40)
  All are detected on CPU0 for some reason.
  The first such error, and only the first, is slightly different: APIC error on
CPU0: 00(40)
  Googling shows that this APIC error shows in dmesg output on systems with the
RS480 or RS482.

- /proc/interrupts shows a lot of parport0 interrupts, even though there is
nothing connected to the parallel port:
  7:     374934   26822359    IO-APIC-edge  parport0

- /proc/interrupts shows a lot of USB interrupts.  The only thing connected to a
USB port is the built-in flash card reader (without any cards loaded):
225:    8589443          0   IO-APIC-level  ehci_hcd:usb1, ohci_hcd:usb2,
ohci_hcd:usb3

- the total number of interrupts fielded by each CPU is extremely close, and I
see no reason for this to be the case.  Could the parport0 interrupts be
invented somehow to balance the two cpu's???  Notice that the number of CPU1
timer interrups equals the number of CPU0 parport0 interrupts?  And that the
reverse is close to true?

           CPU0       CPU1       
  0:   26822411     374934    IO-APIC-edge  timer
  1:      20918          0    IO-APIC-edge  i8042
  7:     374934   26822359    IO-APIC-edge  parport0
  8:          0          0    IO-APIC-edge  rtc
 12:     359851          0    IO-APIC-edge  i8042
 14:    1948929          0    IO-APIC-edge  ide0
169:          2          0   IO-APIC-level  acpi, ohci1394
201:     114362          0   IO-APIC-level  libata
209:     159867          0   IO-APIC-level  eth0
217:          1          0   IO-APIC-level  ATI IXP
225:    8589443          0   IO-APIC-level  ehci_hcd:usb1, ohci_hcd:usb2,
ohci_hcd:usb3
NMI:       1877       1118 
LOC:   27198396   27198373 
ERR:         29
MIS:          0
Comment 150 D. Hugh Redelmeier 2006-03-02 01:27:20 UTC
Created attachment 7495 [details]
dmesg output from system described in #149
Comment 151 Ole Bj 2006-03-04 06:10:12 UTC
Hm. I tried 2.6.16-rc5 on MSI RS482M-IL (a radeon express 200) with a 
AMD Sempron 3000+ in i386 mode. The kernel was compiled to use the K8.

The timer runs twice as fast.

I am still confused. Is the patch working only for x86_64 ?

[17179638.956000] powernow-k8: Found 1 AMD Athlon 64 / Opteron processors
(version 1.60.0)
[17179638.972000] powernow-k8: BIOS error - no PSB or ACPI _PSS objects

0000:00:00.0 Host bridge: ATI Technologies Inc: Unknown device 5950 (rev 10)
0000:00:01.0 PCI bridge: ATI Technologies Inc: Unknown device 5a3f
Comment 152 Jesse Allen 2006-03-10 06:59:00 UTC
An x86 patch was merged into 2.6.16-rc, see "[PATCH] i386: port ATI timer fix
from x86_64 to i386 II".  I dunno if there was a seperate bug number for x86 but
alot of those people were watching this one.  Also, is there a bug report for
the ACPI problems with this chipset yet?
Comment 153 Ole Bj 2006-03-14 12:26:34 UTC
I tried 2.6.16-rc on MSI RS482M-IL (a radeon express 200) with a 
AMD Sempron 3000+ in i386 mode. The kernel was compiled to use the K8.
The clock is now normal and this is success for me: Thanks Andi Kleen!

kern.log says:
Mar 14 20:50:54 alunheim kernel: ATI board detected. Disabling timer routing
over 8254.

Mar 14 20:50:54 alunheim kernel: ENABLING IO-APIC IRQs
Mar 14 20:50:54 alunheim kernel: ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1
pin2=-1
Mar 14 20:50:54 alunheim kernel: ..MP-BIOS bug: 8254 timer not connected to IO-APIC
Mar 14 20:50:54 alunheim kernel: ...trying to set up timer (IRQ0) through the
8259A ...  failed.
Mar 14 20:50:54 alunheim kernel: ...trying to set up timer as Virtual Wire
IRQ... works.


I still sometimes get:
Mar 14 21:00:54 alunheim kernel: APIC error on CPU0: 00(40)
Mar 14 21:01:12 alunheim kernel: APIC error on CPU0: 40(40)
Mar 14 21:02:41 alunheim kernel: APIC error on CPU0: 40(40)
Mar 14 21:06:35 alunheim kernel: APIC error on CPU0: 40(40)
Mar 14 21:09:30 alunheim kernel: APIC error on CPU0: 40(40)
Comment 154 kwall 2006-03-15 23:47:20 UTC
Unless I'm doing something wrong, 2.6.16-rc6 still does the double-speed boogie
unless I pass disable_timer_pin_1 to the kernel.
Comment 155 Chuck Ebbert 2006-03-16 07:39:23 UTC
Kurt, there is a bug in the i386 patch that went into 2.6.16-rc6.
Are you running i386 without ACPI?  If so, try the _first_ patch in this message:

http://marc.theaimsgroup.com/?l=linux-kernel&m=114238639009970&q=raw
Comment 156 Andi Kleen 2006-03-22 07:45:30 UTC
Hmm it works for everybody else now afaik.

Full boot log?
Comment 157 kwall 2006-03-28 20:09:04 UTC
Created attachment 7696 [details]
Full boot log of 2.6.16 and timer running at double speed

On stock 2.6.16, disable_timer_pin_1 is required to keep the clock from running
at double speed. This boot log is from a boot without disable_timer_pin_1.
Comment 158 Andi Kleen 2006-03-28 20:36:10 UTC
Can you add dmidecode output too please?

Comment 159 kwall 2006-03-28 21:07:29 UTC
Created attachment 7697 [details]
dmidecode output for system in #157
Comment 160 kwall 2006-04-04 19:52:56 UTC
As of 2.6.17-rc1, the double speed clock problem has disappeared on my system --
it is no longer necessary to pass disable_timer_pin_1. I still see "APIC error
on CPU0: 0(40)" messages in syslog (well, /var/log/messages and dmesg). So,
between 2.6.16 and 2.6.17-rc1, the problem was fixed.
Comment 161 Alex Dubov 2006-04-05 17:36:53 UTC
I'm getting "APIC error on CPU0: 0(40)" on my 2.6.16-gentoo (Turion64 / ATI
X200) all the time (less than before and I don't have any kernel parameters
passed as of now). I think that #c160 is not entirely correct in this respect.

Double clock problem is fixed, though.
Comment 162 Adrian Bunk 2006-04-22 05:49:56 UTC
*** Bug 5211 has been marked as a duplicate of this bug. ***
Comment 163 Marijn Schouten 2006-04-28 04:39:55 UTC
Nothing has changed for me regarding this bug. I still have to use noapic in 
kernel version 2.6.17_rc4. What I did notice during a boot without noapic was 
these lines from the kernel log. This is from a 2.6.16 kernel btw.

Apr 12 11:25:36 [kernel] ATI board detected. Disabling timer routing over 8254.
Apr 12 11:25:36 [kernel] OEM ID: ATI      Product ID: RS480        APIC at: 
0xFEE00000
Apr 12 11:25:36 [kernel] Losing some ticks... checking if CPU frequency 
changed.
Apr 12 11:26:08 [kernel] spurious 8259A interrupt: IRQ7.

My motherboard is an MSI RS482M4 though and not an RS480.
Comment 164 Martin 2006-05-02 21:21:24 UTC
I have the same problem and i am using kernel-2.6.15, i have a MSI rs482m4 with
ati chipset and an athlon X2 3800+
Comment 165 Henrik Nordstrom 2006-05-03 02:39:37 UTC
You need 2.6.16 or later. 2.6.15 has the old timer code and will produce double
clock rate on the MSI RS482M4 boards.

Fedora Core x86_64 kernel 2.6.16-1.2096_FC5 works just fine here on MSI
RS482M4-IDL Athlon64 X2 4200+.

Have not verified 32-bit i686 mode kernels and it's possible there is still
problems in the i386 arch on these boards, but x86_64 should work fine on
RS428M4 with 2.6.16 and later.
Comment 166 Marijn Schouten 2006-05-05 05:13:45 UTC
No it does not work fine. I have an athlon64-3200+ on my msi rs482m4 and use 
x86_64-pc-linux-gnu-4.1.0 to compile my kernels. Today I tried 2.6.16-gentoo-r6 
and 2.6.17_rc3 and both still suffer from this problem. I still have to use 
noapic to not get double clock rate.
Comment 167 Marijn Schouten 2006-05-05 05:52:22 UTC
maybe this is a bios version issue? I'm on 1.1
See:
http://www.msi.com.tw/program/support/bios/bos/spt_bos_detail.php?UID=693&kind=1
Comment 168 Henrik Nordstrom 2006-05-05 06:39:10 UTC
Maybe. I am using the 1.2 BIOS. But I do not think it is BIOS related.

It could also be slight differences in ACPI usage/patches I suppose. The Fedora
kernels do have a bit of ACPI patches in them, but nothing sticks out as
obviously relevant to the timers..

Might also be SMP/UP related. The fedora x86_&4 kernels are all SMP enabled and
I suppose this could make a bit of difference in how the timer is handled.

I am sorry but I am too lazy to compile a pristine 2.6.16 or 2.6.17 kernel
myself from kernel.org sources just to verify now that the Fedora kernel does
work just fine.  Maybe you could try the Fedora x86_64 kernel 2.6.16-1.2096_FC5
on your gentoo (should just be to unpack the RPM with rpm2cpio and install it as
any other kernel for a quick test). If that works then you have something to
compare with to try to isolate why your kernel isn't working. If that fails then
there is some other difference outside the kernel.
Comment 170 Marijn Schouten 2006-05-05 08:20:24 UTC
I've rpm2tar'ed the rpm for the fedora kernel, and make oldconfig'ed it with a 
config from my current 2.6.16-gentoo-r6. Unfortunately when I tried to make, I 
got:

  CHK     include/linux/version.h
  SPLIT   include/linux/autoconf.h -> include/config/*
  CC      scripts/mod/empty.o
  HOSTCC  scripts/mod/mk_elfconfig
  MKELF   scripts/mod/elfconfig.h
  HOSTCC  scripts/mod/file2alias.o
  HOSTCC  scripts/mod/modpost.o
  HOSTCC  scripts/mod/sumversion.o
  HOSTLD  scripts/mod/modpost
  HOSTCC  scripts/kallsyms
  HOSTCC  scripts/pnmtologo
  HOSTCC  scripts/conmakehash
  HOSTCC  scripts/bin2c
make[1]: *** No rule to make target `init/main.o', needed by `init/built-in.o'.  
Stop.
make: *** [init] Error 2

I'm not sure why this would not work, but perhaps it would be easier for you to 
try a vanilla kernel.
Comment 171 Henrik Nordstrom 2006-05-05 08:30:44 UTC
It's a binary kernel already compiled. Just drop it into the proper places, and
get your boot loader in shape to boot it.

Note: The xen kernel is for xen, not real hardware. The Fedora RPM you should be
looking for is kernel-2.6.16-...

As I am not having any issues with my kernel I am not too motivated to mess
around with this.
Comment 172 Marijn Schouten 2006-05-05 08:43:51 UTC
right, I missed the xen part :(
not xen kernel:
http://download.fedora.redhat.com/pub/fedora/linux/core/updates/5/x86_64/kernel-2.6.16-1.2096_FC5.x86_64.rpm
no need to compile, ok, will try again.
Comment 173 Andi Kleen 2006-05-05 08:54:22 UTC
Folks, this is not the forum for random distribution user land handholding.

If someone has still has a genuinely new problem with 2.6.17 & ATI timers please
open a new bug. I'm not interested in distribution kernels which are off topic
here.
  
Comment 174 Marijn Schouten 2006-05-05 09:02:55 UTC
I still have a problem with 2.6.17 & ATI timers, and the problem is exactly 
that described by the summary of this bug, so I don't think I should open a new 
bug.
Comment 175 Andi Kleen 2006-05-05 09:04:57 UTC
I think you should because I'm going to ignore that one. It is far too chaotic
to let anything good come out of it.
Comment 176 Marijn Schouten 2006-05-05 09:59:08 UTC
okay, you're the boss :)
new bug is:
http://bugzilla.kernel.org/show_bug.cgi?id=6497
Comment 177 Adrian Bunk 2006-11-18 16:35:18 UTC
Is any issue discussed in this bug still present in kernel 2.6.19-rc6?
Comment 178 Timo Jyrinki 2007-01-07 12:43:33 UTC
Like said, new bugs should be probably opened. Off-topic I know, but basically
worked with ubuntu kernels 2.6.15, 2.6.17 and 2.6.19, but now broken with 2.6.20
- I don't see anything in the ubuntu changelogs indicating that this would have
been caused by ubuntu, so it might be that 2.6.20 kernel has this problem
resurfaced on some at least some boards that have already worked for a year or so.

Anyway, this is just FYI. If I have time to compile from kernel.org sources,
I'll open a new bug about this, and meanwhile this probably really should be
closed to make space for bugs about the current situation.
Comment 179 Timo Jyrinki 2007-01-08 05:15:19 UTC
Thanks for closing. The continuation bug is bug 7789.
Comment 180 Fu Michael 2007-11-07 00:30:20 UTC
*** Bug 6099 has been marked as a duplicate of this bug. ***