Bug 10011

Summary: X hangs if DRI enabled - Acer Travelmate 4001 Lmi - Radeon
Product: Drivers Reporter: François Valenduc (francoisvalenduc)
Component: Video(DRI - non Intel)Assignee: Dave Airlie (airlied)
Status: CLOSED OBSOLETE    
Severity: high CC: airlied, alan, bunk, francoisvalenduc, glisse, lenb, rossi.f, venki
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.29-rc Subsystem:
Regression: Yes Bisected commit-id:
Attachments: kernel configuration file
output of dmesg
output of apcidump
output of dmesg after triggering sysrq commands
output of lspci
output of dmidecode
workaround patch
Set some bus state so that cpu c3 state doesn't lead to CP trouble
Set some bus state so that cpu c3 state doesn't lead to CP trouble
output of dmesg | grep drm
Set some bus state so that cpu c3 state doesn't lead to CP trouble
output of dmesg | grep drm
Print more debug information
output of dmesg | grep drm
Log CP status and remove useless debug output.
output of dmesg | grep drm
Keep CPU busy for sometimes after ring commit
Keep CPU busy for sometimes after ring commit
Keep CPU busy for sometimes after ring commit
Keep CPU busy for sometimes after ring commit
Keep CPU busy for sometimes after ring commit
Keep CPU busy for sometimes after ring commit
output of radeondump
Set wptr delay (necessary on some AGP chipset) and disable host path pretech to avoid some hang conditions
Tweak AGP (cripple & others workaround features)
Xorg.log obtained with your last patch Tweak AGP. ..
Another AGP tweak (cripple & others workaround features)
Tweak AGP (cripple & others workaround features)
Xorg.log obtained with C-State limited to C2

Description François Valenduc 2008-02-17 06:28:29 UTC
Latest working kernel version:2.6.24
Earliest failing kernel version: 2.6.25-rc2
Distribution: Gentoo
Hardware Environment: Intel Centrino, Acer Travelmate 4001 Lmi, ATI Radeon Mobility 9700
Software Environment:
Problem Description:
When X is started, the graphical interface is completely blocked. It's impossible to change to a text console. It's impossible to type my password in the KDM login window (or in GDM). However, the computer is not totally blocked since I can still make a connection with SSH. What is strange is that there are no new messages in dmesg when the problem occurs. I am using the radeon driver for my graphic card

After having tried git-bisect, it seems that the first bad commit is the following:

x86: remove special NUMAQ support in io_32.h
author	Andi Kleen <ak@suse.de>
	Mon, 4 Feb 2008 15:48:03 +0000 (16:48 +0100)
committer	Ingo Molnar <mingo@elte.hu>
	Mon, 4 Feb 2008 15:48:03 +0000 (16:48 +0100)
commit	1fba38703d0ce8a5ff0fad9df3eccc6b55cf2cfb
tree	8a64e30f37d5a5ff84ce462d0f20ae518c6e299d	tree | snapshot
parent	c7e844f0415252c7e1a2153a97e7a0c511d61ada	commit | diff
x86: remove special NUMAQ support in io_32.h

Now that the only user does it on its own remove the NUMAQ support macros
in io_32.h

The next step would be to convert the preprocessor mess to actually readable
standard inlines.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

If I revert this commit, things work much better. 

Steps to reproduce: boot the computer, wait for X to start KDM and for the login window to be frozen.
Comment 1 François Valenduc 2008-02-17 06:39:41 UTC
Created attachment 14876 [details]
kernel configuration file
Comment 2 Thomas Gleixner 2008-02-17 07:12:08 UTC
Thanks for doing the bisect.

You reverted the commit on top of rc2 ? It looks harmless, but it seems to introduce subtle wreckage. I'm going to revert it.

Thanks,
    tglx
Comment 3 Thomas Gleixner 2008-02-17 07:13:22 UTC
Hmm, I compiled your config with the patch applied and with it reverted. There is no difference in the binary image.
Comment 4 François Valenduc 2008-02-17 07:17:04 UTC
Indeed, I don't understand really what's happening. I have succesfully started the computer with this patch applied 5 times in a row without any problem. However, this morning, it failed everytime with this patch. So, this bug seems to occur randomly, which is probably not a good news.
Comment 5 Thomas Gleixner 2008-02-17 07:19:52 UTC
Yeah, it seems to be something else. Please check your bisect log again and maybe restart at some point before the lasts steps.
Comment 6 François Valenduc 2008-02-17 07:33:37 UTC
It seems that the problem occurs only if I patch the kernel with tuxonice. In fact, with kernel 2.6.24 and tuxonice 3.0-rc5, the problem already occurs. I tought that some parts of the tuxonice patch had now been merged in the mainline kernel and thus that the bug also occurs with release candidate of 2.6.25. That doesn't look correct.
What I found extremely strange is that the problem doesn't occur if I plug an USB mouse. However, if I remove usbhid support, the problem still occurs. For me, this is totally incomprehensible !
Comment 7 Thomas Gleixner 2008-02-17 08:15:22 UTC
> It seems that the problem occurs only if I patch the kernel with tuxonice. In
> fact, with kernel 2.6.24 and tuxonice 3.0-rc5, the problem already occurs. I
> tought that some parts of the tuxonice patch had now been merged in the
> mainline kernel and thus that the bug also occurs with release candidate of
> 2.6.25. That doesn't look correct.
> What I found extremely strange is that the problem doesn't occur if I plug an
> USB mouse. However, if I remove usbhid support, the problem still occurs. For
> me, this is totally incomprehensible !

So with plain 2.6.25-rc2 it does not happen, right. Only the
combination with tuxonice shows that ?

If yes, then please poke the tuxonice folks.

Thanks,

	tglx
Comment 8 François Valenduc 2008-02-17 08:21:29 UTC
For the moment, I would say that the problem doesn't happen anymore. But since it seems that it occurs randomly, I don't know what to think about it.
Comment 9 Rafael J. Wysocki 2008-02-17 09:43:14 UTC
FWIW, I saw a similar thing a couple of times on an ASUS L5D with plain 2.6.25-rc1.  That is, the box hung solid as soon as X was started (openSUSE 10.3 userland, 32-bit).

Strangely enough, this doesn't seem to happen on the same box with a 64-bit kernel and 64-bit userland (openSUSE 10.2).
Comment 10 Rafael J. Wysocki 2008-02-17 09:54:00 UTC
BTW, Thomas,

There is 1,5 GB of RAM is this box and highmem is used, so that _might_ be related to the kmap_atomic() warning that you observed.
Comment 11 Arjan van de Ven 2008-02-17 21:45:53 UTC
The original report is resolved (as a tuxonice issue)
Closing this bug since any followups are very very unlikely to be the same issue;
they need a separate bug.

I'll try to close as "INVALID" since that's the closest to "outside patch caused" that we have.
Comment 12 Anonymous Emailer 2008-02-18 04:59:09 UTC
Reply-To: akpm@linux-foundation.org

On Sun, 17 Feb 2008 06:28:29 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=10011
> 

Rafael, one for the post-2.6.24 regression list, please.
Comment 13 François Valenduc 2008-02-22 13:04:18 UTC
The problem now again occurs with the official 2.6.25-rc2 kernel. But it occurs randomly. What is extremely strange is that like I said in comment #6, there is no problem if I plug an USB mouse. However my Xorg config file doesn't contain any reference to an external mouse. Only the touchpad is listed as input device. 
Comment 14 François Valenduc 2008-02-22 13:05:30 UTC
Created attachment 14954 [details]
output of dmesg
Comment 15 François Valenduc 2008-02-24 08:06:18 UTC
I have removed some option from the kernel and now, it seems to work correctly. These options are:
 Kernel->user space relay support
Comment 16 François Valenduc 2008-02-24 08:08:26 UTC
I have removed some option from the kernel and now, it seems to work correctly. These options are:
 Kernel->user space relay support
 CPU idle PM support

Let's hope it keeps working correctly. I have also removed these options from the 2.6.24 kernel patched with tuxonice and it also works correctly. So for sure, one of this option must be the cause of the problem.
Comment 17 Rafael J. Wysocki 2008-02-24 16:21:20 UTC
Regressions list annotation:
Handled-By : Thomas Gleixner <tglx@linutronix.de>
Comment 18 François Valenduc 2008-02-25 10:05:26 UTC
I finally found that the problematic option is "Kernel->user space relay support". I have build 2.6.25-rc3 with CPU idle enabled and "user space relay support" disabled. This kernel works correctly. Then I build a kernel with these 2 options enabled and it crashes when X is started. Maybe Xorg is not the direct cause of the problem but the crash occurs when X is started. It's even impossible to connect via SSH to the PC. So, not only X is crashed but the kernel also.
Comment 19 François Valenduc 2008-02-25 13:25:45 UTC
(In reply to comment #18)
> I finally found that the problematic option is "Kernel->user space relay
> support". I have build 2.6.25-rc3 with CPU idle enabled and "user space relay
> support" disabled. This kernel works correctly. Then I build a kernel with
> these 2 options enabled and it crashes when X is started. Maybe Xorg is not
> the
> direct cause of the problem but the crash occurs when X is started. It's even
> impossible to connect via SSH to the PC. So, not only X is crashed but the
> kernel also.
> 

I hadn't wait enough for the problem to occur but "CPU idle PM support" also causes a problem. If I enabled it and if I wait a bit more than 30 seconds to type my password in the KDM login window, the screen is also blocked. The computer is not crashed since I can still make an SSH connection but it's impossible to type the password. The cursor stops blinking and I can't use the mouse to click on the buttons of this window. The only ways to power down my computer is via SSH or using the power button (which shows that ACPI events are still working and catched by acpid).
Comment 20 Arseny Solokha 2008-02-26 01:16:05 UTC
I have a problem that looks similar to submitter's problem, but I'm not sure that it's the same, and I don't sure it's correct place for my posting.
When I playing somewhat using xine-lib (firstly noticed with xine-lib 1.1.8) and X is running (7.3), my machine (firstly appeared with kernel 2.6.22) can stop to responds to any my actions. I didn't tested it with ssh connection, but I can get control back only with reset. It's noticeable that file, played with xine-lib, plays to it's end normally. Now I'm running 2.6.24 with Ingo Molnar's RT patch. System can freeze... for some time, but it's usually returns control back to me.

Latest working kernel version: 2.6.20 (I didn't test 2.6.21)
Earliest failing kernel version: 2.6.22 PREEMPT
Current failing kernel: 2.6.24 (I didn't test 2.6.23), PREEMPT, with RT patch by Ingo Molnar
Distribution: Gentoo
Hardware Environment: AMD Athlon XP 2600+, nForce 2, ATI Radeon
9200SE
Comment 21 Thomas Gleixner 2008-03-12 00:04:04 UTC
Francois,

is this problem still there in 2.6.25-rc5 ?

Thanks,
        tglx
Comment 22 François Valenduc 2008-03-12 10:50:54 UTC
The problem is still present in 2.6.25-rc5. But, if I remove CPU idle support, things are working well. So maybe my computer doesn't support this option.
Comment 23 Venkatesh Pallipadi 2008-03-12 11:06:44 UTC
Hmmmm. No. CPU Idle is not some special feature supported by computer. It is just specific to Linux kernel, just a cleaner way to handle CPU C-states.

Did 2.6.24 also fail when you have CPU IDLE enabled?

Can you get the SYSRQ-t output (you can possibly use /proc/sysrq-trigger as you can ssh into the system in hang state.

I am concerned that this may be some timing related issue that only happens once a while, irrespective of above config options. I mean, you said that even with CPU idle enabled, it does boot fine sometimes...
Comment 24 Venkatesh Pallipadi 2008-03-12 11:27:29 UTC
Francois,

What does the output of 
# grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*
look like with RC5 and CPU_IDLE configured?

Also, can you attach your acpidump output.

Thanks,
Venki
Comment 25 François Valenduc 2008-03-12 11:52:53 UTC
No, with CPU-idle, the problem always happen. The output of grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* is the following when the computer is blocked, which happens around 20 seconds after the start of KDM:

/sys/devices/system/cpu/cpu0/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE
/sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/name:C0
/sys/devices/system/cpu/cpu0/cpuidle/state0/power:4294967295
/sys/devices/system/cpu/cpu0/cpuidle/state0/time:0
/sys/devices/system/cpu/cpu0/cpuidle/state0/usage:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/desc:<null>
/sys/devices/system/cpu/cpu0/cpuidle/state1/latency:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/name:C1
/sys/devices/system/cpu/cpu0/cpuidle/state1/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state1/time:2
/sys/devices/system/cpu/cpu0/cpuidle/state1/usage:1
/sys/devices/system/cpu/cpu0/cpuidle/state2/desc:<null>
/sys/devices/system/cpu/cpu0/cpuidle/state2/latency:1
/sys/devices/system/cpu/cpu0/cpuidle/state2/name:C2
/sys/devices/system/cpu/cpu0/cpuidle/state2/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state2/time:17684
/sys/devices/system/cpu/cpu0/cpuidle/state2/usage:29
/sys/devices/system/cpu/cpu0/cpuidle/state3/desc:<null>
/sys/devices/system/cpu/cpu0/cpuidle/state3/latency:85
/sys/devices/system/cpu/cpu0/cpuidle/state3/name:C3
/sys/devices/system/cpu/cpu0/cpuidle/state3/power:0
/sys/devices/system/cpu/cpu0/cpuidle/state3/time:53347975
/sys/devices/system/cpu/cpu0/cpuidle/state3/usage:8187
Comment 26 Venkatesh Pallipadi 2008-03-12 11:59:32 UTC
Can you chaneg ACPI PROCESSOR config to built in (*) from module (m) and keep  CPU_IDLE configured and try booting with
- processor.max_cstate=1
- processor.max_cstate=2

And let me know whether any of them works?
Comment 27 François Valenduc 2008-03-12 12:15:00 UTC
Created attachment 15228 [details]
output of apcidump

The problem also occurs with kernel 2.6.24. Can you explain how to obtain the SYSRQ-t output ?
I also made the tests concerning the processor.max_cstate parameter and none of the two solves the problem.

Thanks for your help.
Comment 28 Venkatesh Pallipadi 2008-03-12 12:27:26 UTC
To make sure that processor.max_cstate setting worked: with the setting you should not see C3 (or C2 and C3 for max_cstate=1) in the above cpuidle output.


For sysrq, enable MAGIC_SYSRQ in your config and enable sysrq by
# echo 1 > /proc/sys/kernel/sysrq

Then you can use commands in Documentation/sysrq.txt by pressing magic key storke or by 
echo t > /proc/sysrq-trigger
Comment 29 François Valenduc 2008-03-12 12:55:16 UTC
It's strange, I have set processor.max_cstate to 2 and there are still line related to state3 in the cpuidle output.
Comment 30 François Valenduc 2008-03-12 12:58:44 UTC
Created attachment 15229 [details]
output of dmesg after triggering sysrq commands
Comment 31 Venkatesh Pallipadi 2008-03-12 12:59:56 UTC
Probably max_cstate parameter did not work. Did you change ACPI PROCESSSOR in
config to "y" from "m". Your original config had it as "m".
Comment 32 François Valenduc 2008-03-12 13:14:39 UTC
Sorry, I had understood the reverse and I had let processor configured as a module. If I compile it in the kernel instead, it works with processor.max_cstate set to 2 or 1. So it seems that the C3 state is problematic when CPU idle is enabled.
Comment 33 Len Brown 2008-03-12 14:09:19 UTC
does the behaviour change with and without CONFIG_NO_HZ?
Comment 34 Len Brown 2008-03-12 18:44:12 UTC
> No, with CPU-idle, the problem always happen.

So it fails in 2.6.24 just as much as it fails in 2.6.25?
 
> ACPI: HPET 1FEEBFA0, 0038 (r1 ACER   Kestrel  20020909 PTL         0)
> ACPI: BOOT 1FEEBFD8, 0028 (r1 ACER   Kestrel  20020909  LTP        1)
> ACPI: DMI detected: Acer
> ACPI: PM-Timer IO Port: 0x1008
> ACPI: HPET id: 0x8086a201 base: 0x0 is invalid

please try booting with "hpet=force" to see if that helps

when the system is "hung", please collect the output from
"cat /proc/timer_list" and "cat /proc/interrupts" and paste it here.

when the system is "hung", if you type "sleep 1" does it return?

please attach the output from lspci
Comment 35 François Valenduc 2008-03-13 11:02:50 UTC
Without CONFIG_HZ, it's even worse. The computer is directly blocked when X started and a few seconds after, the screen becomes black. It's also impossible to connect via SSH to the computer.
With CONFIG_HZ enabled and hpet=force as boot parameter, it hangs also when X is started, but not completely. I can make an SSH connection. The output of cat /proc/interrupts is the following:

           CPU0
  0:      46976    XT-PIC-XT        timer
  1:         14    XT-PIC-XT        i8042
  2:          0    XT-PIC-XT        cascade
  3:          1    XT-PIC-XT
  4:          1    XT-PIC-XT
  5:          1    XT-PIC-XT
  6:       1604    XT-PIC-XT        uhci_hcd:usb1, uhci_hcd:usb2, uhci_hcd:usb3, eth0, radeon@pci:0000:01:00.0
  7:          1    XT-PIC-XT
  8:         32    XT-PIC-XT        rtc
  9:      52040    XT-PIC-XT        acpi
 10:       1765    XT-PIC-XT        yenta, ohci1394, ehci_hcd:usb4, Intel 82801DB-ICH4, ipw2200
 12:        752    XT-PIC-XT        i8042
 14:       4859    XT-PIC-XT        ide0
 15:        325    XT-PIC-XT        ide1
NMI:          0   Non-maskable interrupts
TRM:          0   Thermal event interrupts
SPU:          0   Spurious interrupts
ERR:          0

The output of "cat /proc/timer_list" is the following:

Timer List Version: v0.3
HRTIMER_MAX_CLOCK_BASES: 2
now at 210040366375 nsecs

cpu: 0
 clock 0:
  .index:      0
  .resolution: 1 nsecs
  .get_time:   ktime_get_real
  .offset:     1205431075401541167 nsecs
active timers:
 clock 1:
  .index:      1
  .resolution: 1 nsecs
  .get_time:   ktime_get
  .offset:     0 nsecs
active timers:
 #0: <dfaedee8>, tick_sched_timer, S:01
 # expires at 210041000000 nsecs [in 633625 nsecs]
 #1: <dfaedee8>, it_real_fn, S:01
 # expires at 210042483543 nsecs [in 2117168 nsecs]
  .expires_next   : 210041000000 nsecs
  .hres_active    : 1
  .nr_events      : 169713
  .nohz_mode      : 2
  .idle_tick      : 75726000000 nsecs
  .tick_stopped   : 0
  .idle_jiffies   : 4294743022
  .idle_calls     : 10360
  .idle_sleeps    : 7607
  .idle_entrytime : 75773696308 nsecs
  .idle_waketime  : 75781004081 nsecs
  .idle_exittime  : 75781012671 nsecs
  .idle_sleeptime : 47747011259 nsecs
  .last_jiffies   : 4294743069
  .next_jiffies   : 4294743077
  .idle_expires   : 75781000000 nsecs
jiffies: 4294877336


Tick Device: mode:     1
Clock Event Device: hpet
 max_delta_ns:   2147483647
 min_delta_ns:   3352
 mult:           61496110
 shift:          32
 mode:           3
 next_event:     210041000000 nsecs
 set_next_event: hpet_legacy_next_event
 set_mode:       hpet_legacy_set_mode
 event_handler:  hrtimer_interrupt


Also, "sleep 1" doesn't "unfreeze" X.
Comment 36 François Valenduc 2008-03-13 11:04:17 UTC
Created attachment 15251 [details]
output of lspci
Comment 37 Thomas Gleixner 2008-03-28 01:49:46 UTC
Is the problem still there with 2.6.25-rc7 ?
Comment 38 François Valenduc 2008-03-29 23:41:31 UTC
Unfortunately, the problem still occurs in the same way with 2.6.25-rc7.
Comment 39 François Valenduc 2008-03-30 00:23:26 UTC
And it also occurs with the current git version (2.6.25-rc7-git5).
Comment 40 Venkatesh Pallipadi 2008-03-31 09:51:35 UTC
Can you please try the patch here on rc7-git5 and see whether that helps?

Patch : http://marc.info/?l=linux-kernel&m=120674502201007&w=4
Comment 41 François Valenduc 2008-03-31 10:36:59 UTC
Unfortunately, the patch doesn't change anything. The graphical interface is still frozen around 10 seconds after the start of X.
Comment 42 Venkatesh Pallipadi 2008-03-31 14:30:24 UTC
Running out of ideas on this one.

Does the problem happen everytime with the git kernel or is it only one in several reboots? If it is happening everytime, it will help if you can try git bisect to narrow down on specific set of patches...
Comment 43 François Valenduc 2008-03-31 21:49:36 UTC
CPU idle has never worked on my computer, with any kernel version. It doesn't work  with 2.6.24 where it was first introduced if I am not wrong. In any of the newer kernels or git version, it doesn't work too. So, I think git-bisect won't be useful.
Is it possible that my computer doesn't support CPU Idle ? Or maybe it doesn't support the C3 state (since it work if I set C2 as max_cstate).
Comment 44 Venkatesh Pallipadi 2008-04-01 08:50:40 UTC
OK. Atleast this is not a regression bug since 2.6.24.

Having said that, if you see all 3 C-states in /proc/acpi/processor/*/power with the kernel not having CPU_IDLE configured and with CPU_IDLE configured it can only work with max_cstate=2, then there is some bug in CPU_IDLE.

I think we can reduce the priority of this one as it is not regression. I will send some debug patch soon that should help us identify what is going wrong with CPU_IDLE. In the mean time, can you attach the output of
# cat /proc/acpi/processor/*/power
with CPU_IDLE not configured.
Comment 45 Venkatesh Pallipadi 2008-04-01 09:00:09 UTC
unmarking regression flag.
Comment 46 François Valenduc 2008-04-01 09:58:21 UTC
Here is the output of cat /proc/acpi/processor/*/power: 

active state:            C2
max_cstate:              C8
bus master activity:     40020001
maximum allowed latency: 2000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--] latency[000] usage[00000010] duration[00000000000000000000]
   *C2:                  type[C2] promotion[C3] demotion[C1] latency[001] usage[00013445] duration[00000000000266258279]
    C3:                  type[C3] promotion[--] demotion[C2] latency[085] usage[00000225] duration[00000000000002117141]
Comment 47 François Valenduc 2008-04-01 09:59:36 UTC
(In reply to comment #46)
> Here is the output of cat /proc/acpi/processor/*/power: 
> 
> active state:            C2
> max_cstate:              C8
> bus master activity:     40020001
> maximum allowed latency: 2000 usec
> states:
>     C1:                  type[C1] promotion[C2] demotion[--] latency[000]
> usage[00000010] duration[00000000000000000000]
>    *C2:                  type[C2] promotion[C3] demotion[C1] latency[001]
> usage[00013445] duration[00000000000266258279]
>     C3:                  type[C3] promotion[--] demotion[C2] latency[085]
> usage[00000225] duration[00000000000002117141]
> 

The command was run with kernel 2.6.24.4 with CPU_IDLE disabled.
Comment 48 Len Brown 2008-04-01 18:57:26 UTC
It is important to find out for sure if this failure is
specific to CPU_IDLE, or it if the root cause existed
before CPU_IDLE, but perhaps was just hidden.

please boot a CONFIG_CPU_IDLE=n kernel
with processor.bm_history=0
to see if the non-cpuidle kernel can also fail.
This will make it more aggressive about entering C3,
and that should be reflected in
/proc/acpi/processor/*/power 

Or, if you have USB devices plugged in, you might try
unplugging them to reduce the bus master history interference
and enter C3 more to stress it more.
Comment 49 François Valenduc 2008-04-01 22:11:01 UTC
The failure might not be in CPU_IDLE. If I boot with a kernel without cpu idle and with  processor.bm_history=0 like you suggested, I also encounter the problem.

The output of /proc/acpi/processor/*/power is the following.

active state:            C2
max_cstate:              C8
bus master activity:     24402041
maximum allowed latency: 2000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--] latency[000] usage[00000010] duration[00000000000000000000]
   *C2:                  type[C2] promotion[C3] demotion[C1] latency[001] usage[00008880] duration[00000000000099016541]
    C3:                  type[C3] promotion[--] demotion[C2] latency[085] usage[00003717] duration[00000000000026832004]

So, the current C-state is C2 but C3 as ben use more often.
Comment 50 François Valenduc 2008-04-02 00:17:26 UTC
Your remark about usb devices also remind me that if I use a kernel with CPU-idle enabled, it worked well (without setting the max cstate) if I use an usb mouse. Maybe this prevent the computer from switching too often to the C3 state.
So, if I understand correctly, the C3 state seems to be problematic for my computer and CPU idle is not the real cause of the problem. It simply trigger another problem.
Comment 51 Venkatesh Pallipadi 2008-04-02 05:32:55 UTC
Yes. C3 being problematic was our suspicion as well.
There is some interaction between C3 state and X driver. If we frequently use C3 during X startup it can hang the driver. CPU_IDLE is aggressive with going into C3 state (to save more power) and thus exposes the problem. Your observation with USB mouse reinforces this theory. 
Comment 52 Len Brown 2008-04-09 19:01:16 UTC
Did this problem exist in 2.6.24?
Comment 53 Len Brown 2008-04-09 19:02:55 UTC
also, can you clarify what graphics drivers you are using
to talk to the Radeon and if different drivers have any effect?
Comment 54 François Valenduc 2008-04-10 09:41:17 UTC
I have already said that this problem occurs with kernel 2.6.24 and all release candidates of git snapshot of 2.6.25 when CPU_IDLE is enabled (see comment #43 for example). But, if CPU_IDLE is disabled, the problem also occurs if I set processor.bm_history=0 like you suggested (see comment #48).

I am using the radeon DRM driver included in the kernel. I have also tried  the proprietary fglrx driver and the problem occurs with this driver as well.
Comment 55 Venkatesh Pallipadi 2008-04-10 10:09:55 UTC
As the problem happens with in-tree radeon DRM driver, I am thinking adding a DMI check for this platform and disable C3 state. Unless someone who understands radeon DRM driver and can figure out where exactly and why it is hanging.

Can yo attach the dmidecode info from this laptop. I can send a patch to disable C3 based on that.
Comment 56 François Valenduc 2008-04-10 10:19:20 UTC
Created attachment 15721 [details]
output of dmidecode
Comment 57 James Ettle 2008-04-21 01:29:58 UTC
This is interesting. I have a T8100-based notebook that does something similar. It doesn't lock solid, but if I let it idle, and keep interrupt chatter to an absolute minimum (no mouse, unplug network, disable chatty USB devices) it enters a "daydream" state where the machine pauses. Touch the mouse or keyboard and it comes back to life as if nothing happened, and I can't see any other ill effect. I'm wondering if it's related to this, booting nohz=off prevents the problem.
Comment 58 François Valenduc 2008-05-12 04:18:38 UTC
The same problem occurs again with kernel 2.6.26-rc1, even when CPÜ_IDLE is not enabled. So, this becomes extremely annoying ! 
Comment 59 Venkatesh Pallipadi 2008-05-13 17:21:44 UTC
Created attachment 16129 [details]
workaround patch

Patch to add this laptop to max_cstate blacklist. This auto-disables C3 on this model.
Comment 60 François Valenduc 2008-05-14 00:07:26 UTC
So, with this patch, my computer is not blocked anymore when X starts. It seems rather a way to get rid of the problem than to really solve it. What I don't understand is that the problem now occurs with or without CPU IDLE.
Comment 61 Venkatesh Pallipadi 2008-05-14 10:06:14 UTC
Yes. The problem is with C3 state and X driver. It is just some timing and frequency of C3 invocation that made this not happen without CPU_IDLE earlier.

Yes. The patch is just a workaround/bandaid for now. We need to know more about X driver on this platform to narrow this down. Copying Dave who may be able to help us with radeon drm driver...
Comment 62 François Valenduc 2008-05-14 12:43:40 UTC
After further investigation, I think the problem appears somewhere between 2.6.25-git7 and -git8. I have tried git-bisect but I end up with a kernel panic at each steps, so it's difficult to be more precise. 
I also find strange that the problem also appears with max_cstate=2 as kernel parameter and ACPI processor support compiled in the kernel. Previously, setting the max_cstate parameter was a way to avoid the problem.
Comment 63 Dave Airlie 2008-05-14 16:07:27 UTC
can you try with Option "DRI" "Off" in your xorg.conf

this will rule out the drm kernel driver and at least place the issue up with X itself.

Is this an AGP system?
Comment 64 François Valenduc 2008-05-15 10:18:05 UTC
When I disable DRI, X works correctly and doesn't freeze. So, there is probably a problem in the drm or radeon driver.
This is an AGP system.
Comment 65 François Valenduc 2008-05-20 02:44:28 UTC
X works again correctly with the latest git version of 2.6.26-r3, without the workaround patch. It seems that the commit 860da5e578c25d1ab4528c0d1ad13f9969e3490f (Merge branch 'drm-patches' of git://git./linux/kernel/git/airlied/drm-2.6) solves the problem. 
Even if CPU_IDLE is still problematic, X doesn't freeze anymore if I disable it (like with kernel 2.6.25).
Comment 66 François Valenduc 2008-07-16 06:52:25 UTC
This problem occurs again with the final release of 2.6.26 (and maybe already with the release candidates). The only way to avoid the problem for sure for the moment is to apply the workaround patch. The commit I mentionned in comment #65 helps a bit. However, with kernel 2.6.26 and without the workaround, the screen becomes immediately black after X startup and it seems impossible to unfreeze X.
Is this problem impossible to solve ?
Comment 67 François Valenduc 2008-07-16 07:01:41 UTC
I forgot to add that if I disable DPMS in xorg.conf, the screen doesn't becomes black after X startup but X is almost immediately blocked after X startup.
Comment 68 François Valenduc 2008-07-17 11:02:36 UTC
With the latest git (2.6.26-git5), the output of cat /proc/acpi/processor/CPU0/power is the following:

active state:            C3
max_cstate:              C8
bus master activity:     00000000
maximum allowed latency: 2000000000 usec
states:
    C1:                  type[C1] promotion[C2] demotion[--] latency[000] usage[00000010] duration[00000000000000000000]
    C2:                  type[C2] promotion[C3] demotion[C1] latency[001] usage[00035816] duration[00000000000328423975]
   *C3:                  type[C3] promotion[--] demotion[C2] latency[085] usage[00000183] duration[00000000000001543805]

If I compare with the stats quoted in comment #49, the maximum allowed latency is 1000000 higher. Is it really normal ?
Comment 69 François Valenduc 2008-07-23 03:10:01 UTC
Does anybody still care about this very annoying problem ? Or do you plan to submit the workaound patch for inclusion in the official kernel if no other solution can be found ?
Comment 70 François Valenduc 2008-07-23 04:26:05 UTC
I have retried with kernel 2.6.24.7 and the same problem still occurs. Maybe, this problem is there since a very long time. Previously, I didn't notice the problem because at that time, I used an USB mouse.
It's still true that X doesn't hang with 2.6.26 if I plug an USB mouse. As explained in comment #51. However, without an USB mouse plugged, X hangs all the time when X starts.
Comment 71 Jérôme Glisse 2008-09-16 07:11:02 UTC
Does this still happen with 2.6.27 rc or git ? If so output of dmesg after doing :

echo 1 > /sys/module/drm/parameters/debug &&
echo 0 > /sys/module/drm/parameters/debug

might be helpful (radeon might flood your log with same message over and over
this why activating debug for a short time is enough).
Comment 72 François Valenduc 2008-09-16 10:14:05 UTC
The problem still occurs whith 2.6.27-rc6. Doing echo 1 > /sys/module/drm/parameters/debug when the problem has occured gives these messages in dmesg:

irq 6: nobody cared (try booting with the "irqpoll" option)
Pid: 6726, comm: X Not tainted 2.6.27-rc6 #8
 [<c044fd04>] __report_bad_irq+0x24/0x90
 [<e125f9af>] b44_interrupt+0x3f/0x100 [b44]
 [<c044ff9f>] note_interrupt+0x22f/0x260
 [<c044f2e5>] handle_IRQ_event+0x25/0x60
 [<c04508d0>] handle_level_irq+0x0/0xa0
 [<c045094c>] handle_level_irq+0x7c/0xa0
 [<c0405b7f>] do_IRQ+0x6f/0xc0
 [<c0403bf7>] common_interrupt+0x23/0x28
 [<c041ef0e>] __do_softirq+0x2e/0x90
 [<c041eee0>] __do_softirq+0x0/0x90
 [<c0405812>] call_on_stack+0x12/0x20
 [<c041eeb5>] irq_exit+0x45/0x70
 [<c0405b86>] do_IRQ+0x76/0xc0
 [<c0403bf7>] common_interrupt+0x23/0x28
 [<c04d0000>] kobject_release+0x30/0x80
 [<e17463c6>] radeon_do_wait_for_idle+0x86/0x160 [radeon]
 [<e1746d10>] radeon_cp_idle+0x0/0xc0 [radeon]
 [<e1746d10>] radeon_cp_idle+0x0/0xc0 [radeon]
 [<e171031a>] drm_ioctl+0x1ba/0x2f0 [drm]
 [<e1710160>] drm_ioctl+0x0/0x2f0 [drm]
 [<c047f669>] vfs_ioctl+0x69/0x70
 [<c047f6cc>] do_vfs_ioctl+0x5c/0x250
 [<c05bb0c2>] schedule+0x172/0x2b0
 [<c047f8fd>] sys_ioctl+0x3d/0x70
 [<c0403a29>] sysenter_do_call+0x12/0x25
 =======================
handlers:
[<e12b62e0>] (usb_hcd_irq+0x0/0x70 [usbcore])
[<e12b62e0>] (usb_hcd_irq+0x0/0x70 [usbcore])
[<e12b62e0>] (usb_hcd_irq+0x0/0x70 [usbcore])
[<e125f970>] (b44_interrupt+0x0/0x100 [b44])
[<e1753d70>] (radeon_driver_irq_handler+0x0/0x170 [radeon])
Disabling IRQ #6
Comment 73 Jérôme Glisse 2008-09-16 13:29:12 UTC
Created attachment 17814 [details]
Set some bus state so that cpu c3 state doesn't lead to CP trouble

Attached is a patch which might help to fix this issue if it doesn't it will at 
least provide some more debugging informations into your kernel log (no need to
enable drm debug).
Comment 74 François Valenduc 2008-09-16 22:01:02 UTC
I have tried your patch and unfortunately, it produces the following compile error:

 CC [M]  drivers/gpu/drm/radeon/radeon_cp.o
  CC [M]  drivers/gpu/drm/radeon/radeon_irq.o
drivers/gpu/drm/radeon/radeon_irq.c: In function 'radeon_acknowledge_irqs':
drivers/gpu/drm/radeon/radeon_irq.c:42: error: expected expression before '^' token
drivers/gpu/drm/radeon/radeon_irq.c:43: error: expected expression before '^' token
distcc[10266] ERROR: compile drivers/gpu/drm/radeon/radeon_irq.c on pc-francois failed
make[4]: *** [drivers/gpu/drm/radeon/radeon_irq.o] Erreur 1
make[3]: *** [drivers/gpu/drm/radeon] Erreur 2
make[2]: *** [drivers/gpu/drm] Erreur 2
make[1]: *** [drivers/gpu] Erreur 2
make[1]: *** Attente des tâches non terminées....
make: *** [drivers] Erreur 2
Comment 75 Jérôme Glisse 2008-09-17 02:10:28 UTC
Created attachment 17831 [details]
Set some bus state so that cpu c3 state doesn't lead to CP trouble

Sorry once again i used the wrong operator, attached patch should compile.
Comment 76 François Valenduc 2008-09-17 10:10:44 UTC
Unfortunately, your patch doesn't solve the problem. Here is what I get in dmesg:

[drm] Initialized drm 1.1.0 20060810
pci 0000:01:00.0: power state changed by ACPI to D0
pci 0000:01:00.0: PCI INT A -> Link[LNKA] -> GSI 6 (level, low) -> IRQ 6
[drm] Initialized radeon 1.29.0 20080528 on minor 0
agpgart-intel 0000:00:00.0: AGP 2.0 bridge
agpgart-intel 0000:00:00.0: putting AGP V2 device into 4x mode
pci 0000:01:00.0: putting AGP V2 device into 4x mode
[drm] Setting GART location based on new memory map
[drm] Loading R300 Microcode
[drm] initial BUS_CNTL  : 0x5133A2A0
[drm] initial BUS_CNTL1 : 0x00004090
[drm] set BUS_CNTL  : 0x5133A2A0
[drm] set BUS_CNTL1 : 0x00004090
[drm] Num pipes: 1
[drm] writeback test succeeded in 1 usecs
[drm] irq not acking : 0x00080026
evdev.c(EVIOCGBIT): Suspicious buffer size 511, limiting output to 64 bytes. See http://userweb.kernel.org/~dtor/eviocgbit-bug.html


I have never seen the message related to evdev before. Do you think it's related to the problem ?
Comment 77 Jérôme Glisse 2008-09-17 12:12:04 UTC
When you activated drm debug was there any message begining with :

wait idle failed status

In any of your log file ?
Comment 78 François Valenduc 2008-09-17 12:38:16 UTC
Created attachment 17841 [details]
output of dmesg | grep drm

I didn't find any line beginning with "wait idle failed status". You can find the dmesg output containing the drm related messages with debug enabled in this log.
Comment 79 Jérôme Glisse 2008-09-18 03:17:57 UTC
Created attachment 17852 [details]
Set some bus state so that cpu c3 state doesn't lead to CP trouble

Okay here is another patch which set some more bus state and and some more 
debugging informations. Could you send the log with the echo 1 > /sys/./drm
stuff and grep drm in your log once the lockup happen and attach it.
Comment 80 François Valenduc 2008-09-18 04:34:54 UTC
The log I send yesterday is obtained when the problem has occured when the drm module is loaded with debug enabled.
Do you want another log obtained with the last patch ?
Comment 81 Jérôme Glisse 2008-09-18 05:20:38 UTC
Yes please a log with the lastest patch as this patch add some more debug
informations which might be insight full. Also this patch might help fixing
this but i am not believing too much in that.
Comment 82 François Valenduc 2008-09-18 09:55:16 UTC
Created attachment 17863 [details]
output of dmesg | grep drm

As you expected, this patch doesn't solve the problem. Furthermore, there are now several lines indicating "wait idle failed status" in dmesg.
Comment 83 Jérôme Glisse 2008-09-18 11:18:23 UTC
Created attachment 17864 [details]
Print more debug information

Sorry i misplaced the debug information could you run again and attach log output 
with this patch ?
Comment 84 François Valenduc 2008-09-23 13:39:50 UTC
Created attachment 17976 [details]
output of dmesg | grep drm

So, after some delay, here is the output of dmesg | grep drm with your last patch.
Comment 85 Jérôme Glisse 2008-09-24 02:17:58 UTC
Created attachment 18000 [details]
Log CP status and remove useless debug output.

Unfortunetly this output is not interesting at all. It shows that it fails
to get a free buffer because the CP is stuck. It doesn't include the debug
output i was looking for. Attached is a patch which remove this debug output.
Also the debug output might be verbose so maybe the message i was looking for
was cut. For reference i am looking for :

wait idle failed status : 

Followed by 3 values this are the values i am interested in.
Comment 86 François Valenduc 2008-09-24 10:01:02 UTC
Created attachment 18006 [details]
output of dmesg | grep drm

So here is another dmesg output obtained with your last patch.

There are a lot of lines like the following:
[drm:radeon_do_cp_idle]
[drm:radeon_do_wait_for_idle] wait idle failed status : 0x80010140 0x00000000 0xC0002804

Is this what you were looking for ?
Comment 87 Jérôme Glisse 2008-09-24 13:41:11 UTC
Created attachment 18015 [details]
Keep CPU busy for sometimes after ring commit

This was not state i expected to see anyway attached is a hack that might
help, basicly it keep busy the CPU a bit longer after commiting the ring.
Comment 88 François Valenduc 2008-09-24 13:54:17 UTC
Unfortunately, this patch gives the following compile error:

  CC [M]  drivers/gpu/drm/radeon/radeon_state.o
drivers/gpu/drm/radeon/radeon_cp.c: In function 'radeon_do_cp_idle':
drivers/gpu/drm/radeon/radeon_cp.c:414: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_cp.c:421: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_cp.c: In function 'radeon_do_cp_start':
drivers/gpu/drm/radeon/radeon_cp.c:439: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_cp.c:450: error: 'for' loop initial declaration used outside C99 mode
distcc[12998] ERROR: compile drivers/gpu/drm/radeon/radeon_cp.c on localhost failed
make[4]: *** [drivers/gpu/drm/radeon/radeon_cp.o] Error 1
make[4]: *** Waiting for unfinished jobs....
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_check_and_fixup_packets':
drivers/gpu/drm/radeon/radeon_state.c:174: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_clip_rect':
drivers/gpu/drm/radeon/radeon_state.c:434: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_state':
drivers/gpu/drm/radeon/radeon_state.c:466: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:485: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:492: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:502: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:512: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:521: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:533: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:542: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:555: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:575: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:595: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_state2':
drivers/gpu/drm/radeon/radeon_state.c:620: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_clear_box':
drivers/gpu/drm/radeon/radeon_state.c:764: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:770: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_dispatch_clear':
drivers/gpu/drm/radeon/radeon_state.c:879: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:905: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:927: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1003: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1035: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1058: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1086: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1109: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1197: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1226: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1273: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1297: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1333: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_dispatch_swap':
drivers/gpu/drm/radeon/radeon_state.c:1359: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1373: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1410: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_dispatch_flip':
drivers/gpu/drm/radeon/radeon_state.c:1437: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1457: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_dispatch_vertex':
drivers/gpu/drm/radeon/radeon_state.c:1525: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_discard_buffer':
drivers/gpu/drm/radeon/radeon_state.c:1551: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_dispatch_indirect':
drivers/gpu/drm/radeon/radeon_state.c:1583: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_dispatch_texture':
drivers/gpu/drm/radeon/radeon_state.c:1679: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1854: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1871: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1885: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:1889: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_dispatch_stipple':
drivers/gpu/drm/radeon/radeon_state.c:1901: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_clear':
drivers/gpu/drm/radeon/radeon_state.c:2131: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_do_init_pageflip':
drivers/gpu/drm/radeon/radeon_state.c:2144: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_flip':
drivers/gpu/drm/radeon/radeon_state.c:2179: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_swap':
drivers/gpu/drm/radeon/radeon_state.c:2199: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_vertex':
drivers/gpu/drm/radeon/radeon_state.c:2275: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_indices':
drivers/gpu/drm/radeon/radeon_state.c:2363: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_stipple':
drivers/gpu/drm/radeon/radeon_state.c:2409: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_indirect':
drivers/gpu/drm/radeon/radeon_state.c:2459: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:2474: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_vertex2':
drivers/gpu/drm/radeon/radeon_state.c:2566: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_packets':
drivers/gpu/drm/radeon/radeon_state.c:2596: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_scalars':
drivers/gpu/drm/radeon/radeon_state.c:2615: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_scalars2':
drivers/gpu/drm/radeon/radeon_state.c:2637: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_vectors':
drivers/gpu/drm/radeon/radeon_state.c:2657: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_veclinear':
drivers/gpu/drm/radeon/radeon_state.c:2683: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_packet3':
drivers/gpu/drm/radeon/radeon_state.c:2713: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_packet3_cliprect':
drivers/gpu/drm/radeon/radeon_state.c:2763: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:2770: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_emit_wait':
drivers/gpu/drm/radeon/radeon_state.c:2792: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:2797: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c:2802: error: 'for' loop initial declaration used outside C99 mode
drivers/gpu/drm/radeon/radeon_state.c: In function 'radeon_cp_cmdbuf':
drivers/gpu/drm/radeon/radeon_state.c:2967: error: 'for' loop initial declaration used outside C99 mode
distcc[13031] ERROR: compile drivers/gpu/drm/radeon/radeon_state.c on pc-francois failed
Comment 89 Jérôme Glisse 2008-09-25 02:11:50 UTC
Created attachment 18025 [details]
Keep CPU busy for sometimes after ring commit

Sorry i thought to it after, i is a way to common to be used in a macro :)
Here is another hack.
Comment 90 Jérôme Glisse 2008-09-25 02:13:55 UTC
Created attachment 18026 [details]
Keep CPU busy for sometimes after ring commit

My bad once again, sorry i am always in hurry.
Comment 91 François Valenduc 2008-09-25 05:12:57 UTC
This time, it gives the following error:

  CC [M]  drivers/gpu/drm/radeon/radeon_state.o
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_clear’:
drivers/gpu/drm/radeon/radeon_state.c:2131: erreur: ‘zzi’ undeclared (first use in this function)
drivers/gpu/drm/radeon/radeon_state.c:2131: erreur: (Each undeclared identifier is reported only once
drivers/gpu/drm/radeon/radeon_state.c:2131: erreur: for each function it appears in.)
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_flip’:
drivers/gpu/drm/radeon/radeon_state.c:2179: erreur: ‘zzi’ undeclared (first use in this function)
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_swap’:
drivers/gpu/drm/radeon/radeon_state.c:2199: erreur: ‘zzi’ undeclared (first use in this function)
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_vertex’:
drivers/gpu/drm/radeon/radeon_state.c:2275: erreur: ‘zzi’ undeclared (first use in this function)
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_indices’:
drivers/gpu/drm/radeon/radeon_state.c:2363: erreur: ‘zzi’ undeclared (first use in this function)
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_stipple’:
drivers/gpu/drm/radeon/radeon_state.c:2409: erreur: ‘zzi’ undeclared (first use in this function)
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_vertex2’:
drivers/gpu/drm/radeon/radeon_state.c:2566: erreur: ‘zzi’ undeclared (first use in this function)
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_cmdbuf’:
drivers/gpu/drm/radeon/radeon_state.c:2967: erreur: ‘zzi’ undeclared (first use in this function)
distcc[9973] ERROR: compile drivers/gpu/drm/radeon/radeon_state.c on localhost failed
make[4]: *** [drivers/gpu/drm/radeon/radeon_state.o] Erreur 1
make[4]: *** Attente des tâches non terminées....
make[3]: *** [drivers/gpu/drm/radeon] Erreur 2
make[2]: *** [drivers/gpu/drm] Erreur 2
make[1]: *** [drivers/gpu] Erreur 2
make: *** [drivers] Erreur 2
make: *** Attente des tâches non terminées....
zsh: exit 2     make "CC=distcc i686-pc-linux-gnu-gcc" -j3
Comment 92 Jérôme Glisse 2008-09-25 06:29:12 UTC
Created attachment 18031 [details]
Keep CPU busy for sometimes after ring commit

I am very sorry i don't the infrastructure to build a kernel right now so
i am doing patch as a blind people. Hopefully This one should compile.
Comment 93 François Valenduc 2008-09-25 07:29:34 UTC
Unfortunately, this produce yet another error:

  CC [M]  drivers/gpu/drm/radeon/radeon_state.o
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_clear’:
drivers/gpu/drm/radeon/radeon_state.c:2116: attention : unused variable ‘ring’
drivers/gpu/drm/radeon/radeon_state.c:2116: attention : unused variable ‘mask’
drivers/gpu/drm/radeon/radeon_state.c:2116: attention : unused variable ‘_nr’
drivers/gpu/drm/radeon/radeon_state.c:2116: attention : unused variable ‘write’
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_flip’:
drivers/gpu/drm/radeon/radeon_state.c:2169: attention : unused variable ‘ring’
drivers/gpu/drm/radeon/radeon_state.c:2169: attention : unused variable ‘mask’
drivers/gpu/drm/radeon/radeon_state.c:2169: attention : unused variable ‘_nr’
drivers/gpu/drm/radeon/radeon_state.c:2169: attention : unused variable ‘write’
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_swap’:
drivers/gpu/drm/radeon/radeon_state.c:2189: attention : unused variable ‘ring’
drivers/gpu/drm/radeon/radeon_state.c:2189: attention : unused variable ‘mask’
drivers/gpu/drm/radeon/radeon_state.c:2189: attention : unused variable ‘_nr’
drivers/gpu/drm/radeon/radeon_state.c:2189: attention : unused variable ‘write’
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_vertex’:
drivers/gpu/drm/radeon/radeon_state.c:2214: attention : unused variable ‘ring’
drivers/gpu/drm/radeon/radeon_state.c:2214: attention : unused variable ‘mask’
drivers/gpu/drm/radeon/radeon_state.c:2214: attention : unused variable ‘_nr’
drivers/gpu/drm/radeon/radeon_state.c:2214: attention : unused variable ‘write’
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_indices’:
drivers/gpu/drm/radeon/radeon_state.c:2292: attention : unused variable ‘ring’
drivers/gpu/drm/radeon/radeon_state.c:2292: attention : unused variable ‘mask’
drivers/gpu/drm/radeon/radeon_state.c:2292: attention : unused variable ‘_nr’
drivers/gpu/drm/radeon/radeon_state.c:2292: attention : unused variable ‘write’
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_stipple’:
drivers/gpu/drm/radeon/radeon_state.c:2404: erreur: conflicting types for ‘mask’
drivers/gpu/drm/radeon/radeon_state.c:2403: erreur: previous declaration of ‘mask’ was here
drivers/gpu/drm/radeon/radeon_state.c:2413: attention : passing argument 2 of ‘radeon_cp_dispatch_stipple’ makes pointer from integer without a cast
drivers/gpu/drm/radeon/radeon_state.c:2404: attention : unused variable ‘ring’
drivers/gpu/drm/radeon/radeon_state.c:2404: attention : unused variable ‘_nr’
drivers/gpu/drm/radeon/radeon_state.c:2404: attention : unused variable ‘write’
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_vertex2’:
drivers/gpu/drm/radeon/radeon_state.c:2493: attention : unused variable ‘ring’
drivers/gpu/drm/radeon/radeon_state.c:2493: attention : unused variable ‘mask’
drivers/gpu/drm/radeon/radeon_state.c:2493: attention : unused variable ‘_nr’
drivers/gpu/drm/radeon/radeon_state.c:2493: attention : unused variable ‘write’
drivers/gpu/drm/radeon/radeon_state.c: In function ‘radeon_cp_cmdbuf’:
drivers/gpu/drm/radeon/radeon_state.c:2830: attention : unused variable ‘ring’
drivers/gpu/drm/radeon/radeon_state.c:2830: attention : unused variable ‘mask’
drivers/gpu/drm/radeon/radeon_state.c:2830: attention : unused variable ‘_nr’
drivers/gpu/drm/radeon/radeon_state.c:2830: attention : unused variable ‘write’
distcc[9601] ERROR: compile drivers/gpu/drm/radeon/radeon_state.c on localhost failed
make[4]: *** [drivers/gpu/drm/radeon/radeon_state.o] Erreur 1
make[4]: *** Attente des tâches non terminées....
make[3]: *** [drivers/gpu/drm/radeon] Erreur 2
make[2]: *** [drivers/gpu/drm] Erreur 2
make[1]: *** [drivers/gpu] Erreur 2
make[1]: *** Attente des tâches non terminées....
make: *** [drivers] Erreur 2
zsh: exit 2     make "CC=distcc i686-pc-linux-gnu-gcc" -j3
Comment 94 Jérôme Glisse 2008-09-25 08:08:29 UTC
Created attachment 18035 [details]
Keep CPU busy for sometimes after ring commit

Is this one better ?
Comment 95 François Valenduc 2008-09-25 09:45:12 UTC
Your patch generates a reject in r300_cmdbuf.c. If I well unterstood, I simply needed to move the declaration of zzi (int zzi;). After having moved that line, no compilation errors occured. Unfortunately, the problem still occurs.

There are plenty of line like this in dmesg when X hangs:

[drm:radeon_cp_idle] 
[drm:radeon_do_cp_idle] 
[drm:radeon_do_wait_for_fifo] wait for fifo failed status : 0x80036100 0x00000000
[drm:drm_ioctl] ret = fffffff0
[drm:drm_ioctl] pid=7276, cmd=0x6444, nr=0x44, dev 0xe200, auth=1
Comment 96 François Valenduc 2008-09-25 09:47:02 UTC
Created attachment 18040 [details]
Keep CPU busy for sometimes after ring commit

Here is the patch I applied.
Comment 97 Jérôme Glisse 2008-09-25 14:09:20 UTC
Does it help ?
Comment 98 Jérôme Glisse 2008-09-25 14:38:32 UTC
Okay let's try a different approach here. I need you to git clone this:

git clone git://people.freedesktop.org/~glisse/radeondump
cd radeondump
cmake .
make

Then once lockup happen log in through ssh as root and do
./radeondump -d lockup
Reboot few times (3 to 5 dumps should do it) and do a dump
each time it lockups. You should endup with several lockup-*-
files. (Note you don't need any of the previous patch)

Then install fglrx and do
./radeondump -d fglrx
same do it few times and do stuff btw dump (launch application
browse the web) 3 to 5 dump should do it. Do a tar of all
this dumps and attach it to this bug.

Basicly radeondump will dump several radeon config registers
dumping this register with fglrx and in lockup case will help
to find out which kind of configuration we do wrong if any.


Also does adding option :
 Option "BusType" "PCI"

to the device section about your card in xorg.conf helps ?
Comment 99 Jérôme Glisse 2008-09-25 14:39:08 UTC
I forgot you must run radeondump as root
Comment 100 François Valenduc 2008-09-26 11:21:19 UTC
As you suggested, Ihave add a the option "BusType" "PCI" line in xorg.conf. I also removed the line setting the BusID which was the following:
BusID       "PCI:1:0:0"

Now, it works perfectly well with CPU idle enabled and the workaround patch limiting C-State to C2 reverted !
So, this bug was in fact a problem of configuration.
Comment 101 Jérôme Glisse 2008-09-27 03:48:55 UTC
This is still a bug, setting bus type to PCI just hide it.
Does it also works without this option ?

If it bugs please do the series of dump.
Comment 102 François Valenduc 2008-09-27 05:27:33 UTC
Created attachment 18064 [details]
output of radeondump

So I reopen the bug. Adding the "bustype" option is one way to avoid it. Another way is to set the max cstate to C2 is another way.
I have added the output of 3 dumps.
Comment 103 Jérôme Glisse 2008-09-28 05:04:52 UTC
What would be really usefull too is 3-4 dumps with fglrx (sadly it's
still the easiest way to find out how to setup some of the regs).
Comment 104 François Valenduc 2008-09-28 05:20:48 UTC
I can't compile fglrx with the current version of the 2.6.27-rcX kernel. As I have said in comment #54, the problem occurs as well. Should I try a dump with fglrx and kernel 2.6.26 ?
Comment 105 Jérôme Glisse 2008-09-28 05:34:07 UTC
Created attachment 18093 [details]
Set wptr delay (necessary on some AGP chipset) and disable host path pretech to avoid  some hang conditions

If you can give this patch a try.
Comment 106 Jérôme Glisse 2008-09-28 05:35:40 UTC
And dump with older kernel with fglrx would be usefull. Even stock distrib
kernel, i am just really interested to know how fglrx setup your card so
i can compare with how we do and try to spot some difference which might
help to fix your situation.
Comment 107 François Valenduc 2008-09-28 05:59:03 UTC
I can also not compile fglrx with kernel 2.6.26. I tried to find some patches but none of them were successful. Should I go back to 2.6.25 ? Will the output still be relevant ? As I have previously said, I am not sure the fglrx will work without problem.
I have also tried your path (set wptr delay...) and it doesn't solve the problem.
Why is it not enough to use the BusType option ?
I start being bored of investigating this problem.
Comment 108 François Valenduc 2008-09-28 06:59:30 UTC
I have finally managed to compile fglrx with kernel 2.6.26. It also required to use unused symbols (because it needs init_mm). Unfortunately, it also fails when CPU IDLE is enabled. So I don't think we will get valuable info with it.
It even fails harder since it's not possible to initiate a SSH connection when X hangs. 
Comment 109 Jérôme Glisse 2008-09-28 07:05:50 UTC
Option bustype pci downgrade your AGP to PCI so you will experience sever
performance loss. I know debugging is painfull and you have been very helpfull.
Thing is we have very little good tester as you with problematic hw while we
have lot of user with problematic hw. So by helping to track down this you
are helping more than you, and do a very valuable contribution to help
fixing others problems :).

I will dig in more in some AGP doc to see what might be usefull to test and
come back to you with another simple patch.
Comment 110 Jérôme Glisse 2008-09-28 07:06:48 UTC
Also could you attach the Xorg.0.log file in lockup case.
Comment 111 François Valenduc 2008-09-28 07:13:45 UTC
So I have removed the BusType option and I also removed the line:
BusID       "PCI:1:0:0"

After several minutes, the GDM logging window is not yet blocked. Maybe the problematic line was the one indicating the BusID. I don't remember where this come from. Maybe it's really a configuration problem. CPU Idle is enabled and the workaround patch to set the max CState is not applied and it has not yet failed. 
Comment 112 Jérôme Glisse 2008-09-28 07:32:39 UTC
Created attachment 18096 [details]
Tweak AGP (cripple & others workaround features)

Strange according to your lspci removing the line :
BusID       "PCI:1:0:0"

shouldn't do anythings. If you could attach your xorg log maybe there is
usefull informations in it. I am still attaching a patch which tweak some
AGP features.
Comment 113 François Valenduc 2008-09-28 09:32:44 UTC
Created attachment 18097 [details]
Xorg.log obtained with your last patch Tweak AGP. ..

I don't understand anything now. I have retried 2.6.27-rc7 kernel without any special patch and without setting the BUS type to PCI and the problem again occurs. I don't understand why it has only worked once.

I have attached the xorg.log file you asked for. Your last doesn't patch doesn't help and make things worse. With this one, the screen remains black when X starts and I never see the GDM login window. I can't even reboot the PC cleanly. After sometime, I have to use SysRQ keys to force a reboot.
Comment 114 Jérôme Glisse 2008-09-30 11:43:05 UTC
Created attachment 18120 [details]
Another AGP tweak (cripple & others workaround features)

Okay at least now we know that this is AGP related. I attach another try at
tweaking some of the AGP configuration.

Also Could you git pull change from radeondump and provide a new dump ?
I want to look at some more config register's values.
Comment 115 François Valenduc 2008-10-01 09:42:14 UTC
Unfortunately, your last patch doesn't compile and give the following error:

drivers/gpu/drm/radeon/radeon_cp.c:1773:1: error: unterminated argument list invoking macro "RADEON_WRITE"
drivers/gpu/drm/radeon/radeon_cp.c: In function 'radeon_cp_init_ring_buffer':
drivers/gpu/drm/radeon/radeon_cp.c:588: error: 'RADEON_WRITE' undeclared (first use in this function)
drivers/gpu/drm/radeon/radeon_cp.c:588: error: (Each undeclared identifier is reported only once
drivers/gpu/drm/radeon/radeon_cp.c:588: error: for each function it appears in.)
drivers/gpu/drm/radeon/radeon_cp.c:588: error: expected ';' at end of input
drivers/gpu/drm/radeon/radeon_cp.c:588: error: expected declaration or statement at end of input
drivers/gpu/drm/radeon/radeon_cp.c:553: warning: unused variable 'tmp'
drivers/gpu/drm/radeon/radeon_cp.c:552: warning: unused variable 'cur_read_ptr'
distcc[14464] ERROR: compile drivers/gpu/drm/radeon/radeon_cp.c on localhost failed
make[4]: *** [drivers/gpu/drm/radeon/radeon_cp.o] Error 1
make[3]: *** [drivers/gpu/drm/radeon] Error 2
make[2]: *** [drivers/gpu/drm] Error 2
make[1]: *** [drivers/gpu] Error 2
Comment 116 Jérôme Glisse 2008-10-02 02:10:51 UTC
Created attachment 18137 [details]
Tweak AGP (cripple & others workaround features)

Again i am very sorry. This one should compile (missing )) could you also update 
radeondump a provide a new dump ?
Comment 117 François Valenduc 2008-10-04 03:07:52 UTC
So I tried your last patch. It doesn't solve the problem. With this one, the screen remains black when X start. I never see the clock shown when gdm starts and I never see the GDM window too.
I also tried radeondump but it freezes my computer when I use it. It also locks the computer when I use it with kernel 2.6.26.
Comment 118 François Valenduc 2008-10-04 03:22:37 UTC
Created attachment 18152 [details]
Xorg.log obtained with C-State limited to C2

I have found something strange in the Xorg log file. When I don't use your patch and if I apply instead the workaround patch limiting C-State, I see more lines after "(II) RADEON(0): no multimedia table present, disabling Rage Theatre.":

The following line appears, which is not the case when I apply your patch (tweak AGP features...):

(II) RADEON(0): RandR 1.2 enabled, ignore the following RandR disabled message.
(WW) RADEON(0): Option "AddARGBGLXVisuals" is not used
(--) RandR disabled
(II) Initializing built-in extension MIT-SHM
(II) Initializing built-in extension XInputExtension
(II) Initializing built-in extension XTEST
(II) Initializing built-in extension XKEYBOARD
(II) Initializing built-in extension XC-APPGROUP
(II) Initializing built-in extension XAccessControlExtension
(II) Initializing built-in extension SECURITY
(II) Initializing built-in extension XINERAMA
(II) Initializing built-in extension XFIXES
(II) Initializing built-in extension XFree86-Bigfont
(II) Initializing built-in extension RENDER
(II) Initializing built-in extension RANDR
(II) Initializing built-in extension COMPOSITE
(II) Initializing built-in extension DAMAGE
(II) Initializing built-in extension XEVIE
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 8, (OK)
drmOpenByBusid: Searching for BusID pci:0000:01:00.0
drmOpenDevice: node name is /dev/dri/card0
drmOpenDevice: open result is 8, (OK)
drmOpenByBusid: drmOpenMinor returns 8
drmOpenByBusid: drmGetBusid reports pci:0000:01:00.0
(WW) AIGLX: 3D driver claims to not support visual 0x23
(WW) AIGLX: 3D driver claims to not support visual 0x24
(WW) AIGLX: 3D driver claims to not support visual 0x25
(WW) AIGLX: 3D driver claims to not support visual 0x26
(WW) AIGLX: 3D driver claims to not support visual 0x27
(WW) AIGLX: 3D driver claims to not support visual 0x28
(WW) AIGLX: 3D driver claims to not support visual 0x29
(WW) AIGLX: 3D driver claims to not support visual 0x2a
(WW) AIGLX: 3D driver claims to not support visual 0x2b
(WW) AIGLX: 3D driver claims to not support visual 0x2c
(WW) AIGLX: 3D driver claims to not support visual 0x2d
(WW) AIGLX: 3D driver claims to not support visual 0x2e
(WW) AIGLX: 3D driver claims to not support visual 0x2f
(WW) AIGLX: 3D driver claims to not support visual 0x30
(WW) AIGLX: 3D driver claims to not support visual 0x31
(WW) AIGLX: 3D driver claims to not support visual 0x32
(II) AIGLX: Loaded and initialized /usr/lib/dri/r300_dri.so
(II) GLX: Initialized DRI GL provider for screen 0
(II) RADEON(0): Setting screen physical size to 270 x 203
Comment 119 François Valenduc 2008-11-16 08:30:38 UTC
It seems nobody is interested in this bug or nobody has an idea on the way to solve it. I retried the last patch (Tweak AGP (cripple - others workaround features) on kernel 2.6.27.6 and the same problem occurs: the screen remains black when X start. Furthermore, X takes 99% of the CPU resources and thus the load of the systel increases constantly (from 2.87, 0.93, 0.33 on login via SSH to 3.79, 1.42, 0.52 3 minutes later).
I am forced to use the workaround to limit C-state to C2 forever ?
Comment 120 François Valenduc 2008-12-13 12:38:20 UTC
Is this problem ever going to be solved ? With the current version of 2.6.28-rc8, it still occurs. X still hangs using 99.8% of CPU resources at startup.
Comment 121 Jérôme Glisse 2008-12-17 04:53:09 UTC
Sorry, i forgot about this one, given that option bus pci fixed it, it might just 
be one of that broken AGP hw. Unfortunately we don't have reliable hw bugs list
neither for AGP chipset or GPU chipset. AGP is one of the worst things ever
invented in computer, too much hw bugs in it. Debugging this mostly need a full
time people working on the hw to track down what the problem is. So i would be
curions to know if fglrx is enabling AGP or not on your card (given than i
assume fglrx have a more reliable list of broken hw chipset or gpu). In the
meantime i will fix my radeondump stuff so you can provide me with some usefull
dumps.
Comment 122 Fabio Rossi 2009-01-31 04:07:47 UTC
I have the same problem on my (very) old notebook where is mounted a Mobility M6 graphic controller. Also in this case the problem is the DRI option used with the Xorg radeon driver.

The problem doesn't exist on an another PC, a desktop where is mounted an ATI Technologies Inc Radeon R100 QD [Radeon 7200].

I'm using the latest kernel, v2.6.29-rc3-12726-gf917b45 (wireless-testing)
Comment 123 François Valenduc 2009-11-13 18:30:11 UTC
I tried today to re-enable C3 state on my computer and I was happily surprised to see that the problem doesn't occur anymore. I now use kernel 2.6.31.6 and KDE 4.3. In the meantime, there were also upgrades to xorg. So I don't know what has solved the problem but it seems to be gone.