Most recent kernel where this bug did not occur: Distribution: Kanotix Hardware Environment: IBM Thinkpad X31, ATI Mobility M6 Software Environment: Debian unstable Problem Description: First of all: I
Please attach the output of dmesg -s 1000000
Created attachment 8227 [details] kern.log from the time of the last crash
Created attachment 8228 [details] syslog from the time of the last crash
Created attachment 8229 [details] /var/log/messages from the time of the last crash
Created attachment 8230 [details] dmesg -s 1000000 (however, only starting after the last crash)
Dmesg is a bit late for the last freeze as the ring buffer has run out already. I still attach the output, but the first message is already after the restart of the system. I also attach /var/log/syslog and /var/log/messages and kern.log from the time of the crash. However, as I have written above, there is (at leat to my eyes) no relevant message in it. I always check syslog, messages and kern.log directly after such a crash and it always seems that the crash is instantaneous. No time for the kernel to write any relevant message. That
This is most likely an issue with the OpenGL driver, or possibly the X driver. Please check the Mesa and xorg products at http://bugs.freedesktop.org and file a new bug there in the unlikely event that there is none about this yet. Attach full X config and log files there, and try running in depth 16 and not enabling Option "DynamicClocks" if you haven't yet.
Michel: Thanks for your help. :) I started with the bugtracker mentioned (strangely searching for "bugtracker xorg" with google did not give any links to this). There are quite a lot bugs in there that report freezes or lockup on Radeon HW. However, I am still not sure if one of them resembles my case or not. Before filing a new bug, I thought it's best to try a newer verion of the DRI driver. I was still using the one from debian unstable (050528 !), but didn't know how to get a newer one. I now found the download address in one of the bug reports. I tried to install the binary builds from 060403. However, the install script does not find the correct directories on a debian system. Fortunately, a newer version of libgl1-mesa-dri featuring the 060327 driver appeared just yesterday in debian alioth (experimental). So now I have installed this DRI driver. Nearly all works so far (only the game chromium refuses to run). I will now see, whether the freeze still occurs (may take some time...). BTW. I am running on 16 bit always. But I do use DynamicClocks in order to save energy on my laptop. If the newer driver does not prevent another freeze, I will disable the dynamicclocks next. If this doesn't help either, I will file a new bug report. I have to admit I was quite disappointed to see that there is nowhere a method described to get more debug info in order to find the culprit of such freezes. It's awful to see how many people have some kind of freeze and no way to find out what
Well, o.k. The new driver does not help. Just tried Enemy Territory and after 5 minutes the system froze again. Now I will try disabling dynamicclocks (after already having disabled FastWrites and going back to AGP Mode 1x). Michael Btw. there is no message related to the crash in either dmesg (only starts with the reboot), syslog, messages, kernlog or xorg.0.log
o.k. next freeze (again under enemy territory), this time without DynamicClocks. I will file a bug at freedesktop.org, but I
Is this still happening in current kernels? Some people have found that X is stable if you do NOT have the radeon fbdev driver loaded. Did you have it loaded? If so, can you try that? Also, increasing the AGP window size in BIOS might help, if that's possible.
Sorry, I needed some time to check it. In the last months I tried to not use any OpenGL app, so I changed the screensaver to something simple and so on. Now, I first upgraded to kernel 2.6.19. The X system is version 7.1.1 The ati driver is module version 6.6.3. The radeon driver is submodule version 4.2.0. The openGL driver is Mesa DRI Radeon 20060327 AGP 1x NO-TCL Testing is always not that easy as these freezes are really, really infrequent. So I tried a few games, used google earth. All worked well. I then started beryl and wow, it seems, some errors have been remedied in the driver. Earlier, I only saw a quarter of the screen and everything was very slow. Now everything works fine and fast!!! What a nice eye-candy!!!! After 4 hours, unfortunately, the system froze again completely :-( :-( :-( So, it's a pity, but the problem does still exist. Concerning the frame buffer: I think the kanotix kernel does NOT use the radeonfb but vesa frame buffer (at least the kernel config says that vesafb is "y" and radeonfb is "m", and the latter one is not loaded). I haven't found anything in the bios where I could change the AGP window size. What number should I look for and what should it be? Thanks and kind regards Michael
I've lately been having exactly the same problem, and I'm fairly sure I can pinpoint it to the in-kernel DRM. I'm an Intel (i810 driver) user, so this rules out the ATI drivers causing it. Also, I've been using Beryl for a while (via AIGLX) and I have never had it lock up on me during regular usage. Lockups have occurred for me while playing Quake 3, Project 64 (via Wine, with the glN64 graphics plugin which uses openGL), and Jedi Academy (via Wine, which I believe uses DirectX). Here's a bug I filed on freedesktop.org about it, in which I explain in detail what happens to me: https://bugs.freedesktop.org/show_bug.cgi?id=10330 The problem started around the time I upgraded from the 2.6.18 kernel to 2.6.19. I downgraded to .18 and the problem persisted. However, I recently realized that when I switched to 2.6.19, x11-drm (a non-kernel DRM module) broke, and I switched then to the in-kernel DRM. Therefore I'm fairly sure the problem is caused by the in-kernel DRM driver. Regards, Ben Blum
Whoops, scratch that. After a bit of testing, it appears I still get this when using x11-drm and 2.6.18 kernel. I have no idea what's causing this, but I'd love to see it solved shortly.
I'd had my motherboard overclocked to 108%, and moving it down to 100% fixed the problem. To whomever posted this: Is your motherboard/graphics card overclocked? If so, try it at regular speed.
Please test with recent kernel and confirm if the problem is still there. Thanks.
Hi Peter: No, this is not an overclocked system. This is an IBM laptop. Natalie: What do you consider a recent kernel? A released one (2.6.21), a release candidate or a bleeding edge one (e.g. GIT)? At the moment I am on 2.6.21. But I have turned off any OpenGL app as I use this system as a productive one and don't like to have severe crashes here and then. A test is not so simple as these crashes really are infrequent. When I posted the bug, it took something between half an hour and 2 weeks between the crashes. The only similarity was that always an OpenGL app was running (e.g. OpenGL screensaver). Before I started using OpenGL (and after I disabled OpenGL apps) the system never crashed. Secondly, such a crash often led to data loss on my system. It seemed to help to switch on the sync option on the disk driver, but that slowed down the system noticebly. So, do you have any specific reason that the problem could be cured in a recent kernel? Thanks and best regards Michael
this most likely the video driver, not the kernel so probably should be in freedesktop.orgs bug tracker. I can't think of anything else to help fix it, running in PCI mode might be the only thing, but there were certain bugs on those chips we haven't tracked down due to the 3d driver.
Michael, I was thinking about 2.6.22-rc5, because it has a huge update for usb and video etc. and refreshing this issue would be good with particularly this release. However this problem is striking to me, because I ran into similar one myself. I've looked on the web and found that Xorg+pretty recent kernels produce hangs just like yours and they range from keybord, and/or mouse, up to hanging a whole system. I am working on building latest kernel and instrumenting it. What makes this harder is just like you said - hangs are very random and rare, and there is no way to reproduce them at will. Look into bug 6645 - similar problem, check if it is same as yours and the workaround applies to your case.
Any new updates on this problem? Michael, have you tried newer kernel/X? If so the problem should probably be reported on xorg or dri lists: xorg@lists.freedesktop.org, dri-devel@lists.sourceforge.net
Just to let you know. I am now on 2.6.27-rc6, xserver-core 1.4.2, xorg 7.3, and the problem is still there. I needed to install some openGL apps (I did not use any before because of the crashes), and it took three days until I got my freeze :-( Again, nothing in the syslog or anywhere else. Being more than two years old now, I guess this bug will never be found. Only solution seems to upgrade to another hardware or just not to use any opengl apps. Unfortunately, I am still very happy with my Thinkpad X31, otherwise. So I guess I will go without opengl for another two to three years, until there is a stronger reason to upgrade the hardware. Pity, I like compiz... Thanks anyway and kind regards Michael
Michael, did you ever try turning off AGP? Wrong AGPMode setting, or AGP at all, is known to cause system freezes. Add this to your Device section: Option "BusType" "PCI" If that works out well, you can remove that line and instead try Option "AGPMode" "2" or try 4 instead of 2.
Tormod, well, I am pretty sure that I have tried many xorg options when I first reported the bug (two years ago) including different AGPmodes, but I will try BusType PCI again. Due to the infrequent nature of these freezes it may, however, take days or weeks until I can give feedback. Thanks for your help, Michael
Hmm, one question: Is AGP only used in opengl apps? I guess not. It rather seems to be the basic hardware communication protocol, isn't it? Because, without opengl I don't have any freezes of the machine, whatsoever. Only with opengl apps (e.g. compiz, googleearth) I occasionally experience these crashes. Wouldn't this rule out an effect of the option "BusType" "PCI"?
AFAIK, the DRI is the only component that uses AGP transfers with the ATI cards. If you don't enable DRI, AGP is not used, only basic PCI over the AGP slot.
Using option PCI often help in lockup case, AGP is wacky. And having freeze only with gl apps doesn't rule out this option.
PCI vs. AGP only affects the GART setup for GPU access to buffers in system memory (command buffers, vertex buffers, etc.). AGP tends to be problematic. The radeon PCI GART interface is usually pretty stable.