Most recent kernel where this bug did not occur: 2.6.13 Distribution: Fedora 4 Hardware Environment: IBM T42p Software Environment: Vanilla kernel compiled with gcc (GCC) 4.1.0 20051222 (Red Hat 4.1.0-0.12) Nothing running, idle system in login prompt. Problem Description: In several minutes after boot, kernel freeze. No OOPS message on console. System do not react to SysRq. Reproducible 100%. vanilla 2.6.13 (Linux version 2.6.13 (root@vkondra-mobl) (gcc version 4.0.1 20050822 (Red Hat 4.0.1-10))) run without problems on the same system. Steps to reproduce: Boot and wait for several minutes. I am ready to provide any additional information and do any experiments to investigate this problem.
Created attachment 6932 [details] my config for 2.6.15
I built gcc 4.0.2 (official stable release) from sources; and recompiled 2.6.15 kernel with it. Also, I removed ieee80211* and ipw2200 modules since they was not present in my 2.6.13. Result is the same - complete freeze. Any ideas what to do next?
Can you enable the NMI watchdog, see if that catches anything?
with NMI - no luck. On boot, kernel says: Local APIC disabled by BIOS -- you can enable it with "lapic" mapped APIC to ffffd000 (01806000) ; but if I append "lapic", laptop won't boot. Sure, I tried with nmi_watchdog=[12], NMI counter not run and in dmesg, I see message: Testing NMI watchdog ... CPU#0: NMI appears to be stuck (0->0)! I continued to play with modules, and I found who is guilty: if I remove drm modules (drm.ko and radeon.ko), it works! I am writing this on 2.6.15 that survived 6 hours. Hope it will work much longer. Thus, either 'radeon.ko" or "drm.ko" requires attention.
cc'ing David...
can you give me an lspci? my guess is this is caused by a newer kernel enabling some feature in X and X crashing out ... you are running X? can you attach an Xorg.0.log... thanks.
Created attachment 6937 [details] My lspci
yes you are using an M10, recent support for r300 chips was added to the kernel, X.org may not be the stablest on these even with 2D with a DRM loaded, Try removing the Load "dri" option from your xorg.conf and see if that helps..
Dave, I doubt it is X crashing. With 2.6.13, I run X for month with DRM enabled without any problems. What is indeed interesting, drm with 2.6.15 reports 2 devices: Jan 4 17:53:47 vkondra-mobl kernel: [drm] Initialized drm 1.0.0 20040925 Jan 4 17:53:47 vkondra-mobl kernel: ACPI: PCI Interrupt 0000:01:00.0[A] -> Link [LNKA] -> GSI 11 (level, low) -> IRQ 11 Jan 4 17:53:47 vkondra-mobl kernel: [drm] Initialized radeon 1.19.0 20050911 on minor 0: Jan 4 17:53:47 vkondra-mobl kernel: agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0. Jan 4 17:53:47 vkondra-mobl kernel: agpgart: Putting AGP V2 device at 0000:00:00.0 into 1x mode Jan 4 17:53:47 vkondra-mobl kernel: agpgart: Putting AGP V2 device at 0000:01:00.0 into 1x mode Jan 4 17:53:47 vkondra-mobl kernel: [drm] Loading R300 Microcode while on 2.6.13 it prints just Jan 4 18:05:07 vkondra-mobl kernel: [drm] Initialized drm 1.0.0 20040925 Devices in question are: 00:00.0 Host bridge: Intel Corporation 82855PM Processor to I/O Controller (rev 03) Subsystem: IBM Unknown device 0529 Flags: bus master, fast devsel, latency 0 Memory at d0000000 (32-bit, prefetchable) [size=256M] Capabilities: [e4] Vendor Specific Information Capabilities: [a0] AGP version 2.0 01:00.0 VGA compatible controller: ATI Technologies Inc M10 NT [FireGL Mobility T2] (rev 80) (prog-if 00 [VGA]) Subsystem: IBM Unknown device 054f Flags: bus master, fast Back2Back, 66MHz, medium devsel, latency 66, IRQ 11 Memory at e0000000 (32-bit, prefetchable) [size=128M] I/O ports at 3000 [size=256] Memory at c0100000 (32-bit, non-prefetchable) [size=64K] [virtual] Expansion ROM at c0120000 [disabled] [size=128K] Capabilities: [58] AGP version 2.0 Capabilities: [50] Power Management version 2 It may be, drm mistaken with 1-st one.
You are correct, this is M10. I'll try without "dri" in xorg.conf; but it will be some later time. I have to make a break to do my main job.
the 2.6.13 kernel didn't support the radeon M10 chip, so the drm module loads and does nothing, the radeon module doesn't load.. you'll notice in the 2.6.15 case the radeon module actually reports some info... So it is X that is crashing and it is because X is now using the DRM...
Disabling DRI in xorg.conf works. I changed title to reflect root cause. Also, severity is not "blocker" any more since simple work around exist (disable DRI in xorg.conf).
This problem is not resloved with the simple no DRI option. im running a ATI moble radeon x600 this is a PCIE card and i get a compleate freeze with x86 kernel 2.6.15.4. This occurs with both the open source radeon driver with xorg and with the proprietary ATI drivers. Strange note is this does not occur on fedora 4 64bit with there shipped kernel of course they have dri enabled in the kernel witch to get 3d acceleration you need to disable. When i figure out how to recompile a kernel in fedora ill check this. BTW i can compile a kernel in slackware and my own LFS system go figure :( The system im running on is farily new as well its a Gateway MX7525 the specs are on there page. I have the Timer issue as well that is documented here with time running at 2x and suspect this may be why i get total freeze with X. running noapictimer as a kernel option disables hardware notably the PCMCIA device. NOTE: Compileing the kernel from www.kernel.org into fedora 4 64bit causes freeze with propietary ATI drivers and with open source radeon. This is a total crash with no logs generated in /var/log/Xorg.log files and system is non responsive to any thing other then a reboot. AKA kernel panic or something other. Another strange note is the mouse still moves even tho the system responds to no commands or be loged in remotely too. Is there something that needs to be disabled when going though menuconfig or is this a real bug ??? Im clueless and after 2weeks and 4 distros later im no closer to a resolution rather then trying to fix the other issues in the shipped fedora 4 64bit kernel. thanks
UPDATE ------ To reslove this issue do the following Go into menuconfig when building the kernel and disable the new radeon Framebuffer driver and use the old vesa one. With this framebuffer driver on and using ither radeon or fglrx in X will cause a system to halt when starting x. Simply recompile with the ATI Radeon frambuffer driver off and it resloves the issue.
+benh Ben, do you still look after the radeon driver? It's being bad. Steven, is this bug still present in 2.6.20-rc7?
Wow ya know I am not sure as I have bought a new laptop that uses a non ati chip as I am tired of fighting with mesa for control of 3d. I still do have the original lappy that had this problem i could load up a distro and try with a new kernel if you want to know. I submitted this almost a year ago
I must admit of slacking a bit there... too much stuff to do. I do have some work-in-progress updates, bits from Solomon Peachy and some bits from myself, I also need to merge in some updates from X (we found workarounds for various issues in X that I never had the time to move to radeonfb). I'm hoping to have some time in february or march to do some serious work on it, but I can't promise. I would happily hand over the maintainership if we could find somebody capable of taking that over though.
Is OK - I'm not aware of many people hurting from this - one, or perhaps two (I'm waiting to hear back from #2). And no, I'm afraid fbdev developers aren't growing on trees, especially after Tony's mysterious disappearance :(
I am currently running same hardware as in original report, kernel 2.6.19, Xorg 7.1.1; I have DRI enabled. precisely, in .config CONFIG_FB_RADEON=m CONFIG_FB_RADEON_I2C=y and Load "dri" in /etc/X11/xorg.conf No hangs, however I must notice GoogleEarth works very slow
Any updates with this problem? Vladimir, how does it work for you now, have you tried latest kernels? Thanks.
I am changing the severity of this bug to low as it does not seem to impact very many users. There also seams to be a lack of understanding where exactly this is occurring or even if there are multiple things that cause this issue and on what type of platform they occur on. I must admit that I myself have provided little information to help this out and apologize. One thing I do know is that it does occur when radeonfb drivers are used along with the ati drivers in the X org system. I am not 100% sure this is what the OP created this bug for and its possible that my experience should have been created as a separate bug. I have not tested this out on recent kernels and its possible it is not a kernel bug but it may be a X org bug or the other way around. I will leave this open in case it is of any interest in the future or may close it if I file a more generic bug regarding this issue which may be better as it could close this and some others that have the same issue. If you have any thoughts please shoot me an email.
Yes, the bugs that appear in the OS and Xorg and their interaction are so subtle, hard to reproduce and debug. Is this freeze still happening with 2.6.23+?
At this stage, radeonfb is a lost cause I'm tempted to say and fixing it isn't a very productive use of anybody's time. The mode setting is moving from X.org into the DRM, and we'll probably end up merging the useful bits of radeonfb such as power management (along with other improvements to those bits) with that new DRM mode setting & deprecate radeonfb alltogether soon. That should get rid of all those nasty interaction issues. To note also AMD/ATI recent announce about providing specs/docs & support to X.org developers which means that we -might- get some help tracking down that sort of problem in the future. Unfortunately, both solutions aren't very short term but that's the best I can say at this stage, unless somebody with time to waste can try to port back some of the stuff in X.org to match the memory maps with radeonfb -and- debug all the regressions that such a thing would cause.
I am not sure Natalie but I would guess that it would still be there as there has been little to no change with this driver. If you really wanna know I can loadup something on that laptop this weekend and give you some feedback. I agree Benjamin, there is a workaround and I would rather see work being done on the DRM then anything else. I still find ATI drivers rather frustrating when dealing with systems that do not have it already setup. Mostly because of the mesa driver wanting to display 3d instead of the hardware.