Created attachment 24936 [details] lspci -v Description of problem: With 2.6.32 and 2.6.33-rc6 I noticed that system with Intel 855GM is not booting anymore. After Grub and kernel start hang with completely black screen happens, no key stroke works (not even Magic SysRq) and have to power of it. When I add 'nomodeset' it boots just fine until X starts problem with non-KMS+X arrives, this is actually bug https://bugzilla.redhat.com/show_bug.cgi?id=522551 (it's not expected this to be ever fixed) and /var/log/Xorg.log is empty. When I add just "acpi=off" (no nomodeset) it boots fine too crashes in X too. I am not sure what acpi=off does regarding KMS but I guess it somehow turns it off. Note: Kernel 2.6.33-rc6 (+ recent intel xorg driver 2.10) is necessary for this box to enable Xv when KMS enabled. Just to stress that boxes with Intel 855 might be totally unusable with kernels 2.6.32+ - KSM broken in kernel and non-KSM usabe broken on X level. 2.6.31.x is fine. Version-Release number of selected component (if applicable): kernel-2.6.32.7-37.fc12.i686 kernel-2.6.33-0.27.rc6.git1.fc13.i686 xorg-x11-drv-intel-2.9.1-1.fc12.i686 xorg-x11-drv-intel-2.10.0-1.fc12.i686
Created attachment 24937 [details] dmesg
Created attachment 24938 [details] lspci -v
Created attachment 24939 [details] dmesg
Will you please boot the system with "nomodeset" option and attach the following output? 1. cat /proc/acpi/button//lid/LID/state 2. dmidecode Thanks. Yakui.
If it's a LID issue (likely) this patch should work around the issue. If it works, can you attach the output of 'dmidecode' to this bug? diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds index 75a9772..345e3b0 100644 --- a/drivers/gpu/drm/i915/intel_lvds.c +++ b/drivers/gpu/drm/i915/intel_lvds.c @@ -643,6 +643,8 @@ static enum drm_connector_status intel_lvds_detect(struct dr { enum drm_connector_status status = connector_status_connected; + return status; + if (!acpi_lid_open() && !dmi_check_system(bad_lid_status)) status = connector_status_disconnected;
Booted "nomodeset" to init 3 and gathered all data. Will try the patch later.
Created attachment 24994 [details] dmidecode
Created attachment 24995 [details] lid state state file says "Closed" but was "open" in reality.
Created attachment 25000 [details] New dmidecode 2.6.33-rc6 KMS on
It works! Now I can boot 2.6.33-rc6 with that patch, KMS is enabled.
Great, just sent a patch, http://patchwork.kernel.org/patch/78947/, to ignore lid status on 8xx machines. It should fix the issue for you.
Created attachment 25017 [details] Jesse Barnes' patch for latest linus tree this is the same patch as J.Barnes, except that is has been manually applied so line offsets are now correct I am now recompiling and will report about it since I have the same hardware as mnowak
Created attachment 25018 [details] Jesse Barnes' patch for latest linus tree this is the same patch as J.Barnes, except that is has been manually applied so line offsets are now correct I am now recompiling and will report about it since I have the same hardware as mnowak
Created attachment 25019 [details] Jesse Barnes' patch for latest linus tree this is the same patch as J.Barnes, except that is has been manually applied so line offsets are now correct I am now recompiling and will report about it since I have the same hardware as mnowak
Created attachment 25020 [details] lspci -v of my box I have attached 'lspci -v' of my box; the patch does nothing for me, exactly same behaviour is observed. System is booted with 'nomodeset nofb' and I get an OFF screen right after the i915 module is auto-loaded. Then I successfully login to Xorg (blindly, since screen is OFF) and it hangs up. Latest lines of Xorg.0.log are not helpful: (II) intel(0): [drm] removed 1 reserved context for kernel (II) intel(0): [drm] unmapping 8192 bytes of SAREA 0xdff08000 at 0xb7331000 (II) intel(0): [drm] Closed DRM master. And neither messages.log, since it contains a wild memory dump in the tail just before my hard reboot!
Created attachment 25021 [details] 'sudo lspci -v' on my box
Daniele, I think you have a different problem. Judging by your X log snippet, you have a really old X driver that won't work with KMS and recent kernels. If after updating you still have issues, please file a new bug with your kernel .config attached, your X log, and your dmesg.
(In reply to comment #15) > System is booted with 'nomodeset nofb' and I get an OFF screen right after > the I can boot without this. This bug is about KMS enabled boot.
Handled-By : Jesse Barnes <jbarnes@virtuousgeek.org> Patch : http://patchwork.kernel.org/patch/78947/
@jbarnes: problem is that kernel crash is so hard that no buffers are written to disk, and Xorg.0.log is not overwritten, so I was providing a snippet from the previous Xorg.0.log which is created with an old version because I *must* use that version to use normally this box. I have started Xorg from ssh and I have taken the output manually by copy/paste, since neither 'tee' can output anything. The box crashes when launching Xorg (startx) and I need to plug power off to restart it. I am attaching the files you requested; please note that I am using latest linus tree (with the patch I attached before) and all latest packages (Arch Linux here)
Created attachment 25047 [details] dmesg output retrieved via ssh machine is booted without 'nomodeset' and without 'nofb' (even if 'nofb' is not relevant)
Created attachment 25048 [details] stdout+stderr produced when starting Xorg (no Xorg.0.log file created because of early crash)
Created attachment 25049 [details] legolas558's kernel .config there's to say that my displays goes black (turned OFF) right after modules loading (I suppose when i915 module is loaded), then I can login and trigger startx (all operations done blindly or via ssh) and the system crashes hard, no magic keys working either; the Xorg output was also collected via ssh. I am fully available to work this bug out Thanks
(In reply to comment #18) > (In reply to comment #15) > > System is booted with 'nomodeset nofb' and I get an OFF screen right after > the > > I can boot without this. This bug is about KMS enabled boot. Sorry my mistake, I get the OFF screen when *not* using 'nomodeset' option, e.g. with KMS enabled
using 2.6.33-rc8 and the "return status" hack, i can boot with KMS enabled, but starting Xorg with either vesa or intel driver (all latest) hard freeze (no sysrq) same, disabling KMS, i can use the vesa driver without crash this is an improvement as i can have ACPI enabled. Just like Daniele, I can't have accelerated drivers via KMS on my 855GM. the poster's fedora kernels must have patches we do not have.
I have just tried FC12 as mnowak suggested me on another bug tracker. 1) the stock FC12/FC13 kernels already have some patch which fixes the kernel for this hardware, since I can boot successfully with 'nomodeset' and start Xorg, while on my vanilla 2.6.32/2.6.33 (linus' tree) I can't boot at all without 'nomodeset' 2) the FC13 kernel (2.6.33-rc8) + 855nolid.patch (J.Barnes small patch about LID on 855GM) can boot but Xorg crashes before the loading screen finishes loading, so I never reach the login manager So now I think it is necessary to identify which Red Hat patches are fixing kernel's DRM so that recent Xorg can work, perhaps I'll need another bug tracker I will try to incrementally apply some FC13 patches regarding DRM, but I don't like shooting in the dark with this stuff... Please change status of this bug since it is *NOT RESOLVED*, J.Barnes' patch is surely an improvement but it has brought the situation back to early 2.6.32 development, when I could boot exactly how FC12+FC13_kernel+nolid.patch does now (without any KMS whatsoever).
Created attachment 25064 [details] .config stripped down from FC13 I have attached a .config which works with 2.6.33-rc8 (linus tree); can some dev tell me why my .config (attachment 25049 [details]) causes the crashes instead?
One problem with the broken config is that AGP is modular. If loaded in the wrong order, it can break the DRM and/or X drivers. It's best to keep AGP as builtin along with the AGP drivers (CONFIG_AGP=7 and CONFIG_AGP_INTEL=y in this case).
Created attachment 25070 [details] garbled fonts with the current 2.6.33-rc8 + nolid patch, after some minutes
Created attachment 25071 [details] Xorg.0.log with Xorg 1.7.4, kernel 2.6.33-rc8+855_nolid.patch @jbarnes: yes I am now rebuilding the kernel with AGP and DRM built-in, that will most surely fix the issue - thanks for pointing it out. So now I am at the same spot like mnowak and the other i855GM users: Xorg crashes after some time, and prints a lot of lines like: (EE) intel(0): Failed to submit batch buffer, expect rendering corruption or even a frozen display: Input/output error. until I terminate the Xorg process (luckly I still have control with the intel driver, while with vesa driver there is a kernel hard crash). When trying again to start Xorg it fails with the same lines as above, so it must mean that the hardware memory or GPU is somewhat corrupted. I have also tried to suspend and restore (just in case some quirk would work) but nothing changes
Created attachment 25072 [details] latest .config with built-in AGP and DRM
Ok good you're on the same page as everyone else now. The crashes seem like a different issue than what was solved in this bug (bad lid detection), can you file a new bug at bugs.freedesktop.org for that, following the instructions at http://intellinuxgraphics.org/how_to_report_bug.html? You can file it against the drm/intel component. Would be extra cool if you could bisect the kernel and see if 8xx started failing for you at a particular commit.
@jbarnes: this bug is not about bad lid detection, this bug is about KMS broken (as far as I can read) and KMS is still broken since it crashes after a few seconds/minutes (mnowak can surely confirm this). However the patch is a huge improvement since crash was instantaneous previously. I have verified that my issue was not due to any Fedora Core patch, but to the .config (built-in AGP and DRM fixed the issue). However if you want I can proceed as you suggested, but I am sure that mnowak can confirm that the patch only delays the crash Also there is to say that with latest linus tree and my new .config I am not getting the crash but instead I am getting a weird glitch each time the screen is refreshed (mouse move, character typing, pretty anything except caret moving up/down); also the fonts are no more garbled
I have to admit I was wrong with statement that XOrg crashes after some time - it's rock solid for me. Thought I am not on 2.6.33-rc8 (just git6rc7). Having no idea what might be the issue for Daniele since I believe we now have the same SW stack (F12+updates+updates_testing + F13's xorg-drv and kernel+Jesse's patch). For *me* is the 2.6.31 -> 2.6.3{2,3} KMS regression fixed. I don't mind if you change the title to mention "problem on boot" and file a new bug at freedesktop's side.
Created attachment 25084 [details] FC13's DRM patch for i8xx devices @mnowak: FC13's kernel uses the attached patch (drm-intel-big-hammer.patch); this appears to have fixed the vertical refresh flicker and crash issue for me also. So now I am running latest linus tree + 855nolid.patch + drm-intel-big-hammer.patch, linux is running with KMS on, latest Xorg and everything runs smoothly fine. Only issue remaining is the garbled font issue, like if some bytes of the fonts bitmap are "eaten" (pretty much like the attached screenshot, just a bit less visible) @jbarnes: if I get no more crashes with this kernel combination, might it be the case to submit also drm-intel-big-hammer.patch? Sorry but I have some issues at understanding if it is a Xorg bug or a kernel bug, and it seems to have been fixed by the attached kernel patch for me now
(In reply to comment #34) > I have to admit I was wrong with statement that XOrg crashes after some time > - > it's rock solid for me. Thought I am not on 2.6.33-rc8 (just git6rc7). Having > no idea what might be the issue for Daniele since I believe we now have the > same SW stack (F12+updates+updates_testing + F13's xorg-drv and > kernel+Jesse's > patch). > The weird bit is that when using exactly your SW stack I get a crash before the desktop appears, while now I am using (latest linus tree + 855nolid.patch + drm-intel-big-hammer.patch) and everything works fine and smoothly (rock solid, as you said), except for the fonts which are missing some pixels. I assume also that your fonts are displaying perfectly. > For *me* is the 2.6.31 -> 2.6.3{2,3} KMS regression fixed. I don't mind if > you > change the title to mention "problem on boot" and file a new bug at > freedesktop's side. I think there is a misunderstanding here: the "problem on boot" was due to AGP being modular, as jbarnes said, and I filed a kernel documentation bug here (http://bugzilla.kernel.org/show_bug.cgi?id=15340). Now I am using it built-in (like FC12/FC13 do) so if I had to submit a new bug to freedesktop it would be for the Xorg crash bug, which anyway is not happening anymore with drm-intel-big-hammer.patch (perhaps that patch needs to be submitted at freedesktop's drm/intel, if jbarnes confirms)
(In reply to comment #36) > I assume also that your fonts are displaying perfectly. Yes. It's perfect. The only package which probably differs in our setups is mine cairo built with XCB enabled and I am using awesome WM (not in Fedora repos). But that's unlikely to cause anything with fonts I believe. > I think there is a misunderstanding here: the "problem on boot" was due to > AGP > being modular, as jbarnes said, and I filed a kernel documentation bug here > (http://bugzilla.kernel.org/show_bug.cgi?id=15340). Now I am using it > built-in > (like FC12/FC13 do) so if I had to submit a new bug to freedesktop it would > be > for the Xorg crash bug, which anyway is not happening anymore with > drm-intel-big-hammer.patch (perhaps that patch needs to be submitted at > freedesktop's drm/intel, if jbarnes confirms) I am bit puzzled with all the recent 855GM issues but since the setup as of now works for me 100%, I don't have any preferences regarding new bugs elsewhere.
@mnowak: my recent findings are not about FC kernel tests but about a vanilla kernel with those 2 patches. It would be great if you could confirm that such mix causes the garbled fonts display and the freeze while playing videos in your box also. Your setup uses Fedora patches so if you are not having any issue at all now (neither the video freeze) we should dig a bit more and find out which other patches we need in the mainstream kernel, so that everybody can get benefit from them
Waiting for developer feedback about the inclusion of the drm-intel-big-hammer.patch, I am digging about the crashes happening now more rarely. The crash is exactly the same of those happening without drm-intel-big-hammer.patch and seems related to GPU "hickups", see also: http://bugs.gentoo.org/301282 However I have no idea about other patches to try to fix this. Playing videos or intensive CPU/GPU usage will likely cause the bug to trigger; mnowak reports that with his patched kernel this never happens. @mnowak: I have put the tools and patches I am using (and also the kernel sources) here http://www.iragan.com/linux/i855GM/
Can you try using libdrm 2.4.18? It has some fixes for corruption, and I'm hoping it makes the "big hammer" patch unnecessary as well.
(In reply to comment #40) > Can you try using libdrm 2.4.18? It has some fixes for corruption, and I'm > hoping it makes the "big hammer" patch unnecessary as well. I just tried libdrm from git: 1) Xorg crashes after more time, like 60 seconds instead of 2-3 seconds 2) the garbled fonts issue is still there I had to re-apply the big hammer patch to use Xorg+KMS, furthermore they recently made KMS mandatory on Arch Linux so I'll really need this patch to run Xorg. Anyway it's not a total fix since it is proven that the batchbuffer crashes can still happen under particular stress situations. Perhaps these i8xx devices have a small pipe and need more time/less flood to work correctly?
just to precise, for me fonts are ok, until some time, they start to get garbled (especially after using opengl i think) when this starts, play with the system a bit more and it will all freeze except the mouse (no ssh in)
For me instead they are garbled since the start of the session; I have verified that XvMC and Tiling are not responsible for this issue and neither for the crash (always happening while playing a video for example) @jbarnes: do we have a built-in kernel stress test for DRM? When redirecting buffer of 'startx' command I am getting the batch buffer error lines but not this line: intel_bufmgr_gem.c:901: Error setting to CPU domain 626: Input/output error So I assume this one is spitted out directly by libdrm in the first tty available. These downstream links contain interesting pointers: http://bugs.gentoo.org/295777 https://bugs.freedesktop.org/show_bug.cgi?id=25510 However I don't think there is any relevant patch mentioned there because this bug could now be triggered from elsewhere. Can't we do anything to prevent GPU from hanging (or kernel from thinking it hung up?)
acpi=off doesn't change anything about bug triggering. Bug also triggers while using Firefox or Opera browsers. Garbled fonts are often coupled by artifacts in icon rendering or background slices restoring. I have applied some other FC13 kernel patches but they don't fix the sudden crash, which needs each time a graceful system restart (VTs are still working, and garbled fonts appear also there). I have also a full git freedesktop development stack with latest Xorg and libraries, just in case I need to make tests. Now I am also using the BFS CPU scheduler and it doesn't interfere with bug triggering, while in the past it highly increased bug triggering ratios.