Bug 13819
Summary: | system freeze when switching to console | ||
---|---|---|---|
Product: | Drivers | Reporter: | Reinette Chatre (reinette.chatre) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | akpm, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.31-rc3 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 13615 |
Description
Reinette Chatre
2009-07-23 17:57:33 UTC
The commit result seems unlikely to be correct. Are you actually using UBIFS? I thought the same thing and that is why I did that sanity check (which failed). My kernel is not compiled with CONFIG_UBIFS_FS I am starting with a fresh bisect now. Will reports back results when this is complete. I think I know why my previous bisect result was wrong. During the bisect I had to test a revision that did not boot on my system, I then guessed it as "bad" and proceeded. With the next bisect the kernel revision changed to 2.6.31-rc1, and I knew that this worked on 2.6.31-rc2 so I changed my guess to "good" to get bisect to go back to 2.6.31-rc2. This was wrong as I see now the new bisect has a kernel version of 2.6.31-rc1 even though 2.6.31-rc2 works fine. It must have something to do with how the trees are merged. Anyway, I rerun bisect with different good commit and was able to get a reliable first bad commit. This commit makes more sense as I am using i915. commit 6ff4fd05676bc5b5c930bef25901e489f7843660 Author: ling.ma@intel.com <ling.ma@intel.com> Date: Thu Jun 25 10:59:22 2009 +0800 drm/i915: Set SSC frequency for 8xx chips correctly All 8xx class chips have the 66/48 split, not just 855. Signed-off-by: Ma Ling <ling.ma@intel.com> Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Eric Anholt <eric@anholt.net> My device: 00:02.0 0300: 8086:2a42 (rev 07) Subsystem: 104d:9025 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 31 Region 0: Memory at e8400000 (64-bit, non-prefetchable) [size=4M] Region 2: Memory at d0000000 (64-bit, prefetchable) [size=256M] Region 4: I/O ports at 8130 [size=8] Capabilities: <access denied> OK, thanks, I'll reassign it to DRI. This is a post-2.6.31-rc2 regression. I am now trying to use 2.6.31-rc4 but would like to use X also. I thus reverted this commit from 2.6.31-rc4 but the freeze problem still exists. The above commit was identified clearly in a git-bisect that ran without problems, but reverting it does not fix the issue. This is weird. Is there a way in which I can obtain any logs to help debug this? This problem has me very confused. I have tried bisecting it three times now and every time I end up with a patch from a merge of the 'drm-intel-next' branch, but I do not always get the same commit from that branch as the "first bad commit". I went ahead and "rolled my own" bisect by using rc4 and reverting all the patches from that branch merge. As expected, that gave me a working setup again. I then did a manual bisect and found that "drm/i915: enable error detection & state collection" was the bad commit. I wanted to confirm this with a sanity check, but could not revert it on its own, I had to revert the following to get a working setup based off the current linux-2.6 (4733fd328f14280900435d9dbae1487d110a4d56): drm/i915: Don't update display FIFO watermark on IGDNG drm/i915: add FIFO watermark support drm/i915: enable error detection & state collection Same problem in 2.6.31-rc6. Unfortunately the patches I previously reverted to get a working system does not revert cleanly anymore. On Thursday 20 August 2009, reinette chatre wrote:
> On Wed, 2009-08-19 at 13:26 -0700, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> >
> > The following bug entry is on the current list of known regressions
> > from 2.6.30. Please verify if it still should be listed and let me know
> > (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=13819
> > Subject : system freeze when switching to console
> > Submitter : Reinette Chatre <reinette.chatre@intel.com>
> > Date : 2009-07-23 17:57 (28 days old)
>
> This issue is still present in 2.6.31-rc6. Unfortunately the patches I
> reverted to get a working system does not revert cleanly anymore.
This is fixed by: commit e6890f6f3dc2d9024a08b1a149d9bd5208eea350 Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Tue Sep 8 17:09:24 2009 -0700 i915: disable interrupts before tearing down GEM state Reinette Chatre reports a frozen system (with blinking keyboard LEDs) when switching from graphics mode to the text console, or when suspending (which does the same thing). With netconsole, the oops turned out to be BUG: unable to handle kernel NULL pointer dereference at 0000000000000084 IP: [<ffffffffa03ecaab>] i915_driver_irq_handler+0x26b/0xd20 [i915] and it's due to the i915_gem.c code doing drm_irq_uninstall() after having done i915_gem_idle(). And the i915_gem_idle() path will do i915_gem_idle() -> i915_gem_cleanup_ringbuffer() -> i915_gem_cleanup_hws() -> dev_priv->hw_status_page = NULL; but if an i915 interrupt comes in after this stage, it may want to access that hw_status_page, and gets the above NULL pointer dereference. And since the NULL pointer dereference happens from within an interrupt, and with the screen still in graphics mode, the common end result is simply a silently hung machine. Fix it by simply uninstalling the irq handler before idling rather than after. Fixes http://bugzilla.kernel.org/show_bug.cgi?id=13819 Reported-and-tested-by: Reinette Chatre <reinette.chatre@intel.com> Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> |