Bug 16388
Summary: | i915 drm BUG: unable to handle kernel paging request at a5e89046 | ||
---|---|---|---|
Product: | Drivers | Reporter: | lists |
Component: | Video(DRI - Intel) | Assignee: | drivers_video-dri-intel (drivers_video-dri-intel) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | CC: | akpm, cebbert, chris, maciej.rutecki, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.34.1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 15310 | ||
Attachments: | Config per jbarnes |
Description
lists
2010-07-14 16:59:13 UTC
Thanks. Was 2.6.33 OK? 2.6.32? Here's the honest answer -short version: I only ran 2.6.33 for a couple of days after the fix for KBZ 13811 was out, and I did see it there. It's really hard to tell because KBZ 13811 (which really was regression but was not marked nor treated like one) masked this problem. I was running a 2.6.32.8 from F11 on F12 before that because that one particular kernel got me ~5 working hibernate/thaw cycles, and didn't notice THIS issue. -Long -version: (Sorry if it is ranty -- I know it's not your fault!) Before I upgraded to F13 and 2.6.33 series, I applied the patch for KBZ 13811 to 2.6.32.14 and 2.6.32.16 and did see the problem there. I was however running 2.6.32.8 for the last 6 months and did not notice it, but then 13811 would usually strike first, but the 2.6.32.8 I was running would usually let me get ~5 hibernate/thaw cycles before dying, so I can't really say for sure. Ever since Fedora put/required KMS into Fedora 11, the kernel has been in a regression since at least 2.6.29. I have been using in kernel hibernate/thaw (pmdisk/swsusp) since I think about 2.6.9 or 2.6.10, or about the time that pmdisk and swsusp "stuff" was big. I used to build my own kernels to configure that in as Fedora's kernels at the time didn't include it -- and generally (with a few hiccups here and there) it worked until the 2.6.29.4 that Fedora shipped in F-11. The last kernel that just plain worked was 2.6.27.44 as shipped in the last update of Fedora 10. The entire KMS/GEM project has been at least for me nothing but a regression, since the mode-switch blink when switching to X didn't bother me, and I've lost the ability to hibernate / thaw my laptop reliably for the past year plus. KBZ 13811, regardless of how it was marked, was a regression, since before KMS as Fedora had it in 2.6.29 swsup worked, after it didn't. This may more be a virgin bug in KMS/gem, but the overall impact is regression I'd try anything from 2.6.35, but does that work the libdrm/mesa xorg-intel driver that Fedora is shipping for F13? That's rhetorical, but reflective of the uncertainty and doubt about a whether or not you can use Fedora as a base and have a workable system at least with Intel graphics. Thanks! On Friday, July 23, 2010, Jesse Barnes wrote:
> On Fri, 23 Jul 2010 14:15:55 +0200 (CEST)
> "Rafael J. Wysocki" <rjw@sisk.pl> wrote:
>
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.33 and 2.6.34.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.33 and 2.6.34. Please verify if it still should
> > be listed and let the tracking team know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=16388
> > Subject : i915 drm BUG: unable to handle kernel paging request
> at a5e89046
> > Submitter : <lists@clanduggan.org>
> > Date : 2010-07-14 16:59 (10 days old)
>
> Looks like some potential memory corruption? At resume we try to get
> connector info but panic due to a bad pointer, maybe in one of the
> lists. Can you gdb your drm_kms_helper module and do "list
> *drm_mode_getconnector+0x295" to see what line this is?
>
> Also, what chipset do you have? Maybe I can reproduce it here with
> your kernel config.
This is probably due to the i915 hibernation memory corruption bug, and should be fixed by: commit 985b823b919273fe1327d56d2196b4f92e5d0fae drm/i915: fix hibernation since i915 self-reclaim fixes commit cd9f040df6ce46573760a507cb88192d05d27d86 drm/i915: add 'reclaimable' to i915 self-reclaimable page allocations And yes, those are in Fedora now. And it looks like those two are needed in 2.6.32-stable, since the patch that caused the bug went in 2.6.32.8 as drm-i915-selectively-enable-self-reclaim.patch Created attachment 27261 [details] Config per jbarnes Sorry for the delay. On Fri, 23 Jul 2010 10:37:12 -0700, Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > > Looks like some potential memory corruption? At resume we try to get > connector info but panic due to a bad pointer, maybe in one of the > lists. Can you gdb your drm_kms_helper module and do "list > *drm_mode_getconnector+0x295" to see what line this is? > (gdb) list *drm_mode_getconnector+0x295 0x20f3 is in drm_mode_getconnector (drivers/gpu/drm/drm_crtc.c:1417). 1412 } 1413 copied++; 1414 } 1415 } 1416 } 1417 out_resp->count_encoders = encoders_count; 1418 1419 out: 1420 mutex_unlock(&dev->mode_config.mutex); 1421 return ret; > Also, what chipset do you have? Maybe I can reproduce it here with > your kernel config. 0:00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Contr oller (rev 03) (prog-if 00 [VGA controller]) Subsystem: Dell Device 01bd Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin A routed to IRQ 16 Region 0: Memory at eff00000 (32-bit, non-prefetchable) [size=512K] Region 1: I/O ports at eff8 [size=8] Region 2: Memory at d0000000 (32-bit, prefetchable) [size=256M] Region 3: Memory at efec0000 (32-bit, non-prefetchable) [size=256K] Expansion ROM at <unassigned> [disabled] Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit- Address: 00000000 Data: 0000 Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Kernel driver in use: i915 Kernel modules: i915 0000:00:02.1 Display controller: Intel Corporation Mobile 945GM/GMS/GME, 943/940GML Express Integrated Graphics Controller (rev 03) Subsystem: Dell Device 01bd Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Region 0: Memory at eff80000 (32-bit, non-prefetchable) [size=512K] Capabilities: [d0] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Weird death following resume. It's either the write to a bit of memory we have just allocated for the ioctl, or the connector is corrupt. Definitely fits the pattern for the i915 hibernation bug. |