Bug 37752
Summary: | Kernel Panic in drm_vblank_put+0x13/0x50 on P4 HT machine with 82915G/GV/910GL Integrated Graphics Controller | ||
---|---|---|---|
Product: | Drivers | Reporter: | Martin Rogge (marogge) |
Component: | Video(DRI - Intel) | Assignee: | drivers_video-dri-intel (drivers_video-dri-intel) |
Status: | RESOLVED CODE_FIX | ||
Severity: | high | CC: | chris, daniel, florian, maciej.rutecki, marogge, rbyshko, rjw, samuel-kbugs |
Priority: | P1 | ||
Hardware: | i386 | ||
OS: | Linux | ||
URL: | https://bugs.freedesktop.org/show_bug.cgi?id=34211 | ||
Kernel Version: | 2.6.39.3, 3.0 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 32012 | ||
Attachments: |
kernel config
output of lspci -vv Screenshots of two different panics |
Description
Martin Rogge
2011-06-17 13:02:11 UTC
Created attachment 62522 [details]
output of lspci -vv
Created attachment 62532 [details]
Screenshots of two different panics
On Tuesday, June 28, 2011, Martin wrote:
> On Monday 27 June 2011 00:35:16 Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.38 and 2.6.39.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.38 and 2.6.39. Please verify if it still should
> > be listed and let the tracking team know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=37752
> > Subject : Kernel Panic in drm_vblank_put+0x13/0x50 on P4 HT
> machine
> with
> > 82915G/GV/910GL Integrated Graphics Controller Submitter : Martin Rogge
> > <marogge@onlinehome.de>
> > Date : 2011-06-17 13:02 (10 days old)
>
> As far as I can see the bug is still present in the 2.6.39 line. However. I
> have good news. Out of a whim I've been trying 3.0-rc4 and the panic has not
> occurred in a few days.
>
> I can try and bisect which commit fixed the issue. I guess I have to reverse
> the semantics of good and bad for this one. Don't tell the clergy.
>
> Anyway, since it takes a while to establish the absence of a panic, don't
> expect any results soon.
>
> Martin
I have good news and bad news: the bad news is, kernel 3.0-rc4 did freeze after 5 days of uptime. The good news is, it didn't panic but threw a kernel BUG with EIP the same as the panics I was getting before. This is what I caught in the syslog: Jun 29 10:08:38 darkstar kernel: ------------[ cut here ]------------ Jun 29 10:08:38 darkstar kernel: kernel BUG at drivers/gpu/drm/drm_irq.c:924! Jun 29 10:08:38 darkstar kernel: invalid opcode: 0000 [#1] PREEMPT SMP Jun 29 10:08:38 darkstar kernel: Jun 29 10:08:38 darkstar kernel: Pid: 11234, comm: git Not tainted 3.0.0-rc4 #1 IBM 8143WZG/IBM Jun 29 10:08:38 darkstar kernel: EIP: 0060:[<c1192a22>] EFLAGS: 00010046 CPU: 0 Jun 29 10:08:38 darkstar kernel: EIP is at drm_vblank_put+0x13/0x50 Jun 29 10:08:38 darkstar kernel: EAX: 00000000 EBX: f726f800 ECX: f7104c00 EDX: f7246dc0 Jun 29 10:08:38 darkstar kernel: ESI: 00000000 EDI: 00ac1e80 EBP: 00000000 ESP: f7009f08 Jun 29 10:08:38 darkstar kernel: DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 Jun 29 10:08:38 darkstar kernel: Process git (pid: 11234, ti=f7008000 task=f696b4e0 task.ti=dfc28000) Jun 29 10:08:38 darkstar kernel: Stack: Jun 29 10:08:38 darkstar kernel: f726f800 00000000 c11b10ba 00000001 00000002 c0d40180 cecae6d8 122bb12b Jun 29 10:08:38 darkstar kernel: f7104c00 f7104db0 00000000 00000082 f71f6000 4e0add86 0005fcb7 4e0add86 Jun 29 10:08:38 darkstar kernel: 0006011f f71f6000 f7104c00 00000000 00000800 c11a267b 00000001 00000000 Jun 29 10:08:38 darkstar kernel: Call Trace: Jun 29 10:08:38 darkstar kernel: [<c11b10ba>] ? do_intel_finish_page_flip+0x187/0x1e5 Jun 29 10:08:38 darkstar kernel: [<c11a267b>] ? i915_driver_irq_handler+0x2dc/0x530 Jun 29 10:08:38 darkstar kernel: [<c11f3300>] ? ata_bmdma_port_intr+0x75/0xcb Jun 29 10:08:38 darkstar kernel: [<c11ffc1a>] ? tg3_interrupt_tagged+0x2d/0xa5 Jun 29 10:08:38 darkstar kernel: [<c104869e>] ? handle_irq_event_percpu+0x1d/0xf9 Jun 29 10:08:38 darkstar kernel: [<c10487a3>] ? handle_irq_event+0x29/0x42 Jun 29 10:08:38 darkstar kernel: [<c1049e5d>] ? handle_level_irq+0x91/0x91 Jun 29 10:08:38 darkstar kernel: [<c1049ec0>] ? handle_fasteoi_irq+0x63/0x7f Jun 29 10:08:38 darkstar kernel: <IRQ> Jun 29 10:08:38 darkstar kernel: [<c100372e>] ? do_IRQ+0x2e/0x84 Jun 29 10:08:38 darkstar kernel: [<c1311329>] ? common_interrupt+0x29/0x30 Jun 29 10:08:38 darkstar kernel: Code: ff 8b 54 24 14 8b 44 24 0c e8 58 d9 17 00 89 f8 83 c4 28 5b 5e 5f 5d c3 56 53 89 c1 c1 e2 02 03 90 74 01 00 00 8b 02 85 c0 75 02 <0f> 0b f0 ff 0a 0f 94 c0 84 c0 74 2e a1 e8 91 42 c1 85 c0 74 25 Jun 29 10:08:38 darkstar kernel: EIP: [<c1192a22>] drm_vblank_put+0x13/0x50 SS:ESP 0068:f7009f08 Jun 29 10:08:38 darkstar kernel: BUG: scheduling while atomic: git/11234/0x00010002 Jun 29 10:08:38 darkstar kernel: Jun 29 10:08:38 darkstar kernel: Pid: 11234, comm: git Not tainted 3.0.0-rc4 #1 IBM 8143WZG/IBM Jun 29 10:08:38 darkstar kernel: EIP: 0073:[<080b137a>] EFLAGS: 00000206 CPU: 0 Jun 29 10:08:38 darkstar kernel: EIP is at 0x80b137a Jun 29 10:08:38 darkstar kernel: EAX: 087be190 EBX: 087388f8 ECX: 08c1ea68 EDX: 086a0108 Jun 29 10:08:38 darkstar kernel: ESI: 08981870 EDI: 000000f1 EBP: bfe98e28 ESP: bfe98de0 Jun 29 10:08:38 darkstar kernel: DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b Jun 29 10:08:38 darkstar kernel: Process git (pid: 11234, ti=f7008000 task=f696b4e0 task.ti=dfc28000) Jun 29 10:08:38 darkstar kernel: Jun 29 10:08:38 darkstar kernel: Call Trace: Jun 29 10:09:38 darkstar kernel: INFO: rcu_preempt_state detected stalls on CPUs/tasks: { 1} (detected by 0, t=18002 jiffies) Jun 29 10:12:38 darkstar kernel: INFO: rcu_preempt_state detected stalls on CPUs/tasks: { 1} (detected by 0, t=72034 jiffies) Looks similar to the races found in https://bugs.freedesktop.org/show_bug.cgi?id=34211 It is very likely a race condition because it seems to trigger as soon as I put the system under load (while the 3D screensaver is running). NB: I've just had another kernel BUG ("EIP is at drm_vblank_put+0x13/0x50"), followed by a panic ("fatal exception in interrupt") when I went through the Alt-SysRq sequence. On Monday, July 11, 2011, Martin wrote:
> On Sunday 10 July 2011 12:58:54 Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of regressions introduced between 2.6.38 and 2.6.39.
> >
> > The following bug entry is on the current list of known regressions
> > introduced between 2.6.38 and 2.6.39. Please verify if it still should
> > be listed and let the tracking team know (either way).
> >
> >
> > Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=37752
> > Subject : Kernel Panic in drm_vblank_put+0x13/0x50 on P4 HT
> machine
> with
> > 82915G/GV/910GL Integrated Graphics Controller Submitter : Martin Rogge
> > <marogge@onlinehome.de>
> > Date : 2011-06-17 13:02 (24 days old)
>
> I have verified today that both 2.6.39.3 and 3.0-rc6 still show the problem.
> As before, 3.0-rc6 seems to have a longer uptime than 3.6.39.3. Both times I
> did not catch a proper kernel panic. The machine simply froze.
just for info, I tested kernel v3.0 today. After a few hours of running the Atlantis screen saver while simultaneously compiling a kernel the BUG was triggered again. Can you please retest with at least 3.2. That kernel contains the fix for https://bugs.freedesktop.org/show_bug.cgi?id=34211 I presume this is it, if I'm wrong, please reopen this bug (and hit me with the cluestick ;-). Relevant commit: commit 7317c75e66fce0c9f82fbe6f72f7e5256b315422 Author: Jesse Barnes <jbarnes@virtuousgeek.org> Date: Mon Aug 29 09:45:28 2011 -0700 drm/i915: don't set unpin_work if vblank_get fails *** Bug 35092 has been marked as a duplicate of this bug. *** |