Created attachment 158101 [details]
When trying to start the X server through the the login manager, the radeon module seems to crash leaving the system without graphical output (but otherwise working). This happens every time as soon as PC boots.
Going back to v3.17.2 fixes the issue.
This seems to be relevant dmesg part:
[ 2.640374] BUG: unable to handle kernel paging request at ffffec2000000900
[ 2.640419] IP: [<ffffffff811ac4e6>] kfree+0x56/0x1a0
[ 2.640449] PGD 0
[ 2.640462] Oops: 0000 [#1] PREEMPT SMP
[ 2.640488] Modules linked in: [...]
[ 2.641120] CPU: 0 PID: 287 Comm: Xorg.bin Not tainted 3.17.3-1-ARCH #1
[ 2.641152] Hardware name: Gigabyte Technology Co., Ltd. To be filled by O.E.M./Q77M-D2H, BIOS F2 12/20/2012
[ 2.641197] task: ffff8800dc131420 ti: ffff8800dc168000 task.ti: ffff8800dc168000
[ 2.641232] RIP: 0010:[<ffffffff811ac4e6>] [<ffffffff811ac4e6>] kfree+0x56/0x1a0
[ 2.641269] RSP: 0018:ffff8800dc16ba30 EFLAGS: 00010286
[ 2.641294] RAX: 0000022000000900 RBX: 0000100000024414 RCX: 0000000000010005
[ 2.641328] RDX: 000077ff80000000 RSI: 0000000000000005 RDI: 0000100000024414
[ 2.641361] RBP: ffff8800dc16ba48 R08: 0000000000000005 R09: ffffec2000000900
[ 2.641393] R10: 0000000000000010 R11: 0000000000000000 R12: ffff8800dd430800
[ 2.641426] R13: ffffffffa0736e8c R14: 00000000000120f0 R15: 0000000000001800
[ 2.641460] FS: 00007f3efe93e8c0(0000) GS:ffff88021dc00000(0000) knlGS:0000000000000000
[ 2.641497] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2.641524] CR2: ffffec2000000900 CR3: 000000020e2e2000 CR4: 00000000001407f0
[ 2.641556] Stack:
[ 2.641567] ffff8800dd9d4000 ffff8800dd430800 ffff8800dd9d4000 ffff8800dc16bb30
[ 2.641607] ffffffffa0736e8c 000000000000004c 000120480001212c ffff880000000008
[ 2.641648] 0000ffff00012044 000000000001212c 0000000000012048 0000000000012044
[ 2.641688] Call Trace:
[ 2.641721] [<ffffffffa0736e8c>] evergreen_hdmi_setmode+0xd8c/0x1970 [radeon]
[ 2.641763] [<ffffffffa034c8d4>] ? drm_detect_hdmi_monitor+0x74/0xc0 [drm]
[ 2.641810] [<ffffffffa073f388>] radeon_atom_encoder_mode_set+0x178/0x3c0 [radeon]
[ 2.641849] [<ffffffffa04a7986>] drm_crtc_helper_set_mode+0x356/0x530 [drm_kms_helper]
[ 2.641897] [<ffffffffa06d989c>] radeon_property_change_mode.isra.1+0x3c/0x40 [radeon]
[ 2.641943] [<ffffffffa06d9a4e>] radeon_connector_set_property+0x1ae/0x3f0 [radeon]
[ 2.641986] [<ffffffffa03494e2>] drm_mode_obj_set_property_ioctl+0x1b2/0x3a0 [drm]
[ 2.642027] [<ffffffffa034970f>] drm_mode_connector_property_set_ioctl+0x3f/0x60 [drm]
[ 2.642069] [<ffffffffa0339fef>] drm_ioctl+0x1df/0x680 [drm]
[ 2.642100] [<ffffffff8105e9ac>] ? __do_page_fault+0x2ec/0x600
[ 2.642134] [<ffffffffa06b204c>] radeon_drm_ioctl+0x4c/0x80 [radeon]
[ 2.642166] [<ffffffff811da3c0>] do_vfs_ioctl+0x2d0/0x4b0
[ 2.642194] [<ffffffff811c9f31>] ? __sb_end_write+0x31/0x60
[ 2.642222] [<ffffffff811da621>] SyS_ioctl+0x81/0xa0
[ 2.642249] [<ffffffff8153d8e9>] system_call_fastpath+0x16/0x1b
[ 2.642277] Code: 00 00 00 80 ff 77 00 00 49 b9 00 00 00 00 00 ea ff ff 48 01 d8 48 0f 42 15 38 bb 66 00 48 01 d0 48 c1 e8 0c 48 c1 e0 06 49 01 c1 <49> 8b 01 f6 c4 80 0f 85 0e 01 00 00 49 8b 01 a8 80 0f 84 83 00
[ 2.644387] RIP [<ffffffff811ac4e6>] kfree+0x56/0x1a0
[ 2.646397] RSP <ffff8800dc16ba30>
[ 2.648359] CR2: ffffec2000000900
I can confirm this.
My laptop can boot with 3.17.3 with 2 additional monitors connected (DVI + HDMI) and all 3 screens are working.
But when X is starting, kernel goes OOPS if HDMI is connected. I can reboot via ssh then.
If HDMI is connected AFTER X has been started, system will enter a hard lockup state.
Will add my current .config and dmesg part.
Created attachment 158131 [details]
Created attachment 158141 [details]
complete dmesg when X starts and HDMI (with audio) is connected, dpm enabled
Created attachment 158151 [details]
Does reverting commit ffe0245532b98efc4bc0e06f29c51d3f0e471152 help? If not, can you bisect?
Reverting the commit ffe0245532b98efc4bc0e06f29c51d3f0e471152 from the tag v3.17.3 does resolve the issue.
Thanks for your time.
Should be fixed by http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=83d04c39f9048807a8500e575ae3f1718a3f45bb
which needs to go to stable.
Applying the patch of Alex's commit solved the issue.
Thanks to everyone involved. Report and fix within 20 hours. Awesome!
I'm not sure it's right to mark it as fixed until the patch goes to 3.17 - it's fixed in mainline, sure, but not stable. The fix seems to have missed the boat for 3.17.4. (I got burned by this with OpenELEC 5.0 beta 3, which happened to get kernel 3.17.3, and is commonly used on systems with Radeon graphics adapters and HDMI displays as it's an HTPC appliance distro...)
The bug is pretty nasty and the fix is quite trivial, so I don't know why wasn't included in the 3.17.4 release. I assume it's because didn't seem so urgent, but actually leave my system unusable.
I don't know the procedure to reach the current stable branch, but if it's going to be a 3.17.5 release, this fix should be in there. Should I mail Greg KH about the bug?
Anyway, I don't think leaving it open will make any difference.
The patch was sent to stable last week. It should show up any time now.
Does this bug still persist in 3.17.6-1 or is it patched there?
It should be fixed in .5 and .6.