Bug 63251

Summary: Kernel panic on X startup on Acer Aspire V3-772G
Product: Drivers Reporter: Bernhard Rosenkränzer (bero)
Component: Console/FramebuffersAssignee: Alan (alan)
Status: RESOLVED INSUFFICIENT_DATA    
Severity: normal CC: alan, intel-gfx-bugs
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.11.5 Subsystem:
Regression: No Bisected commit-id:

Description Bernhard Rosenkränzer 2013-10-18 23:43:19 UTC
On an Acer Aspire V3-772G configured to use the Intel GPU (the box has a dual Intel/Nvidia GPU), occasionally this error occurs during the late boot process on X startup (probably a timing related issue - happens only occasionally and if X does come up everything is fine afterwards):
[The W I taints are from a buggy BIOS that reports the DMAR address as 0 -- should be unrelated to this issue]

BUG: unable to handle kernel paging request at 0000000fffffffe0
IP: [<ffffffff81193323>] __kmalloc_node_track_caller+0xf3/0x230
PGD 831b83067 PUD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: arc4 ath9k ath9k_common ath9k_hw snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep ath snd_pcm mac80211 cfg80211 snd_page_alloc snd_timer snd rtsx_pci_ms broadcom memstick rtsx_pci_sdmmc soundcore tg3 ptp pps_core libphy mmc_core rtsx_pci shpchp rfcomm bnep ath3k btusb bluetooth acer_wmi sparse_keymap rfkill uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core videodev iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal i2c_i801 media coretemp kvm_intel kvm microcode joydev serio_raw lpc_ich thermal acpi_cpufreq mperf ac processor battery evdev binfmt_misc ipv6 autofs4 crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd sr_mod ehci_pci xhci_hcd nouveau usbcore mxm_wmi usb_common ttm wmi i915 video button i2c_algo_bit drm_kms_helper drm i2c_core dm_mirror dm_region_hash dm_log dm_mod
CPU: 3 PID: 1 Comm: systemd Tainted: G        W I  3.11.5 #1
Hardware name: Acer Aspire V3-772/VA70_HW, BIOS V1.08 07/19/2013
task: ffff88083b8f8000 ti:ffff88083b8f4000 task.ti: ffff88083b8f4000
RIP: 0010:[<ffffffff81193323>] [<ffffffff81193323>] __kmalloc_node_track_caller+0xf3/0x230
RSP: 0018:ffff88083b8f5978 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880838fcf900 RCX: 000000000001ca43
RDX: 000000000001ca23 RSI: 0000000000000000 RDI: 00000000000167a0
RBP: ffff88083b8f59e8 R08: ffff88085f2d67a0 R09: ffff88083ec03500
R10: ffff88083b8f5fd8 R11: 0000000000000246 R12: 0000000fffffffe0
R13: 00000000000106d0 R14: 0000000000000300 R15: 00000000ffffffff
FS:  00007f778624d7c0(0000) GS:ffff88085f2c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000fffffffe0 CR3: 0000000834d5e000 CR4: 00000000001407e0
Stack:
 0000000000000001 0000000000000005 0000000000000001 ffff880837bd0000
 ffff88083b8f59e8 ffffffff810955f9 ffff88085f354240 ffffffff81573b97
 ffff88085f5f9d80 ffff880838fcf900 ffff88083b8f5a67 00000000000004d0
Call Trace:
 [<ffffffff810955f9>] ? try_to_wake_up+0xb9/0x1d0
 [<ffffffff81573b97>] ? __alloc_skb+0x87/0x2a0
 [<ffffffff815707fc>] __kmalloc_reserve.isra.52+0x3c/0xa0
 [<ffffffff81573b97>] __alloc_skb+0x87/0x2a0
 [<ffffffff8156cef1>] sock_alloc_send_pskb+0x1d1/0x350
 [<ffffffff81141caa>] ? __alloc_pages_nodemask+0x14a/0x950
 [<ffffffff8156d085>] sock_alloc_send_skb+0x15/0x20
 [<ffffffff81619cfb>] unix_stream_sendmsg+0x27b/0x3e0
 [<ffffffff815698d6>] sock_sendmsg+0xa6/0xd0
 [<ffffffff81388eb1>] ? cpumask_any_but+0x31/0x50
 [<ffffffff81146cee>] ? lru_cache_add+0xe/0x10
 [<ffffffff8116bd64>] ? page_add_new_anon_rmap+0x94/0x110
 [<ffffffff81569c7c>] __sys_sendmsg+0x37c/0x390
 [<ffffffff81161c29>] ? handle_mm_fault+0x149/0x210
 [<ffffffff811b1688>] ? __d_free+0x48/0x70
 [<ffffffff816486dc>] ? __do_page_fault+0x23c/0x4b0
 [<ffffffff811b2d9e>] ? d_kill+0xee/0x150
 [<ffffffff811bb7a9>] ? mntput_no_expire+0x49/0x160
 [<ffffffff811bb8e6>] ? mntput+0x26/0x40
 [<ffffffff8119e3b6>] ? __fput+0x166/0x230
 [<ffffffff8156aab9>] __sys_sendmsg+0x49/0x90
 [<ffffffff8156ab12>] SyS_sendmsg+0x12/0x20
 [<ffffffff8164c86d>] system_call_fastpath+0x1a/0x1f
Code: 51 18 0f 1f 44 00 00 48 83 c4 48 4c 89 e0 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 1f 80 00 00 00 00 49 63 41 20 48 8d 4a 20 49 8b 39 <49> 8b 1c 04 4c 89 e0 65 48 0f c7 0f 0f 94 c0 84 c0 0f 84 41 ff
RIP  [<ffffffff81193323>] __kmalloc_node_track_caller+0xf3/0x230
 RSP <ffff88083b8f5978>
CR2: 0000000fffffffe0
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

drm_kms_helper: panic occurred, switching back to text console


Userland components that might be involved in triggering this:
systemd 208
xorg-server 1.14.3
plymouth 0.8.8
Comment 1 Jani Nikula 2013-11-12 15:04:25 UTC
Hmm, nothing in the backtrace implicates i915; the drm_kms_helper message is just from the panic notifier.
Comment 2 Bernhard Rosenkränzer 2013-11-12 15:29:59 UTC
We've identified a userland bug that was responsible for this (kdm and systemd were apparently trying to grab the same tty at the same time), but there's probably still something wrong - userland shouldn't be able to cause kernel BUG()s even when doing something stupid most of the time...
Comment 3 Alan 2013-11-12 16:49:42 UTC
Can you give me a desciption of roughly what was going on, and how I might try and reproduce it. The console/tty code is certainly a little fragile in spots and that sounds like we've not nailed all the hangup/session races