Bug 151311 - [x2APIC] BUG: unable to handle kernel NULL pointer dereference; IP: [<ffffffff8105b035>] x2apic_cluster_probe+0x35/0x70
Summary: [x2APIC] BUG: unable to handle kernel NULL pointer dereference; IP: [<fffffff...
Status: RESOLVED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: drivers_other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-08-03 01:46 UTC by JianhongYin
Modified: 2016-10-26 03:35 UTC (History)
2 users (show)

See Also:
Kernel Version: 4.7.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments
panic.log (17.85 KB, application/pgp-keys)
2016-08-10 09:20 UTC, Otto Sabart
Details
lscpu (689 bytes, text/plain)
2016-08-10 09:20 UTC, Otto Sabart
Details
lspci (8.73 KB, application/pgp-keys)
2016-08-10 09:20 UTC, Otto Sabart
Details

Description JianhongYin 2016-08-03 01:46:33 UTC
Got this call trace in many machines

part of the console log:
-----------------------------------
[    0.205347] x2apic enabled 
[    0.208375] BUG: unable to handle kernel NULL pointer dereference at           (null) 
[    0.217128] IP: [<ffffffff8105b035>] x2apic_cluster_probe+0x35/0x70 
[    0.224130] PGD 0  
[    0.226379] Oops: 0002 [#1] SMP 
[    0.229878] Modules linked in: 
[    0.233291] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.7.0 #1 
[    0.239797] Hardware name: IBM IBM System X3250 M4 -[2583AC1]-/00D3729, BIOS -[JQE164AUS-1.07]- 12/09/2013 
[    0.250568] task: ffff88017edc8000 task.stack: ffff88017edc4000 
[    0.257170] RIP: 0010:[<ffffffff8105b035>]  [<ffffffff8105b035>] x2apic_cluster_probe+0x35/0x70 
[    0.266885] RSP: 0000:ffff88017edc7e28  EFLAGS: 00010202 
[    0.272809] RAX: 0000000000000000 RBX: ffffffff81f3f960 RCX: ffff88023fc00000 
[    0.280767] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000246 
[    0.288726] RBP: ffff88017edc7e28 R08: 00000000fffffffe R09: 0000000000000000 
[    0.296685] R10: 0000000000000005 R11: 00000000000000cc R12: 0000000000002000 
[    0.304644] R13: 000000000000a118 R14: 0000000000000007 R15: 0000000000000008 
[    0.312603] FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000 
[    0.321629] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[    0.328037] CR2: 0000000000000000 CR3: 0000000001c06000 CR4: 00000000001406f0 
[    0.335996] Stack: 
[    0.338236]  ffff88017edc7e40 ffffffff81da6718 000000000000a110 ffff88017edc7e78 
[    0.346520]  ffffffff81d9fff1 ffffffff81f1b020 ffff88017edc8b18 ffffffff81c95ac0 
[    0.354808]  ffffffff81823300 ffff88017edc8000 ffff88017edc7f38 ffffffff81d8a293 
[    0.363095] Call Trace: 
[    0.365822]  [<ffffffff81da6718>] default_setup_apic_routing+0x28/0x69 
[    0.373103]  [<ffffffff81d9fff1>] native_smp_prepare_cpus+0x223/0x2d2 
[    0.380288]  [<ffffffff81d8a293>] kernel_init_freeable+0xd8/0x249 
[    0.387087]  [<ffffffff816f274e>] kernel_init+0xe/0x110 
[    0.392914]  [<ffffffff816ff8ff>] ret_from_fork+0x1f/0x40 
[    0.398935]  [<ffffffff816f2740>] ? rest_init+0x80/0x80 
[    0.404762] Code: 00 31 c0 65 8b 15 1c f1 fa 7e 85 c9 75 01 c3 48 63 ca 55 48 c7 c0 10 d7 00 00 48 8b 0c cd 60 c4 d4 81 89 d2 48 89 e5 48 8b 04 08 <f0> 48 0f ab 10 49 c7 c0 70 b0 05 81 48 c7 c1 b0 ae 05 81 ba 01  
[    0.426408] RIP  [<ffffffff8105b035>] x2apic_cluster_probe+0x35/0x70 
[    0.433505]  RSP <ffff88017edc7e28> 
[    0.437392] CR2: 0000000000000000 
[    0.441088] ---[ end trace 4515b29a27d62395 ]--- 
[    0.446235] Kernel panic - not syncing: Fatal exception 
[    0.452074] ---[ end Kernel panic - not syncing: Fatal exception 
[    0.665165] random: fast init done 
[   13.785549] random: crng init done 
[-- MARK -- Tue Aug  2 09:20:00 2016] 
[  111.119523] BUG: unable to handle kernel NULL pointer dereference at 0000000000000102 
[  111.128277] IP: [<ffffffff810a4692>] __queue_work+0x32/0x420 
[  111.134598] PGD 0  
[  111.136848] Oops: 0000 [#2] SMP 
[  111.140347] Modules linked in: 
[  111.143760] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G      D         4.7.0 #1 
[  111.151622] Hardware name: IBM IBM System X3250 M4 -[2583AC1]-/00D3729, BIOS -[JQE164AUS-1.07]- 12/09/2013 
[  111.162393] task: ffff88017edc8000 task.stack: ffff88017edc4000 
[  111.168995] RIP: 0010:[<ffffffff810a4692>]  [<ffffffff810a4692>] __queue_work+0x32/0x420 
[  111.178030] RSP: 0000:ffff88023fc03de8  EFLAGS: 00010046 
[  111.183953] RAX: 0000000000000086 RBX: 0000000000000087 RCX: ffffffff81ceeca8 
[  111.191912] RDX: ffffffff81ceeba0 RSI: 0000000000000000 RDI: 0000000000002000 
[  111.199871] RBP: ffff88023fc03e20 R08: 0000000000000000 R09: 0000000000004000 
[  111.207830] R10: 0000000000000001 R11: 0000000000007ffe R12: ffffffff81ceeba0 
[  111.215788] R13: 0000000000002000 R14: 0000000000000000 R15: ffffffff81a817be 
[  111.223747] FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000 
[  111.232772] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 
[  111.239180] CR2: 0000000000000102 CR3: 0000000001c06000 CR4: 00000000001406f0 
[  111.247140] Stack: 
[  111.249380]  0000000000000001 0000200000000001 0000000000000087 000000000000063d 
[  111.257666]  0000000000000381 0000000000008000 ffffffff81a817be ffff88023fc03e38 
[  111.265951]  ffffffff810a4aa7 ffffffff81ceec00 ffff88023fc03e88 ffffffff81468f77 
[  111.274236] Call Trace: 
[  111.276960]  <IRQ>  
[  111.279104]  [<ffffffff810a4aa7>] queue_work_on+0x27/0x40 
[  111.285329]  [<ffffffff81468f77>] credit_entropy_bits+0x1d7/0x2a0 
[  111.292126]  [<ffffffff814696d9>] ? add_interrupt_randomness+0x1b9/0x210 
[  111.299601]  [<ffffffff814696d9>] add_interrupt_randomness+0x1b9/0x210 
[  111.306883]  [<ffffffff810ea250>] handle_irq_event_percpu+0x40/0x80 
[  111.313874]  [<ffffffff810ea2cb>] handle_irq_event+0x3b/0x60 
[  111.320187]  [<ffffffff810ed9f8>] handle_level_irq+0x88/0x100 
[  111.326597]  [<ffffffff8103006b>] handle_irq+0xab/0x130 
[  111.332425]  [<ffffffff817021ad>] do_IRQ+0x4d/0xd0 
[  111.337768]  [<ffffffff8170004c>] common_interrupt+0x8c/0x8c 
[  111.344078]  <EOI>  
[  111.346223]  [<ffffffff81365a13>] ? delay_tsc+0x33/0x60 
[  111.352252]  [<ffffffff81365977>] __const_udelay+0x27/0x30 
[  111.358371]  [<ffffffff81196ea2>] panic+0x22e/0x236 
[  111.363811]  [<ffffffff81030bc8>] oops_end+0xb8/0xd0 
[  111.369349]  [<ffffffff8106a707>] no_context+0x137/0x390 
[  111.375274]  [<ffffffff8106aa4e>] __bad_area_nosemaphore+0xee/0x1d0 
[  111.382266]  [<ffffffff8145aac1>] ? univ8250_console_write+0x21/0x30 
[  111.389354]  [<ffffffff8106ab44>] bad_area_nosemaphore+0x14/0x20 
[  111.396053]  [<ffffffff8106b1f9>] __do_page_fault+0x89/0x4a0 
[  111.402365]  [<ffffffff8106b640>] do_page_fault+0x30/0x80 
[  111.408387]  [<ffffffff81701948>] page_fault+0x28/0x30 
[  111.414118]  [<ffffffff8105b035>] ? x2apic_cluster_probe+0x35/0x70 
[  111.421012]  [<ffffffff81da6718>] default_setup_apic_routing+0x28/0x69 
[  111.428293]  [<ffffffff81d9fff1>] native_smp_prepare_cpus+0x223/0x2d2 
[  111.435477]  [<ffffffff81d8a293>] kernel_init_freeable+0xd8/0x249 
[  111.442274]  [<ffffffff816f274e>] kernel_init+0xe/0x110 
[  111.448101]  [<ffffffff816ff8ff>] ret_from_fork+0x1f/0x40 
[  111.454121]  [<ffffffff816f2740>] ? rest_init+0x80/0x80 
[  111.459947] Code: 89 e5 41 57 41 56 49 89 f6 41 55 41 89 fd 41 54 49 89 d4 53 48 83 ec 10 89 7d d4 9c 58 0f 1f 44 00 00 f6 c4 02 0f 85 0d 03 00 00 <41> f6 86 02 01 00 00 01 0f 85 ae 02 00 00 49 c7 c7 18 41 01 00  
[  111.481601] RIP  [<ffffffff810a4692>] __queue_work+0x32/0x420 
[  111.488018]  RSP <ffff88023fc03de8> 
[  111.491906] CR2: 0000000000000102 
[  111.495599] ---[ end trace 4515b29a27d62396 ]--- 
[  111.500746] Kernel panic - not syncing: Fatal exception in interrupt 
[  111.507846] ---[ end Kernel panic - not syncing: Fatal exception in interrupt 
[-- MARK -- Tue Aug  2 09:25:00 2016] 
[-- MARK -- Tue Aug  2 09:30:00 2016] 
[-- MARK -- Tue Aug  2 09:35:01 2016]
Comment 1 JianhongYin 2016-08-03 01:51:18 UTC
Hardware info:

1.---------------------------------------------------------
## CPU
Vendor 	GenuineIntel
Model Name 	Intel(R) Xeon(R) CPU E3-1220 V2 @ 3.10GHz
Family 	6
Model 	58
Stepping 	9
Speed 	1600.0
Processors 	4
Cores 	4
Sockets 	1
Hyper 	False
Flags 	fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm arat xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
Arch(s) 	i386 x86_64 

##System
Host Hypervisor 	(not virtualized)
Vendor 	ibm
Model 	x3250 m4
Serial Number 	KQ2LB50
MAC Address 	40:F2:E9:32:7B:0C
Memory 	7861 MB
NUMA Nodes 	1

2.-------------------------------------------------
## CPU
Vendor 	GenuineIntel
Model Name 	Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Family 	6
Model 	63
Stepping 	2
Speed 	1200.0
Processors 	72
Cores 	36
Sockets 	2
Hyper 	True
Flags 	fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid fsgsbase bmi1 avx2 smep bmi2 erms invpcid
Arch(s) 	x86_64 

##  System
Host Hypervisor 	(not virtualized)
Vendor 	HP
Model 	HP Proliant XL190R Gen9
Serial Number 	2M25270ZD1
MAC Address 	50:65:F3:66:97:04
Memory 	31978 MB
NUMA Nodes 	2

3. ------------------------------------------------
## CPU
Vendor 	GenuineIntel
Model Name 	Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz
Family 	6
Model 	45
Stepping 	7
Speed 	1200.0
Processors 	12
Cores 	6
Sockets 	1
Hyper 	True
Flags 	fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dts tpr_shadow vnmi flexpriority ept vpid
Arch(s) 	i386 x86_64 

## System
Host Hypervisor 	(not virtualized)
Vendor 	IBM
Model 	IBM System x3550 M4 Server -[7914I4Q]- 09
Serial Number 	
MAC Address 	6C:AE:8B:51:44:AA
Memory 	15893 MB
NUMA Nodes 	1
Comment 2 Otto Sabart 2016-08-10 09:17:32 UTC
I am facing exactly the same problem:

$ uname -a 
Linux blesk-01.----.com 3.10.0-327.el7.x86_64 #1 SMP Thu Oct 29 17:29:29 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux

Attaching my logs.
Comment 3 Otto Sabart 2016-08-10 09:20:08 UTC
Created attachment 228131 [details]
panic.log

My panic log.
Comment 4 Otto Sabart 2016-08-10 09:20:28 UTC
Created attachment 228141 [details]
lscpu
Comment 5 Otto Sabart 2016-08-10 09:20:50 UTC
Created attachment 228151 [details]
lspci
Comment 6 Sebastian A. Siewior 2016-08-11 14:18:20 UTC
This should fix it.
http://marc.info/?l=linux-kernel&m=147092454024041
Comment 7 Otto Sabart 2016-08-16 08:25:45 UTC
Cannot reproduce it on v4.8-rc2 anymore.
Comment 8 JianhongYin 2016-10-26 03:35:14 UTC
(In reply to Otto Sabart from comment #7)
> Cannot reproduce it on v4.8-rc2 anymore.

yes, thanks Otto.  move status to RESOLVED

Note You need to log in before you can comment on or make changes to this bug.