Bug 189851

Summary: icmp6_send nullpointer panic
Product: Networking Reporter: Florian Pritz (bluewind)
Component: IPV6Assignee: Hideaki YOSHIFUJI (yoshfuji)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: hannes, jan.steffens
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.8.12 Subsystem:
Regression: No Bisected commit-id:

Description Florian Pritz 2016-12-08 09:26:02 UTC
I don't know what caused it, but I happened to catch this panic a few minutes after upgrading a machine to kernel 4.8.12. I don't know if this is a regression from previous kernels, but I haven't seen this bug previously. 

I'm running Arch Linux with the 4.8.12-2 linux package.

If you need any more information, please tell me.

Florian


[  912.759191] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[  912.853169] IP: [<ffffffff815b47a5>] icmp6_send+0x1e5/0xa20
[  912.919981] PGD 4f6be5067 PUD 411657067 PMD 0
[  912.973484] Oops: 0000 [#1] PREEMPT SMP
[  913.019371] Modules linked in: ipmi_ssif joydev mousedev input_leds mac_hid iTCO_wdt iTCO_vendor_support hid_generic ppdev ipmi_devintf intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp mgag200 kvm_intel ttm kvm irqbypass crct10dif_pclmul drm_kms_helper crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw drm gf128mul igb glue_helper ablk_helper cryptd intel_cstate intel_rapl_perf syscopyarea sysfillrect ptp sysimgblt fb_sys_fops pps_core dca i2c_algo_bit i2c_i801 usbhid winbond_cir i2c_smbus ie31200_edac hid acpi_als shpchp edac_core lpc_ich kfifo_buf rc_core thermal led_class industrialio battery parport_pc fan dptf_power int3403_thermal parport serioipmi_si ipmi_msghandler int3406_thermal video fjes int3402_thermal processor_thermal_device int3400_thermal int340x_thermal_zone intel_soc_dts_iosf ac acpi_thermal_rel tpm_tis tpm_tis_core tpm sch_fq_codel ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache dm_mod sd_modahci crc32c_intel libahci libata xhci_pci ehci_pci xhci_hcd ehci_hcd scsi_mod usbcore usb_common raid1 md_mod
[  914.136748] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.8.12-2-ARCH #1
[  914.214887] Hardware name: Intel Corporation S1200RP/S1200RP, BIOS S1200RP.86B.01.04.0002.011020141517 01/10/2014
[  914.337767] task: ffff88081bb6aac0 task.stack: ffff88081bb70000
[  914.408626] RIP: 0010:[<ffffffff815b47a5>]  [<ffffffff815b47a5>] icmp6_send+0x1e5/0xa20
[  914.504568] RSP: 0018:ffff88081f343c90  EFLAGS: 00010246
[  914.568145] RAX: 0000000000000000 RBX: ffff8804f916c800 RCX: 0000000000000020
[  914.653570] RDX: 0000000000000001 RSI: ffff8803b56028e6 RDI: ffff8803b56028d6
[  914.738992] RBP: ffff88081f343dc8 R08: 0000000000000000 R09: ffff880818584000
[  914.824417] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8803b56028ce
[  914.909841] R13: ffffffff81abc440 R14: 0000000000000000 R15: 0000000000000003
[  914.995267] FS:  0000000000000000(0000) GS:ffff88081f340000(0000) knlGS:0000000000000000
[  915.092133] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  915.160911] CR2: 0000000000000018 CR3: 000000068a225000 CR4: 00000000001406e0
[  915.246331] Stack:
[  915.270377]  0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  915.359356]  0000000000000000 0000000000000000 0000000000000000 ffff8803b56028e6
[  915.448331]  ffff8803b56028d6 0000000000000001 0000000000000001 0000000000000001
[  915.537301] Call Trace:
[  915.566541]  <IRQ>
[  915.589541]  [<ffffffff810b9a53>] ? load_balance+0x193/0xa10
[  915.659451]  [<ffffffff81036e29>] ? sched_clock+0x9/0x10
[  915.723032]  [<ffffffff815d09ae>] icmpv6_send+0x3e/0x50
[  915.785572]  [<ffffffff815baf20>] ? ip6_expire_frag_queue+0x100/0x100
[  915.862678]  [<ffffffff815baf1b>] ip6_expire_frag_queue+0xfb/0x100
[  915.936662]  [<ffffffff815baf46>] ip6_frag_expire+0x26/0x30
[  916.003361]  [<ffffffff810eb675>] call_timer_fn+0x35/0x150
[  916.069018]  [<ffffffff815baf20>] ? ip6_expire_frag_queue+0x100/0x100
[  916.146121]  [<ffffffff810eb843>] expire_timers+0xb3/0x140
[  916.211783]  [<ffffffff810eba29>] run_timer_softirq+0x89/0xe0
[  916.280569]  [<ffffffff81052356>] ? lapic_next_deadline+0x26/0x30
[  916.353515]  [<ffffffff810fa24f>] ? clockevents_program_event+0x7f/0x120
[  916.433740]  [<ffffffff815fab8d>] __do_softirq+0x10d/0x2cd
[  916.500420]  [<ffffffff81081e73>] irq_exit+0xa3/0xb0
[  916.560850]  [<ffffffff815fa962>] smp_apic_timer_interrupt+0x42/0x50
[  916.637941]  [<ffffffff815f9c72>] apic_timer_interrupt+0x82/0x90
[  916.710850]  <EOI>
[  916.733858]  [<ffffffff814a50c4>] ? cpuidle_enter_state+0x134/0x2e0
[  916.813050]  [<ffffffff814a52a7>] cpuidle_enter+0x17/0x20
[  916.878664]  [<ffffffff810c09ba>] call_cpuidle+0x2a/0x50
[  916.942903]  [<ffffffff810c0dc5>] cpu_startup_entry+0x2c5/0x380
[  917.014252]  [<ffffffff81050e88>] start_secondary+0x158/0x1a0
[  917.083507] Code: 8b 44 24 48 75 60 f6 c2 02 74 05 f6 c2 30 75 56 48 8b 43 58 4c 89 44 24 28 44 89 5c 24 30 44 89 54 24 48 89 54 24 50 48 83 e0 fe <48> 8b 78 18 48 89 7c 24 58 e8 0d ce b2 ff 48 8b 7c 24 58 e8 f3
[  917.317902] RIP  [<ffffffff815b47a5>] icmp6_send+0x1e5/0xa20
[  917.386220]  RSP <ffff88081f343c90>
[  917.428387] CR2: 0000000000000018
[  917.470370] ---[ end trace bbf64fb62667b164 ]---
[  917.560296] Kernel panic - not syncing: Fatal exception in interrupt
[  917.637043] Kernel Offset: disabled
[  917.713689] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
Comment 1 hannes 2016-12-08 13:15:43 UTC
Hello, and thanks for your report!

Can you check if your kernel config has CONFIG_NET_L3_MASTER_DEV enabled? Do you require it? If not, can you disable it and check if the problem is solved?
Comment 2 hannes 2016-12-08 13:37:19 UTC
Probably fixed by:

commit 79dc7e3f1cd323be4c81aa1a94faa1b3ed987fb2
Author: David Ahern <dsa@cumulusnetworks.com>
Date:   Sun Nov 27 18:52:53 2016 -0800

    net: handle no dst on skb in icmp6_send

$ git describe --contains 79dc7e3f1cd323be4c81aa1a94faa1b3ed987fb2
v4.9-rc8~5^2~44
Comment 3 Jan Steffens 2016-12-08 13:40:16 UTC
Seems I can reliably crash said machines running 4.8.12-2 by sending them incomplete fragmented IPv6 packets. The kernels indeed has NET_L3_MASTER_DEV.
Comment 4 Florian Pritz 2016-12-08 14:50:46 UTC
Thanks for the quick reply. That commit (79dc7e3f1cd323be4c81aa1a94faa1b3ed987fb2) does indeed fix the issue.