Bug 78121 - WARNING at inet_csk_destroy_sock
Summary: WARNING at inet_csk_destroy_sock
Status: NEW
Alias: None
Product: Networking
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Stephen Hemminger
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-06-16 15:34 UTC by Andrey Rahmatullin
Modified: 2016-02-15 20:36 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.14.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Patch Test Verison (868 bytes, patch)
2014-06-17 19:07 UTC, xerofoify
Details | Diff

Description Andrey Rahmatullin 2014-06-16 15:34:12 UTC
I can consistently reproduce a WARNING when downloading a torrent:

[37642.054384] WARNING: CPU: 0 PID: 0 at /build/linux-Lu66Tp/linux-3.14.5/net/core/stream.c:201 inet_csk_destroy_sock+0x4d/0x130()
[37642.054395] Modules linked in: cpufreq_stats cpufreq_userspace cpufreq_powersave cpufreq_conservative nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc bridge stp llc usb_storage snd_hda_codec_hdmi joydev hid_generic ext4 crc16 mbcache jbd2 x86_pkg_temp_t
hermal intel_powerclamp nls_utf8 nls_cp437 vfat fat snd_hda_codec_realtek snd_hda_codec_generic arc4 intel_rapl coretemp kvm_intel rt61pci kvm rt2x00pci rt2x00mmio rt2x00lib eeprom_93cx6 mac80211 crc32_pclmul crc32c_intel cfg80211 rfkill crc_itu_t ghash_clmulni_intel mei
_me mei snd_hda_intel ehci_pci snd_hda_codec snd_hwdep snd_pcm_oss aesni_intel aes_x86_64 snd_mixer_oss lrw snd_pcm gf128mul glue_helper ablk_helper cryptd efi_pstore pcspkr efivars serio_raw lpc_ich r8169 mfd_core mii snd_timer snd battery soundcore evdev xhci_hcd proce
ssor nouveau mxm_wmi wmi video button ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_snapshot dm_bufio nct6775 hwmon_vid loop fuse autofs4 xfs crc32c libcrc32c dm_mod usbhid hid ehci_hcd usbcore usb_common sd_mod crc_t10dif crct10dif_generic ahci libahci crct10dif_pclmu
l crct10dif_common libata scsi_mod thermal fan thermal_sys
[37642.054489] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14-1-amd64 #1 Debian 3.14.5-1
[37642.054490] Hardware name: System manufacturer System Product Name/P8Z77-V LE, BIOS 1104 03/18/2014
[37642.054493]  0000000000000009 ffffffff814ba272 0000000000000000 ffffffff8105f0b2
[37642.054508]  ffff8800ac62d140 ffff88029b740680 ffff8801c4483f22 0000000000000000
[37642.054516]  ffff8801c4483f0e ffffffff81414f0d ffff8800ac62d140 ffffffff81421baf
[37642.054519] Call Trace:
[37642.054526]  <IRQ>  [<ffffffff814ba272>] ? dump_stack+0x41/0x51
[37642.054531]  [<ffffffff8105f0b2>] ? warn_slowpath_common+0x72/0x90
[37642.054535]  [<ffffffff81414f0d>] ? inet_csk_destroy_sock+0x4d/0x130
[37642.054545]  [<ffffffff81421baf>] ? tcp_rcv_state_process+0x23f/0xdd0
[37642.054552]  [<ffffffff8109e2e8>] ? __wake_up_sync_key+0x38/0x60
[37642.054563]  [<ffffffff8142aaa4>] ? tcp_v4_do_rcv+0x224/0x480
[37642.054571]  [<ffffffff8142cb7f>] ? tcp_v4_rcv+0x75f/0x780
[37642.054575]  [<ffffffff814121fa>] ? __inet_lookup_established+0x3a/0x120
[37642.054581]  [<ffffffff81409246>] ? ip_local_deliver_finish+0x96/0x1f0
[37642.054586]  [<ffffffff813d58d7>] ? __netif_receive_skb_core+0x627/0x800
[37642.054591]  [<ffffffff813d5b1a>] ? netif_receive_skb_internal+0x1a/0x80
[37642.054599]  [<ffffffffa07bf51a>] ? br_handle_frame_finish+0x1ba/0x3b0 [bridge]
[37642.054606]  [<ffffffffa07c5c49>] ? br_nf_pre_routing_finish+0x179/0x390 [bridge]
[37642.054612]  [<ffffffffa07c60eb>] ? br_nf_pre_routing+0x28b/0x630 [bridge]
[37642.054625]  [<ffffffffa07bf360>] ? br_handle_local_finish+0x60/0x60 [bridge]
[37642.054629]  [<ffffffff81402f7d>] ? nf_iterate+0x5d/0x90
[37642.054635]  [<ffffffffa07bf360>] ? br_handle_local_finish+0x60/0x60 [bridge]
[37642.054639]  [<ffffffff8140301e>] ? nf_hook_slow+0x6e/0x130
[37642.054648]  [<ffffffffa07bf360>] ? br_handle_local_finish+0x60/0x60 [bridge]
[37642.054654]  [<ffffffffa07bf878>] ? br_handle_frame+0x168/0x220 [bridge]
[37642.054660]  [<ffffffff813d54fc>] ? __netif_receive_skb_core+0x24c/0x800
[37642.054666]  [<ffffffff8101a9f5>] ? read_tsc+0x5/0x20
[37642.054670]  [<ffffffff813d5b1a>] ? netif_receive_skb_internal+0x1a/0x80
[37642.054673]  [<ffffffff813d656d>] ? napi_gro_receive+0x6d/0xd0
[37642.054682]  [<ffffffffa045d5ca>] ? rtl8169_poll+0x17a/0x690 [r8169]
[37642.054690]  [<ffffffff813d5e88>] ? net_rx_action+0x138/0x240
[37642.054695]  [<ffffffff8106401a>] ? __do_softirq+0xfa/0x2a0
[37642.054700]  [<ffffffff810643e5>] ? irq_exit+0x95/0xa0
[37642.054710]  [<ffffffff81014fcd>] ? do_IRQ+0x4d/0xe0
[37642.054714]  [<ffffffff814bfded>] ? common_interrupt+0x6d/0x6d
[37642.054719]  <EOI>  [<ffffffff8139b5fd>] ? cpuidle_enter_state+0x4d/0xc0
[37642.054721]  [<ffffffff8139b5f3>] ? cpuidle_enter_state+0x43/0xc0
[37642.054725]  [<ffffffff8139b719>] ? cpuidle_idle_call+0xa9/0x1d0
[37642.054729]  [<ffffffff8101c695>] ? arch_cpu_idle+0x5/0x30
[37642.054731]  [<ffffffff810af425>] ? cpu_startup_entry+0x95/0x230
[37642.054735]  [<ffffffff818c8efc>] ? start_kernel+0x41d/0x428
[37642.054739]  [<ffffffff818c8904>] ? repair_env_string+0x58/0x58
[37642.054743]  [<ffffffff818c8120>] ? early_idt_handlers+0x120/0x120
[37642.054747]  [<ffffffff818c871f>] ? x86_64_start_kernel+0x14d/0x15c
[37642.054748] ---[ end trace 4dd3c5527de09178 ]---
[37642.054775] ------------[ cut here ]------------
[37642.054779] WARNING: CPU: 0 PID: 0 at /build/linux-Lu66Tp/linux-3.14.5/net/ipv4/af_inet.c:153 inet_sock_destruct+0x1c5/0x1d0()
[37642.054781] Modules linked in: cpufreq_stats cpufreq_userspace cpufreq_powersave cpufreq_conservative nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc bridge stp llc usb_storage snd_hda_codec_hdmi joydev hid_generic ext4 crc16 mbcache jbd2 x86_pkg_temp_thermal intel_powerclamp nls_utf8 nls_cp437 vfat fat snd_hda_codec_realtek snd_hda_codec_generic arc4 intel_rapl coretemp kvm_intel rt61pci kvm rt2x00pci rt2x00mmio rt2x00lib eeprom_93cx6 mac80211 crc32_pclmul crc32c_intel cfg80211 rfkill crc_itu_t ghash_clmulni_intel mei_me mei snd_hda_intel ehci_pci snd_hda_codec snd_hwdep snd_pcm_oss aesni_intel aes_x86_64 snd_mixer_oss lrw snd_pcm gf128mul glue_helper ablk_helper cryptd efi_pstore pcspkr efivars serio_raw lpc_ich r8169 mfd_core mii snd_timer snd battery soundcore evdev xhci_hcd processor nouveau mxm_wmi wmi video button ttm drm_kms_helper drm i2c_algo_bit i2c_core dm_snapshot dm_bufio nct6775 hwmon_vid loop fuse autofs4 xfs crc32c libcrc32c dm_mod usbhid hid ehci_hcd usbcore usb_common sd_mod crc_t10dif crct10dif_generic ahci libahci crct10dif_pclmul crct10dif_common libata scsi_mod thermal fan thermal_sys
[37642.054862] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W    3.14-1-amd64 #1 Debian 3.14.5-1
[37642.054863] Hardware name: System manufacturer System Product Name/P8Z77-V LE, BIOS 1104 03/18/2014
[37642.054865]  0000000000000009 ffffffff814ba272 0000000000000000 ffffffff8105f0b2
[37642.054871]  ffff8800ac62d140 ffff8800ac62d2b8 ffff8800ac62d140 ffff8800ac62d1b0
[37642.054873]  ffff8801c4483f0e ffffffff8143f0d5 ffff8800ac62d140 0000000000000000
[37642.054877] Call Trace:
[37642.054882]  <IRQ>  [<ffffffff814ba272>] ? dump_stack+0x41/0x51
[37642.054885]  [<ffffffff8105f0b2>] ? warn_slowpath_common+0x72/0x90
[37642.054888]  [<ffffffff8143f0d5>] ? inet_sock_destruct+0x1c5/0x1d0
[37642.054893]  [<ffffffff813c1635>] ? __sk_free+0x15/0x140
[37642.054897]  [<ffffffff8142ca64>] ? tcp_v4_rcv+0x644/0x780
[37642.054900]  [<ffffffff814121fa>] ? __inet_lookup_established+0x3a/0x120
[37642.054904]  [<ffffffff81409246>] ? ip_local_deliver_finish+0x96/0x1f0
[37642.054912]  [<ffffffff813d58d7>] ? __netif_receive_skb_core+0x627/0x800
[37642.054915]  [<ffffffff813d5b1a>] ? netif_receive_skb_internal+0x1a/0x80
[37642.054921]  [<ffffffffa07bf51a>] ? br_handle_frame_finish+0x1ba/0x3b0 [bridge]
[37642.054925]  [<ffffffffa07c5c49>] ? br_nf_pre_routing_finish+0x179/0x390 [bridge]
[37642.054933]  [<ffffffffa07c60eb>] ? br_nf_pre_routing+0x28b/0x630 [bridge]
[37642.054940]  [<ffffffffa07bf360>] ? br_handle_local_finish+0x60/0x60 [bridge]
[37642.054949]  [<ffffffff81402f7d>] ? nf_iterate+0x5d/0x90
[37642.054955]  [<ffffffffa07bf360>] ? br_handle_local_finish+0x60/0x60 [bridge]
[37642.054964]  [<ffffffff8140301e>] ? nf_hook_slow+0x6e/0x130
[37642.054970]  [<ffffffffa07bf360>] ? br_handle_local_finish+0x60/0x60 [bridge]
[37642.054976]  [<ffffffffa07bf878>] ? br_handle_frame+0x168/0x220 [bridge]
[37642.054986]  [<ffffffff813d54fc>] ? __netif_receive_skb_core+0x24c/0x800
[37642.054991]  [<ffffffff8101a9f5>] ? read_tsc+0x5/0x20
[37642.054999]  [<ffffffff813d5b1a>] ? netif_receive_skb_internal+0x1a/0x80
[37642.055001]  [<ffffffff813d656d>] ? napi_gro_receive+0x6d/0xd0
[37642.055007]  [<ffffffffa045d5ca>] ? rtl8169_poll+0x17a/0x690 [r8169]
[37642.055011]  [<ffffffff813d5e88>] ? net_rx_action+0x138/0x240
[37642.055015]  [<ffffffff8106401a>] ? __do_softirq+0xfa/0x2a0
[37642.055018]  [<ffffffff810643e5>] ? irq_exit+0x95/0xa0
[37642.055024]  [<ffffffff81014fcd>] ? do_IRQ+0x4d/0xe0
[37642.055027]  [<ffffffff814bfded>] ? common_interrupt+0x6d/0x6d
[37642.055034]  <EOI>  [<ffffffff8139b5fd>] ? cpuidle_enter_state+0x4d/0xc0
[37642.055037]  [<ffffffff8139b5f3>] ? cpuidle_enter_state+0x43/0xc0
[37642.055042]  [<ffffffff8139b719>] ? cpuidle_idle_call+0xa9/0x1d0
[37642.055046]  [<ffffffff8101c695>] ? arch_cpu_idle+0x5/0x30
[37642.055050]  [<ffffffff810af425>] ? cpu_startup_entry+0x95/0x230
[37642.055052]  [<ffffffff818c8efc>] ? start_kernel+0x41d/0x428
[37642.055056]  [<ffffffff818c8904>] ? repair_env_string+0x58/0x58
[37642.055062]  [<ffffffff818c8120>] ? early_idt_handlers+0x120/0x120
[37642.055067]  [<ffffffff818c871f>] ? x86_64_start_kernel+0x14d/0x15c
[37642.055070] ---[ end trace 4dd3c5527de09179 ]---
Comment 1 xerofoify 2014-06-17 19:07:58 UTC
Created attachment 140091 [details]
Patch Test Verison
Comment 2 xerofoify 2014-06-17 19:08:41 UTC
This is the patch , I think may fix it. Please tell it and let
me known if it fixes it.
Cheers Nick
Comment 3 Andrey Rahmatullin 2014-06-17 19:24:43 UTC
(In reply to xerofoify from comment #2)
> This is the patch , I think may fix it. Please tell it and let
> me known if it fixes it.
> Cheers Nick

Doesn't look like a "fix".
Comment 4 xerofoify 2014-06-18 14:38:39 UTC
	WARN_ON(sk->sk_wmem_queued);
	WARN_ON(sk->sk_forward_alloc);
There are the two lines that you are having issues with based on your 
trace since both are checking for not Null values this with give you
the above error most likely. This is due to warn on printing the 
above data from your log. Again please try the patch it may work.
Thanks Nick
Comment 5 Ben Hutchings 2014-06-18 16:35:33 UTC
Nick, you're really not helping.  The assertions are valid; the bug is probably either (1) sk_forward_alloc is not always being decremented when the associated memory is freed, or (2) some associated memory is not being freed.
Comment 6 xerofoify 2014-06-18 16:39:27 UTC
Fair enough , now that I look at my patch you are correct i will see if I can find out where your memory errors are coming from.
Thanks Nick
Comment 7 xerofoify 2014-06-18 21:53:48 UTC
Weird allocations seem to be correct in the file.
	sk_mem_reclaim(sk);
Can you do a printk after this line as it seems to maybe reclaiming 
memory in error as the WarnOns  before it are creating any errors 
based on your trace back.
Nick

Note You need to log in before you can comment on or make changes to this bug.