Bug 161841 - BUG: soft lockup - CPU#3 stuck for 22s! we are facing this issue repeatedly
Summary: BUG: soft lockup - CPU#3 stuck for 22s! we are facing this issue repeatedly
Status: NEW
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: All Linux
: P1 high
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-17 07:18 UTC by mahesh
Modified: 2017-05-12 09:43 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.10.0-123.el7.x86_64
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description mahesh 2016-09-17 07:18:07 UTC
Hello All,

We have running no.of VM's on esxi host with v spare client, but we got issue repeatedly. BUG: soft lockup - CPU#3 stuck for 22s!  causing of this issue server has been very slow even we are not able to do ssh. Could you please assist some one this issue to be resolve.


Sep 14 08:05:45  kernel: BUG: soft lockup - CPU#3 stuck for 27s! [scanner:20203]
Sep 14 08:05:45  kernel: Modules linked in: fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat xfs libcrc32c bridge stp llc nfsv3 rpcsec_gss_krb5
nfsv4 dns_resolver nfs fscache binfmt_misc sg ppdev vmw_balloon coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_he
lper ablk_helper cryptd serio_raw pcspkr vmw_vmci parport_pc shpchp parport mperf i2c_piix4 nfsd auth_rpcgss nfs_acl lockd sunrpc ext4 mbcache jbd2 sd_mod sr_mod cdrom
crc_t10dif ata_generic crct10dif_common pata_acpi vmwgfx ttm drm vmxnet3 ata_piix libata vmw_pvscsi i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod
Sep 14 08:05:45  kernel: CPU: 3 PID: 20203 Comm: scanner Not tainted 3.10.0-123.el7.x86_64 #1
Sep 14 08:05:45  kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 10/22/2013
Sep 14 08:05:45  kernel: task: ffff88041f11db00 ti: ffff8802fc7ba000 task.ti: ffff8802fc7ba000
Sep 14 08:05:45  kernel: RIP: 0033:[<000000000041dd80>]  [<000000000041dd80>] 0x41dd7f
Sep 14 08:05:46  kernel: RSP: 002b:00007fffd2c72eb0  EFLAGS: 00010202
Sep 14 08:05:46  kernel: RAX: 0000000000000000 RBX: 000000000000fe2e RCX: 0000000000000001
Sep 14 08:05:46  kernel: RDX: 000000000006d95e RSI: 0000000000000000 RDI: 0000000000c74130
Sep 14 08:05:46  kernel: RBP: 0000000000c74188 R08: 0000000000000000 R09: 0000000057d8e6e9
Sep 14 08:05:46  kernel: R10: 0000000000000002 R11: 0000000000000002 R12: 0000000000000000
Sep 14 08:05:46  kernel: R13: ffffffff815e833e R14: ffff8802fc7bbf70 R15: 0000000000000000
Sep 14 08:05:46  kernel: FS:  00007f4731c75740(0000) GS:ffff88043fd80000(0000) knlGS:0000000000000000
Sep 14 08:05:46  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep 14 08:05:46  kernel: CR2: 00007fb845c10000 CR3: 0000000347dd0000 CR4: 00000000000007e0
Sep 14 08:05:46  kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 14 08:05:46  kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Sep 14 08:05:46  kernel:
Sep 14 08:05:46  kernel: BUG: soft lockup - CPU#2 stuck for 24s! 
Sep 14 08:05:46  kernel: Modules linked in: fuse btrfs zlib_deflate raid6_pq xor vfat msdos fat xfs libcrc32c bridge stp llc nfsv3 rpcsec_gss_krb5
nfsv4 dns_resolver nfs fscache binfmt_misc sg ppdev vmw_balloon coretemp crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_he
lper ablk_helper cryptd serio_raw pcspkr vmw_vmci parport_pc shpchp parport mperf i2c_piix4 nfsd auth_rpcgss nfs_acl lockd sunrpc ext4 mbcache jbd2 sd_mod sr_mod cdrom
crc_t10dif ata_generic crct10dif_common pata_acpi vmwgfx ttm drm vmxnet3 ata_piix libata vmw_pvscsi i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod
Sep 14 08:05:46  kernel: CPU: 2 PID: 11742 Comm: SERVER Not tainted 3.10.0-123.el7.x86_64 #1
Sep 14 08:05:46  kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 10/22/2013
Sep 14 08:05:46  kernel: task: ffff8802fd296660 ti: ffff8802fc0fa000 task.ti: ffff8802fc0fa000
Sep 14 08:05:46  kernel: RIP: 0010:[<ffffffff814bed71>]  [<ffffffff814bed71>] __netdev_alloc_frag+0xe1/0x140
Sep 14 08:05:46  kernel: RSP: 0000:ffff88043fd03d80  EFLAGS: 00000282
Sep 14 08:05:47  kernel: RAX: ffff880000000000 RBX: ffffffff81944f80 RCX: 0000000000000780
Sep 14 08:05:47  kernel: RDX: ffff88033c088000 RSI: 0000000000000000 RDI: 0000000000000282
Sep 14 08:05:47  kernel: RBP: ffff88043fd03db0 R08: 0000000000000000 R09: 00000000ffffffd8
Sep 14 08:05:47  kernel: R10: 0000000000000080 R11: ffffea00109b5380 R12: ffff88043fd03cf8
Sep 14 08:05:47  kernel: R13: ffffffff815f2d9d R14: ffff88043fd03db0 R15: ffff88043fd112c0
Sep 14 08:05:47  kernel: FS:  00007f9501e7d700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
Sep 14 08:05:47  kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 14 08:05:47  kernel: CR2: 00007f46f3aff000 CR3: 000000007e295000 CR4: 00000000000007e0
Sep 14 08:05:47  kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep 14 08:05:47  kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Sep 14 08:05:47  kernel: Stack:
Sep 14 08:05:47  kernel: 000042201ff072d0 0000000000000780 ffff880427b6c000 ffff88041eb128c0
Sep 14 08:05:47 kernel: ffff880427b6d0c0 ffff88041ff072e0 ffff88043fd03dd8 ffffffff814c0db7
Sep 14 08:05:47  kernel: ffff880427b6d150 ffff88041ff0cd20 ffff88041eb128c0 ffff88043fd03e48
Sep 14 08:05:47  kernel: Call Trace:
Sep 14 08:05:47  kernel: <IRQ>
Sep 14 08:05:47  kernel:
Sep 14 08:05:47  kernel: [<ffffffff814c0db7>] __netdev_alloc_skb+0x77/0xc0
Sep 14 08:05:47  kernel: [<ffffffffa00a4174>] vmxnet3_rq_rx_complete+0x194/0x810 [vmxnet3]
Sep 14 08:05:47  kernel: [<ffffffffa00a4d5a>] vmxnet3_poll_rx_only+0x3a/0xb0 [vmxnet3]
Sep 14 08:05:47  kernel: [<ffffffff814d041a>] net_rx_action+0x15a/0x250
Sep 14 08:05:47  kernel: [<ffffffff81067047>] __do_softirq+0xf7/0x290
Sep 14 08:05:47  kernel: [<ffffffff815f3a5c>] call_softirq+0x1c/0x30
Sep 14 08:05:47  kernel: [<ffffffff81014d25>] do_softirq+0x55/0x90
Sep 14 08:05:47  kernel: [<ffffffff810673e5>] irq_exit+0x115/0x120
Sep 14 08:05:47  kernel: [<ffffffff815f4435>] smp_apic_timer_interrupt+0x45/0x60
Sep 14 08:05:47  kernel: [<ffffffff815f2d9d>] apic_timer_interrupt+0x6d/0x80
Sep 14 08:05:48  kernel: <EOI>
Sep 14 08:05:48  kernel: [<ffffffff815e954d>] ? retint_careful+0xb/0x32
Sep 14 08:05:48 kernel: Code: 00 16 00 00 83 6b 10 01 89 4b 08 48 01 c2 48 b8 00 00 00 00 00 88 ff ff 48 c1 fa 06 48 c1 e2 0c 48 01 c2 4c 01 c2 4c
 89 e7 57 9d <66> 66 90 66 90 48 83 c4 08 48 89 d0 5b 41 5c 41 5d 41 5e 41 5f
Comment 1 Toni Ballesta 2017-05-12 09:10:10 UTC
I've one similar on newer kernel 3.16, from Debian Jessie. The bug as appeared just when I insert LOG rule on Iptables. Production server!

Linux mimaquina 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1+deb8u2 (2017-03-07) x86_64 GNU/Linux
Comment 2 Toni Ballesta 2017-05-12 09:43:06 UTC
Well, my cause is probably when result of 575 lines on only one second, because MySQL and HTTP rules not allowed first by forgotting.

Note You need to log in before you can comment on or make changes to this bug.