Bug 62751 - INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 jiffies, g=270766, c=270765, q=0)
Summary: INFO: rcu_sched detected stalls on CPUs/tasks: { 3} (detected by 2, t=15107 j...
Status: NEEDINFO
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: fs_ext4@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-10-09 13:55 UTC by Loris Luise
Modified: 2013-12-20 16:16 UTC (History)
5 users (show)

See Also:
Kernel Version: 3.12
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Kernel.org format filed infos (772.03 KB, text/plain)
2013-10-09 13:59 UTC, Loris Luise
Details
Infos about bug evidence. (203.35 KB, text/plain)
2013-10-10 06:56 UTC, Loris Luise
Details
Additional evidence, kernel 3.11.6 (337.47 KB, text/plain)
2013-10-24 08:18 UTC, Loris Luise
Details
Additional evidence, kernel 3.12 (220.14 KB, text/plain)
2013-11-18 07:53 UTC, Loris Luise
Details

Description Loris Luise 2013-10-09 13:55:53 UTC
Oct  9 09:51:19 h3oserver2 kernel: [87982.718328] NMI backtrace for cpu 3
Oct  9 09:51:19 h3oserver2 kernel: [87982.718351] CPU: 3 PID: 71 Comm: kswapd0 Not tainted 3.11.3-031103-generic #201310011421
Oct  9 09:51:19 h3oserver2 kernel: [87982.718355] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 01/15/2013
Oct  9 09:51:19 h3oserver2 kernel: [87982.718361] task: ffff8801b5d3c650 ti: ffff8801b2026000 task.ti: ffff8801b2026000
Oct  9 09:51:19 h3oserver2 kernel: [87982.718363] RIP: 0010:[<ffffffff8128cfa3>]  [<ffffffff8128cfa3>] __ext4_es_shrink+0xb3/0x200
Oct  9 09:51:19 h3oserver2 kernel: [87982.869348] RSP: 0018:ffff8801b2027b78  EFLAGS: 00010293
Oct  9 09:51:19 h3oserver2 kernel: [87982.869352] RAX: ffff8801b4a89f30 RBX: ffff880002b29c00 RCX: ffff8801b2027ba8
Oct  9 09:51:19 h3oserver2 kernel: [87982.869353] RDX: ffff880002b29f30 RSI: ffff880002b2af30 RDI: 000000010129c39a
Oct  9 09:51:19 h3oserver2 kernel: [87982.869357] RBP: ffff8801b2027be8 R08: ffff880002b2af30 R09: ffffea0006933480
Oct  9 09:51:19 h3oserver2 kernel: [87982.869359] R10: ffffffff8128cdfd R11: 0000000000000002 R12: 000000000000003a
Oct  9 09:51:19 h3oserver2 kernel: [87982.869360] R13: 0000000000000000 R14: ffff8801b0060800 R15: ffff8801b0060c10
Oct  9 09:51:19 h3oserver2 kernel: [87982.869385] FS:  0000000000000000(0000) GS:ffff8801bfcc0000(0000) knlGS:0000000000000000
Oct  9 09:51:19 h3oserver2 kernel: [87982.869387] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Oct  9 09:51:19 h3oserver2 kernel: [87982.869388] CR2: 0000000002236028 CR3: 0000000001c0d000 CR4: 00000000000006e0
Oct  9 09:51:19 h3oserver2 kernel: [87982.869403] Stack:
Oct  9 09:51:19 h3oserver2 kernel: [87982.869406]  ffff8801b1c1bb80 ffff88006019d330 ffff88006027e730 ffff8801b2027ba8
Oct  9 09:51:19 h3oserver2 kernel: [87982.869409]  ffff8801b0060c80 00000046012f3080 ffff8801b4b2db30 ffff880002b28730
Oct  9 09:51:19 h3oserver2 kernel: [87982.869412]  ffff8801b2027fd8 ffff8801b0060be0 ffff8801b2027d18 0000000000010fa0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910096] Call Trace:
Oct  9 09:51:19 h3oserver2 kernel: [87982.910141]  [<ffffffff8128d1b8>] ext4_es_shrink+0xc8/0x140
Oct  9 09:51:19 h3oserver2 kernel: [87982.910158]  [<ffffffff8115b6b4>] shrink_slab+0x154/0x300
Oct  9 09:51:19 h3oserver2 kernel: [87982.910179]  [<ffffffff8172ea0e>] ? _raw_spin_lock+0xe/0x20
Oct  9 09:51:19 h3oserver2 kernel: [87982.910191]  [<ffffffff811ad776>] ? vmpressure+0x76/0xa0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910194]  [<ffffffff8115de8b>] kswapd_shrink_zone+0xfb/0x1a0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910197]  [<ffffffff8115ec5c>] balance_pgdat+0x32c/0x470
Oct  9 09:51:19 h3oserver2 kernel: [87982.910200]  [<ffffffff81166d60>] ? vm_events_fold_cpu+0x50/0x50
Oct  9 09:51:19 h3oserver2 kernel: [87982.910203]  [<ffffffff8115eec3>] kswapd+0x123/0x240
Oct  9 09:51:19 h3oserver2 kernel: [87982.910205]  [<ffffffff8115eda0>] ? balance_pgdat+0x470/0x470
Oct  9 09:51:19 h3oserver2 kernel: [87982.910521]  [<ffffffff81089040>] kthread+0xc0/0xd0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910526]  [<ffffffff81088f80>] ? flush_kthread_worker+0xb0/0xb0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910532]  [<ffffffff81737dac>] ret_from_fork+0x7c/0xb0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910535]  [<ffffffff81088f80>] ? flush_kthread_worker+0xb0/0xb0
Oct  9 09:51:19 h3oserver2 kernel: [87982.910536] Code: 8c da 00 00 00 0f 84 d4 00 00 00 0f 1f 44 00 00 48 8b 7a 18 49 39 be 20 04 00 00 48 8d 9a d0 fc ff ff 73 44 48 8b 42 08 48 8b 32 <48> 89 46 08 48 89 30 48 8b 45 c8 48 89 55 c8 48 89 0a 48 89 42
Comment 1 Loris Luise 2013-10-09 13:59:07 UTC
Created attachment 110491 [details]
Kernel.org format filed infos
Comment 2 Loris Luise 2013-10-09 15:10:37 UTC
Sorry, but I have not idea which part of kernel this bug is related,

so I chose ext4 (becaise in call trace I saw ext4_es_shrink+0xc8/0x140)

I logged several of this almost every day.

Kernel is running on a VMware Esxi 4.1 last patched Host on a HP ML350 G5 server.
Comment 3 Jan Kara 2013-10-09 20:53:49 UTC
Thanks. This looks like a problem in the shrinker for ext4 extent cache. Apparently we burn too much time (the longest stall is over 2 seconds) searching and freeing extents without giving CPU a break. Ideally we should make the shrinker more efficient (since I don't believe it was called with so many entries to free) but if that won't work out easily, we should at least sprinkle some cond_resched() into the shrinker code as a band aid. I'll see what we can do.
Comment 4 Loris Luise 2013-10-10 06:49:29 UTC
Update to kernel 3.11.4-generic, problem still persist, I added another attach (bug-20131010084533.txt)
Comment 5 Loris Luise 2013-10-10 06:56:26 UTC
Created attachment 110571 [details]
Infos about bug evidence.
Comment 6 Loris Luise 2013-10-24 08:18:09 UTC
Created attachment 112191 [details]
Additional evidence, kernel 3.11.6
Comment 7 Alan 2013-11-04 11:53:01 UTC
Jan - its on a vmware virtual platform so it could just be that vmware ran off with the CPU. Can you duplicate this on a native platform, and if you disable the NMI watchdog does the kernel hang or merely stall for a moment in this situation ?
Comment 8 Loris Luise 2013-11-18 07:51:29 UTC
Hello,

what can I do on my side?

I've done the following modifications:

1) Installed latest mainline kernel 3.12

2) Replaced SCSI drivers from vmw_pvsci to LSI Logic SAS (mptsas)

3) reduced number of CPUs from 8 to 4

4) enabled on ESXi (4.1) setting for this machine "CPU Hot add" feature

The issue is still affecting ths machine... other 2 almost equals machine
running on this host (same kernel) but with XFS on root are not affected and very
stable.

I reattached another info file (bug-20131118084007.txt) which contains a lot of infos.

Thanks.
Comment 9 Loris Luise 2013-11-18 07:53:46 UTC
Created attachment 115021 [details]
Additional evidence, kernel 3.12
Comment 10 Zheng Liu 2013-12-07 10:08:16 UTC
Hi Loris,

Sorry for my late reply.  I haven't had a chance to take a closer look at this problem until now.

(In reply to Loris Luise from comment #8)
> Hello,
> 
> what can I do on my side?

I am not sure whether you use 'delalloc' on your system.  I don't see these details from the log that you pasted.  So now only I could suggest is that you can try to turn off the delalloc.  The step is as below:

 % sudo mount -t ext4 -o remount,nodelalloc ${DEV} ${MNT}

After this, it won't track the delayed entry on extent status tree.  So it will avoid es shrinker to take too much time to scan delayed entries that couldn't be reclaimed.

Then I will try to reproduce this problem on my sand box and think about how to improve es shrinker.

Thanks,
- Zheng

> 
> I've done the following modifications:
> 
> 1) Installed latest mainline kernel 3.12
> 
> 2) Replaced SCSI drivers from vmw_pvsci to LSI Logic SAS (mptsas)
> 
> 3) reduced number of CPUs from 8 to 4
> 
> 4) enabled on ESXi (4.1) setting for this machine "CPU Hot add" feature
> 
> The issue is still affecting ths machine... other 2 almost equals machine
> running on this host (same kernel) but with XFS on root are not affected and
> very
> stable.
> 
> I reattached another info file (bug-20131118084007.txt) which contains a lot
> of infos.
> 
> Thanks.
Comment 11 Loris Luise 2013-12-20 15:45:25 UTC
Hello,

I had cloned root partition to a XFS one and curently I have no more problems.

Previously I used a mount line in fstab like this

/dev/mapper/h30server2-root /               ext4    errors=remount-ro,noatime,nodiratime,data=writeback 0       1

and later (after your suggestion)

/dev/mapper/h30server2-root /               ext4    errors=remount-ro,noatime,nodiratime,nodelalloc,data=writeback 0       1


but I did not test it with this new option for a sufficient amount of time to be considered valid and come to a conclusion.

(I had to stabilize the system so I was forced to switch to XFS).

I collected more buginfos If they can be usefult I can post them here.

Thanks a lot.
Comment 12 Theodore Tso 2013-12-20 16:16:22 UTC
Hi Louis,

Can you talk a bit about your workload?   What sort of applications are you running on the file system which is having the problems?   A database?  Squid cache, something else?   Is it a lot of random writes?    What sort of files does it create?  Does it create and delete a lot of files?   Etc.

It looks like your system has 8GB of memory --- is that right?   How much free memory does your machine normally has?   Also, if you still have your ext4 file system image, could you run "e2freefrag -v /dev/mapper/XXX"?

Many thanks!!

Note You need to log in before you can comment on or make changes to this bug.