In 2.6.36.3, I can quite quickly (~1 hour) get the following under some workload: Feb 12 00:17:12 10.10.10.102 kernel: BUG: soft lockup - CPU#3 stuck for 67s! [postmaster:27813] Feb 12 00:17:12 10.10.10.102 kernel: Modules linked in: dm_mod mlx4_ib mlx4_core sata_nv 3w_9xxx Feb 12 00:17:12 10.10.10.102 kernel: CPU 3 Feb 12 00:17:12 10.10.10.102 kernel: Modules linked in: dm_mod mlx4_ib mlx4_core sata_nv 3w_9xxx Feb 12 00:17:12 10.10.10.102 kernel: Feb 12 00:17:12 10.10.10.102 kernel: Pid: 27813, comm: postmaster Not tainted 2.6.36.3.xdi #6 S2892/empty Feb 12 00:17:12 10.10.10.102 kernel: RIP: 0010:[<ffffffff81055033>] [<ffffffff81055033>] bit_waitqueue+0xc3/0xd0 Feb 12 00:17:12 10.10.10.102 kernel: RSP: 0018:ffff880039e7ba48 EFLAGS: 00000292 Feb 12 00:17:12 10.10.10.102 kernel: RAX: ffff880001877b30 RBX: 0000000000000d12 RCX: 0000000000000034 Feb 12 00:17:12 10.10.10.102 kernel: RDX: ffff880100000e00 RSI: 0000000000000000 RDI: 8038000000000000 Feb 12 00:17:12 10.10.10.102 kernel: RBP: ffffffff8100354e R08: 023ee891c01c0000 R09: fda317fdfa087007 Feb 12 00:17:12 10.10.10.102 kernel: R10: 0000000000000001 R11: 0000000000000000 R12: ffff880001877b30 Feb 12 00:17:12 10.10.10.102 kernel: R13: fda317fdfa087007 R14: 0000000000000001 R15: 0000000000000000 Feb 12 00:17:12 10.10.10.102 kernel: FS: 000000004087c940(0063) GS:ffff880001b80000(0000) knlGS:0000000000000000 Feb 12 00:17:12 10.10.10.102 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Feb 12 00:17:12 10.10.10.102 kernel: CR2: 000000000042b000 CR3: 000000003b87e000 CR4: 00000000000006e0 Feb 12 00:17:12 10.10.10.102 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Feb 12 00:17:12 10.10.10.102 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Feb 12 00:17:12 10.10.10.102 kernel: Process postmaster (pid: 27813, threadinfo ffff880039e7a000, task ffff880133f91880) Feb 12 00:17:12 10.10.10.102 kernel: Stack: Feb 12 00:17:12 10.10.10.102 kernel: ffff88023ee891c0 ffffffff813b98e8 ffff88023ee891c0 0000000000000007 Feb 12 00:17:12 10.10.10.102 kernel: <0> 0000000000000001 ffff880133f91880 ffffffff810550e0 ffff880039e7ba80 Feb 12 00:17:12 10.10.10.102 kernel: <0> ffff880039e7ba80 ffffffffffffff10 ffffffff8100372e ffff88023ee89130 Feb 12 00:17:12 10.10.10.102 kernel: Call Trace: Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff813b98e8>] ? out_of_line_wait_on_bit_lock+0x28/0x90 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff810550e0>] ? wake_bit_function+0x0/0x40 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff8100372e>] ? reschedule_interrupt+0xe/0x20 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81133489>] ? nfs_commit_inode+0xb9/0x1c0 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff811337a9>] ? nfs_wb_page+0x69/0xc0 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81133ce1>] ? nfs_flush_incompatible+0x41/0x90 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81124c9f>] ? nfs_write_begin+0x8f/0x220 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81081638>] ? generic_file_buffered_write+0x118/0x280 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff813b8c63>] ? schedule+0x273/0x900 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81083419>] ? __generic_file_aio_write+0x229/0x420 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81035aa3>] ? load_balance+0x133/0x700 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81083677>] ? generic_file_aio_write+0x67/0xe0 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81125ad1>] ? nfs_file_write+0xe1/0x230 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff8104a472>] ? get_signal_to_deliver+0x92/0x350 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff810c2abf>] ? do_sync_write+0xbf/0x100 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff813b8c63>] ? schedule+0x273/0x900 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff810c31f6>] ? vfs_write+0xc6/0x180 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff810c350e>] ? sys_write+0x4e/0x90 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81040276>] ? sys_gettimeofday+0x36/0x90 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81002c6b>] ? system_call_fastpath+0x16/0x1b Feb 12 00:17:12 10.10.10.102 kernel: Code: 00 fa 6c 81 48 8d 2c 29 b9 40 00 00 00 2b 8a 90 06 00 00 48 d3 ed 48 8d 44 6d 00 48 c1 e0 03 48 03 82 80 06 00 00 48 83 c4 08 5b <5d> c3 66 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec 18 48 8d 47 08 Feb 12 00:17:12 10.10.10.102 kernel: Call Trace: Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff813b98e8>] ? out_of_line_wait_on_bit_lock+0x28/0x90 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff810550e0>] ? wake_bit_function+0x0/0x40 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff8100372e>] ? reschedule_interrupt+0xe/0x20 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81133489>] ? nfs_commit_inode+0xb9/0x1c0 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff811337a9>] ? nfs_wb_page+0x69/0xc0 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81133ce1>] ? nfs_flush_incompatible+0x41/0x90 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81124c9f>] ? nfs_write_begin+0x8f/0x220 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81081638>] ? generic_file_buffered_write+0x118/0x280 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff813b8c63>] ? schedule+0x273/0x900 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81083419>] ? __generic_file_aio_write+0x229/0x420 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81035aa3>] ? load_balance+0x133/0x700 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81083677>] ? generic_file_aio_write+0x67/0xe0 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81125ad1>] ? nfs_file_write+0xe1/0x230 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff8104a472>] ? get_signal_to_deliver+0x92/0x350 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff810c2abf>] ? do_sync_write+0xbf/0x100 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff813b8c63>] ? schedule+0x273/0x900 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff810c31f6>] ? vfs_write+0xc6/0x180 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff810c350e>] ? sys_write+0x4e/0x90 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81040276>] ? sys_gettimeofday+0x36/0x90 Feb 12 00:17:12 10.10.10.102 kernel: [<ffffffff81002c6b>] ? system_call_fastpath+0x16/0x1b I cannot reproduce it in 2.6.35.11.
Created attachment 47652 [details] more call traces in 2.6.37 2.6.37 locks up as well
2.6.36 locks up as well
2.6.38-rc4 is likewise
Can you use 'git bisect' to figure out where the problem started between 2.6.35 and 2.6.36?
$ git bisect log # bad: [f6f94e2ab1b33f0082ac22d71f66385a60d8157f] Linux 2.6.36 # good: [9fe6206f400646a2322096b56c59891d530e8d51] Linux 2.6.35 git bisect start 'v2.6.36' 'v2.6.35' '--' 'fs/nfs' # skip: [452e93523d9433f83670e7b42cbe75319c208762] NFSv4: Clean up the process of renewing the NFSv4 lease git bisect skip 452e93523d9433f83670e7b42cbe75319c208762 # skip: [a17c2153d2e271b0cbacae9bed83b0eaa41db7e1] SUNRPC: Move the bound cred to struct rpc_rqst git bisect skip a17c2153d2e271b0cbacae9bed83b0eaa41db7e1 # bad: [df486a25900f4dba9cdc3886c4ac871951c6aef3] NFS: Fix the selection of security flavours in Kconfig git bisect bad df486a25900f4dba9cdc3886c4ac871951c6aef3 # bad: [77041ed9b49a9e10f374bfa6e482d30ee7a3d46e] NFSv4: Ensure the lockowners are labelled using the fl_owner and/or fl_pid git bisect bad 77041ed9b49a9e10f374bfa6e482d30ee7a3d46e # skip: [c48f4f3541e67881c9eb7c46e052f5ece48ef530] NFSv41: Convert the various reboot recovery ops etc to minor version ops git bisect skip c48f4f3541e67881c9eb7c46e052f5ece48ef530 # good: [d185a334c748b3ca9de1f3a293fd8a9cf68378ab] NFSv4.1: Simplify nfs41_sequence_done() git bisect good d185a334c748b3ca9de1f3a293fd8a9cf68378ab # skip: [a4432345352c2be157ed844603147ac2c82f209c] NFSv41: Deprecate nfs_client->cl_minorversion git bisect skip a4432345352c2be157ed844603147ac2c82f209c # good: [035168ab39f66e4946d493f9ee20d11e154f332a] NFSv4.1: Make nfs4_setup_sequence take a nfs_server argument git bisect good 035168ab39f66e4946d493f9ee20d11e154f332a # good: [d77d76ffb638bd013782138cca6d8f4918c5afd6] NFSv41: Clean up exclusive create git bisect good d77d76ffb638bd013782138cca6d8f4918c5afd6 # good: [1f0e890dba5b0f543fea47732116b1c65d55614e] NFSv4: Clean up struct nfs4_state_owner git bisect good 1f0e890dba5b0f543fea47732116b1c65d55614e # skip: [daccbded7f153ec84a3baf3136052e41d0eab555] NFSv4: Clean up for lockowner XDR encoding git bisect skip daccbded7f153ec84a3baf3136052e41d0eab555 # bad: [d3c7b7ccc199ee564177ee914c04771d6bc00295] NFSv4: Add support for the RELEASE_LOCKOWNER operation git bisect bad d3c7b7ccc199ee564177ee914c04771d6bc00295 # bad: [f11ac8db5d07b6e99d41ff4aa39d878ee5cef1c5] NFSv4: Ensure that we track the NFSv4 lock state in read/write requests. git bisect bad f11ac8db5d07b6e99d41ff4aa39d878ee5cef1c5 It says that "f11ac8db5d07b6e99d41ff4aa39d878ee5cef1c5 is the first bad commit". Does it make sense? I skipped some revisions because they caused some seemingly unrelated failures and I could neither prove nor disprove the lockup.
Thanks! That certainly helps a lot! The patch in question is indeed known to trigger a problem in the writeback code. A fix was merged into kernel 2.6.37, but does not appear to have been sent to the stable kernel. I'll attach the fix patch to this bug report, but you can find it as commit 2df485a774ba59c3f43bfe84107672c1d9b731a0 (nfs: remove extraneous and problematic calls to nfs_clear_request)
Created attachment 48482 [details] nfs: remove extraneous and problematic calls to nfs_clear_request
I am confused. Are you saying that the 2df485a7 patch should fix the lockup? Earlier, I tried 2.6.37, which has the patch, and it locked up.
Created attachment 51572 [details] NFS: Fix a hang/infinite loop in nfs_wb_page() When one of the two waits in nfs_commit_inode() is interrupted, it returns a non-negative value, which causes nfs_wb_page() to think that the operation was successful causing it to busy-loop rather than exiting. It also causes nfs_file_fsync() to incorrectly report the file as being successfully committed to disk. This patch fixes both problems by ensuring that we return an error if the attempts to wait fail.
In my setup, v2.6.37 with the patches has worked for 15 hours. Without the second patch it fails withing an hour. Thanks.
Closing. The patch has been submitted to mainline and stable@kernel.org. Please reopen this report if the bug reoccurs in a future kernel...