Bug 199435
Summary: | HPSA + P420i resetting logical Direct-Access never complete | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Anthony Hausman (anthonyhaussmann) |
Component: | SCSI | Assignee: | linux-scsi (linux-scsi) |
Status: | NEW --- | ||
Severity: | normal | CC: | anthonyhaussmann, don.brace, gaetan.trellu, loberman, voronkovaa |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.11.0-14-generic | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Patch to use local work-queue insead of system work-queue
Latest out of box hpsa driver. Load on server during reset problem Patch to correct resets |
Description
Anthony Hausman
2018-04-18 08:19:04 UTC
Do you see any lockup messages in the console logs? "Controller lockup detected"... The driver you used is from 4.16 kernel on a 4.11 kernel? I have not tested this configuration. I notice that the driver is still using the kernel work-queue for monitoring. I will be sending up a patch to change this to local work-queues soon. Perhaps you can test this patch? It may help to discover more information on what is happening. Also, after you rebooted, were there any lockup entries in the ilo IML log? I don't have any "Controller lockup detected" message in the Syslog unfortunately. On the ilo IML log, the last message was about the cache module: CAUTION: POST Messages - POST Error: 1792-Slot X Drive Array - Valid Data Found in Cache Module. Data will automatically be written to drive array.. I have nothing about lockup entries. Indeed, we use the driver from the last kernel and compiled it for 4.11. I am ready to test the patch you are proposing. Where can I retrieve it? Created attachment 275437 [details]
Patch to use local work-queue insead of system work-queue
If the driver initiates a re-scan from a system work-queue, the kernel can hang.
This patch has not been submitted to linux-scsi, I will be sending this patch out soon.
Your stack trace does not show and hpsa driver components, but I do see the reset issued but not completing. I'm hoping that the attached patch helps diagnose the issue a little better. Don, I have applied the patch, it actually run and I try to reproduce the problem. I'll inform you about the diagnose. I have a stack trave about the Workqueue: Apr 19 11:22:52 kernel: INFO: task kworker/u129:28:428 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: kworker/u129:28 D 0 428 2 0x00000000 Apr 19 11:22:52 kernel: Workqueue: writeback wb_workfn (flush-67:80) Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? find_get_pages_tag+0x19f/0x2b0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_writepages+0x4e6/0xe20 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_writepages+0x4e6/0xe20 Apr 19 11:22:52 kernel: ? generic_writepages+0x67/0x90 Apr 19 11:22:52 kernel: ? sd_init_command+0x30/0xb0 Apr 19 11:22:52 kernel: do_writepages+0x1e/0x30 Apr 19 11:22:52 kernel: ? do_writepages+0x1e/0x30 Apr 19 11:22:52 kernel: __writeback_single_inode+0x45/0x330 Apr 19 11:22:52 kernel: writeback_sb_inodes+0x26a/0x5f0 Apr 19 11:22:52 kernel: __writeback_inodes_wb+0x92/0xc0 Apr 19 11:22:52 kernel: wb_writeback+0x26e/0x320 Apr 19 11:22:52 kernel: wb_workfn+0x2cf/0x3a0 Apr 19 11:22:52 kernel: ? wb_workfn+0x2cf/0x3a0 Apr 19 11:22:52 kernel: process_one_work+0x16b/0x4a0 Apr 19 11:22:52 kernel: worker_thread+0x4b/0x500 Apr 19 11:22:52 kernel: kthread+0x109/0x140 Apr 19 11:22:52 kernel: ? process_one_work+0x4a0/0x4a0 Apr 19 11:22:52 kernel: ? kthread_create_on_node+0x70/0x70 Apr 19 11:22:52 kernel: ret_from_fork+0x25/0x30 Apr 19 11:22:52 kernel: INFO: task jbd2/sdbb-8:10556 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: jbd2/sdbb-8 D 0 10556 2 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: ? update_cfs_rq_load_avg.constprop.91+0x227/0x4e0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: jbd2_journal_commit_transaction+0x241/0x1830 Apr 19 11:22:52 kernel: ? update_load_avg+0x84/0x560 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: ? lock_timer_base+0x7d/0xa0 Apr 19 11:22:52 kernel: kjournald2+0xca/0x250 Apr 19 11:22:52 kernel: ? kjournald2+0xca/0x250 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: kthread+0x109/0x140 Apr 19 11:22:52 kernel: ? commit_timeout+0x10/0x10 Apr 19 11:22:52 kernel: ? kthread_create_on_node+0x70/0x70 Apr 19 11:22:52 kernel: ret_from_fork+0x25/0x30 Apr 19 11:22:52 kernel: INFO: task task:14138 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14138 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? autoremove_wake_function+0x40/0x40 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? dquot_file_open+0x3d/0x50 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __mark_inode_dirty+0x176/0x370 Apr 19 11:22:52 kernel: generic_update_time+0x7b/0xd0 Apr 19 11:22:52 kernel: ? current_time+0x38/0x80 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: file_update_time+0xb7/0x110 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 19 11:22:52 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 19 11:22:52 kernel: ? __slab_free+0x9e/0x2e0 Apr 19 11:22:52 kernel: new_sync_write+0xd3/0x130 Apr 19 11:22:52 kernel: __vfs_write+0x26/0x40 Apr 19 11:22:52 kernel: vfs_write+0xb8/0x1b0 Apr 19 11:22:52 kernel: ? do_sys_open+0x1b4/0x280 Apr 19 11:22:52 kernel: SyS_pwrite64+0x95/0xb0 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ed23 Apr 19 11:22:52 kernel: RSP: 002b:00007f9a95aabb90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ed23 Apr 19 11:22:52 kernel: RDX: 0000000000000018 RSI: 00007f9a95aabcc0 RDI: 000000000000001a Apr 19 11:22:52 kernel: RBP: 00007f9a20000f30 R08: 00007f9a95aabc18 R09: 00007f9a95aabb30 Apr 19 11:22:52 kernel: R10: 00000000000000c0 R11: 0000000000000293 R12: 00007f9a20000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab0e3a0 R15: 00007f9a20000a30 Apr 19 11:22:52 kernel: INFO: task task:14159 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14159 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? schedule+0x36/0x80 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? out_of_line_wait_on_bit+0x82/0xb0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: ext4_create+0x110/0x1b0 Apr 19 11:22:52 kernel: path_openat+0x133b/0x1450 Apr 19 11:22:52 kernel: do_filp_open+0x99/0x110 Apr 19 11:22:52 kernel: ? __check_object_size+0x108/0x19e Apr 19 11:22:52 kernel: ? __alloc_fd+0x46/0x170 Apr 19 11:22:52 kernel: do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: ? do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: SyS_open+0x1e/0x20 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ebfd Apr 19 11:22:52 kernel: RSP: 002b:00007f9a912a2bf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ebfd Apr 19 11:22:52 kernel: RDX: 0000000000000180 RSI: 0000000000000041 RDI: 00007f9a912a2d50 Apr 19 11:22:52 kernel: RBP: 00007f99f4000f30 R08: 0000000000000000 R09: 0000000000000001 Apr 19 11:22:52 kernel: R10: 00000000000c58da R11: 0000000000000293 R12: 00007f99f4000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab44360 R15: 00007f99f4000a30 Apr 19 11:22:52 kernel: INFO: task task:14163 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14163 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? autoremove_wake_function+0x40/0x40 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? dquot_file_open+0x3d/0x50 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __mark_inode_dirty+0x176/0x370 Apr 19 11:22:52 kernel: generic_update_time+0x7b/0xd0 Apr 19 11:22:52 kernel: ? current_time+0x38/0x80 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: file_update_time+0xb7/0x110 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 19 11:22:52 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 19 11:22:52 kernel: ? __slab_free+0x178/0x2e0 Apr 19 11:22:52 kernel: new_sync_write+0xd3/0x130 Apr 19 11:22:52 kernel: __vfs_write+0x26/0x40 Apr 19 11:22:52 kernel: vfs_write+0xb8/0x1b0 Apr 19 11:22:52 kernel: ? do_sys_open+0x1b4/0x280 Apr 19 11:22:52 kernel: SyS_pwrite64+0x95/0xb0 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ed23 Apr 19 11:22:52 kernel: RSP: 002b:00007f9a902a0b90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ed23 Apr 19 11:22:52 kernel: RDX: 0000000000000018 RSI: 00007f9a902a0cc0 RDI: 0000000000000015 Apr 19 11:22:52 kernel: RBP: 00007f99ec000f30 R08: 00007f9a902a0c18 R09: 00007f9a902a0b30 Apr 19 11:22:52 kernel: R10: 00000000000000f0 R11: 0000000000000293 R12: 00007f99ec000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab14ad0 R15: 00007f99ec000a30 Apr 19 11:22:52 kernel: INFO: task task:14190 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14190 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? autoremove_wake_function+0x40/0x40 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? dquot_file_open+0x3d/0x50 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __mark_inode_dirty+0x176/0x370 Apr 19 11:22:52 kernel: generic_update_time+0x7b/0xd0 Apr 19 11:22:52 kernel: ? current_time+0x38/0x80 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: file_update_time+0xb7/0x110 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 19 11:22:52 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 19 11:22:52 kernel: ? __slab_free+0x9e/0x2e0 Apr 19 11:22:52 kernel: new_sync_write+0xd3/0x130 Apr 19 11:22:52 kernel: __vfs_write+0x26/0x40 Apr 19 11:22:52 kernel: vfs_write+0xb8/0x1b0 Apr 19 11:22:52 kernel: ? do_sys_open+0x1b4/0x280 Apr 19 11:22:52 kernel: SyS_pwrite64+0x95/0xb0 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ed23 Apr 19 11:22:52 kernel: RSP: 002b:00007f9a8b296b90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ed23 Apr 19 11:22:52 kernel: RDX: 0000000000000018 RSI: 00007f9a8b296cc0 RDI: 0000000000000012 Apr 19 11:22:52 kernel: RBP: 00007f99c4000f30 R08: 00007f9a8b296c18 R09: 00007f9a8b296b30 Apr 19 11:22:52 kernel: R10: 0000000000000120 R11: 0000000000000293 R12: 00007f99c4000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab218f0 R15: 00007f99c4000a30 Apr 19 11:22:52 kernel: INFO: task task:14203 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14203 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? schedule+0x36/0x80 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? out_of_line_wait_on_bit+0x82/0xb0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: ext4_create+0x110/0x1b0 Apr 19 11:22:52 kernel: path_openat+0x133b/0x1450 Apr 19 11:22:52 kernel: do_filp_open+0x99/0x110 Apr 19 11:22:52 kernel: ? __check_object_size+0x108/0x19e Apr 19 11:22:52 kernel: ? __alloc_fd+0x46/0x170 Apr 19 11:22:52 kernel: do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: ? do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: SyS_open+0x1e/0x20 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ebfd Apr 19 11:22:52 kernel: RSP: 002b:00007f9a89a93bf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ebfd Apr 19 11:22:52 kernel: RDX: 0000000000000180 RSI: 0000000000000041 RDI: 00007f9a89a93d50 Apr 19 11:22:52 kernel: RBP: 00007f99c0001010 R08: 0000000000000000 R09: 0000000000000001 Apr 19 11:22:52 kernel: R10: 00000000000c9c34 R11: 0000000000000293 R12: 00007f99c0000e80 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab0d8c0 R15: 00007f99c0000a30 Apr 19 11:22:52 kernel: INFO: task task:14207 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14207 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? schedule+0x36/0x80 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? out_of_line_wait_on_bit+0x82/0xb0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: ext4_create+0x110/0x1b0 Apr 19 11:22:52 kernel: path_openat+0x133b/0x1450 Apr 19 11:22:52 kernel: do_filp_open+0x99/0x110 Apr 19 11:22:52 kernel: ? __check_object_size+0x108/0x19e Apr 19 11:22:52 kernel: ? __alloc_fd+0x46/0x170 Apr 19 11:22:52 kernel: do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: ? do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: SyS_open+0x1e/0x20 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ebfd Apr 19 11:22:52 kernel: RSP: 002b:00007f9a89292bf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ebfd Apr 19 11:22:52 kernel: RDX: 0000000000000180 RSI: 0000000000000041 RDI: 00007f9a89292d50 Apr 19 11:22:52 kernel: RBP: 00007f99b4000f40 R08: 0000000000000000 R09: 0000000000000001 Apr 19 11:22:52 kernel: R10: 00000000000c2428 R11: 0000000000000293 R12: 00007f99b4000f00 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab07c48 R15: 00007f99b4000a60 Apr 19 11:22:52 kernel: INFO: task task:14213 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14213 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? schedule+0x36/0x80 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? out_of_line_wait_on_bit+0x82/0xb0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: ext4_create+0x110/0x1b0 Apr 19 11:22:52 kernel: path_openat+0x133b/0x1450 Apr 19 11:22:52 kernel: do_filp_open+0x99/0x110 Apr 19 11:22:52 kernel: ? __check_object_size+0x108/0x19e Apr 19 11:22:52 kernel: ? __alloc_fd+0x46/0x170 Apr 19 11:22:52 kernel: do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: ? do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: SyS_open+0x1e/0x20 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: Apr 19 11:22:52 kernel: INFO: task kworker/u129:28:428 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: kworker/u129:28 D 0 428 2 0x00000000 Apr 19 11:22:52 kernel: Workqueue: writeback wb_workfn (flush-67:80) Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? find_get_pages_tag+0x19f/0x2b0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_writepages+0x4e6/0xe20 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_writepages+0x4e6/0xe20 Apr 19 11:22:52 kernel: ? generic_writepages+0x67/0x90 Apr 19 11:22:52 kernel: ? sd_init_command+0x30/0xb0 Apr 19 11:22:52 kernel: do_writepages+0x1e/0x30 Apr 19 11:22:52 kernel: ? do_writepages+0x1e/0x30 Apr 19 11:22:52 kernel: __writeback_single_inode+0x45/0x330 Apr 19 11:22:52 kernel: writeback_sb_inodes+0x26a/0x5f0 Apr 19 11:22:52 kernel: __writeback_inodes_wb+0x92/0xc0 Apr 19 11:22:52 kernel: wb_writeback+0x26e/0x320 Apr 19 11:22:52 kernel: wb_workfn+0x2cf/0x3a0 Apr 19 11:22:52 kernel: ? wb_workfn+0x2cf/0x3a0 Apr 19 11:22:52 kernel: process_one_work+0x16b/0x4a0 Apr 19 11:22:52 kernel: worker_thread+0x4b/0x500 Apr 19 11:22:52 kernel: kthread+0x109/0x140 Apr 19 11:22:52 kernel: ? process_one_work+0x4a0/0x4a0 Apr 19 11:22:52 kernel: ? kthread_create_on_node+0x70/0x70 Apr 19 11:22:52 kernel: ret_from_fork+0x25/0x30 Apr 19 11:22:52 kernel: INFO: task jbd2/sdbb-8:10556 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: jbd2/sdbb-8 D 0 10556 2 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: ? update_cfs_rq_load_avg.constprop.91+0x227/0x4e0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: jbd2_journal_commit_transaction+0x241/0x1830 Apr 19 11:22:52 kernel: ? update_load_avg+0x84/0x560 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: ? lock_timer_base+0x7d/0xa0 Apr 19 11:22:52 kernel: kjournald2+0xca/0x250 Apr 19 11:22:52 kernel: ? kjournald2+0xca/0x250 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: kthread+0x109/0x140 Apr 19 11:22:52 kernel: ? commit_timeout+0x10/0x10 Apr 19 11:22:52 kernel: ? kthread_create_on_node+0x70/0x70 Apr 19 11:22:52 kernel: ret_from_fork+0x25/0x30 Apr 19 11:22:52 kernel: INFO: task task:14138 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14138 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? autoremove_wake_function+0x40/0x40 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? dquot_file_open+0x3d/0x50 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __mark_inode_dirty+0x176/0x370 Apr 19 11:22:52 kernel: generic_update_time+0x7b/0xd0 Apr 19 11:22:52 kernel: ? current_time+0x38/0x80 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: file_update_time+0xb7/0x110 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 19 11:22:52 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 19 11:22:52 kernel: ? __slab_free+0x9e/0x2e0 Apr 19 11:22:52 kernel: new_sync_write+0xd3/0x130 Apr 19 11:22:52 kernel: __vfs_write+0x26/0x40 Apr 19 11:22:52 kernel: vfs_write+0xb8/0x1b0 Apr 19 11:22:52 kernel: ? do_sys_open+0x1b4/0x280 Apr 19 11:22:52 kernel: SyS_pwrite64+0x95/0xb0 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ed23 Apr 19 11:22:52 kernel: RSP: 002b:00007f9a95aabb90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ed23 Apr 19 11:22:52 kernel: RDX: 0000000000000018 RSI: 00007f9a95aabcc0 RDI: 000000000000001a Apr 19 11:22:52 kernel: RBP: 00007f9a20000f30 R08: 00007f9a95aabc18 R09: 00007f9a95aabb30 Apr 19 11:22:52 kernel: R10: 00000000000000c0 R11: 0000000000000293 R12: 00007f9a20000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab0e3a0 R15: 00007f9a20000a30 Apr 19 11:22:52 kernel: INFO: task task:14159 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14159 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? schedule+0x36/0x80 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? out_of_line_wait_on_bit+0x82/0xb0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: ext4_create+0x110/0x1b0 Apr 19 11:22:52 kernel: path_openat+0x133b/0x1450 Apr 19 11:22:52 kernel: do_filp_open+0x99/0x110 Apr 19 11:22:52 kernel: ? __check_object_size+0x108/0x19e Apr 19 11:22:52 kernel: ? __alloc_fd+0x46/0x170 Apr 19 11:22:52 kernel: do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: ? do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: SyS_open+0x1e/0x20 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ebfd Apr 19 11:22:52 kernel: RSP: 002b:00007f9a912a2bf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ebfd Apr 19 11:22:52 kernel: RDX: 0000000000000180 RSI: 0000000000000041 RDI: 00007f9a912a2d50 Apr 19 11:22:52 kernel: RBP: 00007f99f4000f30 R08: 0000000000000000 R09: 0000000000000001 Apr 19 11:22:52 kernel: R10: 00000000000c58da R11: 0000000000000293 R12: 00007f99f4000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab44360 R15: 00007f99f4000a30 Apr 19 11:22:52 kernel: INFO: task task:14163 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14163 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? autoremove_wake_function+0x40/0x40 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? dquot_file_open+0x3d/0x50 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __mark_inode_dirty+0x176/0x370 Apr 19 11:22:52 kernel: generic_update_time+0x7b/0xd0 Apr 19 11:22:52 kernel: ? current_time+0x38/0x80 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: file_update_time+0xb7/0x110 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 19 11:22:52 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 19 11:22:52 kernel: ? __slab_free+0x178/0x2e0 Apr 19 11:22:52 kernel: new_sync_write+0xd3/0x130 Apr 19 11:22:52 kernel: __vfs_write+0x26/0x40 Apr 19 11:22:52 kernel: vfs_write+0xb8/0x1b0 Apr 19 11:22:52 kernel: ? do_sys_open+0x1b4/0x280 Apr 19 11:22:52 kernel: SyS_pwrite64+0x95/0xb0 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ed23 Apr 19 11:22:52 kernel: RSP: 002b:00007f9a902a0b90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ed23 Apr 19 11:22:52 kernel: RDX: 0000000000000018 RSI: 00007f9a902a0cc0 RDI: 0000000000000015 Apr 19 11:22:52 kernel: RBP: 00007f99ec000f30 R08: 00007f9a902a0c18 R09: 00007f9a902a0b30 Apr 19 11:22:52 kernel: R10: 00000000000000f0 R11: 0000000000000293 R12: 00007f99ec000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab14ad0 R15: 00007f99ec000a30 Apr 19 11:22:52 kernel: INFO: task task:14190 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14190 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? autoremove_wake_function+0x40/0x40 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? dquot_file_open+0x3d/0x50 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __mark_inode_dirty+0x176/0x370 Apr 19 11:22:52 kernel: generic_update_time+0x7b/0xd0 Apr 19 11:22:52 kernel: ? current_time+0x38/0x80 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: file_update_time+0xb7/0x110 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 19 11:22:52 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 19 11:22:52 kernel: ? __slab_free+0x9e/0x2e0 Apr 19 11:22:52 kernel: new_sync_write+0xd3/0x130 Apr 19 11:22:52 kernel: __vfs_write+0x26/0x40 Apr 19 11:22:52 kernel: vfs_write+0xb8/0x1b0 Apr 19 11:22:52 kernel: ? do_sys_open+0x1b4/0x280 Apr 19 11:22:52 kernel: SyS_pwrite64+0x95/0xb0 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ed23 Apr 19 11:22:52 kernel: RSP: 002b:00007f9a8b296b90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ed23 Apr 19 11:22:52 kernel: RDX: 0000000000000018 RSI: 00007f9a8b296cc0 RDI: 0000000000000012 Apr 19 11:22:52 kernel: RBP: 00007f99c4000f30 R08: 00007f9a8b296c18 R09: 00007f9a8b296b30 Apr 19 11:22:52 kernel: R10: 0000000000000120 R11: 0000000000000293 R12: 00007f99c4000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab218f0 R15: 00007f99c4000a30 Apr 19 11:22:52 kernel: INFO: task task:14203 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14203 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? schedule+0x36/0x80 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? out_of_line_wait_on_bit+0x82/0xb0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: ext4_create+0x110/0x1b0 Apr 19 11:22:52 kernel: path_openat+0x133b/0x1450 Apr 19 11:22:52 kernel: do_filp_open+0x99/0x110 Apr 19 11:22:52 kernel: ? __check_object_size+0x108/0x19e Apr 19 11:22:52 kernel: ? __alloc_fd+0x46/0x170 Apr 19 11:22:52 kernel: do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: ? do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: SyS_open+0x1e/0x20 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ebfd Apr 19 11:22:52 kernel: RSP: 002b:00007f9a89a93bf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ebfd Apr 19 11:22:52 kernel: RDX: 0000000000000180 RSI: 0000000000000041 RDI: 00007f9a89a93d50 Apr 19 11:22:52 kernel: RBP: 00007f99c0001010 R08: 0000000000000000 R09: 0000000000000001 Apr 19 11:22:52 kernel: R10: 00000000000c9c34 R11: 0000000000000293 R12: 00007f99c0000e80 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab0d8c0 R15: 00007f99c0000a30 Apr 19 11:22:52 kernel: INFO: task task:14207 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14207 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? schedule+0x36/0x80 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? out_of_line_wait_on_bit+0x82/0xb0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: ext4_create+0x110/0x1b0 Apr 19 11:22:52 kernel: path_openat+0x133b/0x1450 Apr 19 11:22:52 kernel: do_filp_open+0x99/0x110 Apr 19 11:22:52 kernel: ? __check_object_size+0x108/0x19e Apr 19 11:22:52 kernel: ? __alloc_fd+0x46/0x170 Apr 19 11:22:52 kernel: do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: ? do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: SyS_open+0x1e/0x20 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ebfd Apr 19 11:22:52 kernel: RSP: 002b:00007f9a89292bf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ebfd Apr 19 11:22:52 kernel: RDX: 0000000000000180 RSI: 0000000000000041 RDI: 00007f9a89292d50 Apr 19 11:22:52 kernel: RBP: 00007f99b4000f40 R08: 0000000000000000 R09: 0000000000000001 Apr 19 11:22:52 kernel: R10: 00000000000c2428 R11: 0000000000000293 R12: 00007f99b4000f00 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab07c48 R15: 00007f99b4000a60 Apr 19 11:22:52 kernel: INFO: task task:14213 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14213 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? schedule+0x36/0x80 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? out_of_line_wait_on_bit+0x82/0xb0 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: __ext4_new_inode+0x7b0/0x1420 Apr 19 11:22:52 kernel: ext4_create+0x110/0x1b0 Apr 19 11:22:52 kernel: path_openat+0x133b/0x1450 Apr 19 11:22:52 kernel: do_filp_open+0x99/0x110 Apr 19 11:22:52 kernel: ? __check_object_size+0x108/0x19e Apr 19 11:22:52 kernel: ? __alloc_fd+0x46/0x170 Apr 19 11:22:52 kernel: do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: ? do_sys_open+0x12d/0x280 Apr 19 11:22:52 kernel: SyS_open+0x1e/0x20 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ebfd Apr 19 11:22:52 kernel: RSP: 002b:00007f9a86a8dbf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ebfd Apr 19 11:22:52 kernel: RDX: 0000000000000180 RSI: 0000000000000041 RDI: 00007f9a86a8dd50 Apr 19 11:22:52 kernel: RBP: 00007f99a8000f30 R08: 0000000000000000 R09: 0000000000000001 Apr 19 11:22:52 kernel: R10: 00000000000c595e R11: 0000000000000293 R12: 00007f99a8000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab2a780 R15: 00007f99a8000a30 Apr 19 11:22:52 kernel: INFO: task task:14238 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14238 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? __getblk_gfp+0x2f/0x350 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? dquot_file_open+0x3d/0x50 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __mark_inode_dirty+0x176/0x370 Apr 19 11:22:52 kernel: generic_update_time+0x7b/0xd0 Apr 19 11:22:52 kernel: ? current_time+0x38/0x80 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: file_update_time+0xb7/0x110 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 19 11:22:52 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 19 11:22:52 kernel: ? __slab_free+0x178/0x2e0 Apr 19 11:22:52 kernel: new_sync_write+0xd3/0x130 Apr 19 11:22:52 kernel: __vfs_write+0x26/0x40 Apr 19 11:22:52 kernel: vfs_write+0xb8/0x1b0 Apr 19 11:22:52 kernel: ? do_sys_open+0x1b4/0x280 Apr 19 11:22:52 kernel: SyS_pwrite64+0x95/0xb0 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ed23 Apr 19 11:22:52 kernel: RSP: 002b:00007f9a7da7bb90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ed23 Apr 19 11:22:52 kernel: RDX: 0000000000000018 RSI: 00007f9a7da7bcc0 RDI: 000000000000000c Apr 19 11:22:52 kernel: RBP: 00007f9964001060 R08: 00007f9a7da7bc18 R09: 00007f9a7da7bb30 Apr 19 11:22:52 kernel: R10: 00000000000000c0 R11: 0000000000000293 R12: 00007f9964001020 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9aafeb08 R15: 00007f9964000b60 RIP: 0033:0x7f9a9968ebfd Apr 19 11:22:52 kernel: RSP: 002b:00007f9a86a8dbf0 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ebfd Apr 19 11:22:52 kernel: RDX: 0000000000000180 RSI: 0000000000000041 RDI: 00007f9a86a8dd50 Apr 19 11:22:52 kernel: RBP: 00007f99a8000f30 R08: 0000000000000000 R09: 0000000000000001 Apr 19 11:22:52 kernel: R10: 00000000000c595e R11: 0000000000000293 R12: 00007f99a8000ef0 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9ab2a780 R15: 00007f99a8000a30 Apr 19 11:22:52 kernel: INFO: task task:14238 blocked for more than 120 seconds. Apr 19 11:22:52 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 19 11:22:52 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 19 11:22:52 kernel: task D 0 14238 14058 0x00000000 Apr 19 11:22:52 kernel: Call Trace: Apr 19 11:22:52 kernel: __schedule+0x3b9/0x8f0 Apr 19 11:22:52 kernel: schedule+0x36/0x80 Apr 19 11:22:52 kernel: wait_transaction_locked+0x8a/0xd0 Apr 19 11:22:52 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 19 11:22:52 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 19 11:22:52 kernel: ? __getblk_gfp+0x2f/0x350 Apr 19 11:22:52 kernel: start_this_handle+0x103/0x3f0 Apr 19 11:22:52 kernel: ? dquot_file_open+0x3d/0x50 Apr 19 11:22:52 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 19 11:22:52 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 19 11:22:52 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 19 11:22:52 kernel: ext4_dirty_inode+0x32/0x70 Apr 19 11:22:52 kernel: __mark_inode_dirty+0x176/0x370 Apr 19 11:22:52 kernel: generic_update_time+0x7b/0xd0 Apr 19 11:22:52 kernel: ? current_time+0x38/0x80 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: file_update_time+0xb7/0x110 Apr 19 11:22:52 kernel: ? ext4_xattr_security_set+0x30/0x30 Apr 19 11:22:52 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 19 11:22:52 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 19 11:22:52 kernel: ? __slab_free+0x178/0x2e0 Apr 19 11:22:52 kernel: new_sync_write+0xd3/0x130 Apr 19 11:22:52 kernel: __vfs_write+0x26/0x40 Apr 19 11:22:52 kernel: vfs_write+0xb8/0x1b0 Apr 19 11:22:52 kernel: ? do_sys_open+0x1b4/0x280 Apr 19 11:22:52 kernel: SyS_pwrite64+0x95/0xb0 Apr 19 11:22:52 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 19 11:22:52 kernel: RIP: 0033:0x7f9a9968ed23 Apr 19 11:22:52 kernel: RSP: 002b:00007f9a7da7bb90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 19 11:22:52 kernel: RAX: ffffffffffffffda RBX: 00007f9a6001c7b8 RCX: 00007f9a9968ed23 Apr 19 11:22:52 kernel: RDX: 0000000000000018 RSI: 00007f9a7da7bcc0 RDI: 000000000000000c Apr 19 11:22:52 kernel: RBP: 00007f9964001060 R08: 00007f9a7da7bc18 R09: 00007f9a7da7bb30 Apr 19 11:22:52 kernel: R10: 00000000000000c0 R11: 0000000000000293 R12: 00007f9964001020 Apr 19 11:22:52 kernel: R13: 00007f9a6001c7d0 R14: 00007f9a9aafeb08 R15: 00007f9964000b60 Nothing return by hpsa. This stack trace has caused nothing on the server, no offline or critical disk return by hpsa utilities and to heavy load. I had a similar stack trace: Apr 20 14:57:18 kernel: INFO: task jbd2/sdt-8:10890 blocked for more than 120 seconds. Apr 20 14:57:18 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 20 14:57:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 20 14:57:18 kernel: jbd2/sdt-8 D 0 10890 2 0x00000000 Apr 20 14:57:18 kernel: Call Trace: Apr 20 14:57:18 kernel: __schedule+0x3b9/0x8f0 Apr 20 14:57:18 kernel: schedule+0x36/0x80 Apr 20 14:57:18 kernel: jbd2_journal_commit_transaction+0x241/0x1830 Apr 20 14:57:18 kernel: ? update_load_avg+0x84/0x560 Apr 20 14:57:18 kernel: ? update_load_avg+0x84/0x560 Apr 20 14:57:18 kernel: ? dequeue_entity+0xed/0x4c0 Apr 20 14:57:18 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 20 14:57:18 kernel: ? lock_timer_base+0x7d/0xa0 Apr 20 14:57:18 kernel: kjournald2+0xca/0x250 Apr 20 14:57:18 kernel: ? kjournald2+0xca/0x250 Apr 20 14:57:18 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 20 14:57:18 kernel: kthread+0x109/0x140 Apr 20 14:57:18 kernel: ? commit_timeout+0x10/0x10 Apr 20 14:57:18 kernel: ? kthread_create_on_node+0x70/0x70 Apr 20 14:57:18 kernel: ret_from_fork+0x25/0x30 Apr 20 14:57:18 kernel: INFO: task task:13497 blocked for more than 120 seconds. Apr 20 14:57:18 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 20 14:57:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 20 14:57:18 kernel: task D 0 13497 13196 0x00000000 Apr 20 14:57:18 kernel: Call Trace: Apr 20 14:57:18 kernel: __schedule+0x3b9/0x8f0 Apr 20 14:57:18 kernel: schedule+0x36/0x80 Apr 20 14:57:18 kernel: rwsem_down_write_failed+0x237/0x3b0 Apr 20 14:57:18 kernel: ? copy_page_to_iter_iovec+0x97/0x170 Apr 20 14:57:18 kernel: call_rwsem_down_write_failed+0x17/0x30 Apr 20 14:57:18 kernel: ? call_rwsem_down_write_failed+0x17/0x30 Apr 20 14:57:18 kernel: down_write+0x2d/0x40 Apr 20 14:57:18 kernel: ext4_file_write_iter+0x70/0x3c0 Apr 20 14:57:18 kernel: ? futex_wake+0x90/0x170 Apr 20 14:57:18 kernel: new_sync_write+0xd3/0x130 Apr 20 14:57:18 kernel: __vfs_write+0x26/0x40 Apr 20 14:57:18 kernel: vfs_write+0xb8/0x1b0 Apr 20 14:57:18 kernel: SyS_pwrite64+0x95/0xb0 Apr 20 14:57:18 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 20 14:57:18 kernel: RIP: 0033:0x7fa085d92d23 Apr 20 14:57:18 kernel: RSP: 002b:00007fa0801acc90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 20 14:57:18 kernel: RAX: ffffffffffffffda RBX: 00007fa0480009d0 RCX: 00007fa085d92d23 Apr 20 14:57:18 kernel: RDX: 0000000000000200 RSI: 00007fa004000b30 RDI: 000000000000000f Apr 20 14:57:18 kernel: RBP: 00007fa0801ad060 R08: 00007fa0801acd2c R09: 0000000000000001 Apr 20 14:57:18 kernel: R10: 00000001f86be000 R11: 0000000000000293 R12: 00007fa0040014c0 Apr 20 14:57:18 kernel: R13: 00007fa004000d80 R14: 000000000000002e R15: 00007fa0480009d0 Apr 20 14:57:18 kernel: INFO: task task:13499 blocked for more than 120 seconds. Apr 20 14:57:18 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 20 14:57:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 20 14:57:18 kernel: task D 0 13499 13196 0x00000000 Apr 20 14:57:18 kernel: Call Trace: Apr 20 14:57:18 kernel: __schedule+0x3b9/0x8f0 Apr 20 14:57:18 kernel: schedule+0x36/0x80 Apr 20 14:57:18 kernel: rwsem_down_write_failed+0x237/0x3b0 Apr 20 14:57:18 kernel: ? copy_page_to_iter_iovec+0x97/0x170 Apr 20 14:57:18 kernel: call_rwsem_down_write_failed+0x17/0x30 Apr 20 14:57:18 kernel: ? call_rwsem_down_write_failed+0x17/0x30 Apr 20 14:57:18 kernel: down_write+0x2d/0x40 Apr 20 14:57:18 kernel: ext4_file_write_iter+0x70/0x3c0 Apr 20 14:57:18 kernel: ? futex_wake+0x90/0x170 Apr 20 14:57:18 kernel: new_sync_write+0xd3/0x130 Apr 20 14:57:18 kernel: __vfs_write+0x26/0x40 Apr 20 14:57:18 kernel: vfs_write+0xb8/0x1b0 Apr 20 14:57:18 kernel: SyS_pwrite64+0x95/0xb0 Apr 20 14:57:18 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 20 14:57:18 kernel: RIP: 0033:0x7fa085d92d23 Apr 20 14:57:18 kernel: RSP: 002b:00007fa07f9abc90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 20 14:57:18 kernel: RAX: ffffffffffffffda RBX: 00007f9fac008d00 RCX: 00007fa085d92d23 Apr 20 14:57:18 kernel: RDX: 0000000000000200 RSI: 00007fa0080013b0 RDI: 000000000000000f Apr 20 14:57:18 kernel: RBP: 00007fa07f9ac060 R08: 00007fa07f9abd2c R09: 0000000000000001 Apr 20 14:57:18 kernel: R10: 0000000219541000 R11: 0000000000000293 R12: 00007fa008001140 Apr 20 14:57:18 kernel: R13: 00007fa0080008c0 R14: 000000000000002e R15: 00007f9fac008d00 Apr 20 14:57:18 kernel: INFO: task task:13510 blocked for more than 120 seconds. Apr 20 14:57:18 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 20 14:57:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 20 14:57:18 kernel: task D 0 13510 13196 0x00000000 Apr 20 14:57:18 kernel: Call Trace: Apr 20 14:57:18 kernel: __schedule+0x3b9/0x8f0 Apr 20 14:57:18 kernel: schedule+0x36/0x80 Apr 20 14:57:18 kernel: rwsem_down_write_failed+0x237/0x3b0 Apr 20 14:57:18 kernel: ? copy_page_to_iter_iovec+0x97/0x170 Apr 20 14:57:18 kernel: call_rwsem_down_write_failed+0x17/0x30 Apr 20 14:57:18 kernel: ? call_rwsem_down_write_failed+0x17/0x30 Apr 20 14:57:18 kernel: down_write+0x2d/0x40 Apr 20 14:57:18 kernel: ext4_file_write_iter+0x70/0x3c0 Apr 20 14:57:18 kernel: ? futex_wake+0x90/0x170 Apr 20 14:57:18 kernel: new_sync_write+0xd3/0x130 Apr 20 14:57:18 kernel: __vfs_write+0x26/0x40 Apr 20 14:57:18 kernel: vfs_write+0xb8/0x1b0 Apr 20 14:57:18 kernel: SyS_pwrite64+0x95/0xb0 Apr 20 14:57:18 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 20 14:57:18 kernel: RIP: 0033:0x7fa085d92d23 Apr 20 14:57:18 kernel: RSP: 002b:00007fa07a9a1c90 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 20 14:57:18 kernel: RAX: ffffffffffffffda RBX: 00007f9fe0007700 RCX: 00007fa085d92d23 Apr 20 14:57:18 kernel: RDX: 0000000000000200 RSI: 00007f9fe0000b30 RDI: 000000000000000f Apr 20 14:57:18 kernel: RBP: 00007f9fe0007970 R08: 00007fa07a9a1d2c R09: 0000000000000001 Apr 20 14:57:18 kernel: R10: 00000001b890c000 R11: 0000000000000293 R12: 0000000000000270 Apr 20 14:57:18 kernel: R13: 0000000000000060 R14: 00007f9fe0007ec0 R15: 00007f9fe0000020 Apr 20 14:57:18 kernel: INFO: task task:13585 blocked for more than 120 seconds. Apr 20 14:57:18 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 20 14:57:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 20 14:57:18 kernel: task D 0 13585 13196 0x00000000 Apr 20 14:57:18 kernel: Call Trace: Apr 20 14:57:18 kernel: __schedule+0x3b9/0x8f0 Apr 20 14:57:18 kernel: ? __dev_queue_xmit+0x268/0x680 Apr 20 14:57:18 kernel: schedule+0x36/0x80 Apr 20 14:57:18 kernel: wait_transaction_locked+0x8a/0xd0 Apr 20 14:57:18 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 20 14:57:18 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 20 14:57:18 kernel: start_this_handle+0x103/0x3f0 Apr 20 14:57:18 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 20 14:57:18 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 20 14:57:18 kernel: ? ext4_dirty_inode+0x32/0x70 Apr 20 14:57:18 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 20 14:57:18 kernel: ext4_dirty_inode+0x32/0x70 Apr 20 14:57:18 kernel: __mark_inode_dirty+0x176/0x370 Apr 20 14:57:18 kernel: generic_update_time+0x7b/0xd0 Apr 20 14:57:18 kernel: ? current_time+0x38/0x80 Apr 20 14:57:18 kernel: file_update_time+0xb7/0x110 Apr 20 14:57:18 kernel: __generic_file_write_iter+0x9d/0x1f0 Apr 20 14:57:18 kernel: ext4_file_write_iter+0x21a/0x3c0 Apr 20 14:57:18 kernel: ? futex_wake+0x90/0x170 Apr 20 14:57:18 kernel: new_sync_write+0xd3/0x130 Apr 20 14:57:18 kernel: __vfs_write+0x26/0x40 Apr 20 14:57:18 kernel: vfs_write+0xb8/0x1b0 Apr 20 14:57:18 kernel: ? SyS_futex+0x7f/0x180 Apr 20 14:57:18 kernel: SyS_pwrite64+0x95/0xb0 Apr 20 14:57:18 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 20 14:57:18 kernel: RIP: 0033:0x7fa085d92d23 Apr 20 14:57:18 kernel: RSP: 002b:00007fa06c184d70 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 20 14:57:18 kernel: RAX: ffffffffffffffda RBX: 00007f9f640014e0 RCX: 00007fa085d92d23 Apr 20 14:57:18 kernel: RDX: 0000000000000200 RSI: 00007f9f64001750 RDI: 000000000000000f Apr 20 14:57:18 kernel: RBP: 00007f9f640017b8 R08: 00007fa06c184e0c R09: 0000000000000001 Apr 20 14:57:18 kernel: R10: 0000000166f7b000 R11: 0000000000000293 R12: 0000000000000024 Apr 20 14:57:18 kernel: R13: 0000005d6f4d6d18 R14: 000000000000001c R15: 00007f9f8c005d60 Apr 20 14:57:18 kernel: INFO: task statsync:17472 blocked for more than 120 seconds. Apr 20 14:57:18 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 20 14:57:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 20 14:57:18 kernel: statsync D 0 17472 13196 0x00000000 Apr 20 14:57:18 kernel: Call Trace: Apr 20 14:57:18 kernel: __schedule+0x3b9/0x8f0 Apr 20 14:57:18 kernel: schedule+0x36/0x80 Apr 20 14:57:18 kernel: rwsem_down_write_failed+0x237/0x3b0 Apr 20 14:57:18 kernel: call_rwsem_down_write_failed+0x17/0x30 Apr 20 14:57:18 kernel: ? call_rwsem_down_write_failed+0x17/0x30 Apr 20 14:57:18 kernel: down_write+0x2d/0x40 Apr 20 14:57:18 kernel: ext4_file_write_iter+0x70/0x3c0 Apr 20 14:57:18 kernel: new_sync_write+0xd3/0x130 Apr 20 14:57:18 kernel: __vfs_write+0x26/0x40 Apr 20 14:57:18 kernel: vfs_write+0xb8/0x1b0 Apr 20 14:57:18 kernel: ? SyS_futex+0x7f/0x180 Apr 20 14:57:18 kernel: SyS_pwrite64+0x95/0xb0 Apr 20 14:57:18 kernel: entry_SYSCALL_64_fastpath+0x1e/0xad Apr 20 14:57:18 kernel: RIP: 0033:0x7fa085d92d23 Apr 20 14:57:18 kernel: RSP: 002b:00007fa050ff8020 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 Apr 20 14:57:18 kernel: RAX: ffffffffffffffda RBX: 00007fa050ff80c0 RCX: 00007fa085d92d23 Apr 20 14:57:18 kernel: RDX: 0000000000000038 RSI: 00007fa050ff8130 RDI: 000000000000000f Apr 20 14:57:18 kernel: RBP: 00007fa04c00fcd0 R08: 00007fa050ff80bc R09: 0000000000000001 Apr 20 14:57:18 kernel: R10: 0000000000000000 R11: 0000000000000293 R12: 00007fa04c0008c0 Apr 20 14:57:18 kernel: R13: 00007fa050ff80c0 R14: 00007fa04c01b560 R15: 00007fa04c000998 Apr 20 14:57:18 kernel: INFO: task kworker/u130:6:47634 blocked for more than 120 seconds. Apr 20 14:57:18 kernel: Tainted: G OE 4.11.0-14-generic #20~16.04.1-Ubuntu Apr 20 14:57:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Apr 20 14:57:18 kernel: kworker/u130:6 D 0 47634 2 0x00000000 Apr 20 14:57:18 kernel: Workqueue: writeback wb_workfn (flush-65:48) Apr 20 14:57:18 kernel: Call Trace: Apr 20 14:57:18 kernel: __schedule+0x3b9/0x8f0 Apr 20 14:57:18 kernel: ? blk_queue_bio+0x1df/0x430 Apr 20 14:57:18 kernel: schedule+0x36/0x80 Apr 20 14:57:18 kernel: wait_transaction_locked+0x8a/0xd0 Apr 20 14:57:18 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 20 14:57:18 kernel: add_transaction_credits+0x1c1/0x2a0 Apr 20 14:57:18 kernel: ? __slab_alloc+0x20/0x40 Apr 20 14:57:18 kernel: start_this_handle+0x103/0x3f0 Apr 20 14:57:18 kernel: ? mempool_alloc+0x6e/0x170 Apr 20 14:57:18 kernel: ? kmem_cache_alloc+0xd7/0x1b0 Apr 20 14:57:18 kernel: jbd2__journal_start+0xdb/0x1f0 Apr 20 14:57:18 kernel: ? ext4_writepages+0x4e6/0xe20 Apr 20 14:57:18 kernel: __ext4_journal_start_sb+0x6d/0x120 Apr 20 14:57:18 kernel: ext4_writepages+0x4e6/0xe20 Apr 20 14:57:18 kernel: ? swiotlb_unmap_sg_attrs+0x40/0x70 Apr 20 14:57:18 kernel: ? scsi_dma_map+0x98/0xc0 Apr 20 14:57:18 kernel: ? __enqueue_cmd_and_start_io.isra.41+0x77/0x150 [hpsa] Apr 20 14:57:18 kernel: ? enqueue_cmd_and_start_io+0x18/0x80 [hpsa] Apr 20 14:57:18 kernel: ? hpsa_ciss_submit+0x31b/0x400 [hpsa] Apr 20 14:57:18 kernel: do_writepages+0x1e/0x30 Apr 20 14:57:18 kernel: ? do_writepages+0x1e/0x30 Apr 20 14:57:18 kernel: __writeback_single_inode+0x45/0x330 Apr 20 14:57:18 kernel: writeback_sb_inodes+0x26a/0x5f0 Apr 20 14:57:18 kernel: __writeback_inodes_wb+0x92/0xc0 Apr 20 14:57:18 kernel: wb_writeback+0x26e/0x320 Apr 20 14:57:18 kernel: wb_workfn+0x2cf/0x3a0 Apr 20 14:57:18 kernel: ? wb_workfn+0x2cf/0x3a0 Apr 20 14:57:18 kernel: process_one_work+0x16b/0x4a0 Apr 20 14:57:18 kernel: worker_thread+0x4b/0x500 Apr 20 14:57:18 kernel: kthread+0x109/0x140 Apr 20 14:57:18 kernel: ? process_one_work+0x4a0/0x4a0 Apr 20 14:57:18 kernel: ? kthread_create_on_node+0x70/0x70 Apr 20 14:57:18 kernel: ret_from_fork+0x25/0x30 And once again, it has caused nothing on the server, no offline or critical disk return by hpsa utilities or heavy load. But it directly concerns the hpsa module and the workqueue. So I have reproduced the problem with the patch driver. At the beginning, one disk return a lot of blk_update_request: critical medium error/Unrecovered read error and after the driver trigger a reset logical on all disk. At the first trigger, all reset completed successfully but the third reset on the problematic error disk the system hang out and the reset never complete. The load on the server is less important at that time but application seems to stuck their IO still. And the faulty disk is still considered healthy via the hp utilitues (ssacli). Here is the stack trace: [Fri Apr 20 20:56:58 2018] sd 0:1:0:15: [sdp] Unaligned partial completion (resid=32, sector_sz=512) [Fri Apr 20 20:56:58 2018] sd 0:1:0:15: [sdp] tag#50 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Fri Apr 20 20:56:58 2018] sd 0:1:0:15: [sdp] tag#50 Sense Key : Medium Error [current] [Fri Apr 20 20:56:58 2018] sd 0:1:0:15: [sdp] tag#50 Add. Sense: Unrecovered read error [Fri Apr 20 20:56:58 2018] sd 0:1:0:15: [sdp] tag#50 CDB: Read(16) 88 00 00 00 00 02 36 46 b5 a8 00 00 04 00 00 00 [Fri Apr 20 20:56:58 2018] blk_update_request: critical medium error, dev sdp, sector 9500538280 [Fri Apr 20 20:57:30 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 20:59:06 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 20:59:06 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 21:00:05 2018] sd 0:1:0:15: [sdp] Unaligned partial completion (resid=198, sector_sz=512) [Fri Apr 20 21:00:05 2018] sd 0:1:0:15: [sdp] tag#7 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Fri Apr 20 21:00:05 2018] sd 0:1:0:15: [sdp] tag#7 Sense Key : Medium Error [current] [Fri Apr 20 21:00:05 2018] sd 0:1:0:15: [sdp] tag#7 Add. Sense: Unrecovered read error [Fri Apr 20 21:00:05 2018] sd 0:1:0:15: [sdp] tag#7 CDB: Read(16) 88 00 00 00 00 02 36 46 b9 a8 00 00 04 00 00 00 [Fri Apr 20 21:00:05 2018] blk_update_request: critical medium error, dev sdp, sector 9500539304 [Fri Apr 20 21:00:56 2018] sd 0:1:0:15: [sdp] Unaligned partial completion (resid=48, sector_sz=512) [Fri Apr 20 21:00:56 2018] sd 0:1:0:15: [sdp] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Fri Apr 20 21:00:56 2018] sd 0:1:0:15: [sdp] tag#2 Sense Key : Medium Error [current] [Fri Apr 20 21:00:56 2018] sd 0:1:0:15: [sdp] tag#2 Add. Sense: Unrecovered read error [Fri Apr 20 21:00:56 2018] sd 0:1:0:15: [sdp] tag#2 CDB: Read(16) 88 00 00 00 00 02 36 46 a9 a8 00 00 04 00 00 00 [Fri Apr 20 21:00:56 2018] blk_update_request: critical medium error, dev sdp, sector 9500535208 [Fri Apr 20 21:09:59 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 21:48:43 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 21:48:43 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 21:51:44 2018] hpsa 0000:08:00.0: scsi 0:1:0:0: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:05 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:05 2018] hpsa 0000:08:00.0: scsi 0:1:0:0: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:05 2018] hpsa 0000:08:00.0: scsi 0:1:0:1: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:06 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:06 2018] hpsa 0000:08:00.0: scsi 0:1:0:1: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:06 2018] hpsa 0000:08:00.0: scsi 0:1:0:2: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:07 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:07 2018] hpsa 0000:08:00.0: scsi 0:1:0:2: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:07 2018] hpsa 0000:08:00.0: scsi 0:1:0:3: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:08 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:08 2018] hpsa 0000:08:00.0: scsi 0:1:0:3: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:08 2018] hpsa 0000:08:00.0: scsi 0:1:0:4: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:09 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:09 2018] hpsa 0000:08:00.0: scsi 0:1:0:4: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:09 2018] hpsa 0000:08:00.0: scsi 0:1:0:6: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:10 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:10 2018] hpsa 0000:08:00.0: scsi 0:1:0:6: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:10 2018] hpsa 0000:08:00.0: scsi 0:1:0:7: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:11 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:11 2018] hpsa 0000:08:00.0: scsi 0:1:0:7: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:11 2018] hpsa 0000:08:00.0: scsi 0:1:0:8: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:12 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:12 2018] hpsa 0000:08:00.0: scsi 0:1:0:8: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:12 2018] hpsa 0000:08:00.0: scsi 0:1:0:9: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:13 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:13 2018] hpsa 0000:08:00.0: scsi 0:1:0:9: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:13 2018] hpsa 0000:08:00.0: scsi 0:1:0:10: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:14 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:14 2018] hpsa 0000:08:00.0: scsi 0:1:0:10: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:14 2018] hpsa 0000:08:00.0: scsi 0:1:0:11: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:15 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:15 2018] hpsa 0000:08:00.0: scsi 0:1:0:11: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:15 2018] hpsa 0000:08:00.0: scsi 0:1:0:12: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:16 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:16 2018] hpsa 0000:08:00.0: scsi 0:1:0:12: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:16 2018] hpsa 0000:08:00.0: scsi 0:1:0:13: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:17 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:17 2018] hpsa 0000:08:00.0: scsi 0:1:0:13: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:17 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:18 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:18 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:18 2018] hpsa 0000:08:00.0: scsi 0:1:0:16: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:19 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:19 2018] hpsa 0000:08:00.0: scsi 0:1:0:16: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:19 2018] hpsa 0000:08:00.0: scsi 0:1:0:17: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:20 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:20 2018] hpsa 0000:08:00.0: scsi 0:1:0:17: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:20 2018] hpsa 0000:08:00.0: scsi 0:1:0:18: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:21 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:21 2018] hpsa 0000:08:00.0: scsi 0:1:0:18: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:21 2018] hpsa 0000:08:00.0: scsi 0:1:0:19: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:22 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:22 2018] hpsa 0000:08:00.0: scsi 0:1:0:19: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:22 2018] hpsa 0000:08:00.0: scsi 0:1:0:20: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:23 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:23 2018] hpsa 0000:08:00.0: scsi 0:1:0:20: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:23 2018] hpsa 0000:08:00.0: scsi 0:1:0:21: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:24 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:24 2018] hpsa 0000:08:00.0: scsi 0:1:0:21: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:24 2018] hpsa 0000:08:00.0: scsi 0:1:0:22: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:25 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:25 2018] hpsa 0000:08:00.0: scsi 0:1:0:22: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:25 2018] hpsa 0000:08:00.0: scsi 0:1:0:23: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:26 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:26 2018] hpsa 0000:08:00.0: scsi 0:1:0:23: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:26 2018] hpsa 0000:08:00.0: scsi 0:1:0:25: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:27 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:27 2018] hpsa 0000:08:00.0: scsi 0:1:0:25: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:27 2018] hpsa 0000:08:00.0: scsi 0:1:0:26: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:28 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:28 2018] hpsa 0000:08:00.0: scsi 0:1:0:26: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:28 2018] hpsa 0000:08:00.0: scsi 0:1:0:28: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:29 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:29 2018] hpsa 0000:08:00.0: scsi 0:1:0:28: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:29 2018] hpsa 0000:08:00.0: scsi 0:1:0:29: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:30 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:30 2018] hpsa 0000:08:00.0: scsi 0:1:0:29: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:30 2018] hpsa 0000:08:00.0: scsi 0:1:0:30: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:31 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:31 2018] hpsa 0000:08:00.0: scsi 0:1:0:30: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:31 2018] hpsa 0000:08:00.0: scsi 0:1:0:31: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:32 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:32 2018] hpsa 0000:08:00.0: scsi 0:1:0:31: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:32 2018] hpsa 0000:08:00.0: scsi 0:1:0:32: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:33 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:33 2018] hpsa 0000:08:00.0: scsi 0:1:0:32: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:33 2018] hpsa 0000:08:00.0: scsi 0:1:0:33: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:34 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:34 2018] hpsa 0000:08:00.0: scsi 0:1:0:33: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:34 2018] hpsa 0000:08:00.0: scsi 0:1:0:34: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:35 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:35 2018] hpsa 0000:08:00.0: scsi 0:1:0:34: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:35 2018] hpsa 0000:08:00.0: scsi 0:1:0:35: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:36 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:36 2018] hpsa 0000:08:00.0: scsi 0:1:0:35: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:36 2018] hpsa 0000:08:00.0: scsi 0:1:0:36: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:37 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:37 2018] hpsa 0000:08:00.0: scsi 0:1:0:36: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:37 2018] hpsa 0000:08:00.0: scsi 0:1:0:37: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:39 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:39 2018] hpsa 0000:08:00.0: scsi 0:1:0:37: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:39 2018] hpsa 0000:08:00.0: scsi 0:1:0:39: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:40 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:40 2018] hpsa 0000:08:00.0: scsi 0:1:0:39: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:40 2018] hpsa 0000:08:00.0: scsi 0:1:0:40: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:41 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:41 2018] hpsa 0000:08:00.0: scsi 0:1:0:40: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:41 2018] hpsa 0000:08:00.0: scsi 0:1:0:41: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:42 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:42 2018] hpsa 0000:08:00.0: scsi 0:1:0:41: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:42 2018] hpsa 0000:08:00.0: scsi 0:1:0:42: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:43 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:43 2018] hpsa 0000:08:00.0: scsi 0:1:0:42: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:43 2018] hpsa 0000:08:00.0: scsi 0:1:0:43: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:44 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:44 2018] hpsa 0000:08:00.0: scsi 0:1:0:43: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:44 2018] hpsa 0000:08:00.0: scsi 0:1:0:44: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:45 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:45 2018] hpsa 0000:08:00.0: scsi 0:1:0:44: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:45 2018] hpsa 0000:08:00.0: scsi 0:1:0:45: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:46 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:46 2018] hpsa 0000:08:00.0: scsi 0:1:0:45: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:46 2018] hpsa 0000:08:00.0: scsi 0:1:0:47: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:47 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:47 2018] hpsa 0000:08:00.0: scsi 0:1:0:47: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:47 2018] hpsa 0000:08:00.0: scsi 0:1:0:48: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:48 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:48 2018] hpsa 0000:08:00.0: scsi 0:1:0:48: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:48 2018] hpsa 0000:08:00.0: scsi 0:1:0:49: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:49 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:49 2018] hpsa 0000:08:00.0: scsi 0:1:0:49: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:49 2018] hpsa 0000:08:00.0: scsi 0:1:0:50: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:50 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:50 2018] hpsa 0000:08:00.0: scsi 0:1:0:50: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:50 2018] hpsa 0000:08:00.0: scsi 0:1:0:51: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:51 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:51 2018] hpsa 0000:08:00.0: scsi 0:1:0:51: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:51 2018] hpsa 0000:08:00.0: scsi 0:1:0:52: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:52 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:52 2018] hpsa 0000:08:00.0: scsi 0:1:0:52: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:52 2018] hpsa 0000:08:00.0: scsi 0:1:0:54: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:53 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:53 2018] hpsa 0000:08:00.0: scsi 0:1:0:54: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:53 2018] hpsa 0000:08:00.0: scsi 0:1:0:55: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:54 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:54 2018] hpsa 0000:08:00.0: scsi 0:1:0:55: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:54 2018] hpsa 0000:08:00.0: scsi 0:1:0:56: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:55 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:55 2018] hpsa 0000:08:00.0: scsi 0:1:0:56: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:55 2018] hpsa 0000:08:00.0: scsi 0:1:0:57: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:56 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:56 2018] hpsa 0000:08:00.0: scsi 0:1:0:57: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:56 2018] hpsa 0000:08:00.0: scsi 0:1:0:58: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:57 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:57 2018] hpsa 0000:08:00.0: scsi 0:1:0:58: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:57 2018] hpsa 0000:08:00.0: scsi 0:1:0:59: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:14:58 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:14:58 2018] hpsa 0000:08:00.0: scsi 0:1:0:59: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:18:05 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:51:00 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 22:51:00 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 22:53:25 2018] sd 0:1:0:15: [sdp] tag#164 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Fri Apr 20 22:53:25 2018] sd 0:1:0:15: [sdp] tag#164 Sense Key : Medium Error [current] [Fri Apr 20 22:53:25 2018] sd 0:1:0:15: [sdp] tag#164 Add. Sense: Unrecovered read error [Fri Apr 20 22:53:25 2018] sd 0:1:0:15: [sdp] tag#164 CDB: Read(16) 88 00 00 00 00 02 87 8d b4 e0 00 00 04 00 00 00 [Fri Apr 20 22:53:25 2018] blk_update_request: critical medium error, dev sdp, sector 10864145632 [Fri Apr 20 22:55:11 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 23:12:20 2018] hpsa 0000:08:00.0: device is ready. [Fri Apr 20 23:12:20 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [Fri Apr 20 23:14:28 2018] sd 0:1:0:15: [sdp] Unaligned partial completion (resid=48, sector_sz=512) [Fri Apr 20 23:14:28 2018] sd 0:1:0:15: [sdp] tag#25 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Fri Apr 20 23:14:28 2018] sd 0:1:0:15: [sdp] tag#25 Sense Key : Medium Error [current] [Fri Apr 20 23:14:28 2018] sd 0:1:0:15: [sdp] tag#25 Add. Sense: Unrecovered read error [Fri Apr 20 23:14:28 2018] sd 0:1:0:15: [sdp] tag#25 CDB: Read(16) 88 00 00 00 00 02 87 8d b8 e0 00 00 04 00 00 00 [Fri Apr 20 23:14:28 2018] blk_update_request: critical medium error, dev sdp, sector 10864146656 [Fri Apr 20 23:16:15 2018] hpsa 0000:08:00.0: scsi 0:1:0:15: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 When you applied the 4.16 hpsa driver patches, was this patch also applied? commit 84676c1f21e8ff54befe985f4f14dc1edc10046b Author: Christoph Hellwig <hch@lst.de> Date: Fri Jan 12 10:53:05 2018 +0800 genirq/affinity: assign vectors to all possible CPUs Currently we assign managed interrupt vectors to all present CPUs. This works fine for systems were we only online/offline CPUs. But in case of systems that support physical CPU hotplug (or the virtualized version of it) this means the additional CPUs covered for in the ACPI tables or on the command line are not catered for. To fix this we'd either need to introduce new hotplug CPU states just for this case, or we can start assining vectors to possible but not present CPUs. Reported-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Tested-by: Stefan Haberland <sth@linux.vnet.ibm.com> Fixes: 4b855ad37194 ("blk-mq: Create hctx for each present CPU") Cc: linux-kernel@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> The above patch is why the hpsa-fix-selection-of-reply-queue patch was needed. If not, I would redact that patch because it may be causing your issues. There was another patch required for the hpsa-fix-selection-of-reply-queue patch: scsi-introduce-force-blk-mq. The errors shown in your logs indicate issues with DMA transfers of your data. Unaligned partial completion errors are usually issues with the scatter/gather buffers that represent your data buffers. I would like to eliminate using the 4.16 hpsa driver in a 4.11 kernel. Can you try our out-of-box driver? I'll attach this to the BZ. You compile it with make -f Makefile.alt The name is hpsa-3.4.20-136.tar.bz2 -------- commit 8b834bff1b73dce46f4e9f5e84af6f73fed8b0ef Author: Ming Lei <ming.lei@redhat.com> Date: Tue Mar 13 17:42:39 2018 +0800 scsi: hpsa: fix selection of reply queue Since commit 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs") we could end up with an MSI-X vector that did not have any online CPUs mapped. This would lead to I/O hangs since there was no CPU to receive the completion. Retrieve IRQ affinity information using pci_irq_get_affinity() and use this mapping to choose a reply queue. [mkp: tweaked commit desc] Cc: Hannes Reinecke <hare@suse.de> Cc: "Martin K. Petersen" <martin.petersen@oracle.com>, Cc: James Bottomley <james.bottomley@hansenpartnership.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: Don Brace <don.brace@microsemi.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Laurence Oberman <loberman@redhat.com> Cc: Meelis Roos <mroos@linux.ee> Cc: Artem Bityutskiy <artem.bityutskiy@intel.com> Cc: Mike Snitzer <snitzer@redhat.com> Fixes: 84676c1f21e8 ("genirq/affinity: assign vectors to all possible CPUs") Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Laurence Oberman <loberman@redhat.com> Tested-by: Don Brace <don.brace@microsemi.com> Tested-by: Artem Bityutskiy <artem.bityutskiy@intel.com> Acked-by: Don Brace <don.brace@microsemi.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> I believe this patch is also required. commit cf2a0ce8d1c25c8cc4509874d270be8fc6026cc3 Author: Ming Lei <ming.lei@redhat.com> Date: Tue Mar 13 17:42:41 2018 +0800 scsi: introduce force_blk_mq From scsi driver view, it is a bit troublesome to support both blk-mq and non-blk-mq at the same time, especially when drivers need to support multi hw-queue. This patch introduces 'force_blk_mq' to scsi_host_template so that drivers can provide blk-mq only support, so driver code can avoid the trouble for supporting both. Cc: Omar Sandoval <osandov@fb.com>, Cc: "Martin K. Petersen" <martin.petersen@oracle.com>, Cc: James Bottomley <james.bottomley@hansenpartnership.com>, Cc: Christoph Hellwig <hch@lst.de>, Cc: Don Brace <don.brace@microsemi.com> Cc: Kashyap Desai <kashyap.desai@broadcom.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Laurence Oberman <loberman@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Ming Lei <ming.lei@redhat.com> Created attachment 275473 [details]
Latest out of box hpsa driver.
This tar file contains our latest out-of-box driver.
1. tar xf hpsa-3.4.20-136.tar.bz2
2. cd hpsa-3.4.20/drivers/scsi
3. make -f Makefile.alt
If you are booted from hpsa, you will need to update your initrd and reboot.
If you are using hpsa for non-boot drives, your can
1. rmmod hpsa
2. insmod ./hpsa.ko
The only patch that I'm sure that I have is the "scsi: hpsa: fix selection of reply queue" one. For the I'm using an out of the box 4.11 kernel. So I'm really not sure that the other patches are present. Unfortunately, the module does not compile using 4.11.0-14-generic headers. # make -C /lib/modules/4.11.0-14-generic/build M=$(pwd) --makefile="/root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt" make: Entering directory '/usr/src/linux-headers-4.11.0-14-generic' make -C /lib/modules/4.4.0-96-generic/build M=/usr/src/linux-headers-4.11.0-14-generic EXTRA_CFLAGS+=-DKCLASS4A modules make[1]: Entering directory '/usr/src/linux-headers-4.4.0-96-generic' make[2]: *** No rule to make target 'kernel/bounds.c', needed by 'kernel/bounds.s'. Stop. Makefile:1423: recipe for target '_module_/usr/src/linux-headers-4.11.0-14-generic' failed make[1]: *** [_module_/usr/src/linux-headers-4.11.0-14-generic] Error 2 make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-96-generic' /root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt:96: recipe for target 'default' failed make: *** [default] Error 2 make: Leaving directory '/usr/src/linux-headers-4.11.0-14-generic' But if you tell me the principal problem is using the 4.11 kernel, I can upgrade it to use the 4.16.3 kernel. If I use it, must I use the out of box 3.4.20-136 hpsa driver or use your precedent patch on the last 3.4.20-125? We had a bunch of issues with the HPSA as already mentioned above. The specific issue that we had to revert was this commit 8b834bff1b73dce46f4e9f5e84af6f73fed8b0ef I assume your array has a charged battery (capacitor) and the writeback-cache is enabled on the 420i Are you only seeing this wen you have cmaeventd running, because hat can use pass through commands and has been known to cause issues. I am not running any of the HPE Proliant SPP daemons on my system. I have not seen this load related issue (without those daemons running) that you are seeing on my DL380G7 or Dl380G8 here so I will work on trying to reproduce and assist. Thanks Laurence Apr 18 01:29:16 kernel: cmaidad D 0 3442 1 0x00000000 Apr 18 01:29:16 kernel: Call Trace: Apr 18 01:29:16 kernel: __schedule+0x3b9/0x8f0 Apr 18 01:29:16 kernel: schedule+0x36/0x80 Apr 18 01:29:16 kernel: scsi_block_when_processing_errors+0xd5/0x110 Apr 18 01:29:16 kernel: ? wake_atomic_t_function+0x60/0x60 Apr 18 01:29:16 kernel: sg_open+0x14a/0x5c0 ***** Likely a pass though from the cma* management daemons Can you try reproduce with all the HP Health daemons disabled Indeed I have a charged battery (capacitor) and the writeback-cache enabled. I run the hp-health component too, I have already try to disable it on the 4.11 kernel and have reproduced the problem of load without it. The cma related call trace up after the logical drive reset is called. Right now, I test on a server the kernel 4.16.3-041603-generic with the hpsa module with the patch to use local work-queue insead of system work-queue. Right now I didn't reproduce the problem. I had a disk with bad blocks (before launching a read-only test badblocks returned a lot of block error) but since I have upgraded the kernel with the patch hpsa module I have no more error. I'm still trying to reproduce the problem by launching a badblocks read-only test on the "ex-faulty" disk. I'll tell you the result of the test. (In reply to Anthony Hausman from comment #11) > The only patch that I'm sure that I have is the "scsi: hpsa: fix selection > of reply queue" one. > For the I'm using an out of the box 4.11 kernel. So I'm really not sure that > the other patches are present. > > > Unfortunately, the module does not compile using 4.11.0-14-generic headers. > > # make -C /lib/modules/4.11.0-14-generic/build M=$(pwd) > --makefile="/root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt" > make: Entering directory '/usr/src/linux-headers-4.11.0-14-generic' > make -C /lib/modules/4.4.0-96-generic/build > M=/usr/src/linux-headers-4.11.0-14-generic EXTRA_CFLAGS+=-DKCLASS4A modules > make[1]: Entering directory '/usr/src/linux-headers-4.4.0-96-generic' > make[2]: *** No rule to make target 'kernel/bounds.c', needed by > 'kernel/bounds.s'. Stop. > Makefile:1423: recipe for target > '_module_/usr/src/linux-headers-4.11.0-14-generic' failed > make[1]: *** [_module_/usr/src/linux-headers-4.11.0-14-generic] Error 2 > make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-96-generic' > /root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt:96: recipe for > target 'default' failed > make: *** [default] Error 2 > make: Leaving directory '/usr/src/linux-headers-4.11.0-14-generic' > > But if you tell me the principal problem is using the 4.11 kernel, I can > upgrade it to use the 4.16.3 kernel. > > If I use it, must I use the out of box 3.4.20-136 hpsa driver or use your > precedent patch on the last 3.4.20-125? (In reply to Anthony Hausman from comment #11) > The only patch that I'm sure that I have is the "scsi: hpsa: fix selection > of reply queue" one. > For the I'm using an out of the box 4.11 kernel. So I'm really not sure that > the other patches are present. > > > Unfortunately, the module does not compile using 4.11.0-14-generic headers. > > # make -C /lib/modules/4.11.0-14-generic/build M=$(pwd) > --makefile="/root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt" > make: Entering directory '/usr/src/linux-headers-4.11.0-14-generic' > make -C /lib/modules/4.4.0-96-generic/build > M=/usr/src/linux-headers-4.11.0-14-generic EXTRA_CFLAGS+=-DKCLASS4A modules > make[1]: Entering directory '/usr/src/linux-headers-4.4.0-96-generic' > make[2]: *** No rule to make target 'kernel/bounds.c', needed by > 'kernel/bounds.s'. Stop. > Makefile:1423: recipe for target > '_module_/usr/src/linux-headers-4.11.0-14-generic' failed > make[1]: *** [_module_/usr/src/linux-headers-4.11.0-14-generic] Error 2 > make[1]: Leaving directory '/usr/src/linux-headers-4.4.0-96-generic' > /root/hpsa-3.4.20-136/hpsa-3.4.20/drivers/scsi/Makefile.alt:96: recipe for > target 'default' failed > make: *** [default] Error 2 > make: Leaving directory '/usr/src/linux-headers-4.11.0-14-generic' > > But if you tell me the principal problem is using the 4.11 kernel, I can > upgrade it to use the 4.16.3 kernel. > > If I use it, must I use the out of box 3.4.20-136 hpsa driver or use your > precedent patch on the last 3.4.20-125? The 4.16.3 driver should be OK to use. You could not untar the sources I gave you in /tmp and build with make -f Makefile.alt? If you copy the source code into the kernel tree, you should be able to do make modules SUBDIRS=drivers/scsi hpsa.ko Don, So I'm actually running the kernel 4.16.3 (build 18-04-19) with the hpsa modules patch to use local work-queue insead of system work-queue. I have a reproduce a reset with no stack trace (which is a good news). The only thing is between the resetting logical and the completation, 2 hours passed and caused an heavy load on the server during this time: Apr 25 01:31:09 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 Apr 25 03:31:00 kernel: hpsa 0000:08:00.0: device is ready. Apr 25 03:31:00 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 The good thing after the reset has completed, this one is removed: Apr 25 03:31:45 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: removed Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 Apr 25 03:31:48 kernel: scsi 0:1:0:0: rejecting I/O to dead device So the question is if it's normal than the reset logical take such a long time (and causing trouble on the server)? (In reply to Anthony Hausman from comment #16) > Don, > > So I'm actually running the kernel 4.16.3 (build 18-04-19) with the hpsa > modules patch to use local work-queue insead of system work-queue. > > I have a reproduce a reset with no stack trace (which is a good news). > The only thing is between the resetting logical and the completation, 2 > hours passed and caused an heavy load on the server during this time: > > Apr 25 01:31:09 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: resetting logical > Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 > Apr 25 03:31:00 kernel: hpsa 0000:08:00.0: device is ready. > Apr 25 03:31:00 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: reset logical > completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 > SSDSmartPathCap- En- Exp=1 > > The good thing after the reset has completed, this one is removed: > > Apr 25 03:31:45 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: removed > Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 The driver was notified by the P420i that the volume went offline, so the driver removed it from SML. > Apr 25 03:31:48 kernel: scsi 0:1:0:0: rejecting I/O to dead device There were I/O requests for the device, but the SML detected that it was deleted. > > So the question is if it's normal than the reset logical take such a long > time (and causing trouble on the server)? It is not normal. For a Logical Volume reset, the P420i flushes out any outstanding I/O requests then returns. The SML should block any new requests from coming down while the reset is in progress. Do you know what process was consuming the CPU cycles? ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 Are your using sg_reset to test LV resets? Or, does the device have some intermittent issues which is causing the SML to issue the reset operation? If you turn off the agents, do the resets complete more quickly? I am wondering if the agents are frequently probing the P420i for changes when the reset is active and the agents are consuming the CPU cycles. Unfortunatly I don't know what process was consuming the CPU cycles at this. I'll try to reproduce the problem to reproduce the problem to have the information. I'm not using sg_reset to test the lv reset, actually I am launching a badblocks command on a problematic disk and the reset is invoked when it begins to fails. I'll use sg_reset to reproduce the problem and test with/without the agent. I invoke the agent every 5 minutes to check the controller and disks states. I keep you inform on my test. By the way, I thank you for your help. I was concerned about the agents but Anthony disabled them and still saw this.I have seen this timeout sometimes when the agents probe via passthrough. I did just bump into this reset on a 7.5 RHEL kernel with no agents but it recovered almost immediately. I need to chase that down So here are all my test. With the agent enable, using hp check command disk (hpacucli/ssacli and hpssacli) and launching a sg_reset, the reset has no problem on the problematic disk: Apr 26 14:31:20 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 Apr 26 14:31:21 kernel: hpsa 0000:08:00.0: device is ready. Apr 26 14:31:21 kernel: hpsa 0000:08:00.0: scsi 0:1:0:0: reset logical completed successfully Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 The reset only took 1 second. The "bug" seems to appear only when the disk returns errors concerning Unrecovered read error (when using badblocks read-only test by example). I try to reproduce it. I have reproduced the problem. Here the condition that I have done: Kernel: 4.16.3-041603-generic hpsa: 3.4.20-125 with patch to use local work-queue instead of system work-queue. I needed to execute a badblocks in a read-only test on a disk who has failed before: ~# while :; do badblocks -v -b 4096 -s /dev/sdt; done And several days after, the bug raised. You'll find a graph of the load in an attachment. Before the reset, I have a hpsa_update_device_info: inquiry failed and a stack trace on badblocks (this one seems to be logical) Load: 850 [Tue May 1 06:27:37 2018] hpsa 0000:08:00.0: aborted: LUN:000000c000003901 CDB:12000000310000000000000000000000 [Tue May 1 06:27:37 2018] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. [Tue May 1 06:27:37 2018] hpsa 0000:08:00.0: scsi 0:0:50:0: removed Direct-Access ATA MB4000GCWDC PHYS DRV SSDSmartPathCap- En- Exp=0 [Tue May 1 06:28:24 2018] hpsa 0000:08:00.0: aborted: LUN:000000c000003901 CDB:12000000310000000000000000000000 [Tue May 1 06:28:24 2018] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. [Tue May 1 06:29:51 2018] INFO: task badblocks:46824 blocked for more than 120 seconds. [Tue May 1 06:29:51 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:29:51 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:29:51 2018] badblocks D 0 46824 48728 0x00000004 [Tue May 1 06:29:51 2018] Call Trace: [Tue May 1 06:29:51 2018] __schedule+0x297/0x880 [Tue May 1 06:29:51 2018] ? iov_iter_get_pages+0xc0/0x2c0 [Tue May 1 06:29:51 2018] schedule+0x2c/0x80 [Tue May 1 06:29:51 2018] io_schedule+0x16/0x40 [Tue May 1 06:29:51 2018] __blkdev_direct_IO_simple+0x1ff/0x360 [Tue May 1 06:29:51 2018] ? bdget+0x120/0x120 [Tue May 1 06:29:51 2018] blkdev_direct_IO+0x3a2/0x3f0 [Tue May 1 06:29:51 2018] ? blkdev_direct_IO+0x3a2/0x3f0 [Tue May 1 06:29:51 2018] ? current_time+0x32/0x70 [Tue May 1 06:29:51 2018] ? __atime_needs_update+0x7f/0x190 [Tue May 1 06:29:51 2018] generic_file_read_iter+0xc6/0xc10 [Tue May 1 06:29:51 2018] ? __blkdev_direct_IO_simple+0x360/0x360 [Tue May 1 06:29:51 2018] ? generic_file_read_iter+0xc6/0xc10 [Tue May 1 06:29:51 2018] ? __wake_up+0x13/0x20 [Tue May 1 06:29:51 2018] ? tty_ldisc_deref+0x16/0x20 [Tue May 1 06:29:51 2018] ? tty_write+0x1fb/0x320 [Tue May 1 06:29:51 2018] blkdev_read_iter+0x35/0x40 [Tue May 1 06:29:51 2018] __vfs_read+0xfb/0x170 [Tue May 1 06:29:51 2018] vfs_read+0x8e/0x130 [Tue May 1 06:29:51 2018] SyS_read+0x55/0xc0 [Tue May 1 06:29:51 2018] do_syscall_64+0x73/0x130 [Tue May 1 06:29:51 2018] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [Tue May 1 06:29:51 2018] RIP: 0033:0x7fe31b97c330 [Tue May 1 06:29:51 2018] RSP: 002b:00007fffcea10258 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [Tue May 1 06:29:51 2018] RAX: ffffffffffffffda RBX: 0000026e19800000 RCX: 00007fe31b97c330 [Tue May 1 06:29:51 2018] RDX: 0000000000040000 RSI: 00007fe31c26e000 RDI: 0000000000000003 [Tue May 1 06:29:51 2018] RBP: 0000000000001000 R08: 0000000026e19800 R09: 00007fffcea10008 [Tue May 1 06:29:51 2018] R10: 00007fffcea10020 R11: 0000000000000246 R12: 0000000000000003 [Tue May 1 06:29:51 2018] R13: 00007fe31c26e000 R14: 0000000000000040 R15: 0000000000040000 [Tue May 1 06:31:52 2018] INFO: task badblocks:46824 blocked for more than 120 seconds. [Tue May 1 06:31:52 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:31:52 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:31:52 2018] badblocks D 0 46824 48728 0x00000004 [Tue May 1 06:31:52 2018] Call Trace: [Tue May 1 06:31:52 2018] __schedule+0x297/0x880 [Tue May 1 06:31:52 2018] ? iov_iter_get_pages+0xc0/0x2c0 [Tue May 1 06:31:52 2018] schedule+0x2c/0x80 [Tue May 1 06:31:52 2018] io_schedule+0x16/0x40 [Tue May 1 06:31:52 2018] __blkdev_direct_IO_simple+0x1ff/0x360 [Tue May 1 06:31:52 2018] ? bdget+0x120/0x120 [Tue May 1 06:31:52 2018] blkdev_direct_IO+0x3a2/0x3f0 [Tue May 1 06:31:52 2018] ? blkdev_direct_IO+0x3a2/0x3f0 [Tue May 1 06:31:52 2018] ? current_time+0x32/0x70 [Tue May 1 06:31:52 2018] ? __atime_needs_update+0x7f/0x190 [Tue May 1 06:31:52 2018] generic_file_read_iter+0xc6/0xc10 [Tue May 1 06:31:52 2018] ? __blkdev_direct_IO_simple+0x360/0x360 [Tue May 1 06:31:52 2018] ? generic_file_read_iter+0xc6/0xc10 [Tue May 1 06:31:52 2018] ? __wake_up+0x13/0x20 [Tue May 1 06:31:52 2018] ? tty_ldisc_deref+0x16/0x20 [Tue May 1 06:31:52 2018] ? tty_write+0x1fb/0x320 [Tue May 1 06:31:52 2018] blkdev_read_iter+0x35/0x40 [Tue May 1 06:31:52 2018] __vfs_read+0xfb/0x170 [Tue May 1 06:31:52 2018] vfs_read+0x8e/0x130 [Tue May 1 06:31:52 2018] SyS_read+0x55/0xc0 [Tue May 1 06:31:52 2018] do_syscall_64+0x73/0x130 [Tue May 1 06:31:52 2018] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [Tue May 1 06:31:52 2018] RIP: 0033:0x7fe31b97c330 [Tue May 1 06:31:52 2018] RSP: 002b:00007fffcea10258 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [Tue May 1 06:31:52 2018] RAX: ffffffffffffffda RBX: 0000026e19800000 RCX: 00007fe31b97c330 [Tue May 1 06:31:52 2018] RDX: 0000000000040000 RSI: 00007fe31c26e000 RDI: 0000000000000003 [Tue May 1 06:31:52 2018] RBP: 0000000000001000 R08: 0000000026e19800 R09: 00007fffcea10008 [Tue May 1 06:31:52 2018] R10: 00007fffcea10020 R11: 0000000000000246 R12: 0000000000000003 [Tue May 1 06:31:52 2018] R13: 00007fe31c26e000 R14: 0000000000000040 R15: 0000000000040000 [Tue May 1 06:32:55 2018] hpsa 0000:08:00.0: scsi 0:1:0:19: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- E n- Exp=1 I have done a ps like you said before this time, every 30 seconds: ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 1 TS /sbin/init 0.0 3680 101792 0 0 0.0 poll_schedule_timeout init 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 9 TS [rcu_sched] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_sched 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 2034 TS /usr/sbin/syslog-ng --process-mode=background -f / 0.0 56936 637344 0 0 1.3 ep_poll syslog-ng 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 2333 TS cat filer-01-24-1.keys 0.0 324 4384 0 0 0.0 pipe_wait cat 0 2334 TS /usr/bin/python /usr/local/scality-walker/scality- 0.0 11208 131444 0 0 1.2 wait_woken scality-walker. 0 2740 TS /opt/datadog-agent/embedded/bin/python /opt/datado 0.0 42160 289140 0 0 0.6 poll_schedule_timeout python ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 2333 TS cat filer-01-24-1.keys 0.0 324 4384 0 0 0.0 pipe_wait cat 0 2334 TS /usr/bin/python /usr/local/scality-walker/scality- 0.0 11208 131444 0 0 1.2 wait_woken scality-walker. 0 2740 TS /opt/datadog-agent/embedded/bin/python /opt/datado 0.0 42160 289140 0 0 0.6 poll_schedule_timeout python 1 minute later, I had a task trace about cmaeventd (logical) and jbd2 tasks: Load: 2000 [Tue May 1 06:33:53 2018] INFO: task cmaeventd:3405 blocked for more than 120 seconds. [Tue May 1 06:33:53 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:33:53 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:33:53 2018] cmaeventd D 0 3405 1 0x00000000 [Tue May 1 06:33:53 2018] Call Trace: [Tue May 1 06:33:53 2018] __schedule+0x297/0x880 [Tue May 1 06:33:53 2018] schedule+0x2c/0x80 [Tue May 1 06:33:53 2018] scsi_block_when_processing_errors+0xd4/0x110 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] sg_open+0x14c/0x5d0 [Tue May 1 06:33:53 2018] chrdev_open+0xc4/0x1b0 [Tue May 1 06:33:53 2018] do_dentry_open+0x1c2/0x310 [Tue May 1 06:33:53 2018] ? cdev_put.part.3+0x20/0x20 [Tue May 1 06:33:53 2018] vfs_open+0x4f/0x80 [Tue May 1 06:33:53 2018] path_openat+0x66e/0x1770 [Tue May 1 06:33:53 2018] ? unlazy_walk+0x3b/0xb0 [Tue May 1 06:33:53 2018] ? terminate_walk+0x8e/0xf0 [Tue May 1 06:33:53 2018] do_filp_open+0x9b/0x110 [Tue May 1 06:33:53 2018] ? __check_object_size+0xac/0x1a0 [Tue May 1 06:33:53 2018] ? __check_object_size+0xac/0x1a0 [Tue May 1 06:33:53 2018] ? __alloc_fd+0x46/0x170 [Tue May 1 06:33:53 2018] do_sys_open+0x1ba/0x250 [Tue May 1 06:33:53 2018] ? do_sys_open+0x1ba/0x250 [Tue May 1 06:33:53 2018] ? SyS_access+0x13d/0x230 [Tue May 1 06:33:53 2018] SyS_open+0x1e/0x20 [Tue May 1 06:33:53 2018] do_syscall_64+0x73/0x130 [Tue May 1 06:33:53 2018] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [Tue May 1 06:33:53 2018] RIP: 0033:0x7fdfbc6b0be0 [Tue May 1 06:33:53 2018] RSP: 002b:00007ffe1f418728 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 [Tue May 1 06:33:53 2018] RAX: ffffffffffffffda RBX: 00000000018b8640 RCX: 00007fdfbc6b0be0 [Tue May 1 06:33:53 2018] RDX: 0000000000000008 RSI: 0000000000000002 RDI: 00007ffe1f418760 [Tue May 1 06:33:53 2018] RBP: 00007ffe1f418760 R08: 0000000000000001 R09: 0000000000000000 [Tue May 1 06:33:53 2018] R10: 00007fdfbc699760 R11: 0000000000000246 R12: 0000000000000002 [Tue May 1 06:33:53 2018] R13: 0000000000000001 R14: 00007ffe1f418870 R15: 00007ffe1f4189a0 [Tue May 1 06:33:53 2018] INFO: task cmaidad:3507 blocked for more than 120 seconds. [Tue May 1 06:33:53 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:33:53 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:33:53 2018] cmaidad D 0 3507 1 0x00000000 [Tue May 1 06:33:53 2018] Call Trace: [Tue May 1 06:33:53 2018] __schedule+0x297/0x880 [Tue May 1 06:33:53 2018] ? __find_get_block+0xb6/0x2f0 [Tue May 1 06:33:53 2018] schedule+0x2c/0x80 [Tue May 1 06:33:53 2018] scsi_block_when_processing_errors+0xd4/0x110 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] sg_open+0x14c/0x5d0 [Tue May 1 06:33:53 2018] chrdev_open+0xc4/0x1b0 [Tue May 1 06:33:53 2018] do_dentry_open+0x1c2/0x310 [Tue May 1 06:33:53 2018] ? cdev_put.part.3+0x20/0x20 [Tue May 1 06:33:53 2018] vfs_open+0x4f/0x80 [Tue May 1 06:33:53 2018] path_openat+0x66e/0x1770 [Tue May 1 06:33:53 2018] ? unlazy_walk+0x3b/0xb0 [Tue May 1 06:33:53 2018] ? terminate_walk+0x8e/0xf0 [Tue May 1 06:33:53 2018] do_filp_open+0x9b/0x110 [Tue May 1 06:33:53 2018] ? __check_object_size+0xac/0x1a0 [Tue May 1 06:33:53 2018] ? __check_object_size+0xac/0x1a0 [Tue May 1 06:33:53 2018] ? __alloc_fd+0x46/0x170 [Tue May 1 06:33:53 2018] do_sys_open+0x1ba/0x250 [Tue May 1 06:33:53 2018] ? do_sys_open+0x1ba/0x250 [Tue May 1 06:33:53 2018] ? SyS_access+0x13d/0x230 [Tue May 1 06:33:53 2018] SyS_open+0x1e/0x20 [Tue May 1 06:33:53 2018] do_syscall_64+0x73/0x130 [Tue May 1 06:33:53 2018] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [Tue May 1 06:33:53 2018] RIP: 0033:0x7f5dec322be0 [Tue May 1 06:33:53 2018] RSP: 002b:00007ffee82dccc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000002 [Tue May 1 06:33:53 2018] RAX: ffffffffffffffda RBX: 00000000021c4c60 RCX: 00007f5dec322be0 [Tue May 1 06:33:53 2018] RDX: 0000000000000008 RSI: 0000000000000002 RDI: 00007ffee82dcd00 [Tue May 1 06:33:53 2018] RBP: 00007ffee82dcd00 R08: 0000000000000001 R09: 0000000000000003 [Tue May 1 06:33:53 2018] R10: 00007f5dec30b760 R11: 0000000000000246 R12: 0000000000000002 [Tue May 1 06:33:53 2018] R13: 0000000000000001 R14: 00007ffee82dce10 R15: 00007ffee82dcf40 [Tue May 1 06:33:53 2018] INFO: task jbd2/sdas-8:9924 blocked for more than 120 seconds. [Tue May 1 06:33:53 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:33:53 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:33:53 2018] jbd2/sdas-8 D 0 9924 2 0x80000000 [Tue May 1 06:33:53 2018] Call Trace: [Tue May 1 06:33:53 2018] __schedule+0x297/0x880 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] schedule+0x2c/0x80 [Tue May 1 06:33:53 2018] jbd2_journal_commit_transaction+0x244/0x1740 [Tue May 1 06:33:53 2018] ? update_curr+0xf5/0x1d0 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] ? lock_timer_base+0x6b/0x90 [Tue May 1 06:33:53 2018] kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] kthread+0x121/0x140 [Tue May 1 06:33:53 2018] ? commit_timeout+0x20/0x20 [Tue May 1 06:33:53 2018] ? kthread_create_worker_on_cpu+0x70/0x70 [Tue May 1 06:33:53 2018] ret_from_fork+0x35/0x40 [Tue May 1 06:33:53 2018] INFO: task jbd2/sdan-8:9955 blocked for more than 120 seconds. [Tue May 1 06:33:53 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:33:53 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:33:53 2018] jbd2/sdan-8 D 0 9955 2 0x80000000 [Tue May 1 06:33:53 2018] Call Trace: [Tue May 1 06:33:53 2018] __schedule+0x297/0x880 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] schedule+0x2c/0x80 [Tue May 1 06:33:53 2018] jbd2_journal_commit_transaction+0x244/0x1740 [Tue May 1 06:33:53 2018] ? update_curr+0xf5/0x1d0 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] ? lock_timer_base+0x6b/0x90 [Tue May 1 06:33:53 2018] kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] kthread+0x121/0x140 [Tue May 1 06:33:53 2018] ? commit_timeout+0x20/0x20 [Tue May 1 06:33:53 2018] ? kthread_create_worker_on_cpu+0x70/0x70 [Tue May 1 06:33:53 2018] ? do_syscall_64+0x73/0x130 [Tue May 1 06:33:53 2018] ? SyS_exit_group+0x14/0x20 [Tue May 1 06:33:53 2018] ret_from_fork+0x35/0x40 [Tue May 1 06:33:53 2018] INFO: task jbd2/sdaq-8:9965 blocked for more than 120 seconds. [Tue May 1 06:33:53 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:33:53 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:33:53 2018] jbd2/sdaq-8 D 0 9965 2 0x80000000 [Tue May 1 06:33:53 2018] Call Trace: [Tue May 1 06:33:53 2018] __schedule+0x297/0x880 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] schedule+0x2c/0x80 [Tue May 1 06:33:53 2018] jbd2_journal_commit_transaction+0x244/0x1740 [Tue May 1 06:33:53 2018] ? update_curr+0xf5/0x1d0 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] ? lock_timer_base+0x6b/0x90 [Tue May 1 06:33:53 2018] kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] kthread+0x121/0x140 [Tue May 1 06:33:53 2018] ? commit_timeout+0x20/0x20 [Tue May 1 06:33:53 2018] ? kthread_create_worker_on_cpu+0x70/0x70 [Tue May 1 06:33:53 2018] ret_from_fork+0x35/0x40 [Tue May 1 06:33:53 2018] INFO: task jbd2/sdaj-8:10082 blocked for more than 120 seconds. [Tue May 1 06:33:53 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:33:53 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:33:53 2018] jbd2/sdaj-8 D 0 10082 2 0x80000000 [Tue May 1 06:33:53 2018] Call Trace: [Tue May 1 06:33:53 2018] __schedule+0x297/0x880 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] schedule+0x2c/0x80 [Tue May 1 06:33:53 2018] jbd2_journal_commit_transaction+0x244/0x1740 [Tue May 1 06:33:53 2018] ? update_curr+0xf5/0x1d0 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] ? lock_timer_base+0x6b/0x90 [Tue May 1 06:33:53 2018] kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] kthread+0x121/0x140 [Tue May 1 06:33:53 2018] ? commit_timeout+0x20/0x20 [Tue May 1 06:33:53 2018] ? kthread_create_worker_on_cpu+0x70/0x70 [Tue May 1 06:33:53 2018] ? do_syscall_64+0x73/0x130 [Tue May 1 06:33:53 2018] ? SyS_exit_group+0x14/0x20 [Tue May 1 06:33:53 2018] ret_from_fork+0x35/0x40 [Tue May 1 06:33:53 2018] INFO: task jbd2/sdao-8:10109 blocked for more than 120 seconds. [Tue May 1 06:33:53 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:33:53 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:33:53 2018] jbd2/sdao-8 D 0 10109 2 0x80000000 [Tue May 1 06:33:53 2018] Call Trace: [Tue May 1 06:33:53 2018] __schedule+0x297/0x880 [Tue May 1 06:33:53 2018] ? bit_wait+0x60/0x60 [Tue May 1 06:33:53 2018] schedule+0x2c/0x80 [Tue May 1 06:33:53 2018] io_schedule+0x16/0x40 [Tue May 1 06:33:53 2018] bit_wait_io+0x11/0x60 [Tue May 1 06:33:53 2018] __wait_on_bit+0x4c/0x90 [Tue May 1 06:33:53 2018] out_of_line_wait_on_bit+0x90/0xb0 [Tue May 1 06:33:53 2018] ? bit_waitqueue+0x40/0x40 [Tue May 1 06:33:53 2018] __wait_on_buffer+0x32/0x40 [Tue May 1 06:33:53 2018] jbd2_journal_commit_transaction+0xf59/0x1740 [Tue May 1 06:33:53 2018] kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] kthread+0x121/0x140 [Tue May 1 06:33:53 2018] ? commit_timeout+0x20/0x20 [Tue May 1 06:33:53 2018] ? kthread_create_worker_on_cpu+0x70/0x70 [Tue May 1 06:33:53 2018] ret_from_fork+0x35/0x40 [Tue May 1 06:33:53 2018] INFO: task jbd2/sdag-8:10135 blocked for more than 120 seconds. [Tue May 1 06:33:53 2018] Tainted: G OE 4.16.3-041603-generic #201804190730 [Tue May 1 06:33:53 2018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Tue May 1 06:33:53 2018] jbd2/sdag-8 D 0 10135 2 0x80000000 [Tue May 1 06:33:53 2018] Call Trace: [Tue May 1 06:33:53 2018] __schedule+0x297/0x880 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] schedule+0x2c/0x80 [Tue May 1 06:33:53 2018] jbd2_journal_commit_transaction+0x244/0x1740 [Tue May 1 06:33:53 2018] ? update_curr+0xf5/0x1d0 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] ? lock_timer_base+0x6b/0x90 [Tue May 1 06:33:53 2018] kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? kjournald2+0xc8/0x270 [Tue May 1 06:33:53 2018] ? wait_woken+0x80/0x80 [Tue May 1 06:33:53 2018] kthread+0x121/0x140 [Tue May 1 06:33:53 2018] ? commit_timeout+0x20/0x20 [Tue May 1 06:33:53 2018] ? kthread_create_worker_on_cpu+0x70/0x70 [Tue May 1 06:33:53 2018] ret_from_fork+0x35/0x40 Some other ps after that message: ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 2333 TS cat filer-01-24-1.keys 0.0 324 4384 0 0 0.0 pipe_wait cat 0 2427 TS /usr/bin/python /usr/bin/salt-minion KeepAlive Mul 0.0 109868 719424 0 0 0.1 poll_schedule_timeout /usr/bin/python 0 3555 TS cmascsid -p 15 -s OK -l /var/log/hp-snmp-agents/cm 0.0 396 12880 0 0 0.0 msgrcv cmascsid ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 1865 TS /usr/sbin/sshd -D 0.0 836 61392 0 0 0.0 poll_schedule_timeout sshd 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 2333 TS cat filer-01-24-1.keys 0.0 324 4384 0 0 0.0 pipe_wait cat 0 2399 TS /usr/bin/python /usr/sbin/exabgp /etc/exabgp/exabg 0.0 9552 50888 0 0 0.0 poll_schedule_timeout exabgp ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 436 TS [md0_raid1] 0.0 0 0 0 0 0.0 md_thread md0_raid1 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 2333 TS cat filer-01-24-1.keys 0.0 324 4384 0 0 0.0 pipe_wait cat 0 2662 TS asynctask-worker [disable] : 1 0.0 14856 135372 0 0 0.0 poll_schedule_timeout asynctask-worke ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 2034 TS /usr/sbin/syslog-ng --process-mode=background -f / 0.0 56936 637344 0 0 1.3 ep_poll syslog-ng 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 2333 TS cat filer-01-24-1.keys 0.0 324 4384 0 0 0.0 pipe_wait cat 0 2471 TS python /etc/exabgp/processes/exasrv.py /etc/exabgp 0.0 6316 35120 0 0 0.0 poll_schedule_timeout python ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 2471 TS python /etc/exabgp/processes/exasrv.py /etc/exabgp 0.0 6316 35120 0 0 0.0 poll_schedule_timeout python 0 3275 TS cmahealthd -p 30 -s OK -t OK -i -l /var/log/hp-snm 0.0 972 22236 0 0 0.0 msgrcv cmahealthd 0 3487 TS cmasasd -p 15 -s OK -l /var/log/hp-snmp-agents/cma 0.0 388 10820 0 0 0.0 msgrcv cmasasd ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 458 TS [jbd2/md0-8] 0.0 0 0 0 0 0.0 kjournald2 jbd2/md0-8 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 3275 TS cmahealthd -p 30 -s OK -t OK -i -l /var/log/hp-snm 0.0 972 22236 0 0 0.0 msgrcv cmahealthd 0 3487 TS cmasasd -p 15 -s OK -l /var/log/hp-snmp-agents/cma 0.0 388 10820 0 0 0.0 msgrcv cmasasd ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kwor ker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kwor ker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 436 TS [md0_raid1] 0.0 0 0 0 0 0.0 md_thread md0_raid1 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 1462 TS lldpd: monitor 0.0 2160 48672 0 0 0.0 skb_wait_for_more_packets lldpd 0 1865 TS /usr/sbin/sshd -D 0.0 836 61392 0 0 0.0 poll_schedule_timeout sshd 0 2033 TS lldpd: 2 neighbors 0.0 2488 49000 0 0 0.0 ep_poll lldpd 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty Few minutes later before reboot: ps -deo psr,pid,cls,cmd:50,pmem,size,vsz,nice,psr,pcpu,wchan:30,comm:30 | sort -nk1 | head -20 0 3 TS [kworker/0:0] 0.0 0 0 0 0 0.0 worker_thread kworker/0:0 0 4 TS [kworker/0:0H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:0H 0 7 TS [mm_percpu_wq] 0.0 0 0 -20 0 0.0 rescuer_thread mm_percpu_wq 0 8 TS [ksoftirqd/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn ksoftirqd/0 0 10 TS [rcu_bh] 0.0 0 0 0 0 0.0 rcu_gp_kthread rcu_bh 0 11 FF [migration/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn migration/0 0 12 FF [watchdog/0] 0.0 0 0 - 0 0.0 smpboot_thread_fn watchdog/0 0 13 TS [cpuhp/0] 0.0 0 0 0 0 0.0 smpboot_thread_fn cpuhp/0 0 71 TS [kblockd] 0.0 0 0 -20 0 0.0 rescuer_thread kblockd 0 76 FF [watchdogd] 0.0 0 0 - 0 0.0 kthread_worker_fn watchdogd 0 128 TS [nvme-delete-wq] 0.0 0 0 -20 0 0.0 rescuer_thread nvme-delete-wq 0 245 TS [kworker/0:2] 0.0 0 0 0 0 0.0 worker_thread kworker/0:2 0 271 TS [raid5wq] 0.0 0 0 -20 0 0.0 rescuer_thread raid5wq 0 427 TS [kworker/u129:0] 0.0 0 0 0 0 0.0 worker_thread kworker/u129:0 0 477 TS [kworker/0:1H] 0.0 0 0 -20 0 0.0 worker_thread kworker/0:1H 0 2080 TS logger -p daemon.info -t docker_daemon_events 0.0 328 4360 0 0 0.0 pipe_wait logger 0 2248 TS /sbin/getty -8 38400 tty6 0.0 356 15836 0 0 0.0 wait_woken getty 0 2427 TS /usr/bin/python /usr/bin/salt-minion KeepAlive Mul 0.0 109868 719424 0 0 0.1 poll_schedule_timeout /usr/bin/python 0 3326 TS cmasm2d -p 30 -l /var/log/hp-snmp-agents/cma.log 0.0 948 24176 0 0 0.0 msgrcv cmasm2d 0 3364 TS cmaperfd -p 30 -s OK -l /var/log/hp-snmp-agents/cm 0.0 1628 22724 0 0 0.0 msgrcv cmaperfd So here it is, I hope we have now enough thing to track down this weird behavior. If you need some other informations or more thing, I can make a little to script to pass some commands if the reset raise without returns. Created attachment 275723 [details]
Load on server during reset problem
Ho, I have forgotten to say that before the hpsa do some actions, I had several errors on the disk where the badblocks ran: ... [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] tag#0 Sense Key : Medium Error [current] [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] tag#0 Add. Sense: Unrecovered read error [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] tag#0 CDB: Read(16) 88 00 00 00 00 01 37 0c c5 b0 00 00 00 08 00 00 [Mon Apr 30 22:21:18 2018] print_req_error: critical medium error, dev sdt, sector 5218551216 [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] Unaligned partial completion (resid=242, sector_sz=512) [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] tag#0 Sense Key : Medium Error [current] [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] tag#0 Add. Sense: Unrecovered read error [Mon Apr 30 22:21:18 2018] sd 0:1:0:19: [sdt] tag#0 CDB: Read(16) 88 00 00 00 00 01 37 0c c5 b8 00 00 00 08 00 00 [Mon Apr 30 22:21:18 2018] print_req_error: critical medium error, dev sdt, sector 5218551224 [Tue May 1 06:27:37 2018] hpsa 0000:08:00.0: aborted: LUN:000000c000003901 CDB:12000000310000000000000000000000 [Tue May 1 06:27:37 2018] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. [Tue May 1 06:27:37 2018] hpsa 0000:08:00.0: scsi 0:0:50:0: removed Direct-Access ATA MB4000GCWDC PHYS DRV SSDSmartP athCap- En- Exp=0 [Tue May 1 06:28:24 2018] hpsa 0000:08:00.0: aborted: LUN:000000c000003901 CDB:12000000310000000000000000000000 [Tue May 1 06:28:24 2018] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. ... Same behavior here with controllers P440ar and P420i on DL480 G8 and DL480p G8. Firmware: - P440ar: 6.60 - P420i: 8.32 [128958.979859] hpsa 0000:03:00.0: scsi 0:1:0:9: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [129170.663840] INFO: task scsi_eh_0:446 blocked for more than 120 seconds. [129170.671251] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.678176] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.686930] scsi_eh_0 D 0 446 2 0x80000000 [129170.686934] Call Trace: [129170.686945] __schedule+0x3d6/0x8b0 [129170.686947] schedule+0x36/0x80 [129170.686950] schedule_timeout+0x1db/0x370 [129170.686954] ? __dev_printk+0x3c/0x80 [129170.686956] ? dev_printk+0x56/0x80 [129170.686959] io_schedule_timeout+0x1e/0x50 [129170.686961] wait_for_completion_io+0xb4/0x140 [129170.686965] ? wake_up_q+0x70/0x70 [129170.686972] hpsa_scsi_do_simple_cmd.isra.56+0xc7/0xf0 [hpsa] [129170.686975] hpsa_eh_device_reset_handler+0x3bb/0x790 [hpsa] [129170.686978] ? sched_clock_cpu+0x11/0xb0 [129170.686983] ? scsi_device_put+0x2b/0x30 [129170.686987] scsi_eh_ready_devs+0x368/0xc10 [129170.686993] ? __pm_runtime_resume+0x5b/0x80 [129170.686995] scsi_error_handler+0x4c3/0x5c0 [129170.687000] kthread+0x105/0x140 [129170.687003] ? scsi_eh_get_sense+0x240/0x240 [129170.687005] ? kthread_destroy_worker+0x50/0x50 [129170.687012] ? do_syscall_64+0x73/0x130 [129170.687015] ? SyS_exit_group+0x14/0x20 [129170.687017] ret_from_fork+0x35/0x40 [129170.687021] INFO: task jbd2/sda1-8:636 blocked for more than 120 seconds. [129170.694649] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.701598] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.710343] jbd2/sda1-8 D 0 636 2 0x80000000 [129170.710346] Call Trace: [129170.710349] __schedule+0x3d6/0x8b0 [129170.710351] ? bit_wait+0x60/0x60 [129170.710352] schedule+0x36/0x80 [129170.710354] io_schedule+0x16/0x40 [129170.710359] bit_wait_io+0x11/0x60 [129170.710362] __wait_on_bit+0x63/0x90 [129170.710367] out_of_line_wait_on_bit+0x8e/0xb0 [129170.710373] ? bit_waitqueue+0x40/0x40 [129170.710377] __wait_on_buffer+0x32/0x40 [129170.710381] jbd2_journal_commit_transaction+0xdf6/0x1760 [129170.710387] kjournald2+0xc8/0x250 [129170.710392] ? kjournald2+0xc8/0x250 [129170.710395] ? wait_woken+0x80/0x80 [129170.710398] kthread+0x105/0x140 [129170.710399] ? commit_timeout+0x20/0x20 [129170.710402] ? kthread_destroy_worker+0x50/0x50 [129170.710404] ? do_syscall_64+0x73/0x130 [129170.710407] ? SyS_exit_group+0x14/0x20 [129170.710412] ret_from_fork+0x35/0x40 [129170.710423] INFO: task rs:main Q:Reg:2907 blocked for more than 120 seconds. [129170.718358] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.725305] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.734076] rs:main Q:Reg D 0 2907 1 0x00000000 [129170.734079] Call Trace: [129170.734082] __schedule+0x3d6/0x8b0 [129170.734086] ? bit_waitqueue+0x40/0x40 [129170.734087] ? bit_wait+0x60/0x60 [129170.734089] schedule+0x36/0x80 [129170.734091] io_schedule+0x16/0x40 [129170.734092] bit_wait_io+0x11/0x60 [129170.734094] __wait_on_bit+0x63/0x90 [129170.734096] out_of_line_wait_on_bit+0x8e/0xb0 [129170.734098] ? bit_waitqueue+0x40/0x40 [129170.734100] do_get_write_access+0x202/0x410 [129170.734102] jbd2_journal_get_write_access+0x51/0x70 [129170.734107] __ext4_journal_get_write_access+0x3b/0x80 [129170.734111] ext4_reserve_inode_write+0x95/0xc0 [129170.734115] ? ext4_dirty_inode+0x48/0x70 [129170.734117] ext4_mark_inode_dirty+0x53/0x1d0 [129170.734119] ? __ext4_journal_start_sb+0x6d/0x120 [129170.734121] ext4_dirty_inode+0x48/0x70 [129170.734125] __mark_inode_dirty+0x184/0x3b0 [129170.734129] generic_update_time+0x7b/0xd0 [129170.734132] ? current_time+0x32/0x70 [129170.734134] file_update_time+0xbe/0x110 [129170.734140] __generic_file_write_iter+0x9d/0x1f0 [129170.734142] ext4_file_write_iter+0xc4/0x3f0 [129170.734147] ? futex_wake+0x90/0x170 [129170.734151] new_sync_write+0xe5/0x140 [129170.734155] __vfs_write+0x29/0x40 [129170.734156] vfs_write+0xb8/0x1b0 [129170.734158] SyS_write+0x55/0xc0 [129170.734160] do_syscall_64+0x73/0x130 [129170.734163] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [129170.734165] RIP: 0033:0x7feefa9394bd [129170.734166] RSP: 002b:00007feef7ce8600 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [129170.734168] RAX: ffffffffffffffda RBX: 00007feeec00d120 RCX: 00007feefa9394bd [129170.734169] RDX: 0000000000000078 RSI: 00007feeec00d120 RDI: 0000000000000006 [129170.734171] RBP: 0000000000000000 R08: 00000000010d0030 R09: 00007feef7ce8890 [129170.734173] R10: 00007feef7ce8890 R11: 0000000000000293 R12: 00007feeec0027c0 [129170.734174] R13: 00007feef7ce8620 R14: 000000000046a8b4 R15: 0000000000000078 [129170.734194] INFO: task dockerd:10374 blocked for more than 120 seconds. [129170.741596] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.748540] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.757284] dockerd D 0 10374 1 0x00000000 [129170.757286] Call Trace: [129170.757289] __schedule+0x3d6/0x8b0 [129170.757295] ? bit_wait+0x60/0x60 [129170.757298] schedule+0x36/0x80 [129170.757300] io_schedule+0x16/0x40 [129170.757302] bit_wait_io+0x11/0x60 [129170.757303] __wait_on_bit+0x63/0x90 [129170.757306] ? select_idle_sibling+0x1db/0x410 [129170.757307] out_of_line_wait_on_bit+0x8e/0xb0 [129170.757311] ? bit_waitqueue+0x40/0x40 [129170.757319] do_get_write_access+0x202/0x410 [129170.757323] ? __wake_up_common_lock+0x8e/0xc0 [129170.757327] jbd2_journal_get_write_access+0x51/0x70 [129170.757331] __ext4_journal_get_write_access+0x3b/0x80 [129170.757334] ext4_reserve_inode_write+0x95/0xc0 [129170.757338] ? ext4_dirty_inode+0x48/0x70 [129170.757340] ext4_mark_inode_dirty+0x53/0x1d0 [129170.757343] ? __ext4_journal_start_sb+0x6d/0x120 [129170.757345] ext4_dirty_inode+0x48/0x70 [129170.757350] __mark_inode_dirty+0x184/0x3b0 [129170.757358] generic_update_time+0x7b/0xd0 [129170.757362] ? current_time+0x32/0x70 [129170.757365] file_update_time+0xbe/0x110 [129170.757368] __generic_file_write_iter+0x9d/0x1f0 [129170.757371] ext4_file_write_iter+0xc4/0x3f0 [129170.757374] new_sync_write+0xe5/0x140 [129170.757376] __vfs_write+0x29/0x40 [129170.757378] vfs_write+0xb8/0x1b0 [129170.757379] SyS_write+0x55/0xc0 [129170.757382] do_syscall_64+0x73/0x130 [129170.757385] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [129170.757393] RIP: 0033:0x5632bfc51d40 [129170.757396] RSP: 002b:000000c420edaaf0 EFLAGS: 00000206 ORIG_RAX: 0000000000000001 [129170.757403] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00005632bfc51d40 [129170.757405] RDX: 000000000000012d RSI: 000000c420bb0800 RDI: 000000000000001e [129170.757407] RBP: 000000c420edab48 R08: 0000000000000000 R09: 0000000000000000 [129170.757409] R10: 0000000000000000 R11: 0000000000000206 R12: ffffffffffffffff [129170.757410] R13: 0000000000000083 R14: 0000000000000082 R15: 0000000000000100 [129170.757448] INFO: task log:6889 blocked for more than 120 seconds. [129170.764622] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.771584] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.780349] log D 0 6889 1 0x00000000 [129170.780352] Call Trace: [129170.780356] __schedule+0x3d6/0x8b0 [129170.780360] ? bit_wait+0x60/0x60 [129170.780361] schedule+0x36/0x80 [129170.780363] io_schedule+0x16/0x40 [129170.780365] bit_wait_io+0x11/0x60 [129170.780366] __wait_on_bit+0x63/0x90 [129170.780368] out_of_line_wait_on_bit+0x8e/0xb0 [129170.780373] ? bit_waitqueue+0x40/0x40 [129170.780377] do_get_write_access+0x202/0x410 [129170.780380] jbd2_journal_get_write_access+0x51/0x70 [129170.780385] __ext4_journal_get_write_access+0x3b/0x80 [129170.780387] ext4_reserve_inode_write+0x95/0xc0 [129170.780389] ? ext4_dirty_inode+0x48/0x70 [129170.780391] ext4_mark_inode_dirty+0x53/0x1d0 [129170.780393] ? __ext4_journal_start_sb+0x6d/0x120 [129170.780395] ext4_dirty_inode+0x48/0x70 [129170.780397] __mark_inode_dirty+0x184/0x3b0 [129170.780401] generic_update_time+0x7b/0xd0 [129170.780405] ? current_time+0x32/0x70 [129170.780409] file_update_time+0xbe/0x110 [129170.780413] __generic_file_write_iter+0x9d/0x1f0 [129170.780417] ext4_file_write_iter+0xc4/0x3f0 [129170.780421] ? futex_wake+0x90/0x170 [129170.780423] new_sync_write+0xe5/0x140 [129170.780425] __vfs_write+0x29/0x40 [129170.780426] vfs_write+0xb8/0x1b0 [129170.780428] SyS_write+0x55/0xc0 [129170.780430] do_syscall_64+0x73/0x130 [129170.780433] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [129170.780434] RIP: 0033:0x7f5c82b984bd [129170.780435] RSP: 002b:00007f5c80a615c0 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [129170.780436] RAX: ffffffffffffffda RBX: 0000000000000082 RCX: 00007f5c82b984bd [129170.780437] RDX: 0000000000000082 RSI: 00007f5c80a615f0 RDI: 0000000000000003 [129170.780438] RBP: 00007f5c80a615f0 R08: 0000000000000000 R09: 0000000000000011 [129170.780439] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003 [129170.780440] R13: 00007f5c80a617e0 R14: 0000000000000056 R15: 0000000000000081 [129170.780453] INFO: task log:6964 blocked for more than 120 seconds. [129170.787377] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.794341] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.803204] log D 0 6964 1 0x00000000 [129170.803207] Call Trace: [129170.803210] __schedule+0x3d6/0x8b0 [129170.803216] ? find_next_bit+0xb/0x10 [129170.803218] ? bit_wait+0x60/0x60 [129170.803219] schedule+0x36/0x80 [129170.803221] io_schedule+0x16/0x40 [129170.803223] bit_wait_io+0x11/0x60 [129170.803224] __wait_on_bit+0x63/0x90 [129170.803226] out_of_line_wait_on_bit+0x8e/0xb0 [129170.803228] ? bit_waitqueue+0x40/0x40 [129170.803230] do_get_write_access+0x202/0x410 [129170.803234] jbd2_journal_get_write_access+0x51/0x70 [129170.803237] __ext4_journal_get_write_access+0x3b/0x80 [129170.803239] ext4_reserve_inode_write+0x95/0xc0 [129170.803241] ? ext4_dirty_inode+0x48/0x70 [129170.803243] ext4_mark_inode_dirty+0x53/0x1d0 [129170.803244] ? _cond_resched+0x1a/0x50 [129170.803247] ? __ext4_journal_start_sb+0x6d/0x120 [129170.803250] ext4_dirty_inode+0x48/0x70 [129170.803252] __mark_inode_dirty+0x184/0x3b0 [129170.803254] ? block_write_end+0x33/0x80 [129170.803256] generic_write_end+0x87/0xe0 [129170.803258] ext4_da_write_end+0x117/0x290 [129170.803260] ? copyin+0x29/0x30 [129170.803263] generic_perform_write+0xff/0x1b0 [129170.803266] __generic_file_write_iter+0x1a6/0x1f0 [129170.803269] ext4_file_write_iter+0xc4/0x3f0 [129170.803271] ? futex_wake+0x90/0x170 [129170.803273] new_sync_write+0xe5/0x140 [129170.803275] __vfs_write+0x29/0x40 [129170.803277] vfs_write+0xb8/0x1b0 [129170.803279] SyS_write+0x55/0xc0 [129170.803281] do_syscall_64+0x73/0x130 [129170.803284] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [129170.803286] RIP: 0033:0x7f875f3864bd [129170.803287] RSP: 002b:00007f875d24f540 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [129170.803289] RAX: ffffffffffffffda RBX: 000000000000010a RCX: 00007f875f3864bd [129170.803290] RDX: 000000000000010a RSI: 00007f875d24f570 RDI: 0000000000000003 [129170.803291] RBP: 00007f875d24f570 R08: 0000000000000000 R09: 0000000000000011 [129170.803292] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003 [129170.803293] R13: 00007f875d24f7e0 R14: 00000000000000de R15: 0000000000000109 [129170.803304] INFO: task log:6976 blocked for more than 120 seconds. [129170.810258] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.817202] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.826044] log D 0 6976 1 0x00000000 [129170.826047] Call Trace: [129170.826050] __schedule+0x3d6/0x8b0 [129170.826057] ? bit_wait+0x60/0x60 [129170.826060] schedule+0x36/0x80 [129170.826062] io_schedule+0x16/0x40 [129170.826063] bit_wait_io+0x11/0x60 [129170.826065] __wait_on_bit+0x63/0x90 [129170.826066] ? ttwu_do_wakeup+0x1e/0x150 [129170.826071] out_of_line_wait_on_bit+0x8e/0xb0 [129170.826079] ? bit_waitqueue+0x40/0x40 [129170.826082] do_get_write_access+0x202/0x410 [129170.826084] jbd2_journal_get_write_access+0x51/0x70 [129170.826087] __ext4_journal_get_write_access+0x3b/0x80 [129170.826089] ext4_reserve_inode_write+0x95/0xc0 [129170.826094] ? ext4_dirty_inode+0x48/0x70 [129170.826100] ext4_mark_inode_dirty+0x53/0x1d0 [129170.826104] ? _cond_resched+0x1a/0x50 [129170.826107] ? __ext4_journal_start_sb+0x6d/0x120 [129170.826109] ext4_dirty_inode+0x48/0x70 [129170.826112] __mark_inode_dirty+0x184/0x3b0 [129170.826115] ? block_write_end+0x33/0x80 [129170.826116] generic_write_end+0x87/0xe0 [129170.826120] ext4_da_write_end+0x117/0x290 [129170.826125] ? copyin+0x29/0x30 [129170.826133] generic_perform_write+0xff/0x1b0 [129170.826135] __generic_file_write_iter+0x1a6/0x1f0 [129170.826137] ext4_file_write_iter+0xc4/0x3f0 [129170.826139] ? futex_wake+0x90/0x170 [129170.826142] new_sync_write+0xe5/0x140 [129170.826150] __vfs_write+0x29/0x40 [129170.826153] vfs_write+0xb8/0x1b0 [129170.826155] SyS_write+0x55/0xc0 [129170.826157] do_syscall_64+0x73/0x130 [129170.826159] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [129170.826161] RIP: 0033:0x7fea6f6044bd [129170.826168] RSP: 002b:00007fea6d4cd530 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [129170.826173] RAX: ffffffffffffffda RBX: 000000000000010f RCX: 00007fea6f6044bd [129170.826174] RDX: 000000000000010f RSI: 00007fea6d4cd560 RDI: 0000000000000003 [129170.826175] RBP: 00007fea6d4cd560 R08: 0000000000000000 R09: 0000000000000011 [129170.826176] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003 [129170.826177] R13: 00007fea6d4cd7e0 R14: 00000000000000e3 R15: 000000000000010e [129170.826188] INFO: task log:6997 blocked for more than 120 seconds. [129170.833188] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.840109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.848851] log D 0 6997 1 0x00000000 [129170.848853] Call Trace: [129170.848857] __schedule+0x3d6/0x8b0 [129170.848859] ? bit_wait+0x60/0x60 [129170.848860] schedule+0x36/0x80 [129170.848862] io_schedule+0x16/0x40 [129170.848864] bit_wait_io+0x11/0x60 [129170.848865] __wait_on_bit+0x63/0x90 [129170.848867] out_of_line_wait_on_bit+0x8e/0xb0 [129170.848869] ? bit_waitqueue+0x40/0x40 [129170.848871] do_get_write_access+0x202/0x410 [129170.848875] jbd2_journal_get_write_access+0x51/0x70 [129170.848877] __ext4_journal_get_write_access+0x3b/0x80 [129170.848879] ext4_reserve_inode_write+0x95/0xc0 [129170.848882] ? ext4_dirty_inode+0x48/0x70 [129170.848884] ext4_mark_inode_dirty+0x53/0x1d0 [129170.848886] ? __ext4_journal_start_sb+0x6d/0x120 [129170.848889] ext4_dirty_inode+0x48/0x70 [129170.848892] __mark_inode_dirty+0x184/0x3b0 [129170.848894] generic_update_time+0x7b/0xd0 [129170.848896] ? current_time+0x32/0x70 [129170.848898] file_update_time+0xbe/0x110 [129170.848901] __generic_file_write_iter+0x9d/0x1f0 [129170.848903] ext4_file_write_iter+0xc4/0x3f0 [129170.848905] ? futex_wake+0x90/0x170 [129170.848908] new_sync_write+0xe5/0x140 [129170.848910] __vfs_write+0x29/0x40 [129170.848912] vfs_write+0xb8/0x1b0 [129170.848914] SyS_write+0x55/0xc0 [129170.848916] do_syscall_64+0x73/0x130 [129170.848918] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [129170.848919] RIP: 0033:0x7fcaf66d04bd [129170.848921] RSP: 002b:00007fcaf4599560 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [129170.848924] RAX: ffffffffffffffda RBX: 00000000000000de RCX: 00007fcaf66d04bd [129170.848925] RDX: 00000000000000de RSI: 00007fcaf4599590 RDI: 0000000000000003 [129170.848926] RBP: 00007fcaf4599590 R08: 0000000000000000 R09: 0000000000000011 [129170.848927] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003 [129170.848928] R13: 00007fcaf45997e0 R14: 00000000000000b2 R15: 00000000000000dd [129170.848940] INFO: task log:7127 blocked for more than 120 seconds. [129170.855864] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.862786] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.871674] log D 0 7127 1 0x00000000 [129170.871677] Call Trace: [129170.871679] __schedule+0x3d6/0x8b0 [129170.871681] ? bit_wait+0x60/0x60 [129170.871683] schedule+0x36/0x80 [129170.871685] io_schedule+0x16/0x40 [129170.871686] bit_wait_io+0x11/0x60 [129170.871691] __wait_on_bit+0x63/0x90 [129170.871695] out_of_line_wait_on_bit+0x8e/0xb0 [129170.871699] ? bit_waitqueue+0x40/0x40 [129170.871703] do_get_write_access+0x202/0x410 [129170.871706] jbd2_journal_get_write_access+0x51/0x70 [129170.871709] __ext4_journal_get_write_access+0x3b/0x80 [129170.871711] ext4_reserve_inode_write+0x95/0xc0 [129170.871713] ? ext4_dirty_inode+0x48/0x70 [129170.871715] ext4_mark_inode_dirty+0x53/0x1d0 [129170.871717] ? __ext4_journal_start_sb+0x6d/0x120 [129170.871720] ext4_dirty_inode+0x48/0x70 [129170.871721] __mark_inode_dirty+0x184/0x3b0 [129170.871725] generic_update_time+0x7b/0xd0 [129170.871729] ? current_time+0x32/0x70 [129170.871734] file_update_time+0xbe/0x110 [129170.871740] __generic_file_write_iter+0x9d/0x1f0 [129170.871744] ext4_file_write_iter+0xc4/0x3f0 [129170.871746] ? futex_wake+0x90/0x170 [129170.871748] new_sync_write+0xe5/0x140 [129170.871750] __vfs_write+0x29/0x40 [129170.871751] vfs_write+0xb8/0x1b0 [129170.871753] SyS_write+0x55/0xc0 [129170.871755] do_syscall_64+0x73/0x130 [129170.871758] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [129170.871760] RIP: 0033:0x7f10989804bd [129170.871763] RSP: 002b:00007f1096849560 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [129170.871769] RAX: ffffffffffffffda RBX: 00000000000000de RCX: 00007f10989804bd [129170.871772] RDX: 00000000000000de RSI: 00007f1096849590 RDI: 0000000000000003 [129170.871775] RBP: 00007f1096849590 R08: 0000000000000000 R09: 0000000000000011 [129170.871778] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003 [129170.871781] R13: 00007f10968497e0 R14: 00000000000000b2 R15: 00000000000000dd [129170.871792] INFO: task log:7150 blocked for more than 120 seconds. [129170.878715] Not tainted 4.15.0-33-generic #36~16.04.1-Ubuntu [129170.885639] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [129170.894451] log D 0 7150 1 0x00000000 [129170.894455] Call Trace: [129170.894460] __schedule+0x3d6/0x8b0 [129170.894462] ? bit_wait+0x60/0x60 [129170.894463] schedule+0x36/0x80 [129170.894465] io_schedule+0x16/0x40 [129170.894467] bit_wait_io+0x11/0x60 [129170.894468] __wait_on_bit+0x63/0x90 [129170.894470] out_of_line_wait_on_bit+0x8e/0xb0 [129170.894473] ? bit_waitqueue+0x40/0x40 [129170.894475] do_get_write_access+0x202/0x410 [129170.894477] jbd2_journal_get_write_access+0x51/0x70 [129170.894481] __ext4_journal_get_write_access+0x3b/0x80 [129170.894484] ext4_reserve_inode_write+0x95/0xc0 [129170.894485] ? ext4_dirty_inode+0x48/0x70 [129170.894487] ext4_mark_inode_dirty+0x53/0x1d0 [129170.894490] ? __ext4_journal_start_sb+0x6d/0x120 [129170.894492] ext4_dirty_inode+0x48/0x70 [129170.894495] __mark_inode_dirty+0x184/0x3b0 [129170.894498] generic_update_time+0x7b/0xd0 [129170.894500] ? current_time+0x32/0x70 [129170.894502] file_update_time+0xbe/0x110 [129170.894505] __generic_file_write_iter+0x9d/0x1f0 [129170.894507] ext4_file_write_iter+0xc4/0x3f0 [129170.894509] ? futex_wake+0x90/0x170 [129170.894513] new_sync_write+0xe5/0x140 [129170.894515] __vfs_write+0x29/0x40 [129170.894517] vfs_write+0xb8/0x1b0 [129170.894518] SyS_write+0x55/0xc0 [129170.894521] do_syscall_64+0x73/0x130 [129170.894523] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [129170.894524] RIP: 0033:0x7f5174f514bd [129170.894526] RSP: 002b:00007f5172e1a560 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [129170.894530] RAX: ffffffffffffffda RBX: 00000000000000de RCX: 00007f5174f514bd [129170.894531] RDX: 00000000000000de RSI: 00007f5172e1a590 RDI: 0000000000000003 [129170.894532] RBP: 00007f5172e1a590 R08: 0000000000000000 R09: 0000000000000011 [129170.894533] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000003 [129170.894534] R13: 00007f5172e1a7e0 R14: 00000000000000b2 R15: 00000000000000dd More logs. [ 5.272077] HP HPSA Driver (v 3.4.14-0) [ 5.340589] hpsa 0000:03:00.0: can't disable ASPM; OS doesn't have ASPM control [ 5.352372] hpsa 0000:03:00.0: MSI-X capable controller [ 5.358775] hpsa 0000:03:00.0: Logical aborts not supported [ 5.410577] scsi host6: hpsa [ 5.620173] hpsa 0000:03:00.0: scsi 6:3:0:0: added RAID HP P440ar controller SSDSmartPathCap- En- Exp=1 [ 5.633345] hpsa 0000:03:00.0: scsi 6:0:0:0: masked Direct-Access ATA TK0120GDJXT PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.651921] hpsa 0000:03:00.0: scsi 6:0:1:0: masked Direct-Access ATA TK0120GDJXT PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.682879] ata6.00: ATA-9: VR0120GEJXL, 4IWTHPG0, max UDMA/100 [ 5.682891] ata5.00: ATA-9: VR0120GEJXL, 4IWTHPG0, max UDMA/100 [ 5.800257] hpsa 0000:03:00.0: scsi 6:0:2:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.813417] hpsa 0000:03:00.0: scsi 6:0:3:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.826488] hpsa 0000:03:00.0: scsi 6:0:4:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.839558] hpsa 0000:03:00.0: scsi 6:0:5:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.852628] hpsa 0000:03:00.0: scsi 6:0:6:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.865698] hpsa 0000:03:00.0: scsi 6:0:7:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.878769] hpsa 0000:03:00.0: scsi 6:0:8:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.891839] hpsa 0000:03:00.0: scsi 6:0:9:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.904910] hpsa 0000:03:00.0: scsi 6:0:10:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.918076] hpsa 0000:03:00.0: scsi 6:0:11:0: masked Direct-Access ATA MB3000GCWDB PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.931242] hpsa 0000:03:00.0: scsi 6:0:12:0: masked Direct-Access ATA TK0120GDJXT PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.944442] hpsa 0000:03:00.0: scsi 6:0:13:0: masked Direct-Access ATA TK0120GDJXT PHYS DRV SSDSmartPathCap- En- Exp=0 [ 5.957609] hpsa 0000:03:00.0: scsi 6:0:14:0: masked Enclosure HPE 12G SAS Exp Card enclosure SSDSmartPathCap- En- Exp=0 [ 5.970871] hpsa 0000:03:00.0: scsi 6:1:0:0: added Direct-Access HP LOGICAL VOLUME RAID-1(+0) SSDSmartPathCap+ En+ Exp=1 [ 5.984038] hpsa 0000:03:00.0: scsi 6:1:0:1: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap+ En+ Exp=1 [ 5.996822] hpsa 0000:03:00.0: scsi 6:1:0:2: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap+ En+ Exp=1 [ 6.009606] hpsa 0000:03:00.0: scsi 6:1:0:3: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.022391] hpsa 0000:03:00.0: scsi 6:1:0:4: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.035176] hpsa 0000:03:00.0: scsi 6:1:0:5: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.047960] hpsa 0000:03:00.0: scsi 6:1:0:6: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.060759] hpsa 0000:03:00.0: scsi 6:1:0:7: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.073545] hpsa 0000:03:00.0: scsi 6:1:0:8: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.086329] hpsa 0000:03:00.0: scsi 6:1:0:9: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.099113] hpsa 0000:03:00.0: scsi 6:1:0:10: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.111991] hpsa 0000:03:00.0: scsi 6:1:0:11: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.124869] hpsa 0000:03:00.0: scsi 6:1:0:12: added Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 [ 6.138251] scsi 6:0:0:0: RAID HP P440ar 6.60 PQ: 0 ANSI: 5 [ 6.147610] scsi 6:1:0:0: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.156967] scsi 6:1:0:1: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.171837] scsi 6:1:0:2: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.181197] scsi 6:1:0:3: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.190653] scsi 6:1:0:4: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.200015] scsi 6:1:0:5: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.200420] scsi 6:1:0:6: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.200813] scsi 6:1:0:7: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.201205] scsi 6:1:0:8: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.201599] scsi 6:1:0:9: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.201999] scsi 6:1:0:10: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.202395] scsi 6:1:0:11: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.202789] scsi 6:1:0:12: Direct-Access HP LOGICAL VOLUME 6.60 PQ: 0 ANSI: 5 [ 6.205267] scsi 4:0:0:0: Direct-Access ATA VR0120GEJXL HPG0 PQ: 0 ANSI: 5 [ 6.205610] scsi 5:0:0:0: Direct-Access ATA VR0120GEJXL HPG0 PQ: 0 ANSI: 5 [ 15.324913] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4 [ 16.681743] hpwdt 0000:01:00.0: HP Watchdog Timer Driver: NMI decoding initialized, allow kernel dump: ON (default = 1/ON) [ 16.681825] hpwdt 0000:01:00.0: HP Watchdog Timer Driver: 1.3.3, timer margin: 30 seconds (nowayout=0). [ 35.467951] hpsa 0000:03:00.0: Acknowledging event: 0xc0000000 (HP SSD Smart Path configuration change) [ 35.636446] hpsa 0000:03:00.0: scsi 6:1:0:0: updated Direct-Access HP LOGICAL VOLUME RAID-1(+0) SSDSmartPathCap+ En+ Exp=1 [ 35.636452] hpsa 0000:03:00.0: scsi 6:1:0:1: updated Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap+ En+ Exp=1 [ 35.636457] hpsa 0000:03:00.0: scsi 6:1:0:2: updated Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap+ En+ Exp=1 Moved from Ubuntu 16.04.5 to CentOS 7.5 with hpsa kernel module (kmod-hpsa-3.4.20-141.rhel7u5.x86_64.rpm) from HPE website. Running without kernel panic since more than a week. By compiling the hpsa kernel module from SourceForge on Ubuntu 16.04 with kernel 4.4 solved the issue for us. Steps: # apt-get install dkms build-essential # tar xjvf hpsa-3.4.20-141.tar.bz2 # cd hpsa-3.4.20/drivers/ # sudo cp -a scsi /usr/src/hpsa-3.4.20.141 # dkms add -m hpsa -v 3.4.20.141 # dkms build -m hpsa -v 3.4.20.141 # dkms install -m hpsa -v 3.4.20.141 Link: https://sourceforge.net/projects/cciss/ I have actually compiled the hpsa driver 3.4.20.141 into the kernel 4.19.13. I still have the same behavior, a heavy load (3000) and all disk of the controller unavailable. But this time, it's not the reset who trigger the bug, here the log that I have. First of all, one disk returns a lot of critical medium error: ``` [Wed Jan 23 15:55:34 2019] print_req_error: critical medium error, dev sdt, sector 13836632 [Wed Jan 23 15:55:34 2019] sd 3:1:0:19: [sdt] Unaligned partial completion (resid=52, sector_sz=512) [Wed Jan 23 15:55:35 2019] sd 3:1:0:19: [sdt] Unaligned partial completion (resid=48, sector_sz=512) [Wed Jan 23 15:55:35 2019] sd 3:1:0:19: [sdt] Unaligned partial completion (resid=32, sector_sz=512) [Wed Jan 23 15:55:35 2019] sd 3:1:0:19: [sdt] Unaligned partial completion (resid=52, sector_sz=512) [Wed Jan 23 15:55:52 2019] sd 3:1:0:19: [sdt] Unaligned partial completion (resid=32, sector_sz=512) [Wed Jan 23 15:55:52 2019] scsi_io_completion_action: 5 callbacks suppressed [Wed Jan 23 15:55:52 2019] sd 3:1:0:19: [sdt] tag#23 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [Wed Jan 23 15:55:52 2019] sd 3:1:0:19: [sdt] tag#23 Sense Key : Medium Error [current] [Wed Jan 23 15:55:52 2019] sd 3:1:0:19: [sdt] tag#23 Add. Sense: Unrecovered read error [Wed Jan 23 15:55:52 2019] sd 3:1:0:19: [sdt] tag#23 CDB: Read(16) 88 00 00 00 00 00 00 d3 21 58 00 00 00 08 00 00 [Wed Jan 23 15:55:52 2019] print_req_error: 5 callbacks suppressed [Wed Jan 23 15:55:52 2019] print_req_error: critical medium error, dev sdt, sector 13836632 ``` After this, hpsa show sone failed inquiry: ``` [Wed Jan 23 15:57:07 2019] hpsa 0000:08:00.0: aborted: NULL_SDEV_PTR TAG:0x00000000:00000770 LUN:000000c000003901 CDB:12000000310000000000000000000000 [Wed Jan 23 15:57:07 2019] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. [Wed Jan 23 15:57:08 2019] hpsa 0000:08:00.0: removed scsi 3:0:50:0: Direct-Access ATA MB4000GCWDC PHYS DRV SSDSmartPathCap- En- Exp=0 qd=14 [Wed Jan 23 15:57:31 2019] hpsa 0000:08:00.0: aborted: NULL_SDEV_PTR TAG:0x00000000:000015c0 LUN:000000c000003901 CDB:12000000310000000000000000000000 [Wed Jan 23 15:57:31 2019] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. [Wed Jan 23 15:57:54 2019] hpsa 0000:08:00.0: aborted: NULL_SDEV_PTR TAG:0x00000000:00000e70 LUN:000000c000003901 CDB:12000000310000000000000000000000 [Wed Jan 23 15:57:54 2019] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. [Wed Jan 23 15:59:04 2019] hpsa 0000:08:00.0: aborted: NULL_SDEV_PTR TAG:0x00000000:00002650 LUN:000000c000003901 CDB:12000000310000000000000000000000 [Wed Jan 23 15:59:04 2019] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. [Wed Jan 23 15:59:28 2019] hpsa 0000:08:00.0: aborted: NULL_SDEV_PTR TAG:0x00000000:00001400 LUN:000000c000003901 CDB:12000000310000000000000000000000 [Wed Jan 23 15:59:28 2019] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. [Wed Jan 23 15:59:51 2019] hpsa 0000:08:00.0: aborted: NULL_SDEV_PTR TAG:0x00000000:00001400 LUN:000000c000003901 CDB:12000000310000000000000000000000 [Wed Jan 23 15:59:51 2019] hpsa 0000:08:00.0: hpsa_update_device_info: inquiry failed, device will be skipped. ``` And following this Call Trace: ``` Wed Jan 23 16:00:19 2019] INFO: task task:12406 blocked for more than 120 seconds. [Wed Jan 23 16:00:19 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:00:19 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:00:19 2019] task D 0 12406 12384 0x00000000 [Wed Jan 23 16:00:19 2019] Call Trace: [Wed Jan 23 16:00:19 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:00:19 2019] ? bit_wait+0x50/0x50 [Wed Jan 23 16:00:19 2019] schedule+0x28/0x80 [Wed Jan 23 16:00:19 2019] io_schedule+0x12/0x40 [Wed Jan 23 16:00:19 2019] bit_wait_io+0xd/0x50 [Wed Jan 23 16:00:19 2019] __wait_on_bit+0x44/0x80 [Wed Jan 23 16:00:19 2019] out_of_line_wait_on_bit+0x91/0xb0 [Wed Jan 23 16:00:19 2019] ? init_wait_var_entry+0x40/0x40 [Wed Jan 23 16:00:19 2019] __ext4_get_inode_loc+0x1a4/0x3f0 [Wed Jan 23 16:00:19 2019] ext4_iget+0x8f/0xbb0 [Wed Jan 23 16:00:19 2019] ? d_alloc_parallel+0x9d/0x4a0 [Wed Jan 23 16:00:19 2019] ext4_lookup+0xda/0x200 [Wed Jan 23 16:00:19 2019] __lookup_slow+0x97/0x150 [Wed Jan 23 16:00:19 2019] lookup_slow+0x35/0x50 [Wed Jan 23 16:00:19 2019] walk_component+0x1c4/0x340 [Wed Jan 23 16:00:19 2019] link_path_walk.part.33+0x2a6/0x510 [Wed Jan 23 16:00:19 2019] ? path_init+0x190/0x310 [Wed Jan 23 16:00:19 2019] path_openat+0xdd/0x1540 [Wed Jan 23 16:00:19 2019] ? get_futex_key+0x2ed/0x3d0 [Wed Jan 23 16:00:19 2019] do_filp_open+0x9b/0x110 [Wed Jan 23 16:00:19 2019] ? __check_object_size+0xb1/0x1a0 [Wed Jan 23 16:00:19 2019] ? do_sys_open+0x1bd/0x250 [Wed Jan 23 16:00:19 2019] do_sys_open+0x1bd/0x250 [Wed Jan 23 16:00:19 2019] do_syscall_64+0x55/0x110 [Wed Jan 23 16:00:19 2019] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [Wed Jan 23 16:00:19 2019] RIP: 0033:0x7f1c109a5bfd [Wed Jan 23 16:00:19 2019] Code: Bad RIP value. [Wed Jan 23 16:00:19 2019] RSP: 002b:00007f1c075b7d90 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 [Wed Jan 23 16:00:19 2019] RAX: ffffffffffffffda RBX: 00007f1bdc000af0 RCX: 00007f1c109a5bfd [Wed Jan 23 16:00:19 2019] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f1c075b7dc0 [Wed Jan 23 16:00:19 2019] RBP: 0000000000000000 R08: 0000564865cae4aa R09: 0000000000000000 [Wed Jan 23 16:00:19 2019] R10: 0000000000000004 R11: 0000000000000293 R12: 00007f1bdc0008c0 [Wed Jan 23 16:00:19 2019] R13: 00007f1c075b7dc0 R14: 00007f1c075b8f50 R15: 00007f1bd0000c78 [Wed Jan 23 16:01:41 2019] hpsa 0000:08:00.0: logical_reset scsi 3:1:0:19: Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 qd=0 [Wed Jan 23 16:02:20 2019] INFO: task jbd2/sdac-8:9669 blocked for more than 120 seconds. [Wed Jan 23 16:02:20 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:02:20 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:02:20 2019] jbd2/sdac-8 D 0 9669 2 0x80000000 [Wed Jan 23 16:02:20 2019] Call Trace: [Wed Jan 23 16:02:20 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:02:20 2019] ? __wake_up_common_lock+0x89/0xc0 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] schedule+0x28/0x80 [Wed Jan 23 16:02:20 2019] jbd2_journal_commit_transaction+0x246/0x1740 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? lock_timer_base+0x67/0x80 [Wed Jan 23 16:02:20 2019] ? kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] ? __wake_up_common+0x74/0x120 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] ? commit_timeout+0x10/0x10 [Wed Jan 23 16:02:20 2019] kthread+0x113/0x130 [Wed Jan 23 16:02:20 2019] ? kthread_create_worker_on_cpu+0x70/0x70 [Wed Jan 23 16:02:20 2019] ret_from_fork+0x35/0x40 [Wed Jan 23 16:02:20 2019] INFO: task jbd2/sdab-8:9684 blocked for more than 120 seconds. [Wed Jan 23 16:02:20 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:02:20 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:02:20 2019] jbd2/sdab-8 D 0 9684 2 0x80000000 [Wed Jan 23 16:02:20 2019] Call Trace: [Wed Jan 23 16:02:20 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:02:20 2019] ? bit_wait+0x50/0x50 [Wed Jan 23 16:02:20 2019] schedule+0x28/0x80 [Wed Jan 23 16:02:20 2019] io_schedule+0x12/0x40 [Wed Jan 23 16:02:20 2019] bit_wait_io+0xd/0x50 [Wed Jan 23 16:02:20 2019] __wait_on_bit+0x44/0x80 [Wed Jan 23 16:02:20 2019] out_of_line_wait_on_bit+0x91/0xb0 [Wed Jan 23 16:02:20 2019] ? init_wait_var_entry+0x40/0x40 [Wed Jan 23 16:02:20 2019] jbd2_journal_commit_transaction+0xd0d/0x1740 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] ? commit_timeout+0x10/0x10 [Wed Jan 23 16:02:20 2019] kthread+0x113/0x130 [Wed Jan 23 16:02:20 2019] ? kthread_create_worker_on_cpu+0x70/0x70 [Wed Jan 23 16:02:20 2019] ret_from_fork+0x35/0x40 [Wed Jan 23 16:02:20 2019] INFO: task jbd2/sdaa-8:9792 blocked for more than 120 seconds. [Wed Jan 23 16:02:20 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:02:20 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:02:20 2019] jbd2/sdaa-8 D 0 9792 2 0x80000000 [Wed Jan 23 16:02:20 2019] Call Trace: [Wed Jan 23 16:02:20 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:02:20 2019] ? bit_wait+0x50/0x50 [Wed Jan 23 16:02:20 2019] schedule+0x28/0x80 [Wed Jan 23 16:02:20 2019] io_schedule+0x12/0x40 [Wed Jan 23 16:02:20 2019] bit_wait_io+0xd/0x50 [Wed Jan 23 16:02:20 2019] __wait_on_bit+0x44/0x80 [Wed Jan 23 16:02:20 2019] out_of_line_wait_on_bit+0x91/0xb0 [Wed Jan 23 16:02:20 2019] ? init_wait_var_entry+0x40/0x40 [Wed Jan 23 16:02:20 2019] jbd2_journal_commit_transaction+0xd0d/0x1740 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] ? __wake_up_common+0x74/0x120 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] ? commit_timeout+0x10/0x10 [Wed Jan 23 16:02:20 2019] kthread+0x113/0x130 [Wed Jan 23 16:02:20 2019] ? kthread_create_worker_on_cpu+0x70/0x70 [Wed Jan 23 16:02:20 2019] ret_from_fork+0x35/0x40 [Wed Jan 23 16:02:20 2019] INFO: task jbd2/sdb-8:9796 blocked for more than 120 seconds. [Wed Jan 23 16:02:20 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:02:20 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:02:20 2019] jbd2/sdb-8 D 0 9796 2 0x80000000 [Wed Jan 23 16:02:20 2019] Call Trace: [Wed Jan 23 16:02:20 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:02:20 2019] ? bit_wait+0x50/0x50 [Wed Jan 23 16:02:20 2019] schedule+0x28/0x80 [Wed Jan 23 16:02:20 2019] io_schedule+0x12/0x40 [Wed Jan 23 16:02:20 2019] bit_wait_io+0xd/0x50 [Wed Jan 23 16:02:20 2019] __wait_on_bit+0x44/0x80 [Wed Jan 23 16:02:20 2019] out_of_line_wait_on_bit+0x91/0xb0 [Wed Jan 23 16:02:20 2019] ? init_wait_var_entry+0x40/0x40 [Wed Jan 23 16:02:20 2019] jbd2_journal_commit_transaction+0xd0d/0x1740 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] ? __wake_up_common+0x74/0x120 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] ? commit_timeout+0x10/0x10 [Wed Jan 23 16:02:20 2019] kthread+0x113/0x130 [Wed Jan 23 16:02:20 2019] ? kthread_create_worker_on_cpu+0x70/0x70 [Wed Jan 23 16:02:20 2019] ret_from_fork+0x35/0x40 [Wed Jan 23 16:02:20 2019] INFO: task jbd2/sdf-8:9939 blocked for more than 120 seconds. [Wed Jan 23 16:02:20 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:02:20 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:02:20 2019] jbd2/sdf-8 D 0 9939 2 0x80000000 [Wed Jan 23 16:02:20 2019] Call Trace: [Wed Jan 23 16:02:20 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:02:20 2019] ? __wake_up_common_lock+0x89/0xc0 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] schedule+0x28/0x80 [Wed Jan 23 16:02:20 2019] jbd2_journal_commit_transaction+0x246/0x1740 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x34/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? lock_timer_base+0x67/0x80 [Wed Jan 23 16:02:20 2019] ? kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] ? __wake_up_common+0x74/0x120 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] ? commit_timeout+0x10/0x10 [Wed Jan 23 16:02:20 2019] kthread+0x113/0x130 [Wed Jan 23 16:02:20 2019] ? kthread_create_worker_on_cpu+0x70/0x70 [Wed Jan 23 16:02:20 2019] ret_from_fork+0x35/0x40 [Wed Jan 23 16:02:20 2019] INFO: task jbd2/sdc-8:9947 blocked for more than 120 seconds. [Wed Jan 23 16:02:20 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:02:20 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:02:20 2019] jbd2/sdc-8 D 0 9947 2 0x80000000 [Wed Jan 23 16:02:20 2019] Call Trace: [Wed Jan 23 16:02:20 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:02:20 2019] ? bit_wait+0x50/0x50 [Wed Jan 23 16:02:20 2019] schedule+0x28/0x80 [Wed Jan 23 16:02:20 2019] io_schedule+0x12/0x40 [Wed Jan 23 16:02:20 2019] bit_wait_io+0xd/0x50 [Wed Jan 23 16:02:20 2019] __wait_on_bit+0x44/0x80 [Wed Jan 23 16:02:20 2019] out_of_line_wait_on_bit+0x91/0xb0 [Wed Jan 23 16:02:20 2019] ? init_wait_var_entry+0x40/0x40 [Wed Jan 23 16:02:20 2019] jbd2_journal_commit_transaction+0xd0d/0x1740 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? __switch_to_asm+0x40/0x70 [Wed Jan 23 16:02:20 2019] ? kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] kjournald2+0xbd/0x270 [Wed Jan 23 16:02:20 2019] ? __wake_up_common+0x74/0x120 [Wed Jan 23 16:02:20 2019] ? wait_woken+0x80/0x80 [Wed Jan 23 16:02:20 2019] ? commit_timeout+0x10/0x10 [Wed Jan 23 16:02:20 2019] kthread+0x113/0x130 [Wed Jan 23 16:02:20 2019] ? kthread_create_worker_on_cpu+0x70/0x70 [Wed Jan 23 16:02:20 2019] ret_from_fork+0x35/0x40 [Wed Jan 23 16:02:20 2019] INFO: task task:10018 blocked for more than 120 seconds. [Wed Jan 23 16:02:20 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:02:20 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:02:20 2019] task D 0 10018 10007 0x00000000 [Wed Jan 23 16:02:20 2019] Call Trace: [Wed Jan 23 16:02:20 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:02:20 2019] ? futex_wait_queue_me+0xd3/0x120 [Wed Jan 23 16:02:20 2019] schedule+0x28/0x80 [Wed Jan 23 16:02:20 2019] rwsem_down_write_failed+0x15e/0x350 [Wed Jan 23 16:02:20 2019] ? call_rwsem_down_write_failed+0x13/0x20 [Wed Jan 23 16:02:20 2019] call_rwsem_down_write_failed+0x13/0x20 [Wed Jan 23 16:02:20 2019] down_write+0x29/0x40 [Wed Jan 23 16:02:20 2019] ext4_file_write_iter+0x96/0x3e0 [Wed Jan 23 16:02:20 2019] ? __sys_sendto+0xac/0x140 [Wed Jan 23 16:02:20 2019] __vfs_write+0x112/0x1a0 [Wed Jan 23 16:02:20 2019] vfs_write+0xad/0x1a0 [Wed Jan 23 16:02:20 2019] ksys_pwrite64+0x71/0x90 [Wed Jan 23 16:02:20 2019] do_syscall_64+0x55/0x110 [Wed Jan 23 16:02:20 2019] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [Wed Jan 23 16:02:20 2019] RIP: 0033:0x7ffa3666ad23 [Wed Jan 23 16:02:20 2019] Code: Bad RIP value. [Wed Jan 23 16:02:20 2019] RSP: 002b:00007ffa32a88a50 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 [Wed Jan 23 16:02:20 2019] RAX: ffffffffffffffda RBX: 0000000000000200 RCX: 00007ffa3666ad23 [Wed Jan 23 16:02:20 2019] RDX: 0000000000000200 RSI: 00007ffa32a88b30 RDI: 0000000000000013 [Wed Jan 23 16:02:20 2019] RBP: 0000000000000200 R08: 00007ffa32a88b08 R09: 00007ffa32a889f0 [Wed Jan 23 16:02:20 2019] R10: 00000003c2275000 R11: 0000000000000293 R12: 0000000000000000 [Wed Jan 23 16:02:20 2019] R13: 00000003c2275000 R14: 0000000000000013 R15: 00007ffa32a88b30 [Wed Jan 23 16:02:20 2019] INFO: task task:10021 blocked for more than 120 seconds. [Wed Jan 23 16:02:20 2019] Not tainted 4.19.13-dailymotion #1 [Wed Jan 23 16:02:20 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Wed Jan 23 16:02:20 2019] task D 0 10021 10007 0x00000000 [Wed Jan 23 16:02:20 2019] Call Trace: [Wed Jan 23 16:02:20 2019] ? __schedule+0x2b7/0x880 [Wed Jan 23 16:02:20 2019] ? futex_wait_queue_me+0xd3/0x120 [Wed Jan 23 16:02:20 2019] schedule+0x28/0x80 [Wed Jan 23 16:02:20 2019] rwsem_down_write_failed+0x15e/0x350 [Wed Jan 23 16:02:20 2019] ? call_rwsem_down_write_failed+0x13/0x20 [Wed Jan 23 16:02:20 2019] call_rwsem_down_write_failed+0x13/0x20 [Wed Jan 23 16:02:20 2019] down_write+0x29/0x40 [Wed Jan 23 16:02:20 2019] ext4_file_write_iter+0x96/0x3e0 [Wed Jan 23 16:02:20 2019] ? __sys_sendto+0xac/0x140 [Wed Jan 23 16:02:20 2019] __vfs_write+0x112/0x1a0 [Wed Jan 23 16:02:20 2019] vfs_write+0xad/0x1a0 [Wed Jan 23 16:02:20 2019] ksys_pwrite64+0x71/0x90 [Wed Jan 23 16:02:20 2019] do_syscall_64+0x55/0x110 [Wed Jan 23 16:02:20 2019] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [Wed Jan 23 16:02:20 2019] RIP: 0033:0x7ffa3666ad23 [Wed Jan 23 16:02:20 2019] Code: Bad RIP value. [Wed Jan 23 16:02:20 2019] RSP: 002b:00007ffa31285a50 EFLAGS: 00000293 ORIG_RAX: 0000000000000012 [Wed Jan 23 16:02:20 2019] RAX: ffffffffffffffda RBX: 0000000000000200 RCX: 00007ffa3666ad23 [Wed Jan 23 16:02:20 2019] RDX: 0000000000000200 RSI: 00007ffa31285b30 RDI: 0000000000000013 [Wed Jan 23 16:02:20 2019] RBP: 0000000000000200 R08: 00007ffa31285b08 R09: 00007ffa312859f0 [Wed Jan 23 16:02:20 2019] R10: 00000003c2276000 R11: 0000000000000293 R12: 0000000000000000 [Wed Jan 23 16:02:20 2019] R13: 00000003c2276000 R14: 0000000000000013 R15: 00007ffa31285b30 ``` The issue came back too... The only way so for to avoid the crashes has been to switch the card from RAID to HBA but at performances cost. I have been re-writing the eh_reset path recently. I found a race condition between the completion handler and the reset handler. I have an update that I have been aggressively testing that seems to be holding up. I am testing using 7 volumes: 2 SAS HBAs, 2 SATA HBAs, 1 Smart Path enabled R5, and 2 LVs. My tests consist of issuing resets (sg_reset -d) to all volumes repeatedly while doing: 1. mke2fs to all 7 in parallel 2. mount all volumes in parallel 3. rsync data to all volumes in parallel 4. umount all volumes in parallel 5. fsck all volumes in parallel. Before my update, I was having the above issue. I intend to run the above tests over the weekend, then submit the update internally for code review/testing before I send the update to kernel.org. Thanks, Don Brace (In reply to Don from comment #30) > I have been re-writing the eh_reset path recently. > > I found a race condition between the completion handler and the reset > handler. > > I have an update that I have been aggressively testing that seems to be > holding up. > > I am testing using 7 volumes: 2 SAS HBAs, 2 SATA HBAs, 1 Smart Path enabled > R5, and 2 LVs. > > My tests consist of issuing resets (sg_reset -d) to all volumes repeatedly > while doing: > 1. mke2fs to all 7 in parallel > 2. mount all volumes in parallel > 3. rsync data to all volumes in parallel > 4. umount all volumes in parallel > 5. fsck all volumes in parallel. > > Before my update, I was having the above issue. > > I intend to run the above tests over the weekend, then submit the update > internally for code review/testing before I send the update to kernel.org. > > Thanks, > Don Brace Super news Don, Please, let me know when the patch/update will be available. I'm interested to test it. Thanks and regards, Created attachment 282479 [details] Patch to correct resets hpsa: correct device resets - Correct a race condition that occurs between the reset handler and the completion handler. There are times when the wait_event condition is never met due to this race condition and the reset never completes. The reset_pending field is NULL initially. t Reset Handler Thread Completion Thread -- -------------------- ----------------- t1 if (c->reset_pending) t2 c->reset_pending = dev; if (atomic_dev_and_test(counter)) t3 atomic_inc(counter) wait_up_all(event_sync_wait_queue) t4 t5 wait_event(...counter == 0) Kernel.org Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=1994350 Bug 199435 - HPSA + P420i resetting logical Direct-Access never complete Here is the patch I am preparing to send up to kernel.org. I have been testing this patch for some time now and I feel it is ready. Hi Don, Did you send it to kernel.org? Any idea when the patch will be available in the kernel? Is the patch compatible with the last kernel longterm release 4.19? Thanks in advance for your response. (In reply to Anthony Hausman from comment #33) > Hi Don, > > Did you send it to kernel.org? > Any idea when the patch will be available in the kernel? > > Is the patch compatible with the last kernel longterm release 4.19? > > Thanks in advance for your response. I will be sending the patch up today. It will apply if all of the patches I am sending up are applied. Otherwise it will have to have some minor porting done to have it applied. The patches I am sending up are: hpsa-correct-simple-mode hpsa-use-local-workqueues-instead-of-system-workqueues hpsa-check-for-tag-collision hpsa-wait-longer-for-ptraid-commands hpsa-do-not-complete-cmds-for-deleted-devices hpsa-correct-device-resets hpsa-update-driver-version Hi guys, I'm from the future. It's 2020 already! I have similar (exact the same?) problem with HPSA P410 already on two of my nodes with Kernel 5.7.1-1.el7.elrepo.x86_64 Here are the logs: 2020-06-16T14:59:00.8117 warning kern kernel [679613.058375] hpsa 0000:06:00.0: scsi 0:1:0:0: resetting logical Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 2020-06-16T14:59:23.3999 info kern kernel [679635.647794] libceph: osd0 down 2020-06-16T14:59:23.3999 info kern kernel [679635.648599] libceph: osd6 down 2020-06-16T14:59:24.4468 warning kern kernel [679636.694762] rbd: rbd1: encountered watch error: -107 2020-06-16T14:59:24.4886 warning kern kernel [679636.736747] rbd: rbd2: encountered watch error: -107 2020-06-16T14:59:28.4377 info kern kernel [679640.685700] libceph: osd5 down 2020-06-16T14:59:36.6272 warning kern kernel [679648.874179] hpsa 0000:06:00.0: Controller lockup detected: 0x0015002f after 30 2020-06-16T14:59:36.6272 warning kern kernel [679648.875554] hpsa 0000:06:00.0: controller lockup detected: LUN:0000004000000000 CDB:01040000000000000000000000000000 2020-06-16T14:59:36.6272 warning kern kernel [679648.875591] hpsa 0000:06:00.0: failed 15 commands in fail_all 2020-06-16T14:59:36.6272 warning kern kernel [679648.876650] hpsa 0000:06:00.0: Controller lockup detected during reset wait 2020-06-16T14:59:36.6272 warning kern kernel [679648.876655] hpsa 0000:06:00.0: scsi 0:1:0:0: reset logical failed Direct-Access HP LOGICAL VOLUME RAID-0 SSDSmartPathCap- En- Exp=1 2020-06-16T14:59:36.6272 info kern kernel [679648.876667] sd 0:1:0:2: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6272 info kern kernel [679648.876672] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6348 info kern kernel [679648.883168] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6348 info kern kernel [679648.884214] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6357 info kern kernel [679648.885286] sd 0:1:0:1: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6367 info kern kernel [679648.886297] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6377 info kern kernel [679648.887301] sd 0:1:0:2: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6395 info kern kernel [679648.888269] sd 0:1:0:2: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6395 info kern kernel [679648.889193] sd 0:1:0:3: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6419 info kern kernel [679648.890076] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6419 info kern kernel [679648.891496] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6447 info kern kernel [679648.893012] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6447 info kern kernel [679648.894114] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6466 info kern kernel [679648.895182] sd 0:1:0:0: Device offlined - not ready after error recovery 2020-06-16T14:59:36.6466 info kern kernel [679648.896204] sd 0:1:0:2: [sdc] tag#477 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=66s 2020-06-16T14:59:36.6489 info kern kernel [679648.897223] sd 0:1:0:2: [sdc] tag#477 CDB: Read(10) 28 00 00 ed 13 90 00 00 08 00 2020-06-16T14:59:36.6489 err kern kernel [679648.898309] blk_update_request: I/O error, dev sdc, sector 15537040 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 2020-06-16T14:59:36.6523 info kern kernel [679648.899489] sd 0:1:0:0: [sda] tag#469 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=70s 2020-06-16T14:59:36.6523 err kern kernel [679648.899956] sd 0:1:0:0: rejecting I/O to offline device 2020-06-16T14:59:36.6523 info kern kernel [679648.900659] sd 0:1:0:0: [sda] tag#469 CDB: Read(10) 28 00 14 78 2f e0 00 00 10 00 2020-06-16T14:59:36.6524 err kern kernel [679648.901820] blk_update_request: I/O error, dev sda, sector 929142240 op 0x0:(READ) flags 0x3000 phys_seg 1 prio class 0 2020-06-16T14:59:36.6537 err kern kernel [679648.903092] blk_update_request: I/O error, dev sda, sector 343420896 op 0x0:(READ) flags 0x80700 phys_seg 2 prio class 0 2020-06-16T14:59:36.6537 info kern kernel [679648.903138] sd 0:1:0:0: [sda] tag#470 FAILED Result: hostbyte=DID_NO_CONNECT driverbyte=DRIVER_OK cmd_age=70s /usr/sbin/hpssacli ctrl all show detail Smart Array P410 in Slot 1 Bus Interface: PCI Slot: 1 Serial Number: PACCRID122902CV Cache Serial Number: PBCDH0CRH2K24K Controller Status: OK Hardware Revision: C Firmware Version: 6.64 Rebuild Priority: Medium Expand Priority: Medium Surface Scan Delay: 3 secs Surface Scan Mode: Idle Parallel Surface Scan Supported: No Queue Depth: Automatic Monitor and Performance Delay: 60 min Elevator Sort: Enabled Degraded Performance Optimization: Disabled Inconsistency Repair Policy: Disabled Wait for Cache Room: Disabled Surface Analysis Inconsistency Notification: Disabled Post Prompt Timeout: 15 secs Cache Board Present: True Cache Status: OK Cache Status Details: The current array controller had valid data stored in its battery/capacitor backed write cache the last time it was reset or was powered up. This indicates that the system may not have been shut down gracefully. The array controller has automatically written, or has attempted to write, this data to the drives. This message will continue to be displayed until the next reset or power-cycle of the array controller. Cache Ratio: 25% Read / 75% Write Drive Write Cache: Disabled Total Cache Size: 512 MB Total Cache Memory Available: 400 MB No-Battery Write Cache: Disabled Cache Backup Power Source: Capacitors Battery/Capacitor Count: 1 Battery/Capacitor Status: OK SATA NCQ Supported: True Number of Ports: 2 Internal only Driver Name: hpsa Driver Version: 3.4.20 Driver Supports HPE SSD Smart Path: True PCI Address (Domain:Bus:Device.Function): 0000:06:00.0 Sanitize Erase Supported: False Primary Boot Volume: logicaldrive 1 (600508B1001C3DAA9705279AD5D8DABA) Secondary Boot Volume: None Seems like related: https://bugzilla.kernel.org/show_bug.cgi?id=199435 Sry ^ - wrong thread. Created new bug as the controller differs in my case: https://bugzilla.kernel.org/show_bug.cgi?id=208215 |