Bug 40552 - DRBD synchronization stalls
Summary: DRBD synchronization stalls
Status: RESOLVED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Lars Ellenberg
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-08-04 18:56 UTC by Bart Van Assche
Modified: 2012-08-30 09:38 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.1-rc0
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Bart Van Assche 2011-08-04 18:56:48 UTC
Kernel version: v3.0-7347-g288d5ab (most recent commit was 288d5abec8314ae50fe6692f324b0444acae8486 with date August 3, 2011).

While syncing a DRBD volume (on top of an LVM logical volume) for the first time the following complaint appeared in the kernel log:

block drbd0: Implicitly set pdsk from Inconsistent to UpToDate
block drbd0: conn( Connected -> VerifyS ) disk( Inconsistent -> UpToDate ) pdsk( Inconsistent -> UpToDate ) 
block drbd0: Starting Online Verify from sector 0

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
3.0.0-scst-debug+ #6
---------------------------------------------------------
drbd0_receiver/5871 just changed the state of lock:
(&(&mdev->req_lock)->rlock){..-...}, at: [<ffffffffa023e072>] drbd_endio_read_sec_final+0x42/0x330 [drbd]
but this lock took another, SOFTIRQ-unsafe lock in the past:
(&(&mdev->peer_seq_lock)->rlock){+.+...}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
Possible interrupt unsafe locking scenario:

      CPU0                    CPU1
      ----                    ----
 lock(&(&mdev->peer_seq_lock)->rlock);
                              local_irq_disable();
                              lock(&(&mdev->req_lock)->rlock);
                              lock(&(&mdev->peer_seq_lock)->rlock);
 <Interrupt>
   lock(&(&mdev->req_lock)->rlock);

*** DEADLOCK ***

no locks held by drbd0_receiver/5871.

the shortest dependencies between 2nd lock and 1st lock:
-> (&(&mdev->peer_seq_lock)->rlock){+.+...} ops: 3 {
   HARDIRQ-ON-W at:
                         [<ffffffff81088b14>] __lock_acquire+0x754/0x1d70
                         [<ffffffff8108a725>] lock_acquire+0x85/0x130
                         [<ffffffff8140438b>] _raw_spin_lock+0x3b/0x50
                         [<ffffffffa0263604>] drbd_nl_syncer_conf+0x524/0xbe0 [drbd]
                         [<ffffffffa02689e9>] drbd_connector_callback+0x139/0x750 [drbd]
                         [<ffffffff81298a7b>] cn_rx_skb+0xdb/0x100
                         [<ffffffff81357603>] netlink_unicast+0x2a3/0x2f0
                         [<ffffffff8135787b>] netlink_sendmsg+0x22b/0x360
                         [<ffffffff8131f971>] sock_sendmsg+0xe1/0x120
                         [<ffffffff813227bd>] sys_sendto+0x10d/0x140
                         [<ffffffff814059ab>] system_call_fastpath+0x16/0x1b
   SOFTIRQ-ON-W at:
                         [<ffffffff81088b4d>] __lock_acquire+0x78d/0x1d70
                         [<ffffffff8108a725>] lock_acquire+0x85/0x130
                         [<ffffffff8140438b>] _raw_spin_lock+0x3b/0x50
                         [<ffffffffa0263604>] drbd_nl_syncer_conf+0x524/0xbe0 [drbd]
                         [<ffffffffa02689e9>] drbd_connector_callback+0x139/0x750 [drbd]
                         [<ffffffff81298a7b>] cn_rx_skb+0xdb/0x100
                         [<ffffffff81357603>] netlink_unicast+0x2a3/0x2f0
                         [<ffffffff8135787b>] netlink_sendmsg+0x22b/0x360
                         [<ffffffff8131f971>] sock_sendmsg+0xe1/0x120
                         [<ffffffff813227bd>] sys_sendto+0x10d/0x140
                         [<ffffffff814059ab>] system_call_fastpath+0x16/0x1b
   INITIAL USE at:
                        [<ffffffff810887a4>] __lock_acquire+0x3e4/0x1d70
                        [<ffffffff8108a725>] lock_acquire+0x85/0x130
                        [<ffffffff8140438b>] _raw_spin_lock+0x3b/0x50
                        [<ffffffffa0263604>] drbd_nl_syncer_conf+0x524/0xbe0 [drbd]
                        [<ffffffffa02689e9>] drbd_connector_callback+0x139/0x750 [drbd]
                        [<ffffffff81298a7b>] cn_rx_skb+0xdb/0x100
                        [<ffffffff81357603>] netlink_unicast+0x2a3/0x2f0
                        [<ffffffff8135787b>] netlink_sendmsg+0x22b/0x360
                        [<ffffffff8131f971>] sock_sendmsg+0xe1/0x120
                        [<ffffffff813227bd>] sys_sendto+0x10d/0x140
                        [<ffffffff814059ab>] system_call_fastpath+0x16/0x1b
 }
 ... key      at: [<ffffffffa0275a5c>] __key.46241+0x0/0xffffffffffff35a4 [drbd]
 ... acquired at:
  [<ffffffff8108a725>] lock_acquire+0x85/0x130
  [<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
  [<ffffffff81048429>] try_to_wake_up+0x39/0x300
  [<ffffffff81048702>] default_wake_function+0x12/0x20
  [<ffffffff810721cd>] autoremove_wake_function+0x1d/0x50
  [<ffffffff81034b49>] __wake_up_common+0x59/0x90
  [<ffffffff81037d98>] __wake_up+0x48/0x70
  [<ffffffffa02421b2>] got_RqSReply+0x42/0x90 [drbd]
  [<ffffffffa024d957>] drbd_asender+0x3b7/0x5f0 [drbd]
  [<ffffffffa0255262>] drbd_thread_setup+0x72/0x100 [drbd]
  [<ffffffff81071c06>] kthread+0x96/0xa0
  [<ffffffff81407ad4>] kernel_thread_helper+0x4/0x10

-> (&(&mdev->req_lock)->rlock){..-...} ops: 145 {
  IN-SOFTIRQ-W at:
                       [<ffffffff81088ad5>] __lock_acquire+0x715/0x1d70
                       [<ffffffff8108a725>] lock_acquire+0x85/0x130
                       [<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
                       [<ffffffffa023e072>] drbd_endio_read_sec_final+0x42/0x330 [drbd]
                       [<ffffffffa023e3d9>] drbd_endio_sec+0x79/0x4d0 [drbd]
                       [<ffffffff811611bd>] bio_endio+0x1d/0x40
                       [<ffffffffa02232c4>] dec_pending+0x1f4/0x3c0 [dm_mod]
                       [<ffffffffa0223533>] clone_endio+0xa3/0xe0 [dm_mod]
                       [<ffffffff811611bd>] bio_endio+0x1d/0x40
                       [<ffffffff811c147b>] req_bio_endio.clone.41+0xab/0xf0
                       [<ffffffff811c3640>] blk_update_request+0x100/0x570
                       [<ffffffff811c3ae4>] blk_update_bidi_request+0x34/0xa0
                       [<ffffffff811c3ccf>] blk_end_bidi_request+0x2f/0x80
                       [<ffffffff811c3d60>] blk_end_request+0x10/0x20
                       [<ffffffff812b3f7f>] scsi_io_completion+0xaf/0x630
                       [<ffffffff812aace8>] scsi_finish_command+0xc8/0x130
                       [<ffffffff812b3dcf>] scsi_softirq_done+0x13f/0x160
                       [<ffffffff811ca4a0>] blk_done_softirq+0x80/0xa0
                       [<ffffffff81055839>] __do_softirq+0xc9/0x230
                       [<ffffffff81407bcc>] call_softirq+0x1c/0x30
                       [<ffffffff81004c0d>] do_softirq+0x8d/0xc0
                       [<ffffffff81055d0e>] irq_exit+0x9e/0xc0
                       [<ffffffff81408176>] do_IRQ+0x66/0xe0
                       [<ffffffff81405070>] ret_from_intr+0x0/0x13
                       [<ffffffff812d945c>] mptspi_qcmd+0x8c/0xd0
                       [<ffffffff812ac05e>] scsi_dispatch_cmd+0x13e/0x340
                       [<ffffffff812b30e7>] scsi_request_fn+0x397/0x490
                       [<ffffffff811c545f>] __make_request+0x32f/0x350
                       [<ffffffff811c2ea9>] generic_make_request+0x1f9/0x5c0
                       [<ffffffffa024943a>] drbd_submit_ee+0x13a/0x230 [drbd]
                       [<ffffffffa024ab96>] receive_DataRequest+0x4b6/0x850 [drbd]
                       [<ffffffffa024d366>] drbdd_init+0xf16/0x1150 [drbd]
                       [<ffffffffa0255262>] drbd_thread_setup+0x72/0x100 [drbd]
                       [<ffffffff81071c06>] kthread+0x96/0xa0
                       [<ffffffff81407ad4>] kernel_thread_helper+0x4/0x10
  INITIAL USE at:
                      [<ffffffff810887a4>] __lock_acquire+0x3e4/0x1d70
                      [<ffffffff8108a725>] lock_acquire+0x85/0x130
                      [<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
                      [<ffffffffa025d554>] drbd_req_state+0x74/0x4a0 [drbd]
                      [<ffffffffa025d9b2>] _drbd_request_state+0x32/0xe0 [drbd]
                      [<ffffffffa026674c>] drbd_nl_disk_conf+0x72c/0x11c0 [drbd]
                      [<ffffffffa02689e9>] drbd_connector_callback+0x139/0x750 [drbd]
                      [<ffffffff81298a7b>] cn_rx_skb+0xdb/0x100
                      [<ffffffff81357603>] netlink_unicast+0x2a3/0x2f0
                      [<ffffffff8135787b>] netlink_sendmsg+0x22b/0x360
                      [<ffffffff8131f971>] sock_sendmsg+0xe1/0x120
                      [<ffffffff813227bd>] sys_sendto+0x10d/0x140
                      [<ffffffff814059ab>] system_call_fastpath+0x16/0x1b
}
... key      at: [<ffffffffa0275a64>] __key.46240+0x0/0xffffffffffff359c [drbd]
... acquired at:
  [<ffffffff81087564>] check_usage_forwards+0x94/0x110
  [<ffffffff8108816b>] mark_lock+0x19b/0x3f0
  [<ffffffff81088ad5>] __lock_acquire+0x715/0x1d70
  [<ffffffff8108a725>] lock_acquire+0x85/0x130
  [<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
  [<ffffffffa023e072>] drbd_endio_read_sec_final+0x42/0x330 [drbd]
  [<ffffffffa023e3d9>] drbd_endio_sec+0x79/0x4d0 [drbd]
  [<ffffffff811611bd>] bio_endio+0x1d/0x40
  [<ffffffffa02232c4>] dec_pending+0x1f4/0x3c0 [dm_mod]
  [<ffffffffa0223533>] clone_endio+0xa3/0xe0 [dm_mod]
  [<ffffffff811611bd>] bio_endio+0x1d/0x40
  [<ffffffff811c147b>] req_bio_endio.clone.41+0xab/0xf0
  [<ffffffff811c3640>] blk_update_request+0x100/0x570
  [<ffffffff811c3ae4>] blk_update_bidi_request+0x34/0xa0
  [<ffffffff811c3ccf>] blk_end_bidi_request+0x2f/0x80
  [<ffffffff811c3d60>] blk_end_request+0x10/0x20
  [<ffffffff812b3f7f>] scsi_io_completion+0xaf/0x630
  [<ffffffff812aace8>] scsi_finish_command+0xc8/0x130
  [<ffffffff812b3dcf>] scsi_softirq_done+0x13f/0x160
  [<ffffffff811ca4a0>] blk_done_softirq+0x80/0xa0
  [<ffffffff81055839>] __do_softirq+0xc9/0x230
  [<ffffffff81407bcc>] call_softirq+0x1c/0x30
  [<ffffffff81004c0d>] do_softirq+0x8d/0xc0
  [<ffffffff81055d0e>] irq_exit+0x9e/0xc0
  [<ffffffff81408176>] do_IRQ+0x66/0xe0
  [<ffffffff81405070>] ret_from_intr+0x0/0x13
  [<ffffffff812d945c>] mptspi_qcmd+0x8c/0xd0
  [<ffffffff812ac05e>] scsi_dispatch_cmd+0x13e/0x340
  [<ffffffff812b30e7>] scsi_request_fn+0x397/0x490
  [<ffffffff811c545f>] __make_request+0x32f/0x350
  [<ffffffff811c2ea9>] generic_make_request+0x1f9/0x5c0
  [<ffffffffa024943a>] drbd_submit_ee+0x13a/0x230 [drbd]
  [<ffffffffa024ab96>] receive_DataRequest+0x4b6/0x850 [drbd]
  [<ffffffffa024d366>] drbdd_init+0xf16/0x1150 [drbd]
  [<ffffffffa0255262>] drbd_thread_setup+0x72/0x100 [drbd]
  [<ffffffff81071c06>] kthread+0x96/0xa0
  [<ffffffff81407ad4>] kernel_thread_helper+0x4/0x10


stack backtrace:
Pid: 5871, comm: drbd0_receiver Not tainted 3.0.0-scst-debug+ #6
Call Trace:
<IRQ>  [<ffffffff81087486>] print_irq_inversion_bug+0x1b6/0x200
[<ffffffff810874d0>] ? print_irq_inversion_bug+0x200/0x200
[<ffffffff81087564>] check_usage_forwards+0x94/0x110
[<ffffffff8108816b>] mark_lock+0x19b/0x3f0
[<ffffffff81404d25>] ? _raw_spin_unlock_irqrestore+0x65/0x80
[<ffffffff81088ad5>] __lock_acquire+0x715/0x1d70
[<ffffffff81125b97>] ? __slab_free+0x377/0x440
[<ffffffff8112500a>] ? free_debug_processing+0x1ca/0x350
[<ffffffff810e4f00>] ? mempool_free_pages+0x10/0x10
[<ffffffff8108b1be>] ? trace_hardirqs_on_caller+0xde/0x1a0
[<ffffffff8108a725>] lock_acquire+0x85/0x130
[<ffffffffa023e072>] ? drbd_endio_read_sec_final+0x42/0x330 [drbd]
[<ffffffff8140447c>] ? _raw_spin_lock_irqsave+0x1c/0x60
[<ffffffff814044a6>] _raw_spin_lock_irqsave+0x46/0x60
[<ffffffffa023e072>] ? drbd_endio_read_sec_final+0x42/0x330 [drbd]
[<ffffffffa023e072>] drbd_endio_read_sec_final+0x42/0x330 [drbd]
[<ffffffffa023e3d9>] drbd_endio_sec+0x79/0x4d0 [drbd]
[<ffffffff811611bd>] bio_endio+0x1d/0x40
[<ffffffffa02232c4>] dec_pending+0x1f4/0x3c0 [dm_mod]
[<ffffffffa022311c>] ? dec_pending+0x4c/0x3c0 [dm_mod]
[<ffffffffa0223533>] clone_endio+0xa3/0xe0 [dm_mod]
[<ffffffff811611bd>] bio_endio+0x1d/0x40
[<ffffffff811c147b>] req_bio_endio.clone.41+0xab/0xf0
[<ffffffff811c3640>] blk_update_request+0x100/0x570
[<ffffffff811c38c2>] ? blk_update_request+0x382/0x570
[<ffffffff811c3ae4>] blk_update_bidi_request+0x34/0xa0
[<ffffffff811c3ccf>] blk_end_bidi_request+0x2f/0x80
[<ffffffff811c3d60>] blk_end_request+0x10/0x20
[<ffffffff812b3f7f>] scsi_io_completion+0xaf/0x630
[<ffffffff812aace8>] scsi_finish_command+0xc8/0x130
[<ffffffff812b3dcf>] scsi_softirq_done+0x13f/0x160
[<ffffffff811ca4a0>] blk_done_softirq+0x80/0xa0
[<ffffffff81055839>] __do_softirq+0xc9/0x230
[<ffffffff81407bcc>] call_softirq+0x1c/0x30
[<ffffffff81004c0d>] do_softirq+0x8d/0xc0
[<ffffffff81055d0e>] irq_exit+0x9e/0xc0
[<ffffffff81408176>] do_IRQ+0x66/0xe0
[<ffffffff812ab860>] ? perf_trace_scsi_cmd_done_timeout_template+0x1b0/0x1b0
[<ffffffff81405070>] common_interrupt+0x70/0x70
<EOI>  [<ffffffff81404d27>] ? _raw_spin_unlock_irqrestore+0x67/0x80
[<ffffffff812d945c>] mptspi_qcmd+0x8c/0xd0
[<ffffffff812ac05e>] scsi_dispatch_cmd+0x13e/0x340
[<ffffffff812b30e7>] scsi_request_fn+0x397/0x490
[<ffffffff811c545f>] __make_request+0x32f/0x350
[<ffffffffa0224193>] ? dm_request+0x33/0x280 [dm_mod]
[<ffffffff811c2ea9>] generic_make_request+0x1f9/0x5c0
[<ffffffff81161a64>] ? __bio_add_page+0x114/0x220
[<ffffffffa024943a>] drbd_submit_ee+0x13a/0x230 [drbd]
[<ffffffff81043609>] ? sub_preempt_count+0xa9/0xe0
[<ffffffffa024ab96>] receive_DataRequest+0x4b6/0x850 [drbd]
[<ffffffffa024d366>] drbdd_init+0xf16/0x1150 [drbd]
[<ffffffffa02551f0>] ? we_should_drop_the_connection+0xd0/0xd0 [drbd]
[<ffffffffa0255262>] drbd_thread_setup+0x72/0x100 [drbd]
[<ffffffffa02551f0>] ? we_should_drop_the_connection+0xd0/0xd0 [drbd]
[<ffffffff81071c06>] kthread+0x96/0xa0
[<ffffffff81407ad4>] kernel_thread_helper+0x4/0x10
[<ffffffff81038383>] ? finish_task_switch+0x83/0xe0
[<ffffffff81404c9b>] ? _raw_spin_unlock_irq+0x3b/0x60
[<ffffffff81405121>] ? retint_restore_args+0xe/0xe
[<ffffffff81071b70>] ? __init_kthread_worker+0x70/0x70
[<ffffffff81407ad0>] ? gs_change+0xb/0xb
block drbd0: Out of sync: start=8, size=712 (sectors)
Comment 1 Bart Van Assche 2011-08-04 19:06:06 UTC
A similar complaint was logged in the kernel log of the second system (B).

Synchronization stalled after some time. The contents of /proc/drbd on system A stalled at this state:

 0: cs:VerifyS ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
    ns:0 nr:0 dw:0 dr:414660 al:0 bm:0 lo:1 pe:9610 ua:2048 ap:0 ep:1 wo:f oos:2092956
        [==>.................] verified: 19.6% (1686488/2092956)K
        finish: 4:50:49 speed: 96 (180) want: 40,960 K/sec
Comment 2 Bart Van Assche 2011-08-04 19:06:38 UTC
A previous and similar report of this issue can be found here: http://lists.linbit.com/pipermail/drbd-user/2011-February/015725.html.
Comment 3 Philipp Reisner 2011-08-08 08:52:00 UTC
hi Bart,

We are aware of this issue. We have a fix for that in the drbd-8.4
code. We expect that 8.4 will be merged in the 3.2 merge-window.

You quick way to a solution is to use the out of tree code.

Best,
 Phil
Comment 4 Bart Van Assche 2011-08-08 17:17:22 UTC
(In reply to comment #3)
> We are aware of this issue. We have a fix for that in the drbd-8.4
> code. We expect that 8.4 will be merged in the 3.2 merge-window.
> 
> You quick way to a solution is to use the out of tree code.

Thanks for the quick reply. If you can tell me the commit ID(s) of the patches that fix this issue in the DRBD git repo, I'll try to port these myself to Linus' kernel tree and retest.
Comment 5 Lars Ellenberg 2011-08-24 14:52:58 UTC
We moved to rcu lockless updates of the protected object.

This would be the relevant commit:
http://git.drbd.org/?p=linux-2.6-drbd.git;a=commitdiff;h=1dedcbdda6b89a1b82650b4c65b8ee8b506b2ef0

Note You need to log in before you can comment on or make changes to this bug.