My primary server started blocking all IO for no apparent reason. Connection with the (long distance) secondary server was working as normal. drbdadm disconnect didn't work either (it blocked, too.) I had to reboot the server. DRBD is on top of mdadm, and both my disks were working properly. I've never had any issue with 2.6.39 (and, before that, with previous versions, although Ahead mode was not available.) Configuration: common { syncer { rate 100M; al-extents 3833; verify-alg md5; use-rle; } handlers { pri-on-incon-degr "echo O > /proc/sysrq-trigger ; halt -f"; pri-lost-after-sb "echo O > /proc/sysrq-trigger ; halt -f"; } disk { on-io-error detach; } net { ping-timeout 50; after-sb-0pri disconnect; after-sb-1pri discard-secondary; after-sb-2pri disconnect; rr-conflict disconnect; sndbuf-size 10M; rcvbuf-size 10M; max-buffers 2048; cram-hmac-alg "sha1"; shared-secret "foobar"; on-congestion pull-ahead; congestion-fill 10M; } startup { wfc-timeout 15; outdated-wfc-timeout 5; degr-wfc-timeout 5; } protocol A; } dmesg on the primary: Aug 23 04:30:44 kernel: [741635.718352] block drbd0: Congestion-fill threshold reached Aug 23 04:30:44 kernel: [741635.718588] block drbd0: conn( Connected -> Ahead ) pdsk( UpToDate -> Consistent ) Aug 23 04:30:46 kernel: [741637.892460] block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 Aug 23 04:30:46 kernel: [741637.893614] block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0) Aug 23 04:30:46 kernel: [741637.893666] block drbd0: conn( Ahead -> SyncSource ) pdsk( Consistent -> Inconsistent ) Aug 23 04:30:46 kernel: [741637.893714] block drbd0: Began resync as SyncSource (will sync 92388 KB [23097 bits set]). Aug 23 04:30:46 kernel: [741637.895396] block drbd0: updated sync UUID 31D068DFA6A5A469:0001000000000000:18EFD27FBECBDF53:18EED27FBECBDF53 Aug 23 04:30:55 kernel: [741646.209057] block drbd0: Resync done (total 8 sec; paused 0 sec; 11548 K/sec) Aug 23 04:30:55 kernel: [741646.209113] block drbd0: updated UUIDs 31D068DFA6A5A469:0000000000000000:0001000000000000:18EFD27FBECBDF53 Aug 23 04:30:55 kernel: [741646.209164] block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate ) Aug 23 04:30:55 kernel: [741646.210922] block drbd0: bitmap WRITE of 0 pages took 0 jiffies Aug 23 04:30:55 kernel: [741646.211049] block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map. Aug 23 04:31:06 kernel: [741657.464005] block drbd0: Congestion-fill threshold reached Aug 23 04:31:06 kernel: [741657.464045] block drbd0: conn( Connected -> Ahead ) pdsk( UpToDate -> Consistent ) Aug 23 04:31:26 kernel: [741676.975599] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967295 Aug 23 04:31:32 kernel: [741682.963823] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967294 Aug 23 04:31:38 kernel: [741688.952046] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967293 Aug 23 04:31:44 kernel: [741694.940272] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967292 Aug 23 04:31:50 kernel: [741700.928492] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967291 Aug 23 04:31:56 kernel: [741706.916721] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967290 Aug 23 04:32:02 kernel: [741712.904945] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967289 Aug 23 04:32:08 kernel: [741718.893169] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967288 Aug 23 04:32:14 kernel: [741724.881395] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967287 Aug 23 04:32:20 kernel: [741730.869619] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967286 Aug 23 04:32:26 kernel: [741736.857844] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967285 Aug 23 04:32:32 kernel: [741742.846068] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967284 Aug 23 04:32:38 kernel: [741748.834294] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967283 Aug 23 04:32:44 kernel: [741754.822521] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967282 Aug 23 04:32:50 kernel: [741760.810742] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967281 Aug 23 04:32:56 kernel: [741766.798967] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967280 Aug 23 04:33:02 kernel: [741772.787189] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967279 Aug 23 04:33:08 kernel: [741778.775416] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967278 Aug 23 04:33:14 kernel: [741784.763640] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967277 Aug 23 04:33:20 kernel: [741790.751865] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967276 Aug 23 04:33:26 kernel: [741796.740091] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967275 Aug 23 04:33:32 kernel: [741802.728315] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967274 Aug 23 04:33:38 kernel: [741808.716539] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967273 Aug 23 04:33:44 kernel: [741814.704764] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967272 Aug 23 04:33:50 kernel: [741820.692987] block drbd0: [drbd0_worker/4150] sock_sendmsg time expired, ko = 4294967271 Aug 23 04:33:52 kernel: [741823.148165] INFO: task jbd2/drbd0-8:4296 blocked for more than 120 seconds. Aug 23 04:33:52 kernel: [741823.148202] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 23 04:33:52 kernel: [741823.148253] jbd2/drbd0-8 D 0000000000000001 0 4296 2 0x00000000 Aug 23 04:33:52 kernel: [741823.148289] ffff88023ecc0730 0000000000000046 ffff88022f28a010 ffff880233ad12e0 Aug 23 04:33:52 kernel: [741823.148342] 0000000000012480 ffff88022f28bfd8 ffff88022f28bfd8 0000000000012480 Aug 23 04:33:52 kernel: [741823.148395] 0000000000000001 ffff880233ad12e0 ffff880231d54f00 ffffffff8103d37b Aug 23 04:33:52 kernel: [741823.148448] Call Trace: Aug 23 04:33:52 kernel: [741823.148474] [<ffffffff8103d37b>] ? __wake_up_common+0x5b/0x90 Aug 23 04:33:52 kernel: [741823.148504] [<ffffffff81010a62>] ? read_tsc+0x12/0x40 Aug 23 04:33:52 kernel: [741823.148531] [<ffffffff81071a5b>] ? ktime_get_ts+0x6b/0xd0 Aug 23 04:33:52 kernel: [741823.148560] [<ffffffff810a8e50>] ? __lock_page+0x70/0x70 Aug 23 04:33:52 kernel: [741823.148589] [<ffffffff8150eae7>] ? io_schedule+0x97/0xf0 Aug 23 04:33:52 kernel: [741823.148616] [<ffffffff810a8e59>] ? sleep_on_page+0x9/0x10 Aug 23 04:33:52 kernel: [741823.148643] [<ffffffff8150f3e0>] ? __wait_on_bit+0x50/0x80 Aug 23 04:33:52 kernel: [741823.148671] [<ffffffff810a90dc>] ? wait_on_page_bit+0x6c/0x80 Aug 23 04:33:52 kernel: [741823.148699] [<ffffffff81068690>] ? autoremove_wake_function+0x30/0x30 Aug 23 04:33:52 kernel: [741823.148729] [<ffffffff810b2b2a>] ? pagevec_lookup_tag+0x1a/0x30 Aug 23 04:33:52 kernel: [741823.148757] [<ffffffff810a9d1b>] ? filemap_fdatawait_range+0xdb/0x150 Aug 23 04:33:52 kernel: [741823.148788] [<ffffffff811aa92f>] ? jbd2_journal_commit_transaction+0x85f/0x1340 Aug 23 04:33:52 kernel: [741823.148833] [<ffffffff811ae6c3>] ? kjournald2+0xb3/0x210 Aug 23 04:33:52 kernel: [741823.148860] [<ffffffff81068660>] ? wake_up_bit+0x40/0x40 Aug 23 04:33:52 kernel: [741823.148887] [<ffffffff811ae610>] ? commit_timeout+0x10/0x10 Aug 23 04:33:52 kernel: [741823.148916] [<ffffffff810681e6>] ? kthread+0x96/0xb0 Aug 23 04:33:52 kernel: [741823.148944] [<ffffffff81517de4>] ? kernel_thread_helper+0x4/0x10 Aug 23 04:33:52 kernel: [741823.148973] [<ffffffff81068150>] ? kthread_worker_fn+0x190/0x190 Aug 23 04:33:52 kernel: [741823.149001] [<ffffffff81517de0>] ? gs_change+0x13/0x13 Aug 23 04:33:52 kernel: [741823.149031] INFO: task mysqld:25472 blocked for more than 120 seconds. Aug 23 04:33:52 kernel: [741823.149059] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 23 04:33:52 kernel: [741823.149104] mysqld D 0000000000000001 0 25472 4633 0x00000000 Aug 23 04:33:52 kernel: [741823.149139] ffff88023ecc0730 0000000000000086 ffff8801d600a010 ffff88023397af60 Aug 23 04:33:52 kernel: [741823.149192] 0000000000012480 ffff8801d600bfd8 ffff8801d600bfd8 0000000000012480 Aug 23 04:33:52 kernel: [741823.149245] 0000000000000001 ffff88023397af60 0000000000000000 0000000000000000 Aug 23 04:33:52 kernel: [741823.149297] Call Trace: Aug 23 04:33:52 kernel: [741823.149321] [<ffffffff81009729>] ? __switch_to+0xe9/0x350 Aug 23 04:33:52 kernel: [741823.149348] [<ffffffff810a8e50>] ? __lock_page+0x70/0x70 Aug 23 04:33:52 kernel: [741823.149377] [<ffffffff8150eae7>] ? io_schedule+0x97/0xf0 Aug 23 04:33:52 kernel: [741823.149404] [<ffffffff810a8e59>] ? sleep_on_page+0x9/0x10 Aug 23 04:33:52 kernel: [741823.149431] [<ffffffff8150f3e0>] ? __wait_on_bit+0x50/0x80 Aug 23 04:33:52 kernel: [741823.149458] [<ffffffff810a90dc>] ? wait_on_page_bit+0x6c/0x80 Aug 23 04:33:52 kernel: [741823.149486] [<ffffffff81068690>] ? autoremove_wake_function+0x30/0x30 Aug 23 04:33:52 kernel: [741823.149515] [<ffffffff810a949d>] ? grab_cache_page_write_begin+0xad/0xf0 Aug 23 04:33:52 kernel: [741823.149547] [<ffffffff8117ab2a>] ? ext4_da_write_begin+0x14a/0x200 Aug 23 04:33:52 kernel: [741823.149575] [<ffffffff811a8b98>] ? jbd2_journal_stop+0x1b8/0x260 Aug 23 04:33:52 kernel: [741823.149604] [<ffffffff810a8124>] ? generic_file_buffered_write+0x104/0x240 Aug 23 04:33:52 kernel: [741823.149634] [<ffffffff810aa2b8>] ? __generic_file_aio_write+0x278/0x460 Aug 23 04:33:52 kernel: [741823.149664] [<ffffffff810aa4f8>] ? generic_file_aio_write+0x58/0xd0 Aug 23 04:33:52 kernel: [741823.149693] [<ffffffff8116f821>] ? ext4_file_write+0x71/0x2e0 Aug 23 04:33:52 kernel: [741823.149722] [<ffffffff810e9a20>] ? do_sync_write+0xc0/0x100 Aug 23 04:33:52 kernel: [741823.149752] [<ffffffff8128d30c>] ? security_file_permission+0x1c/0xb0 Aug 23 04:33:52 kernel: [741823.149781] [<ffffffff810ea66b>] ? vfs_write+0xcb/0x180 Aug 23 04:33:52 kernel: [741823.149808] [<ffffffff810ea7c1>] ? sys_pwrite64+0xa1/0xb0 Aug 23 04:33:52 kernel: [741823.149835] [<ffffffff81516d52>] ? system_call_fastpath+0x16/0x1b Aug 23 04:33:52 kernel: [741823.149864] INFO: task flush-147:0:4858 blocked for more than 120 seconds. Aug 23 04:33:52 kernel: [741823.149892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Aug 23 04:33:52 kernel: [741823.149937] flush-147:0 D 0000000000000001 0 4858 2 0x00000000 Aug 23 04:33:52 kernel: [741823.149971] ffff88023ecc0730 0000000000000046 ffff88022891a010 ffff8802320f7450 Aug 23 04:33:52 kernel: [741823.150024] 0000000000012480 ffff88022891bfd8 ffff88022891bfd8 0000000000012480 Aug 23 04:33:52 kernel: [741823.150077] 0000000000000001 ffff8802320f7450 ffff880231d45000 ffff880231d45958 Aug 23 04:33:52 kernel: [741823.150130] Call Trace: Aug 23 04:33:52 kernel: [741823.150152] [<ffffffff810ab1d4>] ? mempool_alloc+0x64/0x160 Aug 23 04:33:52 kernel: [741823.150184] [<ffffffffa006cf55>] ? bm_change_bits_to+0x85/0x150 [drbd] Aug 23 04:33:52 kernel: [741823.150214] [<ffffffff8150f1bd>] ? schedule_timeout+0x1ed/0x220 Aug 23 04:33:52 kernel: [741823.150243] [<ffffffff8103da33>] ? __wake_up+0x43/0x70 Aug 23 04:33:52 kernel: [741823.150270] [<ffffffff8104019a>] ? select_task_rq_fair+0x47a/0x9c0 Aug 23 04:33:52 kernel: [741823.150299] [<ffffffff8150ee1a>] ? wait_for_common+0xfa/0x1a0 Aug 23 04:33:52 kernel: [741823.150327] [<ffffffff81045c90>] ? try_to_wake_up+0x260/0x260 Aug 23 04:33:52 kernel: [741823.150356] [<ffffffffa00271b1>] ? lc_get+0x41/0x250 [lru_cache] Aug 23 04:33:52 kernel: [741823.150387] [<ffffffffa0086b84>] ? drbd_al_begin_io+0xf4/0x220 [drbd] Aug 23 04:33:52 kernel: [741823.150418] [<ffffffffa00875f0>] ? drbd_al_read_log+0x3a0/0x3a0 [drbd] Aug 23 04:33:52 kernel: [741823.150448] [<ffffffff81119884>] ? bio_alloc_bioset+0x54/0xf0 Aug 23 04:33:52 kernel: [741823.150476] [<ffffffff811186d1>] ? __bio_clone+0x21/0x70 Aug 23 04:33:52 kernel: [741823.150505] [<ffffffffa0082a6e>] ? drbd_make_request_common+0xc6e/0x1660 [drbd] Aug 23 04:33:52 kernel: [741823.150550] [<ffffffff8103da33>] ? __wake_up+0x43/0x70 Aug 23 04:33:52 kernel: [741823.150579] [<ffffffffa008378d>] ? drbd_make_request+0x32d/0x10c0 [drbd] Aug 23 04:33:52 kernel: [741823.150609] [<ffffffff8117afab>] ? ext4_io_submit+0x2b/0x50 Aug 23 04:33:52 kernel: [741823.150636] [<ffffffff8117735d>] ? mpage_da_map_and_submit+0x59d/0x9f0 Aug 23 04:33:52 kernel: [741823.150667] [<ffffffff812b1f30>] ? throtl_find_tg+0x40/0x60 Aug 23 04:33:52 kernel: [741823.150694] [<ffffffff812b2526>] ? blk_throtl_bio+0x396/0x4d0 Aug 23 04:33:52 kernel: [741823.150723] [<ffffffff812a361b>] ? generic_make_request+0x2db/0x3d0 Aug 23 04:33:52 kernel: [741823.150751] [<ffffffff810ab1d4>] ? mempool_alloc+0x64/0x160 Aug 23 04:33:52 kernel: [741823.151823] [<ffffffff812a3781>] ? submit_bio+0x71/0xf0 Aug 23 04:33:52 kernel: [741823.151851] [<ffffffff81113bb8>] ? submit_bh+0xe8/0x150 Aug 23 04:33:52 kernel: [741823.151878] [<ffffffff81116da0>] ? __block_write_full_page+0x1e0/0x3d0 Aug 23 04:33:52 kernel: [741823.151907] [<ffffffff81116540>] ? end_buffer_async_read+0x160/0x160 Aug 23 04:33:52 kernel: [741823.151936] [<ffffffff81119ac0>] ? I_BDEV+0x10/0x10 Aug 23 04:33:52 kernel: [741823.151963] [<ffffffff810b088a>] ? __writepage+0xa/0x40 Aug 23 04:33:52 kernel: [741823.151992] [<ffffffff810b1231>] ? write_cache_pages+0x1e1/0x410 Aug 23 04:33:52 kernel: [741823.152021] [<ffffffff810b0880>] ? get_zeroed_page+0x10/0x10 Aug 23 04:33:52 kernel: [741823.152049] [<ffffffff810b14a8>] ? generic_writepages+0x48/0x70 Aug 23 04:33:52 kernel: [741823.152078] [<ffffffff8110d220>] ? writeback_single_inode+0xe0/0x200 Aug 23 04:33:52 kernel: [741823.152107] [<ffffffff8110d88e>] ? writeback_sb_inodes+0xae/0x190 Aug 23 04:33:52 kernel: [741823.152136] [<ffffffff8110dfff>] ? writeback_inodes_wb+0xbf/0x140 Aug 23 04:33:52 kernel: [741823.152165] [<ffffffff8110e33d>] ? wb_writeback+0x2bd/0x310 Aug 23 04:33:52 kernel: [741823.152193] [<ffffffff8110e58b>] ? wb_do_writeback+0x1fb/0x210 Aug 23 04:33:52 kernel: [741823.152222] [<ffffffff8110e62a>] ? bdi_writeback_thread+0x8a/0x170 Aug 23 04:33:52 kernel: [741823.152250] [<ffffffff8110e5a0>] ? wb_do_writeback+0x210/0x210 Aug 23 04:33:52 kernel: [741823.152279] [<ffffffff810681e6>] ? kthread+0x96/0xb0 Aug 23 04:33:52 kernel: [741823.152305] [<ffffffff81517de4>] ? kernel_thread_helper+0x4/0x10 Aug 23 04:33:52 kernel: [741823.152334] [<ffffffff81068150>] ? kthread_worker_fn+0x190/0x190 Aug 23 04:33:52 kernel: [741823.152363] [<ffffffff81517de0>] ? gs_change+0x13/0x13 (and so on, until the reboot) Nothing suspicious in the secondary's dmesg. Both nodes were still marked as connected.
This looks more like a misconfiguration or misunderstanding. Though I have to admit that the call trace does not make much sense to me. DRBD uses two tcp connections. For some reason, sendmsg on the bulk connection does not make progress, even though there is no problem doing full roundtrips on what we call the meta connection. Do you have some MTU problem? To have DRBD detect and automatically recover from similar situations, you can use the "knock out counter" feature, e.g. add ko-count 4; to your drbd.conf. More on this in the DRBD User's Guide: http://www.drbd.org/users-guide/ http://www.drbd.org/users-guide/re-drbdconf.html
I can't rule out a misconfiguration, but I've been using DRBD since 8.0 on many servers and I've never seen such a thing happen. It's only happened very recently since I began using Linux 3.0 on a few servers, a few days ago. Even if it's a temporary network failure that I've never had before: - how comes even a drbdadm disconnect doesn't work (i.e. hangs), and I have to reboot the primary server to be able to write on my filesystem again? - isn't the Ahead mode supposed to not block IO if the buffer is full and/or the secondary server is not accessible?
On Wed, Aug 24, 2011 at 03:14:47PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=41602 > --- Comment #2 from Cyril B. <cbay@excellency.fr> 2011-08-24 15:14:47 --- > I can't rule out a misconfiguration, but I've been using DRBD since 8.0 on > many > servers and I've never seen such a thing happen. It's only happened very > recently since I began using Linux 3.0 on a few servers, a few days ago. > > Even if it's a temporary network failure that I've never had before: > > - how comes even a drbdadm disconnect doesn't work (i.e. hangs), and I have > to > reboot the primary server to be able to write on my filesystem again? Arguably a design bug, the default disconnect waits for all current requests to be handled. You can try to --force disconnect, though. > - isn't the Ahead mode supposed to not block IO if the buffer is full and/or > the secondary server is not accessible? Well, yes, but it still needs to send some bytes (as oposed to the full data block) to the peer. And if sendmsg blocks/times out/makes no progress for whatever reason, we cannot do anything about that.
Thanks, that makes sense.