Created attachment 301539 [details] stack 1. cannot login with ssh, system hanged and cannot do anything 2. dmesg report 'audit: audit_backlog=41349 > audit_backlog_limit=8192' 3. I send sysrq-crash and get vmcore file , I dont know how to reproduce it. Follwing is my analysis from vmcore: The reason why tty cannot login is pid 2021571 hold the acct_process mutex, and 2021571 cannot release mutex because it is wait for xlog release space. See the stac info in the attachment of stack.txt So I try to figure out what happened to xlog crash> struct xfs_ail.ail_target_prev,ail_targe,ail_head 0xffff00ff884f1000 ail_target_prev = 0xe9200058600 ail_target = 0xe9200058600 ail_head = { next = 0xffff0340999a0a80, prev = 0xffff020013c66b40 } there are 112 log item in ail list crash> list 0xffff0340999a0a80 | wc -l 112 79 item of them are xlog_inode_item 30 item of them are xlog_buf_item crash> xfs_log_item.li_flags,li_lsn 0xffff0340999a0a80 -x li_flags = 0x1 li_lsn = 0xe910005cc00 ===> first item lsn crash> xfs_log_item.li_flags,li_lsn ffff020013c66b40 -x li_flags = 0x1 li_lsn = 0xe9200058600 ===> last item lsn crash>xfs_log_item.li_buf 0xffff0340999a0a80 li_buf = 0xffff0200125b7180 crash> xfs_buf.b_flags 0xffff0200125b7180 -x b_flags = 0x110032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_INODES|_XBF_PAGES) crash> xfs_buf.b_state 0xffff0200125b7180 -x b_state = 0x2 (XFS_BSTATE_IN_FLIGHT) crash> xfs_buf.b_last_error,b_retries,b_first_retry_time 0xffff0200125b7180 -x b_last_error = 0x0 b_retries = 0x0 b_first_retry_time = 0x0 The buf flags show the io had been done(XBF_DONE is set). When I review the code xfs_buf_ioend, if XBF_DONE is set, xfs_buf_inode_iodone will be called and it will remove the log item from ail list, then release the xlog space by moving the tail_lsn. But now this item is still in the ail list, and the b_last_error = 0, XBF_WRITE is set. xfs buf log item is the same as the inode log item. crash> list -s xfs_log_item.li_buf 0xffff0340999a0a80 ffff033f8d7c9de8 li_buf = 0x0 crash> xfs_buf_log_item.bli_buf ffff033f8d7c9de8 bli_buf = 0xffff0200125b4a80 crash> xfs_buf.b_flags 0xffff0200125b4a80 -x b_flags = 0x100032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_PAGES) I think it is impossible that (XBF_DONE is set & b_last_error = 0) and the item still in the ail. Is my analysis correct? Why xlog space cannot release space?
[cc Amir, the 5.10 stable XFS maintainer] On Tue, Aug 09, 2022 at 11:46:23AM +0000, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=216343 > > Bug ID: 216343 > Summary: XFS: no space left in xlog cause system hang > Product: File System > Version: 2.5 > Kernel Version: 5.10.38 > Hardware: ARM > OS: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: XFS > Assignee: filesystem_xfs@kernel-bugs.kernel.org > Reporter: zhoukete@126.com > Regression: No > > Created attachment 301539 [details] > --> https://bugzilla.kernel.org/attachment.cgi?id=301539&action=edit > stack > > 1. cannot login with ssh, system hanged and cannot do anything > 2. dmesg report 'audit: audit_backlog=41349 > audit_backlog_limit=8192' > 3. I send sysrq-crash and get vmcore file , I dont know how to reproduce it. > > Follwing is my analysis from vmcore: > > The reason why tty cannot login is pid 2021571 hold the acct_process mutex, > and > 2021571 cannot release mutex because it is wait for xlog release space. See > the > stac info in the attachment of stack.txt > > So I try to figure out what happened to xlog > > crash> struct xfs_ail.ail_target_prev,ail_targe,ail_head 0xffff00ff884f1000 > ail_target_prev = 0xe9200058600 > ail_target = 0xe9200058600 > ail_head = { > next = 0xffff0340999a0a80, > prev = 0xffff020013c66b40 > } > > there are 112 log item in ail list > crash> list 0xffff0340999a0a80 | wc -l > 112 > > 79 item of them are xlog_inode_item > 30 item of them are xlog_buf_item > > crash> xfs_log_item.li_flags,li_lsn 0xffff0340999a0a80 -x > li_flags = 0x1 > li_lsn = 0xe910005cc00 ===> first item lsn > > crash> xfs_log_item.li_flags,li_lsn ffff020013c66b40 -x > li_flags = 0x1 > li_lsn = 0xe9200058600 ===> last item lsn > > crash>xfs_log_item.li_buf 0xffff0340999a0a80 > li_buf = 0xffff0200125b7180 > > crash> xfs_buf.b_flags 0xffff0200125b7180 -x > b_flags = 0x110032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_INODES|_XBF_PAGES) > > crash> xfs_buf.b_state 0xffff0200125b7180 -x > b_state = 0x2 (XFS_BSTATE_IN_FLIGHT) > > crash> xfs_buf.b_last_error,b_retries,b_first_retry_time 0xffff0200125b7180 > -x > b_last_error = 0x0 > b_retries = 0x0 > b_first_retry_time = 0x0 > > The buf flags show the io had been done(XBF_DONE is set). > When I review the code xfs_buf_ioend, if XBF_DONE is set, > xfs_buf_inode_iodone > will be called and it will remove the log item from ail list, then release > the > xlog space by moving the tail_lsn. > > But now this item is still in the ail list, and the b_last_error = 0, > XBF_WRITE > is set. > > xfs buf log item is the same as the inode log item. > > crash> list -s xfs_log_item.li_buf 0xffff0340999a0a80 > ffff033f8d7c9de8 > li_buf = 0x0 > crash> xfs_buf_log_item.bli_buf ffff033f8d7c9de8 > bli_buf = 0xffff0200125b4a80 > crash> xfs_buf.b_flags 0xffff0200125b4a80 -x > b_flags = 0x100032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_PAGES) > > I think it is impossible that (XBF_DONE is set & b_last_error = 0) and the > item > still in the ail. > > Is my analysis correct? > Why xlog space cannot release space? > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are watching the assignee of the bug.
On Mon, Aug 15, 2022 at 2:54 AM Dave Chinner <david@fromorbit.com> wrote: > > [cc Amir, the 5.10 stable XFS maintainer] > > On Tue, Aug 09, 2022 at 11:46:23AM +0000, bugzilla-daemon@kernel.org wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216343 > > > > Bug ID: 216343 > > Summary: XFS: no space left in xlog cause system hang > > Product: File System > > Version: 2.5 > > Kernel Version: 5.10.38 > > Hardware: ARM > > OS: Linux > > Tree: Mainline > > Status: NEW > > Severity: normal > > Priority: P1 > > Component: XFS > > Assignee: filesystem_xfs@kernel-bugs.kernel.org > > Reporter: zhoukete@126.com > > Regression: No > > > > Created attachment 301539 [details] > > --> https://bugzilla.kernel.org/attachment.cgi?id=301539&action=edit > > stack > > > > 1. cannot login with ssh, system hanged and cannot do anything > > 2. dmesg report 'audit: audit_backlog=41349 > audit_backlog_limit=8192' > > 3. I send sysrq-crash and get vmcore file , I dont know how to reproduce > it. > > > > Follwing is my analysis from vmcore: > > > > The reason why tty cannot login is pid 2021571 hold the acct_process mutex, > and > > 2021571 cannot release mutex because it is wait for xlog release space. See > the > > stac info in the attachment of stack.txt > > > > So I try to figure out what happened to xlog > > > > crash> struct xfs_ail.ail_target_prev,ail_targe,ail_head 0xffff00ff884f1000 > > ail_target_prev = 0xe9200058600 > > ail_target = 0xe9200058600 > > ail_head = { > > next = 0xffff0340999a0a80, > > prev = 0xffff020013c66b40 > > } > > > > there are 112 log item in ail list > > crash> list 0xffff0340999a0a80 | wc -l > > 112 > > > > 79 item of them are xlog_inode_item > > 30 item of them are xlog_buf_item > > > > crash> xfs_log_item.li_flags,li_lsn 0xffff0340999a0a80 -x > > li_flags = 0x1 > > li_lsn = 0xe910005cc00 ===> first item lsn > > > > crash> xfs_log_item.li_flags,li_lsn ffff020013c66b40 -x > > li_flags = 0x1 > > li_lsn = 0xe9200058600 ===> last item lsn > > > > crash>xfs_log_item.li_buf 0xffff0340999a0a80 > > li_buf = 0xffff0200125b7180 > > > > crash> xfs_buf.b_flags 0xffff0200125b7180 -x > > b_flags = 0x110032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_INODES|_XBF_PAGES) > > > > crash> xfs_buf.b_state 0xffff0200125b7180 -x > > b_state = 0x2 (XFS_BSTATE_IN_FLIGHT) > > > > crash> xfs_buf.b_last_error,b_retries,b_first_retry_time 0xffff0200125b7180 > -x > > b_last_error = 0x0 > > b_retries = 0x0 > > b_first_retry_time = 0x0 > > > > The buf flags show the io had been done(XBF_DONE is set). > > When I review the code xfs_buf_ioend, if XBF_DONE is set, > xfs_buf_inode_iodone > > will be called and it will remove the log item from ail list, then release > the > > xlog space by moving the tail_lsn. > > > > But now this item is still in the ail list, and the b_last_error = 0, > XBF_WRITE > > is set. > > > > xfs buf log item is the same as the inode log item. > > > > crash> list -s xfs_log_item.li_buf 0xffff0340999a0a80 > > ffff033f8d7c9de8 > > li_buf = 0x0 > > crash> xfs_buf_log_item.bli_buf ffff033f8d7c9de8 > > bli_buf = 0xffff0200125b4a80 > > crash> xfs_buf.b_flags 0xffff0200125b4a80 -x > > b_flags = 0x100032 (XBF_WRITE|XBF_ASYNC|XBF_DONE|_XBF_PAGES) > > > > I think it is impossible that (XBF_DONE is set & b_last_error = 0) and the > item > > still in the ail. > > > > Is my analysis correct? I don't think so. I think this buffer write is in-flight. > > Why xlog space cannot release space? Not sure if space cannot be released or just takes a lot of time. There are several AIL/CIL improvements in upstream kernel and none of them are going to land in 5.10.y. The reported kernel version 5.10.38 has almost no upstream fixes at all, but I don't think that any of the fixes in 5.10.y are relevant for this case anyway. If this hang happens often with your workload, I suggest using a newer kernel and/or formatting xfs with a larger log to meet the demands of your workload. Thanks, Amir.
(In reply to Amir Goldstein from comment #2) > > I don't think so. > I think this buffer write is in-flight. But how to explain the XFS_DONE flag is set.Is XFS_DONE flag means the io had been done? > > Not sure if space cannot be released or just takes a lot of time. crash> ps -m 2006840 [0 14:35:38.418] [UN] PID: 2006840 TASK: ffff035f80de8f80 CPU: 51 COMMAND: "onestor-peon" pid 2006840 hang 14 hours. It is waiting for the xfs buf io finish. So I think it is ‘cannot be released’ not take a lot of time. crash> bt 2006840 PID: 2006840 TASK: ffff035f80de8f80 CPU: 51 COMMAND: "onestor-peon" #0 [ffff80002a873580] __switch_to at ffff800010016dec #1 [ffff80002a8735b0] __schedule at ffff800010a99d60 #2 [ffff80002a873650] schedule at ffff800010a9a1cc #3 [ffff80002a873670] schedule_timeout at ffff800010a9ee00 #4 [ffff80002a873710] __down at ffff800010a9d2ec #5 [ffff80002a8737a0] down at ffff8000100ea30c #6 [ffff80002a8737c0] xfs_buf_lock at ffff800008f5bd4c [xfs] #7 [ffff80002a8737f0] xfs_buf_find at ffff800008f5c3f0 [xfs] #8 [ffff80002a8738b0] xfs_buf_get_map at ffff800008f5c7d4 [xfs] #9 [ffff80002a873930] xfs_buf_read_map at ffff800008f5cf10 [xfs] #10 [ffff80002a8739a0] xfs_trans_read_buf_map at ffff800008f9c198 [xfs] #11 [ffff80002a873a20] xfs_da_read_buf at ffff800008f2b51c [xfs] #12 [ffff80002a873aa0] xfs_da3_node_read at ffff800008f2b5a4 [xfs] #13 [ffff80002a873ac0] xfs_da3_node_lookup_int at ffff800008f2c778 [xfs] #14 [ffff80002a873b60] xfs_dir2_node_removename at ffff800008f38964 [xfs] #15 [ffff80002a873c30] xfs_dir_removename at ffff800008f2fc6c [xfs] #16 [ffff80002a873c90] xfs_remove at ffff800008f77f70 [xfs] #17 [ffff80002a873d00] xfs_vn_unlink at ffff800008f72744 [xfs] #18 [ffff80002a873d50] vfs_unlink at ffff8000103264e8 #19 [ffff80002a873d90] do_unlinkat at ffff80001032ad44 #20 [ffff80002a873e20] __arm64_sys_unlinkat at ffff80001032ae30 > There are several AIL/CIL improvements in upstream kernel and > none of them are going to land in 5.10.y. > The reported kernel version 5.10.38 has almost no upstream fixes > at all, but I don't think that any of the fixes in 5.10.y are relevant for > this case anyway. > > If this hang happens often with your workload, I suggest using > a newer kernel and/or formatting xfs with a larger log to meet > the demands of your workload. > It is just happens only once.
On Tue, Aug 16, 2022 at 11:57 AM <bugzilla-daemon@kernel.org> wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216343 > > --- Comment #3 from zhoukete@126.com --- > (In reply to Amir Goldstein from comment #2) > > > > I don't think so. > > I think this buffer write is in-flight. > > But how to explain the XFS_DONE flag is set.Is XFS_DONE flag means the io had > been done? > According to xfs_trans_dirty_buf() I think it could mean uptodate and dirty buffer. > > > > > Not sure if space cannot be released or just takes a lot of time. > > crash> ps -m 2006840 > [0 14:35:38.418] [UN] PID: 2006840 TASK: ffff035f80de8f80 CPU: 51 > COMMAND: > "onestor-peon" > > pid 2006840 hang 14 hours. It is waiting for the xfs buf io finish. > So I think it is ‘cannot be released’ not take a lot of time. Maybe the hardware never returned with a response? Hard to say. Maybe someone else has ideas. Thanks, Amir.
(In reply to Amir Goldstein from comment #4) > > According to xfs_trans_dirty_buf() I think it could mean uptodate and > dirty buffer. > when I review the xfs_trans_dirty_buf code, I found that xfs inode item b_log_item is null, crash> xfs_log_item.li_buf,li_ops 0xffff0340999a0a80 -x li_buf = 0xffff0200125b7180 li_ops = 0xffff800008faec60 <xfs_inode_item_ops> crash> xfs_buf.b_log_item 0xffff0200125b7180 b_log_item = 0x0 and only xfs buf log item b_log_item has value crash> xfs_log_item.li_buf,li_ops ffff033f8d7c9de8 -x li_buf = 0x0 li_ops = 0xffff800008fae8d8 <xfs_buf_item_ops> crash> xfs_buf_log_item.bli_buf ffff033f8d7c9de8 bli_buf = 0xffff0200125b4a80 crash> xfs_buf.b_log_item 0xffff0200125b4a80 b_log_item = 0xffff033f8d7c9de8 crash> xfs_buf_log_item.bli_flags 0xffff033f8d7c9de8 bli_flags = 2 (XFS_BLI_DIRTY) crash> xfs_buf_log_item.bli_item.li_flags ffff033f8d7c9de8 bli_item.li_flags = 1, (XFS_LI_IN_AIL) So xfs buf log item XFS_DONE is set because of xfs_trans_dirty_buf(),buf xfs inode log item never call xfs_trans_dirty_buf() because of b_log_item == 0x0. Do you know the reason why xfs inode log item XFS_DONE is set? > > Maybe the hardware never returned with a response? > Hard to say. Maybe someone else has ideas. > If we can prove that XFS_DONE isn't stand for iodone, I think this issue may cause by the hardware error. I find the err msg in dmesg: [ 9824.111366] mpt3sas_cm0: issue target reset: handle = (0x0034) Maybe it tell us mpt3sas lost the io requests before.
On Wed, Aug 17, 2022 at 1:19 PM <bugzilla-daemon@kernel.org> wrote: > > https://bugzilla.kernel.org/show_bug.cgi?id=216343 > > --- Comment #5 from zhoukete@126.com --- > (In reply to Amir Goldstein from comment #4) > > > > > According to xfs_trans_dirty_buf() I think it could mean uptodate and > > dirty buffer. > > > > when I review the xfs_trans_dirty_buf code, I found that xfs inode item > b_log_item is null, > > crash> xfs_log_item.li_buf,li_ops 0xffff0340999a0a80 -x > li_buf = 0xffff0200125b7180 > li_ops = 0xffff800008faec60 <xfs_inode_item_ops> > crash> xfs_buf.b_log_item 0xffff0200125b7180 > b_log_item = 0x0 > > and only xfs buf log item b_log_item has value > > crash> xfs_log_item.li_buf,li_ops ffff033f8d7c9de8 -x > li_buf = 0x0 > li_ops = 0xffff800008fae8d8 <xfs_buf_item_ops> > crash> xfs_buf_log_item.bli_buf ffff033f8d7c9de8 > bli_buf = 0xffff0200125b4a80 > crash> xfs_buf.b_log_item 0xffff0200125b4a80 > b_log_item = 0xffff033f8d7c9de8 > crash> xfs_buf_log_item.bli_flags 0xffff033f8d7c9de8 > bli_flags = 2 (XFS_BLI_DIRTY) > crash> xfs_buf_log_item.bli_item.li_flags ffff033f8d7c9de8 > bli_item.li_flags = 1, (XFS_LI_IN_AIL) > > So xfs buf log item XFS_DONE is set because of xfs_trans_dirty_buf(),buf xfs > inode log item never call xfs_trans_dirty_buf() because of b_log_item == 0x0. > > Do you know the reason why xfs inode log item XFS_DONE is set? > #define XBF_DONE (1u << 5) /* all pages in the buffer uptodate */ Buffer uptodate does not mean that it is not dirty. I am not sure about the rest of your analysis. > > > > Maybe the hardware never returned with a response? > > Hard to say. Maybe someone else has ideas. > > > > If we can prove that XFS_DONE isn't stand for iodone, I think this issue may > cause by the hardware error. > > I find the err msg in dmesg: > [ 9824.111366] mpt3sas_cm0: issue target reset: handle = (0x0034) > > Maybe it tell us mpt3sas lost the io requests before. > Yes, maybe it does. Anyway, if your hardware had errors, could it be that your filesystem is shutting down? If it does, you may be hit by the bug fixed by 84d8949e7707 ("xfs: hold buffer across unpin and potential shutdown processing") but I am not sure if all the conditions in this bug match your case. If you did get hit by this bug, you may consider upgrade to v5.10.135 which has the bug fix. Thanks, Amir.
(In reply to Amir Goldstein from comment #6) > On Wed, Aug 17, 2022 at 1:19 PM <bugzilla-daemon@kernel.org> wrote: > > > > https://bugzilla.kernel.org/show_bug.cgi?id=216343 > > > > --- Comment #5 from zhoukete@126.com --- > > (In reply to Amir Goldstein from comment #4) > > > > > > > > According to xfs_trans_dirty_buf() I think it could mean uptodate and > > > dirty buffer. > > > > > > > when I review the xfs_trans_dirty_buf code, I found that xfs inode item > > b_log_item is null, > > > > crash> xfs_log_item.li_buf,li_ops 0xffff0340999a0a80 -x > > li_buf = 0xffff0200125b7180 > > li_ops = 0xffff800008faec60 <xfs_inode_item_ops> > > crash> xfs_buf.b_log_item 0xffff0200125b7180 > > b_log_item = 0x0 > > > > and only xfs buf log item b_log_item has value > > > > crash> xfs_log_item.li_buf,li_ops ffff033f8d7c9de8 -x > > li_buf = 0x0 > > li_ops = 0xffff800008fae8d8 <xfs_buf_item_ops> > > crash> xfs_buf_log_item.bli_buf ffff033f8d7c9de8 > > bli_buf = 0xffff0200125b4a80 > > crash> xfs_buf.b_log_item 0xffff0200125b4a80 > > b_log_item = 0xffff033f8d7c9de8 > > crash> xfs_buf_log_item.bli_flags 0xffff033f8d7c9de8 > > bli_flags = 2 (XFS_BLI_DIRTY) > > crash> xfs_buf_log_item.bli_item.li_flags ffff033f8d7c9de8 > > bli_item.li_flags = 1, (XFS_LI_IN_AIL) > > > > So xfs buf log item XFS_DONE is set because of xfs_trans_dirty_buf(),buf > xfs > > inode log item never call xfs_trans_dirty_buf() because of b_log_item == > 0x0. > > > > Do you know the reason why xfs inode log item XFS_DONE is set? > > > > #define XBF_DONE (1u << 5) /* all pages in the buffer uptodate */ > > Buffer uptodate does not mean that it is not dirty. > I am not sure about the rest of your analysis. > > > > > > > Maybe the hardware never returned with a response? > > > Hard to say. Maybe someone else has ideas. > > > > > > > If we can prove that XFS_DONE isn't stand for iodone, I think this issue > may > > cause by the hardware error. > > > > I find the err msg in dmesg: > > [ 9824.111366] mpt3sas_cm0: issue target reset: handle = (0x0034) > > > > Maybe it tell us mpt3sas lost the io requests before. > > > > Yes, maybe it does. > > Anyway, if your hardware had errors, could it be that your > filesystem is shutting down? > > If it does, you may be hit by the bug fixed by > 84d8949e7707 ("xfs: hold buffer across unpin and potential shutdown > processing") > but I am not sure if all the conditions in this bug match your case. > > If you did get hit by this bug, you may consider upgrade to v5.10.135 > which has the bug fix. > > Thanks, > Amir. I think xfs isn't in shutdown stat, crash> xfs_mount.m_flags 0xffff00ff85169000 -x m_flags = 0x100 (XFS_MOUNT_ATTR2) About the 84d8949e7707 ,I would try to back port to 5.10.38. Thanks
I have found the cause of the problem. By applying the patch d43aaf1685aa471f0593685c9f54d53e3af3cf3f , the problem can be solved.