Bug 217522
Summary: | xfs_attr3_leaf_add_work produces a warning | ||
---|---|---|---|
Product: | File System | Reporter: | Vladimir Lomov (lomov.vl) |
Component: | XFS | Assignee: | FileSystem/XFS Default Virtual Assignee (filesystem_xfs) |
Status: | NEW --- | ||
Severity: | normal | ||
Priority: | P3 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | Subsystem: | ||
Regression: | No | Bisected commit-id: | |
Attachments: |
signature.asc
signature.asc |
Description
Vladimir Lomov
2023-06-03 03:58:25 UTC
On Sat, Jun 03, 2023 at 03:58:25AM +0000, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=217522 > > Bug ID: 217522 > Summary: xfs_attr3_leaf_add_work produces a warning > Product: File System > Version: 2.5 > Hardware: All > OS: Linux > Status: NEW > Severity: normal > Priority: P3 > Component: XFS > Assignee: filesystem_xfs@kernel-bugs.kernel.org > Reporter: lomov.vl@bkoty.ru > Regression: No > > Hi. > > While running linux-next > (6.4.0-rc4-next-20230602-1-next-git-06849-gbc708bbd8260) on one of my hosts, > I > see the following message in the kernel log (`dmesg`): > ``` > Jun 02 20:01:19 smoon.bkoty.ru kernel: ------------[ cut here ]------------ > Jun 02 20:01:19 smoon.bkoty.ru kernel: memcpy: detected field-spanning write > (size 12) of single field "(char *)name_loc->nameval" at Yes, this bug is a collision between the bad old ways of doing flex arrays: typedef struct xfs_attr_leaf_name_local { __be16 valuelen; /* number of bytes in value */ __u8 namelen; /* length of name bytes */ __u8 nameval[1]; /* name/value bytes */ } xfs_attr_leaf_name_local_t; And the static checking that gcc/llvm purport to be able to do properly. This is encoded into the ondisk structures, which means that someone needs to do perform a deep audit to change each array[1] into an array[] and then ensure that every sizeof() performed on those structure definitions has been adjusted. Then they would need to run the full QA test suite to ensure that no regressions have been introduced. Then someone will need to track down any code using /usr/include/xfs/xfs_da_format.h to let them know about the silent compiler bomb heading their way. I prefer we leave it as-is since this code has been running for years with no problems. --D > fs/xfs/libxfs/xfs_attr_leaf.c:1559 (size 1) > Jun 02 20:01:19 smoon.bkoty.ru kernel: WARNING: CPU: 2 PID: 1161 at > fs/xfs/libxfs/xfs_attr_leaf.c:1559 xfs_attr3_leaf_add_work+0x4f5/0x540 [xfs] > Jun 02 20:01:19 smoon.bkoty.ru kernel: Modules linked in: nft_fib_ipv6 > nft_nat > overlay rpcrdma rdma_cm iw_cm ib_cm ib_core wireguard curve25519_x86_64 > libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic > libchacha ip6_udp_tunnel udp_tunnel nft_fib_ipv4 n> > Jun 02 20:01:19 smoon.bkoty.ru kernel: crct10dif_pclmul snd_pcm_dmaengine > crc32_pclmul snd_hda_intel polyval_clmulni polyval_generic gf128mul > snd_intel_dspcfg ghash_clmulni_intel snd_intel_sdw_acpi sha512_ssse3 > snd_hda_codec aesni_intel ppdev snd_hda_core crypto_simd cryp> > Jun 02 20:01:19 smoon.bkoty.ru kernel: CPU: 2 PID: 1161 Comm: systemd-coredum > Tainted: G U > 6.4.0-rc4-next-20230602-1-next-git-06849-gbc708bbd8260 #1 > e2bc2c7c17ec9449d00023ecb23f332188dc6bfc > Jun 02 20:01:19 smoon.bkoty.ru kernel: Hardware name: Gigabyte Technology > Co., > Ltd. B460HD3/B460 HD3, BIOS F1 04/15/2020 > Jun 02 20:01:19 smoon.bkoty.ru kernel: RIP: > 0010:xfs_attr3_leaf_add_work+0x4f5/0x540 [xfs] > Jun 02 20:01:19 smoon.bkoty.ru kernel: Code: fe ff ff b9 01 00 00 00 4c 89 fe > 48 c7 c2 f8 95 87 c0 48 c7 c7 40 96 87 c0 48 89 44 24 08 c6 05 e5 35 11 00 01 > e8 5b cf 91 c7 <0f> 0b 48 8b 44 24 08 e9 88 fe ff ff 80 3d cc 35 11 00 00 0f > 85 > bd > Jun 02 20:01:19 smoon.bkoty.ru kernel: RSP: 0018:ffffb6050254b7f8 EFLAGS: > 00010282 > Jun 02 20:01:19 smoon.bkoty.ru kernel: RAX: 0000000000000000 RBX: > ffffb6050254b8c8 RCX: 0000000000000027 > Jun 02 20:01:19 smoon.bkoty.ru kernel: RDX: ffff9ce0ff2a1688 RSI: > 0000000000000001 RDI: ffff9ce0ff2a1680 > Jun 02 20:01:19 smoon.bkoty.ru kernel: RBP: ffffb6050254b85c R08: > 0000000000000000 R09: ffffb6050254b688 > Jun 02 20:01:19 smoon.bkoty.ru kernel: R10: 0000000000000003 R11: > ffffffff89aca028 R12: ffff9cd9f2fb6050 > Jun 02 20:01:19 smoon.bkoty.ru kernel: R13: ffff9cd9f2fb6000 R14: > ffff9cd9f2fb6fb0 R15: 000000000000000c > Jun 02 20:01:19 smoon.bkoty.ru kernel: FS: 00007f75cad39200(0000) > GS:ffff9ce0ff280000(0000) knlGS:0000000000000000 > Jun 02 20:01:19 smoon.bkoty.ru kernel: CS: 0010 DS: 0000 ES: 0000 CR0: > 0000000080050033 > Jun 02 20:01:19 smoon.bkoty.ru kernel: CR2: 00007f75cb7a1000 CR3: > 0000000155a3a002 CR4: 00000000003706e0 > Jun 02 20:01:19 smoon.bkoty.ru kernel: Call Trace: > Jun 02 20:01:19 smoon.bkoty.ru kernel: <TASK> > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? xfs_attr3_leaf_add_work+0x4f5/0x540 > [xfs ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? __warn+0x81/0x130 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? xfs_attr3_leaf_add_work+0x4f5/0x540 > [xfs ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? report_bug+0x171/0x1a0 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? prb_read_valid+0x1b/0x30 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? handle_bug+0x3c/0x80 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? exc_invalid_op+0x17/0x70 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? asm_exc_invalid_op+0x1a/0x20 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? xfs_attr3_leaf_add_work+0x4f5/0x540 > [xfs ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? xfs_attr3_leaf_add_work+0x4f5/0x540 > [xfs ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: xfs_attr3_leaf_add+0x1a3/0x210 [xfs > ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: > xfs_attr_shortform_to_leaf+0x23f/0x250 > [xfs ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: xfs_attr_set_iter+0x772/0x910 [xfs > ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: xfs_xattri_finish_update+0x18/0x50 > [xfs > ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: xfs_attr_finish_item+0x1e/0xb0 [xfs > ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: xfs_defer_finish_noroll+0x193/0x6e0 > [xfs ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: __xfs_trans_commit+0x2d8/0x3e0 [xfs > ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: xfs_attr_set+0x48a/0x6a0 [xfs > ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: xfs_xattr_set+0x8d/0xe0 [xfs > ecac3a792ff4924c3e2601105ba002d1f7178133] > Jun 02 20:01:19 smoon.bkoty.ru kernel: __vfs_setxattr+0x96/0xd0 > Jun 02 20:01:19 smoon.bkoty.ru kernel: __vfs_setxattr_noperm+0x77/0x1d0 > Jun 02 20:01:19 smoon.bkoty.ru kernel: vfs_setxattr+0x9f/0x180 > Jun 02 20:01:19 smoon.bkoty.ru kernel: setxattr+0x9e/0xc0 > Jun 02 20:01:19 smoon.bkoty.ru kernel: __x64_sys_fsetxattr+0xbf/0xf0 > Jun 02 20:01:19 smoon.bkoty.ru kernel: do_syscall_64+0x5d/0x90 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? syscall_exit_to_user_mode+0x1b/0x40 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? do_syscall_64+0x6c/0x90 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? syscall_exit_to_user_mode+0x1b/0x40 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? do_syscall_64+0x6c/0x90 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? syscall_exit_to_user_mode+0x1b/0x40 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? do_syscall_64+0x6c/0x90 > Jun 02 20:01:19 smoon.bkoty.ru kernel: ? exc_page_fault+0x7f/0x180 > Jun 02 20:01:19 smoon.bkoty.ru kernel: > entry_SYSCALL_64_after_hwframe+0x6e/0xd8 > Jun 02 20:01:19 smoon.bkoty.ru kernel: RIP: 0033:0x7f75cb2023be > Jun 02 20:01:19 smoon.bkoty.ru kernel: Code: 48 8b 0d 9d 49 0d 00 f7 d8 64 89 > 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 be > 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 6a 49 0d 00 f7 d8 64 89 > 01 > 48 > Jun 02 20:01:19 smoon.bkoty.ru kernel: RSP: 002b:00007ffd172d1a68 EFLAGS: > 00000202 ORIG_RAX: 00000000000000be > Jun 02 20:01:19 smoon.bkoty.ru kernel: RAX: ffffffffffffffda RBX: > 00007ffd172d2188 RCX: 00007f75cb2023be > Jun 02 20:01:19 smoon.bkoty.ru kernel: RDX: 000055d5735653ae RSI: > 000055d571d48a5f RDI: 0000000000000007 > Jun 02 20:01:19 smoon.bkoty.ru kernel: RBP: 000055d571d4b618 R08: > 0000000000000001 R09: 0000000000000001 > Jun 02 20:01:19 smoon.bkoty.ru kernel: R10: 000000000000000f R11: > 0000000000000202 R12: 000055d5735653ae > Jun 02 20:01:19 smoon.bkoty.ru kernel: R13: 0000000000000007 R14: > 000055d571d48a5f R15: 000055d571d4b638 > Jun 02 20:01:19 smoon.bkoty.ru kernel: </TASK> > Jun 02 20:01:19 smoon.bkoty.ru kernel: ---[ end trace 0000000000000000 ]--- > ``` > > On another host running the same kernel with almost identical environment > (CPU > and FS on hard disks), I don't see this message. > > The flags used to mount the FS: > ``` > $ grep 'xfs' /etc/fstab > PARTUUID=7c9a5053-216d-2b4e-8c73-22d16a87ae6b / xfs > rw,relatime,attr2,inode64,noquota 0 1 > PARTUUID=88b4e2db-862b-8b41-a331-66c483237a23 /var xfs > rw,relatime,attr2,inode64,noquota 0 2 > PARTUUID=d0099f96-70d9-3846-835c-e7d7da363048 /usr/local xfs > rw,relatime,attr2,inode64,noquota 0 2 > PARTUUID=ffde9d45-2275-c446-b54c-fcf96bd93a5f /home xfs > rw,relatime,attr2,inode64,noquota 0 2 > PARTUUID=8cec7c90-441a-1d49-94af-a5176a9fd973 /srv/nfs/cache xfs > rw,relatime,attr2,inode64,noquota 0 2 > PARTUUID=39dd3664-0a48-d144-8e65-414d5d549c2f /mnt/aux xfs > rw,relatime,attr2,inode64,noquota 0 2 > PARTUUID=a5480aca-273b-4d4c-8520-f782293ed878 /mnt/storage xfs > rw,relatime,attr2,inode64,noquota 0 2 > PARTUUID=231b1235-c9a1-e249-8332-fd9141c89ae7 /mnt/data xfs > rw,relatime,attr2,inode64,noquota 0 2 > PARTUUID=bab3d4b7-2b1e-492d-9298-de6170d2098f /mnt/archive xfs > rw,relatime,attr2,inode64,noquota 0 2 > PARTUUID=55fd9e2f-605e-4a01-b0c2-f6a9df302301 /media/storage xfs > auto,x-systemd.automount,x-systemd.device-timeout=20,nofail 0 2 > ``` > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are watching the assignee of the bug. Created attachment 304370 [details] signature.asc Hello ** bugzilla-daemon@kernel.org <bugzilla-daemon@kernel.org> [2023-06-03 14:50:24 +0000]: >https://bugzilla.kernel.org/show_bug.cgi?id=217522 > >--- Comment #1 from Darrick J. Wong (djwong@kernel.org) --- >On Sat, Jun 03, 2023 at 03:58:25AM +0000, bugzilla-daemon@kernel.org wrote: >> https://bugzilla.kernel.org/show_bug.cgi?id=217522 >> >> Bug ID: 217522 >> Summary: xfs_attr3_leaf_add_work produces a warning >> Product: File System >> Version: 2.5 >> Hardware: All >> OS: Linux >> Status: NEW >> Severity: normal >> Priority: P3 >> Component: XFS >> Assignee: filesystem_xfs@kernel-bugs.kernel.org >> Reporter: lomov.vl@bkoty.ru >> Regression: No >> >> Hi. >> >> While running linux-next >> (6.4.0-rc4-next-20230602-1-next-git-06849-gbc708bbd8260) on one of my hosts, >> I >> see the following message in the kernel log (`dmesg`): >> ``` >> Jun 02 20:01:19 smoon.bkoty.ru kernel: ------------[ cut here ]------------ >> Jun 02 20:01:19 smoon.bkoty.ru kernel: memcpy: detected field-spanning write >> (size 12) of single field "(char *)name_loc->nameval" at > > Yes, this bug is a collision between the bad old ways of doing flex > arrays: > > typedef struct xfs_attr_leaf_name_local { > __be16 valuelen; /* number of bytes in value */ > __u8 namelen; /* length of name bytes */ > __u8 nameval[1]; /* name/value bytes */ > } xfs_attr_leaf_name_local_t; > And the static checking that gcc/llvm purport to be able to do properly. Something similar has caused problems with kernel compilation before: https://lkml.org/lkml/2023/5/24/576 (I'm not 100% sure if the origin is the same though). > This is encoded into the ondisk structures, which means that someone > needs to do perform a deep audit to change each array[1] into an > array[] and then ensure that every sizeof() performed on those structure > definitions has been adjusted. Then they would need to run the full QA > test suite to ensure that no regressions have been introduced. Then > someone will need to track down any code using > /usr/include/xfs/xfs_da_format.h to let them know about the silent > compiler bomb heading their way. > I prefer we leave it as-is since this code has been running for years > with no problems. Should I assume that this problem is not significant and won't have any effect to the FS and won't cause the FS to misbehave or become corrupted? If so, why does the problem only show up on one host but not on the other? Or is this a runtime check, and it somehow happens on the first system (even rebooted twice), but not on the second one. [...] --- Vladimir Lomov On Sun, Jun 04, 2023 at 03:31:20AM +0000, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=217522 > > --- Comment #2 from Vladimir Lomov (lomov.vl@bkoty.ru) --- > Hello > ** bugzilla-daemon@kernel.org <bugzilla-daemon@kernel.org> [2023-06-03 > 14:50:24 > +0000]: > > >https://bugzilla.kernel.org/show_bug.cgi?id=217522 > > > >--- Comment #1 from Darrick J. Wong (djwong@kernel.org) --- > >On Sat, Jun 03, 2023 at 03:58:25AM +0000, bugzilla-daemon@kernel.org wrote: > >> https://bugzilla.kernel.org/show_bug.cgi?id=217522 > >> > >> Bug ID: 217522 > >> Summary: xfs_attr3_leaf_add_work produces a warning > >> Product: File System > >> Version: 2.5 > >> Hardware: All > >> OS: Linux > >> Status: NEW > >> Severity: normal > >> Priority: P3 > >> Component: XFS > >> Assignee: filesystem_xfs@kernel-bugs.kernel.org > >> Reporter: lomov.vl@bkoty.ru > >> Regression: No > >> > >> Hi. > >> > >> While running linux-next > >> (6.4.0-rc4-next-20230602-1-next-git-06849-gbc708bbd8260) on one of my > hosts, > >> I > >> see the following message in the kernel log (`dmesg`): > >> ``` > >> Jun 02 20:01:19 smoon.bkoty.ru kernel: ------------[ cut here > ]------------ > >> Jun 02 20:01:19 smoon.bkoty.ru kernel: memcpy: detected field-spanning > write > >> (size 12) of single field "(char *)name_loc->nameval" at > > > > Yes, this bug is a collision between the bad old ways of doing flex > > arrays: > > > > typedef struct xfs_attr_leaf_name_local { > > __be16 valuelen; /* number of bytes in value */ > > __u8 namelen; /* length of name bytes */ > > __u8 nameval[1]; /* name/value bytes */ > > } xfs_attr_leaf_name_local_t; > > > And the static checking that gcc/llvm purport to be able to do properly. > > Something similar has caused problems with kernel compilation before: > https://lkml.org/lkml/2023/5/24/576 (I'm not 100% sure if the origin is the > same though). Yup. > > This is encoded into the ondisk structures, which means that someone > > needs to do perform a deep audit to change each array[1] into an > > array[] and then ensure that every sizeof() performed on those structure > > definitions has been adjusted. Then they would need to run the full QA > > test suite to ensure that no regressions have been introduced. Then > > someone will need to track down any code using > > /usr/include/xfs/xfs_da_format.h to let them know about the silent > > compiler bomb heading their way. > > > I prefer we leave it as-is since this code has been running for years > > with no problems. > > Should I assume that this problem is not significant and won't have any > effect > to the FS and won't cause the FS to misbehave or become corrupted? If so, why > does the problem only show up on one host but not on the other? Or is this a > runtime check, and it somehow happens on the first system (even rebooted > twice), but not on the second one. AFAICT, there's no real memory corruption problem here; it's just that the compiler treats array[1] as a single-element array instead of turning on whatever magic enables it to handle flexarrays (aka array[] or array[0]). I don't know why you'd ever want a real single-element array, but legacy C is fun like that. :/ --D > [...] > > --- > Vladimir Lomov > > -- > You may reply to this email to add a comment. > > You are receiving this mail because: > You are watching the assignee of the bug. Created attachment 304371 [details] signature.asc Hello. ** bugzilla-daemon@kernel.org <bugzilla-daemon@kernel.org> [2023-06-04 18:32:00 +0000]: > https://bugzilla.kernel.org/show_bug.cgi?id=217522 >>> Yes, this bug is a collision between the bad old ways of doing flex >>> arrays: >>> >>> typedef struct xfs_attr_leaf_name_local { >>> __be16 valuelen; /* number of bytes in value */ >>> __u8 namelen; /* length of name bytes */ >>> __u8 nameval[1]; /* name/value bytes */ >>> } xfs_attr_leaf_name_local_t; >>> And the static checking that gcc/llvm purport to be able to do properly. >> Something similar has caused problems with kernel compilation before: >> https://lkml.org/lkml/2023/5/24/576 (I'm not 100% sure if the origin is the >> same though). > Yup. Ok, I see. The "proper" way to get rid of the warning requires too much effort, so there are doubts as to whether it is worth it. >>> This is encoded into the ondisk structures, which means that someone >>> needs to do perform a deep audit to change each array[1] into an >>> array[] and then ensure that every sizeof() performed on those structure >>> definitions has been adjusted. Then they would need to run the full QA >>> test suite to ensure that no regressions have been introduced. Then >>> someone will need to track down any code using >>> /usr/include/xfs/xfs_da_format.h to let them know about the silent >>> compiler bomb heading their way. >>> I prefer we leave it as-is since this code has been running for years >>> with no problems. >> Should I assume that this problem is not significant and won't have any >> effect >> to the FS and won't cause the FS to misbehave or become corrupted? If so, >> why >> does the problem only show up on one host but not on the other? Or is this a >> runtime check, and it somehow happens on the first system (even rebooted >> twice), but not on the second one. > AFAICT, there's no real memory corruption problem here; it's just that > the compiler treats array[1] as a single-element array instead of > turning on whatever magic enables it to handle flexarrays (aka array[] > or array[0]). I don't know why you'd ever want a real single-element > array, but legacy C is fun like that. :/ Ok, I get it, but what bothers me is why I only see this message on one system and not the other. At first I thought it had to do with the fact that I explicitly set "read-only" attribute (chattr +i) to one file (/etc/resolv.conf), but I checked that both systems had the same settings on that file. Then I thought it might be a problem with XFS, but I configured to run fsck on every boot, so that the problem would be revealed at boot time, and I wouldn't see it again after the next reboot. But the message remains even after reboot. So I must conclude that the warning has nothing to do with the FS and the problem lies somewhere else. I'm puzzled why I don't see this message on the second system, especially since I didn't see it with kernel 5.15 and the previous linux-next (I have a different problem with these systems, so I don't run kernels 6.0+, but I'm running linux-next to see if the problem persists). Let me stress what worries me: why am I seeing this message on one system and not on the other? Why I didn't see this message on the previous linux-next (compiled with the same compiler)? It might be related to the disks used (HDD, SSD SATA and NVME), because on the system in question systemd gives a warning like 'invalid GPT table' (or something like that, not the exact wording), even when I have repartitioned the disk. [...] --- Vladimir Lomov |