Bug 150711
Summary: | kernel BUG at fs/attr.c:238! - when installing Slackware packages on overlayfs | ||
---|---|---|---|
Product: | File System | Reporter: | Eric Hameleers (alien) |
Component: | Other | Assignee: | fs_other |
Status: | NEW --- | ||
Severity: | high | CC: | eric, kjhambrick |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.4.16 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: | Patch fixes the reported issue. |
Description
Eric Hameleers
2016-07-29 20:22:02 UTC
More information: the same bug is triggered in a VMWare Player instance as well as on real hardware. There are several - but not all - packages in Slackware that trigger the bug. Two of them are the 'dcron' and the 'floppy' package. I tested the 'floppy' package installation by running "sh -x /sbin/installpkg /path/to/floppy*.t?z". The bug is triggered by this line in Slackware's "/sbin/installpkg" program which is meant to extract the Slackware package (a simple compressed tarball) and add a listing of the package's content to the package database (in Slackware the package database entry is a simple text file in the directory /var/log/packages): ( cd $ROOT/ ; $packagecompression -dc | $TAR -xlUpvf - | sort ) < $package >> $TMP/$shortname 2> /dev/null Looking at the package database file, the last file to be written to the file before the kernel bug triggered, is "usr/bin/fdmount" which is setuid root: -rwsr-x--- 1 root floppy 23200 Aug 2 2013 /usr/bin/fdmount* This is also the last file that was actually extracted to the filesystem, and it does have the setuid bit set. Judging by the mention of "ovl_setattr" in the trace, it is possible that the setuid bit is the troublesome bit. The other package I tested, 'dcron', also contains a setuid program: -rws--x--x 1 root root 14712 May 10 20:02 /usr/bin/crontab And in this case as well, it is the last file from that package which is written to the filesystem when the kernel bug is triggered. I verified this by comparing an unsorted listing of the tarball with the (sorted) package listing in /var/log/packages/ . Here is a post on the Slackware forum at LQ from someone who thinks he may have found the culprit : http://www.linuxquestions.org/questions/slackware-14/slackware64-live-14-2-a-4175585764/#post5583302 It's this modification: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/diff/fs/overlayfs/inode.c?id=v4.4.16&id2=v4.4.15 Thanks Eric. I just this morning created a login here to post my $0.02 here. It looks like the mode &= S_IFMT line needs to be modified and a tad and then put it back where it was in fs/overlayfs/inode.c, upstream of the call to get_next_ino() at line 415 ... ? 'modified a tad' == maybe `or-in` another mode flag or two ? HTH. -- kjh Confirmed here on 4.4.16, and luckily, there were only three ovl commits between 4.4.15 and 4.4.16, so this was pretty easy to narrow down the offending commit. It's this one: commit c12dada5f28a4894b81df2666c060f5cecc02cf9 Author: Vivek Goyal <vgoyal@redhat.com> Date: Fri Jul 1 16:34:25 2016 -0400 ovl: Copy up underlying inode's ->i_mode to overlay inode commit 07a2daab49c549a37b5b744cbebb6e3f445f12bc upstream. I don't see why though :/ If you wish, once this is solved, a bisected-by tag is fine - use rworkman@slackware.com for the mail. We'll be happy to test potential solutions. Confirmed here on 4.4.16. I found the root cause though which has nothing to do with modifying any of the three ovl commits that went in to 4.4.16. The offending commit in 4.4.16 (c12dada5f28a4894b81df2666c060f5cecc02cf9) requires another commit that *wasn't* ported in to 4.4.16: commit b99c2d913810e56682a538c9f2394d76fca808f8 ovl: handle ATTR_KILL* Before 4bacc9c9234c ("overlayfs: Make f_path...") file->f_path pointed to the underlying file, hence suid/sgid removal on write worked fine. After that patch file->f_path pointed to the overlay file, and the file mode bits weren't copied to overlay_inode->i_mode. So the suid/sgid removal simply stopped working. The fix is to copy the mode bits, but then ovl_setattr() needs to clear ATTR_MODE to avoid the BUG() in notify_change(). So do this first, then in the next patch copy the mode. Kernel maintainer (greg) will have to port this upstream commit to 4.4.y branch. I manually did and it fixed this issue. I can confirm that this patch which I backported from the 4.6 kernel (the b99c2d913810e56682a538c9f2394d76fca808f8 commit above) fixes my issue with the 4.4.16 kernel: # --- cut here --- --- a/fs/overlayfs/inode.c 2016-07-31 15:41:55.613057608 +0200 +++ b/fs/overlayfs/inode.c 2016-07-31 15:46:24.498060788 +0200 @@ -63,6 +63,9 @@ if (!err) { upperdentry = ovl_dentry_upper(dentry); + if (attr->ia_valid & (ATTR_KILL_SUID|ATTR_KILL_SGID)) + attr->ia_valid &= ~ATTR_MODE; + mutex_lock(&upperdentry->d_inode->i_mutex); err = notify_change(upperdentry, attr, NULL); if (!err) # --- cut here --- Created attachment 227011 [details]
Patch fixes the reported issue.
This is the same patch I referred to in my earlier comment. This patch fixes my issue.
Eric Schultz -- As I said over on my LQ post, I don't understand the Kernel, but being curious ... Do the two commits from 4.6.y ( below ) appear to also belong in the 4.4.y tree along with commit: b99c2d913810e56682a538c9f2394d76fca808f8 / ovl: handle ATTR_KILL* ? The changes are both isolated to fs/overlayfs/super.c function ovl_fill_super(). And they would seem to be applicable to 4.4.y ... Thanks for your time ! -- kjh commit 429cc19dbdce4568f4d3f0da06596c1d4d559f24 Author: Vivek Goyal <vgoyal@redhat.com> Date: Fri Jul 1 10:02:44 2016 -0400 ovl: warn instead of error if d_type is not supported commit e7c0b5991dd1be7b6f6dc2b54a15a0f47b64b007 upstream. overlay needs underlying fs to support d_type. Recently I put in a patch in to detect this condition and started failing mount if underlying fs did not support d_type. But this breaks existing configurations over kernel upgrade. Those who are running docker (partially broken configuration) with xfs not supporting d_type, are surprised that after kernel upgrade docker does not run anymore. https://github.com/docker/docker/issues/22937#issuecomment-229881315 So instead of erroring out, detect broken configuration and warn about it. This should allow existing docker setups to continue working after kernel upgrade. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 45aebeaf4f67 ("ovl: Ensure upper filesystem supports d_type") Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> commit 1cc1665689afab53364915003a945779589356b8 Author: Vivek Goyal <vgoyal@redhat.com> Date: Fri May 20 09:04:26 2016 -0400 ovl: Do d_type check only if work dir creation was successful commit 21765194cecf2e4514ad75244df459f188140a0f upstream. d_type check requires successful creation of workdir as iterates through work dir and expects work dir to be present in it. If that's not the case, this check will always return d_type not supported even if underlying filesystem might be supporting it. So don't do this check if work dir creation failed in previous step. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Konrad Hambrick -- Those two additional commits you mentioned in 4.6.y are not applicable to 4.4.y -- that is, the d_type check(s) in overlayfs were just introduced in 4.6. Funny thing is I tried to upgrade to 4.6 over a month ago and couldn't mount a overlayfs path with a reiserfs filesystem with the error that d_type wasn't supported. That commit 429cc19dbdce4568f4d3f0da06596c1d4d559f24 (ovl: warn instead of error if d_type is not supported) would have probably helped me if it existed back then. I'm still on 4.4 today but have switched to BTRFS and will jump to 4.7 soon. Eric Hameleers -- Thanks for attaching that patch, that's exactly what I modified locally and verified working before I came here to post on my findings. I shot a email to Greg before I posted here and he has accepted this patch and merged it in to 4.4.y this morning. |