1. Apply diff and compile kernel 2. useradd freg 3. ./test.sh [ 334.663558] overlayfs: get inode, wait drop cache, dir [ 334.688095] destroy inode [ 335.664586] overlayfs: wait done [ 335.664932] overlayfs: XXXX inodestate 60 unhash 1 upper dentry ffff88817b64c600 dentry->inode 0000000000000000 [ 335.664979] BUG: kernel NULL pointer dereference, address: 0000000000000002 [ 335.666508] #PF: supervisor read access in kernel mode [ 335.666965] #PF: error_code(0x0000) - not-present page [ 335.667429] PGD 0 P4D 0 [ 335.667673] Oops: 0000 [#1] PREEMPT SMP [ 335.668032] CPU: 0 PID: 2592 Comm: ls Not tainted 6.3.0-12065-g3fafea7f3ad0-dirty #1221 [ 335.668752] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproje4 [ 335.669956] RIP: 0010:inode_permission+0x33/0x2c0 [ 335.670392] Code: 41 56 41 55 41 89 d5 41 54 41 89 d4 55 48 89 fd 53 48 89 f3 48 83 05 8b 1f f2 0c 01 41 83 e4 02 75 6c 48 8a [ 335.672101] RSP: 0018:ffffc9000190bba8 EFLAGS: 00010206 [ 335.672578] RAX: ffff888103e32520 RBX: 0000000000000000 RCX: 0000000000000027 [ 335.673216] RDX: 0000000000000081 RSI: 0000000000000000 RDI: ffffffff8a4aff50 [ 335.673868] RBP: ffffffff8a4aff50 R08: ffff88817a97cf00 R09: ffffc9000190ba78 [ 335.674509] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 [ 335.675145] R13: 0000000000000081 R14: ffff88817b73e0f8 R15: ffffffff8a4aff50 [ 335.675789] FS: 00007fa2bf908540(0000) GS:ffff88842fc00000(0000) knlGS:0000000000000000 [ 335.676513] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 335.677028] CR2: 0000000000000002 CR3: 0000000173508000 CR4: 00000000000006f0 [ 335.677675] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 335.678308] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 335.678939] Call Trace: [ 335.679165] <TASK> [ 335.679371] ovl_permission+0xde/0x320 [ 335.679723] inode_permission+0x15e/0x2c0 [ 335.680090] link_path_walk+0x115/0x550 [ 335.680443] ? path_init+0x3b1/0x560 [ 335.680771] path_lookupat.isra.0+0xb2/0x200 [ 335.681170] filename_lookup+0xda/0x240 [ 335.681540] ? __create_object+0x2b3/0x5c0 [ 335.681922] vfs_statx+0xa6/0x1f0 [ 335.682233] vfs_fstatat+0x7b/0xb0 [ 335.682556] __do_sys_newlstat+0x3b/0x90 [ 335.682923] ? exit_to_user_mode_prepare+0x130/0x340 [ 335.683380] __x64_sys_newlstat+0x1e/0x30 [ 335.683758] do_syscall_64+0x39/0x80 [ 335.684103] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 335.684568] RIP: 0033:0x7fa2beaffa75 [ 335.684910] Code: 19 f4 2c 00 64 c7 00 16 00 00 00 b8 ff ff ff ff c3 0f 1f 40 00 83 ff 01 48 89 f0 77 30 48 89 c7 48 89 d6 b9 [ 335.686562] RSP: 002b:00007ffedf0503e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000006 [ 335.687243] RAX: ffffffffffffffda RBX: 00007ffedf0524d0 RCX: 00007fa2beaffa75 [ 335.687887] RDX: 0000559a4ec6b150 RSI: 0000559a4ec6b150 RDI: 00007ffedf0524d0 [ 335.688529] RBP: 00007ffedf050790 R08: 0000000000000004 R09: 00007ffedf0524e7 [ 335.689176] R10: 00000000df052400 R11: 0000000000000246 R12: 00007ffedf0524d0 [ 335.689828] R13: 0000000000000000 R14: 0000559a4ec6b140 R15: 0000000000000000 [ 335.690471] </TASK> [ 335.690680] Modules linked in: [ 335.690975] CR2: 0000000000000002 [ 335.691304] ---[ end trace 0000000000000000 ]--- [ 335.691743] RIP: 0010:inode_permission+0x33/0x2c0 [ 335.692175] Code: 41 56 41 55 41 89 d5 41 54 41 89 d4 55 48 89 fd 53 48 89 f3 48 83 05 8b 1f f2 0c 01 41 83 e4 02 75 6c 48 8a [ 335.693858] RSP: 0018:ffffc9000190bba8 EFLAGS: 00010206 [ 335.694330] RAX: ffff888103e32520 RBX: 0000000000000000 RCX: 0000000000000027 [ 335.694992] RDX: 0000000000000081 RSI: 0000000000000000 RDI: ffffffff8a4aff50 [ 335.695649] RBP: ffffffff8a4aff50 R08: ffff88817a97cf00 R09: ffffc9000190ba78 [ 335.696287] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 [ 335.696944] R13: 0000000000000081 R14: ffff88817b73e0f8 R15: ffffffff8a4aff50 [ 335.697619] FS: 00007fa2bf908540(0000) GS:ffff88842fc00000(0000) knlGS:0000000000000000 [ 335.698341] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 335.698881] CR2: 0000000000000002 CR3: 0000000173508000 CR4: 00000000000006f0 [ 335.699524] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 335.700173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 335.700829] Kernel panic - not syncing: Fatal exception [ 335.701374] Kernel Offset: disabled [ 335.701720] ---[ end Kernel panic - not syncing: Fatal exception ]---
Created attachment 304220 [details] test.sh
Created attachment 304221 [details] diff
I am also facing this issue and I am checking if dentry is in KILLED state to fix the issue. I see it makes sense to check this instead of inode NULL check, comment?
(In reply to Bhaskar Valaboju from comment #3) > I am also facing this issue and I am checking if dentry is in KILLED state > to fix the issue. I see it makes sense to check this instead of inode NULL > check, comment? Do you judge the KILLED state by checking d_unhashed(dentry)? When do the check? Could you please sendout the fix code? Thanks.
Please find below the change I am testing: diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index 541cf3717fc2..69c481cc820f 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -289,7 +289,7 @@ int ovl_permission(struct mnt_idmap *idmap, /* Careful in RCU walk mode */ ovl_i_path_real(inode, &realpath); - if (!realpath.dentry) { + if (!realpath.dentry || realpath.dentry->d_flags & DCACHE_DENTRY_KILLED) { WARN_ON(!(mask & MAY_NOT_BLOCK)); return -ECHILD; } @@ -402,6 +402,9 @@ int ovl_xattr_get(struct dentry *dentry, struct inode *inode, const char *name, struct path realpath; ovl_i_path_real(inode, &realpath); + if (!realpath.dentry || realpath.dentry->d_flags & DCACHE_DENTRY_KILLED) + return -ECHILD; + old_cred = ovl_override_creds(dentry->d_sb); res = vfs_getxattr(mnt_idmap(realpath.mnt), realpath.dentry, name, value, size); revert_creds(old_cred); @@ -568,7 +571,7 @@ struct posix_acl *do_ovl_get_acl(struct mnt_idmap *idmap, /* Careful in RCU walk mode */ ovl_i_path_real(inode, &realpath); - if (!realpath.dentry) { + if (!realpath.dentry || realpath.dentry->d_flags & DCACHE_DENTRY_KILLED) { WARN_ON(!rcu); return ERR_PTR(-ECHILD); }
(In reply to Bhaskar Valaboju from comment #5) > Please find below the change I am testing: > > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c > index 541cf3717fc2..69c481cc820f 100644 > --- a/fs/overlayfs/inode.c > +++ b/fs/overlayfs/inode.c > @@ -289,7 +289,7 @@ int ovl_permission(struct mnt_idmap *idmap, > > /* Careful in RCU walk mode */ > ovl_i_path_real(inode, &realpath); > - if (!realpath.dentry) { > + if (!realpath.dentry || realpath.dentry->d_flags & > DCACHE_DENTRY_KILLED) { > WARN_ON(!(mask & MAY_NOT_BLOCK)); > return -ECHILD; > } We'd better hold realpath.dentry->d_lock to check 'd_flags' and get 'd_inode', otherwise: P1 P2 ovl_permission if (realpath.dentry->d_flags & DCACHE_DENTRY_KILLED) // false drop_cache __dentry_kill dentry_unlist dentry->d_flags |= DCACHE_DENTRY_KILLED dentry_unlink_inode dentry->d_inode = NULL realinode = d_inode(realpath.dentry) // NULL Am I understanding right? > @@ -402,6 +402,9 @@ int ovl_xattr_get(struct dentry *dentry, struct inode > *inode, const char *name, > struct path realpath; > > ovl_i_path_real(inode, &realpath); > + if (!realpath.dentry || realpath.dentry->d_flags & > DCACHE_DENTRY_KILLED) > + return -ECHILD; > + > old_cred = ovl_override_creds(dentry->d_sb); > res = vfs_getxattr(mnt_idmap(realpath.mnt), realpath.dentry, name, > value, size); > revert_creds(old_cred); BTW, path_getxattr user_path_at(path) // hold refcnt for path.dentry getxattr(path.dentry) vfs_getxattr ovl_xattr_get // realpath.dentry cannot be killed path_put(&path) Are there any other paths to trigger null-ptr-deference in ovl_xattr_get()? > @@ -568,7 +571,7 @@ struct posix_acl *do_ovl_get_acl(struct mnt_idmap *idmap, > > /* Careful in RCU walk mode */ > ovl_i_path_real(inode, &realpath); > - if (!realpath.dentry) { > + if (!realpath.dentry || realpath.dentry->d_flags & > DCACHE_DENTRY_KILLED) { > WARN_ON(!rcu); > return ERR_PTR(-ECHILD); > }
(In reply to Zhihao Cheng from comment #6) > (In reply to Bhaskar Valaboju from comment #5) > > Please find below the change I am testing: > > > > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c > > index 541cf3717fc2..69c481cc820f 100644 > > --- a/fs/overlayfs/inode.c > > +++ b/fs/overlayfs/inode.c > > @@ -289,7 +289,7 @@ int ovl_permission(struct mnt_idmap *idmap, > > > > /* Careful in RCU walk mode */ > > ovl_i_path_real(inode, &realpath); > > - if (!realpath.dentry) { > > + if (!realpath.dentry || realpath.dentry->d_flags & > > DCACHE_DENTRY_KILLED) { > > WARN_ON(!(mask & MAY_NOT_BLOCK)); > > return -ECHILD; > > } > > We'd better hold realpath.dentry->d_lock to check 'd_flags' and get > 'd_inode', otherwise: > P1 P2 > ovl_permission > if (realpath.dentry->d_flags & DCACHE_DENTRY_KILLED) // false > drop_cache > __dentry_kill > dentry_unlist > dentry->d_flags |= DCACHE_DENTRY_KILLED > dentry_unlink_inode > dentry->d_inode = NULL > realinode = d_inode(realpath.dentry) // NULL > > Am I understanding right? Makes sense. > > > @@ -402,6 +402,9 @@ int ovl_xattr_get(struct dentry *dentry, struct inode > > *inode, const char *name, > > struct path realpath; > > > > ovl_i_path_real(inode, &realpath); > > + if (!realpath.dentry || realpath.dentry->d_flags & > > DCACHE_DENTRY_KILLED) > > + return -ECHILD; > > + > > old_cred = ovl_override_creds(dentry->d_sb); > > res = vfs_getxattr(mnt_idmap(realpath.mnt), realpath.dentry, name, > > value, size); > > revert_creds(old_cred); > > BTW, > path_getxattr > user_path_at(path) // hold refcnt for path.dentry > getxattr(path.dentry) > vfs_getxattr > ovl_xattr_get // realpath.dentry cannot be killed > path_put(&path) > > Are there any other paths to trigger null-ptr-deference in ovl_xattr_get()? > There are no other parts, we can ignore this change. > > @@ -568,7 +571,7 @@ struct posix_acl *do_ovl_get_acl(struct mnt_idmap > *idmap, > > > > /* Careful in RCU walk mode */ > > ovl_i_path_real(inode, &realpath); > > - if (!realpath.dentry) { > > + if (!realpath.dentry || realpath.dentry->d_flags & > > DCACHE_DENTRY_KILLED) { > > WARN_ON(!rcu); > > return ERR_PTR(-ECHILD); > > }