Bug 213273
Summary: | BUG: kernel NULL pointer dereference - nfs4_proc_lock | ||
---|---|---|---|
Product: | File System | Reporter: | Daire Byrne (daire) |
Component: | NFS | Assignee: | Trond Myklebust (trondmy) |
Status: | NEW --- | ||
Severity: | normal | CC: | bfields |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.13 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | lockd: lockd server-side shouldn't set fl_ops |
Description
Daire Byrne
2021-05-30 10:51:58 UTC
I can reproduce this just by running connectathon tests over a 4.2 re-export of a v3 mount. I can't explain it yet. The v4.2/v4.2 case crashes too (in a different place); I can post a patch for that. In general file locking on re-exports looks in pretty poor shape, but fixable.... No... file locking on re-exports is terminally broken and is not trivially fixable. Consider the case where the re-exporting server reboots, but neither the NFS client nor the 'original' server change. How do you recover locks? (In reply to Trond Myklebust from comment #2) > No... file locking on re-exports is terminally broken and is not trivially > fixable. > > Consider the case where the re-exporting server reboots, but neither the NFS > client nor the 'original' server change. How do you recover locks? I was thinking it would be worthwhile just to fix up the crashes. If we don't think "works if the re-export server never restarts" is really a supportable state, though, then maybe safest for now would be to add an export_op flag or otherwise give nfsd a way to fail any locking attempts. I think we can handle recovery of the export server eventually, but it's definitely not trivial. In our particular case, the reboot recovery limitation is acceptable as long as the re-export server doesn't crash or needs to be rebooted. Our production NFS servers (RHEL) have uptimes in the order of years after all. Although I do appreciate that it would be better to have a general (reboot recovery) solution so that users didn't have to worry about such caveats that may or may not be written down somewhere (the re-export wiki page only helps if you know where to look). But obviously now that we are seeing crashes with some (rare) workloads, we can no longer even rely on our hopeful "don't reboot or crash" strategy. So on a purely selfish level, fixing the crashes would help us out and ensure we never have to deal with the reboot recovery problem. Or at least if we do reboot, it's part of a scheduled outage and we understand that the clients will also need rebooting. Created attachment 297343 [details]
lockd: lockd server-side shouldn't set fl_ops
Does this (applied to kernel on the re-export server) fix the crash?
Thanks for that! I've put it into production now but I should say that I have not seen the crash in a couple of days. It was at least twice a day all last week. So it may take a little while to say anything for sure as this particular workload seems to come and go and I don't know what it is exactly. I'm assuming that this patch fixes your reproducible connectathon crashes (NFSv4.2 re-export of a NFSv3 mount)? I was sure this was actually happening for us with a NFSv3 re-export of a NFSv4.2 client mount (the other way around), but I have just realised that the same re-export server was also re-exporting a single NFSv3 mount over NFSv4.2 too which confused matters somewhat (well, me at least). I've got two easily reproducible crashes: one a NULL deference that looks like yours, whenever I take a lock on a v3 re-export of a v4.2 mount; the other a 4.2 re-export of a 4.2 mount, which I have a separate patch for. I've also got a third patch that forces reclaims after a reboot to fail in the re-export case; that seems like the safer way to handle the lack of reboot recovery support. Thanks Bruce, these patches sound great for us. I haven't seen any issues over the weekend's production loads but I'll continue to run with it throughout the week. I'm actually a little surprised that after almost 2 years of using a re-export server, I've only started seeing this lock crash on a v3 re-export of a V4.2 mount recently. Now I know our workloads are not lock heavy (lots of unique output files per process), but certainly there should be lots of overlapping read locks as we pull common (cacheable) files from lots of clients. I actually stopped using an NFSv4.2 re-export of an NFSv4.2 mount because I was also hitting a rare hang there too. I didn't spend too much time digging into it because the NFSv3 re-export of NFSv4.2 worked fine for our purposes. We get the benefits of NFSv4.2 over the WAN (less chatter) and then NFSv3 to local LAN clients is performant. For NFS v4.2 + v4.2, there wasn't a crash exactly, it's just that the all the nfsds would get stuck (on something) and then I was just left with slowpath messages: Oct 3 19:37:56 loncloudnfscache1 kernel: INFO: task nfsd:2864 blocked for more than 122 seconds. Oct 3 19:37:56 loncloudnfscache1 kernel: Not tainted 5.7.10-1.dneg.x86_64 #1 Oct 3 19:37:56 loncloudnfscache1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 3 19:37:56 loncloudnfscache1 kernel: nfsd D 0 2864 2 0x80004000 Oct 3 19:37:56 loncloudnfscache1 kernel: Call Trace: Oct 3 19:37:56 loncloudnfscache1 kernel: __schedule+0x3a1/0x6f0 Oct 3 19:37:56 loncloudnfscache1 kernel: ? nfs_check_cache_invalid+0x38/0xa0 [nfs] Oct 3 19:37:56 loncloudnfscache1 kernel: schedule+0x4f/0xc0 Oct 3 19:37:56 loncloudnfscache1 kernel: rwsem_down_write_slowpath+0x2bb/0x472 Oct 3 19:37:56 loncloudnfscache1 kernel: down_write+0x42/0x50 Oct 3 19:37:56 loncloudnfscache1 kernel: nfsd_lookup_dentry+0xba/0x420 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: ? fh_verify+0x341/0x6e0 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: nfsd_lookup+0x82/0x140 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: nfsd4_lookup+0x1a/0x20 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: nfsd4_proc_compound+0x646/0x830 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: ? svc_reserve+0x40/0x50 [sunrpc] Oct 3 19:37:56 loncloudnfscache1 kernel: nfsd_dispatch+0xc1/0x260 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: svc_process_common+0x323/0x760 [sunrpc] Oct 3 19:37:56 loncloudnfscache1 kernel: ? svc_sock_secure_port+0x16/0x40 [sunrpc] Oct 3 19:37:56 loncloudnfscache1 kernel: ? nfsd_svc+0x360/0x360 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: svc_process+0xfc/0x110 [sunrpc] Oct 3 19:37:56 loncloudnfscache1 kernel: nfsd+0xe9/0x160 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: kthread+0x105/0x140 Oct 3 19:37:56 loncloudnfscache1 kernel: ? nfsd_destroy+0x60/0x60 [nfsd] Oct 3 19:37:56 loncloudnfscache1 kernel: ? kthread_bind+0x20/0x20 Oct 3 19:37:56 loncloudnfscache1 kernel: ret_from_fork+0x35/0x40 All the nfsd threads and clients were completely stalled until a reboot (of everything). I'd be interested to re-test with your v4.2 + v4.2 patch to see if it makes any difference. Many thanks for your time. I know we are a bit of an oddball case, but it mostly works for us. I'd say it has even helped us win Oscars... ;) Daire Just a small update to say that I've not seen this crash at all this week or any (other issues for that matter). I'm running with it applied to 5.13-rc6. I hope to be able to test a NFSv4.2 re-export of an NFSv4.2 server later next week to see if that other patch helps us out there too. Thanks again, Daire Thanks for the report. |