Bug 203827
Summary: | NFS v4.1 and v4.2 still unstable | ||
---|---|---|---|
Product: | File System | Reporter: | Slawomir Pryczek (slawek1211) |
Component: | NFS | Assignee: | bfields |
Status: | RESOLVED UNREPRODUCIBLE | ||
Severity: | high | CC: | bfields, chuck.lever, chucklever, friedrich.beckmann, jlayton, trondmy |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.1.6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | Stack trace for kernel 5.1.6 |
Description
Slawomir Pryczek
2019-06-05 22:19:56 UTC
The lock callback encoding crashed, so this is probably related to that work which went in after 4.10 (I think) and is a NFSv4.1 only feature. If you have some debugging-fu, it would be nice to nail down the line where it crashed. In my case: $ gdb fs/nfsd/nfsd.ko ... (gdb) list *(nfs4_xdr_enc_cb_notify_lock+0x9a) 0x2c84a is in nfs4_xdr_enc_cb_notify_lock (fs/nfsd/nfs4callback.c:648). 643 encode_cb_sequence4args(xdr, cb, &hdr); 644 645 p = xdr_reserve_space(xdr, 4); 646 *p = cpu_to_be32(OP_CB_NOTIFY_LOCK); 647 encode_nfs_fh4(xdr, &nbl->nbl_fh); 648 encode_stateowner(xdr, &lo->lo_owner); 649 hdr.nops++; 650 651 encode_cb_nops(&hdr); 652 } ...but the offsets in your kernel may be different. My guess is that "lo" turned out to be NULL here for some reason, and that led to the crash. Maybe a refcounting bug of some sort? If this happens again, it might be nice to get a vmcore via kdump. That might help track down the cause. Thanks for the update, this is production unfortunately and it crashed twice very shortly after enabling it. Reproducer no longer works... will try to figure out something to be able to reproduce it some other way because unfortunately it's too dangerous and always results in downtime. Will see about adding some other operations to reproducer... I filed a bug in debian for kernel package 4.19.0-18. The system crashes after writing to an nfs mounted filesystem. See: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1004070 (In reply to Friedrich Beckmann from comment #4) > I filed a bug in debian for kernel package 4.19.0-18. The system crashes > after writing to an nfs mounted filesystem. > > See: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1004070 That debian bug seems to indicate a problem on the client, where FSCACHE operates. This bug (203827) is a server bug. These two issues appear to be unrelated. (In reply to Chuck Lever from comment #5) > That debian bug seems to indicate a problem on the client, where FSCACHE > operates. This bug (203827) is a server bug. These two issues appear to be > unrelated. Thanks for looking at the bug. It is indeed on the client side. Sorry for the confusion. No activity on the original report in a couple years, so I'm assuming this is no longer an issue. Reopen if it is. |