This may be related to bug #208157. From 5.7.0 through 5.7.4, nfs-server would not start upon boot on one of my servers. With 5.7.7 this was resolved BUT now when I reboot one of the NFS clients or unmount and remount an NFS partition on the client, the NFS server will sometimes spontaneously reboot. I get these messages in /var/log/dmesg.0: (The old dmesg log I assume is relevant since it would be the active one before the last boot). [ 40.302192] systemd[1]: Mounting NFSD configuration filesystem... [ 40.688313] kernel: RPC: Registered tcp NFSv4.1 backchannel transport module. [ 69.899630] kernel: NFSD: Using UMH upcall client tracking operations. [ 69.899635] kernel: NFSD: starting 90-second grace period (net f00000a8) After the NFS server reboots I see these NFS related messages in dmesg: [ 53.810062] systemd[1]: Mounting NFSD configuration filesystem... [ 54.254326] RPC: Registered tcp NFSv4.1 backchannel transport module. [ 106.468779] NFSD: Using UMH upcall client tracking operations. [ 106.468781] NFSD: starting 90-second grace period (net f00000a8) [ 107.631713] NFS: Registering the id_resolver key type [ 110.815312] NFS4: Couldn't follow remote path [ 113.935404] NFS4: Couldn't follow remote path [ 117.055421] NFS4: Couldn't follow remote path [ 120.175488] NFS4: Couldn't follow remote path [ 123.295611] NFS4: Couldn't follow remote path [ 126.415625] NFS4: Couldn't follow remote path [ 129.545752] NFS4: Couldn't follow remote path [ 132.655844] NFS4: Couldn't follow remote path So pretty much the same thing except for the "NFS4: Couldn't follow remote path" messages which I've read are caused by old nfs-utils not using the new system calls so probably not relevant.
(In reply to Robert Dinse from comment #0) > This may be related to bug #208157. From 5.7.0 through 5.7.4, nfs-server > would not start upon boot on one of my servers. > > With 5.7.7 this was resolved BUT now when I reboot one of the NFS > clients > or unmount and remount an NFS partition on the client, the NFS server will > sometimes spontaneously reboot. Well, that's not good. Too bad the dmesg has nothing interesting in it. Can you capture console output to see if there are messages that aren't making it to disk before the reboot? Do you have CONFIG_PANIC_ON_OOPS set? Is it possible for you to build kernels between 5.7.4 and 5.7.7 to figure out where exactly the server started crashing?
On Tue, 7 Jul 2020, bugzilla-daemon@bugzilla.kernel.org wrote: > > Well, that's not good. Too bad the dmesg has nothing interesting in it. Can > you capture console output to see if there are messages that aren't making it > to disk before the reboot? > > Do you have CONFIG_PANIC_ON_OOPS set? > > Is it possible for you to build kernels between 5.7.4 and 5.7.7 to figure out > where exactly the server started crashing? It started crashing at 5.7.6 and not sure how to get 5.7.5 now. Don't have access to the console because the machine is 21 miles from me. When it reboots console is overwritten with login screen when it comes back up. However, I discovered another problem that may be related, nouveau is allowing the nvidia 210 video card to DMA without having allocated the memory so who knows what it is randomly overwriting. So don't know that that isn't related, but crashes always occur when I try to mount a file system on a client. Presently I've reverted to a stock Ubuntu 5.4.0 kernel just to make sure I haven't got a hardware issue.
With a recent patch applied, this appears to be totally fixed in 5.8 so I am closing this ticket.