Note: Sorry for the mistakes, but the "Bug Filing FAQ" produced a 404 Latest working kernel version: Earliest failing kernel version: earliest known: 2.6.18-92.1.1.el5 Distribution: Red Hat (also: Ubuntu) Hardware Environment: Dell PowerEdge 2950 (dual Xeon, 8GB RAM), Dell PV220S disk storage Software Environment: RHEL 5.2 Problem Description: Intermittently all clients will hang. This can only be remedied with a server reboot. This seems to happen across different distros and different kernel versions, hence the post here instead of to Red Hat's bugzilla. Client- and server-side /var/log/messages both show: lockd: server <ip> not responding, timed out Restarting nfs on the server fails on "starting nfs daemon." Giving his error: lockd_down: lockd failed to exist, clearing pid Doing this also causes a second instance of [lockd] to be running, where before there was one. Pinging and ssh'ing to the server continue to function throughout. The bug seems to be a kernel issue, as it has appeared in different versions across different kernels. This seems to be the same problem, in Ubuntu 2.6.22: https://bugs.launchpad.net/ubuntu/+source/linux-source-2.6.22/+bug/181996 That page contains all necessary system messages and significant debugging output, which I'm not going to bother re-posting here. Other perhaps related problems: http://www.mail-archive.com/linux-nfs@vger.kernel.org/msg01373.html https://bugzilla.redhat.com/show_bug.cgi?id=430160 Steps to reproduce: According to "the.jxc" on that first link above, "The failure is very regular. It happens whenever the garbage collection is performed as a result of a lock request." I can't be much more helpful than that. Let me know if more information is needed, or if this is a duplicate of another submission (my search produced no results).
*** This bug has been marked as a duplicate of bug 10939 ***