The other day, the NFS clients at the pound stopped working correctly. Programs that use a NFS share for caching data or locking files (such as Firefox) stopped working without any explanation. My doggs were also unable to compile any programs, which led to a lot of barking and growling from all of them!

Looking through the logs on the client didn’t reveal anything significant, however the logs on the NFS server were filled with these:

kernel: statd: server localhost not responding, timed out
kernel: lockd: cannot monitor client

At first it seemed that the statd daemon was not functioning. After restarting lockd and statd, the problem persisted. Even restarting the server didn’t fix the problem. The next thought was that something was blocking the loopback interface from communicating, since the localhost server wasn’t responding. After running some network tests, checking firewall and tcpwrapper rules, I found nothing that was keeping the server from communicating with itself.

After reading through the man page for statd and conversing with some of my doggs, I decided to attempt to remove the statd monitor and notify lists on the NFS server. This was the key! These files had somehow become locked or corrupted. These lists are located in the directories below:

/var/lib/nfs/statd/sm/ - directory containing statd monitor list
/var/lib/nfs/statd/sm.bak/ - directory containing statd notify list

Before removing these files, you should stop the rpcbind, statd, and lockd services. Below is a list of commands to run to fix this issue on a RPM based distro.

service rpcbind stop
service nfslock stop
rm -rf /var/lib/nfs/statd/sm/*
rm -rf /var/lib/nfs/statd/sm.bak/*
service rpcbind start
service nfslock start

After running these commands, it may be best to restart your NFS server.

Also check the permissions on these files and folders, to make sure that the NFS service can access them. Here are the permissions from my NFS server:

drwx------ 4 rpcuser rpcuser 4.0K Aug  1 15:00 .
drwxr-xr-x 5 root    root    4.0K Aug  1 15:00 ..
drwx------ 2 rpcuser rpcuser 4.0K Aug  1 15:00 sm
drwx------ 2 rpcuser rpcuser 4.0K Aug  1 15:00 sm.bak
-rw-r--r-- 1 root    root       4 Aug  1 15:00 state

A NFS FAQ can also be found here: http://www.sunhelp.org/faq/nfs.html