UPDATE: rpcbind-0.2.0-3 was just released, and it solves all the rpcbind issues.
UPDATE2: THE BUG WAS SOLVED! Alexander Wirt patched nfs-kernel-server: by removing the linkage of libtirpc, it now uses the previously method of IPv4 binding, thus not triggering the problem. When/if would rpcbind replace portmap - I don't know.
[ This article describes my analysis to a problem found in Debian Unstable(sid) ]
1. The bug (link)
Since the end of December, a change in nfs-kernel-server package caused a change of behavior in some NFSv3 crucial services: rpc.statd, rpc.mountd. NFS is RPC-based, and as thus, it uses an RPC-to-UDP/TCP address translation service, aka port mapper. These services try to connect to port mapper when they need address translation, and since the recent change they first try to do it over IPv6.
portmap, the current widely-used rpc port mapper service, does not support IPv6. This causes these crucial services to die, and NFSv3 fails to start (actually, with default configuration it even prevents starting an NFSv4 server).
2. So... rpcbind?
While quickly researching the problem, I've leanred that rpcbind is a portmap-alternative, which does provide IPv6 support. While this is not necessarily the solution to the bug (rpc.* services could simply try IPv4 if IPv6 fails), it seems that a transition from portmap to rpcbind should be done anyway:
- rpcbind adds IPv6 support, which is, apparently the future (or is it..).
- Fedora12 (And thus the upcoming RHEL6) has already dropped portmap in favor of rpcbind.
- rpcbind is a fork from Sun, so I assume that modern Solaris also use rpcbind.
- rpcbind's last commit took place on 4 months ago, while portmap's last commit took place 10 months ago.
These reasons might not be strong enough. I just get the feeling that the winds blows toward the rpcbind direction, but I have no idea which code is actually better. It is, of course, possible to improve portmap instead of moving to rpcbind.
3. The current problems of rpcbind transition
- [ UPDATE: FIXED ] The package doesn't install an init script
- [ UPDATE: FIXED ] The package conflicts with portmappackage (e.g. they both listen on the same port), but isn't defined as conflicting
- Filenames conflict as well (/usr/bin/rpcinfo), but the package maintainer solved it by renaming it to /usr/bin/rpcbindinfo
- Many packages depend on portmap only. They should allow rpcbind as well. And should be tested with it.
- Actually.. for packages which cannot work with portmap anymore (such as nfs-kernel-server), if there's no plan to fix the portmap coupling, they should depend on rpcbind only now.
4. Why am I blogging this?
As always, to help people research their problem. According to popcon there are more than 10,000 Debian sid users, and that's a lot of people potentially having NFS server problems.
And, because I'm a bit worried. Well, worried is a big word, it's just a distro after all. Well, an unstable distro, even. Still, this is quite a major issue, and in the last 10 days I've hardly noticed any activity in the Debian community regarding this one. Only today it was mentioned on the mailing lists (yes, I know, I should have done this myself), and no replies yet. So, I hope that smart Debian guys would check it out and find the right recipe.
5. Oh, and a quick, dirty solution until they fix it 🙂
a. aptitude remove portmap (or just stop it and remove it from the boot sequence)
b. aptitude install rpcbind 0.2.0-2
c. Put an init script in /etc/init.d/rpcbind (here's my recommended script)
d. Add rpcbind to /etc/insserv.conf, in the "portmap" line (so init script that depend on $portmap would depend on rpcbind also)
e. Run insserv /etc/init.d/rpcbind to reorder the init scripts.
f. Reboot (or start rpcbind and then all the rpc services).