[Beowulf] automount on high ports

Perry E. Metzger perry at piermont.com
Wed Jul 2 07:26:13 PDT 2008


Skip to the bottom for advice on how to make NFS only use non-prived
ports. My guess is still that it isn't priv ports that are causing
trouble, but I describe at the bottom what you need to do to get rid
of that issue entirely. I'd advise reading the rest, but the part
about how to disable the stuff is after the --- near the bottom.

Carsten Aulbert <carsten.aulbert at aei.mpg.de> writes:
> Well, I understand your reasoning, but that's contradicted to what we do see
>
> netstat -an|awk '/2049/ {print $4}'|sed 's/10.10.13.41://'|sort -n
>
> shows us the follwing:

Are those all mounts to ONE HOST? Because if they are, you're going to
run out of ports. If you're connecting to multiple hosts should
you be okay, but you certainly could run out of ports between two
hosts -- you only have 1023 prived connections from a given host to a
single port on another box.

Of course, one might validly ask why the other 650 odd ports aren't
usable -- clearly they should be, right? The limit is 1023, not
358. It might be that there is some Linux oddness here.

Anyway, this shouldn't be a problem if you're connecting to MANY
servers, but maybe there's some linux weirdness here. See below.

> Which corresponds exactly to the maximum achievable mounts of 358 right
> now. Besides, I'm far from being an expert on TCP/IP, but is it possible
> for a local process to bind to a port which is already in use but to
> another host?

Of course! You can use the same local port number with connections to
different remote hosts. You can even use the same local port number
with multiple connections to the same remote host provided the remote
host is using different port numbers on its end.

Every open socket is a 4-tuple of localip:localport:remoteip:remoteport
Provided two sockets don't share that 4-tuple, you can have both.

Now, a given OS may screw up how they handle this, but the *protocol*
certainly permits it. Perhaps you're right and Linux isn't dealing
with this gracefully. We can check that.

> I don't think so, but may be wrong.

Then how does an SMTP server handle thousands of simultaneous
connections all coming to port 25? :)

In any case, this is what the NFS FAQ says. It does mention the priv
port problem, but only in a context in which makes me think it is
talking about two given hosts and not one client and many
hosts. However, I might be wrong. See below:

>From http://nfs.sourceforge.net/

  B3. Why can't I mount more than 255 NFS file systems on my client?
  Why is it sometimes even less than 255?

    A. On Linux, each mounted file system is assigned a major number,
    which indicates what file system type it is (eg. ext3, nfs,
    isofs); and a minor number, which makes it unique among the file
    systems of the same type. In kernels prior to 2.6, Linux major and
    minor numbers have only 8 bits, so they may range numerically from
    zero to 255. Because a minor number has only 8 bits, a system can
    mount only 255 file systems of the same type. So a system can
    mount up to 255 NFS file systems, another 255 ext3 file system,
    255 more iosfs file systems, and so on. Kernels after 2.6 have
    20-bit wide minor numbers, which alleviate this restriction.

    For the Linux NFS client, however, the problem is somewhat worse
    because it is an anonymous file system. Local disk-based file
    systems have a block device associated with them, but anonymous
    file systems do not. /proc, for example, is an anonymous file
    system, and so are other network file systems like AFS. All
    anonymous file systems share the same major number, so there can
    be a maximum of only 255 anonymous file systems mounted on a
    single host.

    Usually you won't need more than ten or twenty total NFS mounts on
    any given client. In some large enterprises, though, your work and
    users might be spread across hundreds of NFS file servers. To work
    around the limitation on the number of NFS file systems you can
    mount on a single host, we recommend that you set up and run one
    of the automounter daemons for Linux. An automounter finds and
    mounts file systems as they are needed, and unmounts any that it
    finds are inactive. You can find more information on Linux
    automounters here.

    You may also run into a limit on the number of privileged network
    ports on your system. The NFS client uses a unique socket with its
    own port number for each NFS mount point. Using an automounter
    helps address the limited number of available ports by
    automatically unmounting file systems that are not in use, thus
    freeing their network ports. NFS version 4 support in the Linux
    NFS client uses a single socket per client-server pair, which also
    helps increase the allowable number of NFS mount points on a
    client.

Now, until you brought this up, I would have guessed that this meant
you could run out of priv ports between host A and host B -- i.e. host
B is the client, is connecting to one port on host A, and is trying to
mount more than 1023 file systems on host A and fails because it runs
out of priv ports. However, if your test is not between two hosts but
is rather between multiple hosts, perhaps for whatever reason Linux is
braindead and is not allowing you to re-use the same local socket
ports. We can diagnose that later.

---

So, here are the things you need to do to totally remove the priv
ports thing from the situation:

1) On the server, in your exports file you have to put the "insecure"
   option onto every exported file system. Otherwise the mountd will
   demand that the remote side use a "secure" mount. You've already
   done this according to the initial mail message. However, that only
   tells the server not to care if the client comes in from a port
   above 1024
2) The client side is where the action is -- the client picks the port
   it opens after all. Unfortunately, Linux DOES NOT have an option to
   do this. BSD, Solaris, etc. do, but not Linux. You need to hack the
   source to make it happen.

   On a reasonably current source tree, go to:
     /usr/src/linux/fs/nfs/mount_clnt.c
   and look for the argument structure being built for rpc_create. You
   need to or-in RPC_CLNT_CREATE_NONPRIVPORT to the .flags member, as
   in (for example, depending on your version, this is 2.6.24):
         .flags          = RPC_CLNT_CREATE_INTR,
   to
         .flags          = RPC_CLNT_CREATE_INTR | RPC_CLNT_CREATE_NONPRIVPORT,

   This is a bloody ugly hack that will make ALL connections unprived,
   so you might have trouble with "normal" mounts. This can be done
   more cleanly, but it would require more than a one line
   patch. However, it would get you through testing. If it works for
   you and you really need it, a clean mount option could be added.

My guess is that this is not your problem! However, can check and see
if I'm wrong, and if I am, then we can move on to fixing it better.

Perry
-- 
Perry E. Metzger		perry at piermont.com



More information about the Beowulf mailing list