[Beowulf] running out of rsh ports

Geoff Jacobs gdjacobs at gmail.com
Wed May 3 16:44:46 PDT 2006


Joe Landman wrote:
> Dan Stromberg wrote:
>> On Wed, 2006-05-03 at 12:16 -0500, Bruce Allen wrote:
>>> rsh typically uses ports in the 513-1023 range.  With a 640 node
>>> cluster we are running out of ports.  This leads to messages such as:
>>> "rcmd: socket: All ports in use"
>>>
>>> Are there any standard solutions to this other than 'use ssh'?  We
>>> already have net.ipv4.tcp_tw_recycle = 1 in /etc/sysctl.conf
>>
>> http://dcs.nac.uci.edu/~strombrg/loop.html
> 
> Hmmm.... we use pdsh for our (customers) clusters.  Works quite well.
> Your example
> 
>     pdsh uname -a
> 
> does the same thing if it is setup right.
> 
> 
I'm presuming PDSH opens up a bank of connections to nodes, starts up
remote processes, then closes the connections. It ripples along until
all processes are started.

With my handy, optimized infinite loop, the process is forked as a child
of my remote shell.

sshd: odin at notty
          \_ ./loop

This connection is persistent unless I choose to detach loop.

Does PDSH leave listeners in order to reconnect and reap child processes
properly upon termination, or are they detached by necessity?

-- 
Geoffrey D. Jacobs
MORE CORE AVAILABLE, BUT NONE FOR YOU.




More information about the Beowulf mailing list