RSH scaling problems
mathog at mendel.bio.caltech.edu
Wed Dec 18 09:26:19 PST 2002
Greg Lindahl wrote:
> Low ports can't be reused until TIME_WAIT time has passed.
True. Let's see what kind of a limit that imposes on rsh
command rates on a typical system - RedHat 7.3 with a few
servers and SGE running, over 100baseT. I put 100 copies
of a target node's name in a file and then did:
time rsh -zf manycopies.txt hostname
That blew up. But initially it was because of the default
cps setting in xinetd.d for rsh, which picked up the default
cps = 25 30
So I added
cps = 250 10
to the /etc/xinetd.d/rsh, restarted xinetd, and tried it again,
whereupon it completed in 2.196 seconds real time. Running this
3 times quickly failed in the third one, and netstat on the
target showed all the ports used up. On the node running rsh
netstat showed no TIME_WAIT connections. I think
that means the target was closing the connection before the
source. After a while (TIME_WAIT, presumably) these
all dropped out of netstat and rsh to the target started
working again. Then I changed
the target file so that it listed 50 copies of target1 and 50
copies of target2. That variation failed in the 6th iteration,
further supporting the conjecture that the limit is on the target
end. So the rate for outgoing rsh from a given node seems not
to be limited (at least by this effect) but the incoming rate
to a node is limited.
It jams up when about 290 ports are stuck in TIME_WAIT. TIME_WAIT
on linux is 60 seconds (I think). So the average sustainable
rate of incoming rsh (or rlogin, or rcp) commands is about 290/60,
or just less than 5 per second. cps set to 250 is overly
optimistic as well, if all rsh come from one source, since the
fastest that rsh can send them (my modified version, which
basically runs rcmd() in a loop), is only about 50/second.
This was over 100baseT, maybe you can go higher with Myrinet.
Which means, I suppose that if you want to fire a lot of commands
from one machine to another putting rsh inside a loop is a bad idea.
Better to start up one rsh, leave it running, and pipe the commands
through it to some target process which runs them on the other end
without dropping the connection between commands.
ANYWAY, going back to the original post by Mike Galicki, he should
check that the xinetd cps value (or equivalent, if it isn't linux)
isn't setting the upper limit. Possibly he can get more
throughput by raising it. Failing that, perhaps one of the other
mpi devices keeps a line open all the time and so bypasses
this limit entirely?
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
More information about the Beowulf