[Beowulf] question about enforcement of scheduler use
Larry Felton Johnson
larryj at gsu.edu
Wed May 24 10:36:02 PDT 2006
On Tue, May 23, 2006 at 10:18:45AM -0400, Matt Allen wrote:
> The prologue script only runs on the mother superior node, so it can
> only alter the other nodes in the job via (probably) ssh. I think Dr.
> Weisz's script does this, although the version I've seen has "rsh"
> hard-coded. I'd check to see that the prologue script is actually
> altering limits.conf on all of the nodes, since it looks like that could
> be why you're seeing connection failures.
> I think the way you're going about this is fine; we've tried the same
> thing here at IU. In the end, we just didn't have that many problem
> users connecting directly to the compute nodes, so we abandoned the
> restriction enforcement. Our problems have more to do with orphaned MPI
> processes hanging around on nodes, so we use a script to periodically
> clean out processes owned by users who shouldn't be on the node.
I want to thank all of you for answering this question. Each of the
responses I got provided me with useful possible approaches. I'll
summarize how I've actually resolved this since getting your replies
when I've finished working through the problem. In the meantime I just
wanted to acknowledge your replies and thank you for the help.
"I learned long ago, never to wrestle with a pig. You
get dirty, and besides, the pig likes it."
George Bernard Shaw
More information about the Beowulf