infrastructure and clusters

Wed Mar 12 09:04:46 PST 2003

Hi all,

I am curious on how people solve these problems within their
organizations: we have a number of independent machines serving as
bioinformatics software servers and development boxes. They are multi-cpu
machines running Tru64 and Linux and I have been more or less ;)
successful in making them transparent to the user in terms of unified
environment (executables are on shared nfs mounted media, the login
scripts set-up paths correctly to include the architecture dependent
binaries etc.). Recently we got a cluster to play with and I am thinking
of using something like the Sun Grid Engine to tie everything together.
The question I have is on whether I should leave the cluster to be a
separate pool of boxes and or can it somehow be tied in with the rest of
the existing machines under one SGE solution? I would like not to allow
outgoing connections from the compute nodes (nor incoming into the nodes,
unless initiated from the master) - how would this play with SGE, since
the nodes will not have globally recognized IP addresses within the
intranet? Would a use of a stateful ip filter do any good? Or am I way off?

How do people usually solve these problems? Do they completely separate
the cluster from the rest of the production machines? I do not forasee a
lot of power users doing PVM or MPI programming or in-house
parallelization, mostly, for now, I see the cluster as a process farm...

Thank you,
Ognen