julien.leduc at lri.fr
Thu Jul 26 02:44:31 PDT 2007
andrew holway a écrit :
> Would this mean that a users environment could never exceed the
> resources of a single node?
you can deploy as many nodes as you want on your cluster with your own
environment, you just reserve the amount of nodes you want to use,
deploy your environment (filling the needs for your experiments) on all
your nodes (or an environment on some nodes and another one on other
nodes), and then run the experiment.
This way users can use the complete cluster with their custom environment.
> On 26/07/07, Julien Leduc <julien.leduc at lri.fr> wrote:
>> >> I'm interested in utilising the hardware to create something akin to
>> >> the sun grid or the amazon elastic computing cloud whereby the
>> >> resources available to the environment are automatically expanded and
>> >> contracted. Maybe I have the wrong end of the stick on how these
>> >> services operate.
>> > no, I think you're right on, and there's not much to it. why do you
>> > think Sun or Amazon have any special magic? beowulf clusters running
>> > multi-user queueing systems are precisely such an "elastic", "compute-
>> > on-demand" thingy, just without paying for the isolation, because such
>> > clusters are mainly motivated by performance.
>> Running a multi-user queueing system, you can have a cluster that
>> behaves like Sun or Amazon projects: you just choose the nodes that can
>> fullfill the user needs and requirements, fetch a VM on those chosen
>> nodes (during the 'prolog' section of the batch scheduler), start the
>> VMs on the physical nodes, ensure the user can log on those or fetch his
>> data / run a passive job. Then, once finished, clean up all that mess by
>> destroying the VM, and let another user reserve the node.
>> More isolation can be achieved, if the user needs to be root on the
>> node, to run a modified version of the kernel, or run several VMs on top
>> of his environment. For that, you have to let him deploy his own
>> environment on the node.
>> This last technique ensure reproductible experiments, more performances,
>> drawbacks are: more work on the middleware that make all that magic come
>> Combining the 2 previous techniques could help users to test their
>> OS+experimentation program in a VM and then deploy it at larger scale
>> for a true run on all the cluster(s ;) ).
>> This is a very interesting approach (at least for computer scientists)
>> and the second approach gives quite good results for the moment, the
>> combination of the 2 techniques has to be implemented to give away more
>> ressources so that users can test their environments on many virtual
>> nodes, consuming less physical nodes.
>> The main problem is to be able to control the nodes remotely, with
>> hardware supporting remote reboots, remote console management...
>> Julien Leduc
More information about the Beowulf