[Beowulf] Parallel Development Tools

Bogdan Costescu Bogdan.Costescu at iwr.uni-heidelberg.de
Thu Oct 18 07:26:38 PDT 2007


On Thu, 18 Oct 2007, Robert G. Brown wrote:

> but on highly INhomogeneous hardware, slow and undependable 
> networks, and the like -- if it dies

This discussion being on the beowulf list, I can agree with kickstart 
being used on INhomogenous hardware, but not on slow and undependable 
networks; how do you intend to run the cluster later with a crappy 
network ?

> and checkpointing of some sort on the script(s) that finish off the 
> system, so that if a particular package crashes the install one can 
> just remove it from the list and restart the package list install to 
> pick up where it left off and deal with the missing piece later

I think that this is a limitation in updating the RPM database; you 
make a transaction with a set of packages which has to have 
dependencies satisfied; if you want to eliminate one package you need 
to recompute the package set, as removing that one might remove many 
others pulled in through dependencies trees - so it's not so easily 
checkpointable.

> Of course this requires a binary and configurational standard at 
> LEAST through the base install (the kernel, glibc, /etc layout, more 
> base-class libraries).

... and packaging. Each of these pieces comes from a package and 
whatever further-install program runs later will need to deal with 
other packages and should know about what is installed already. And 
that's where a big problem lies...

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De



More information about the Beowulf mailing list