[Beowulf] Most common cluster management software, job schedulers, etc?

Kenneth Hoste kenneth.hoste at ugent.be
Mon Mar 7 23:46:21 PST 2016



On 08/03/16 05:43, Jeff Friedman wrote:
> Hello all. I am just entering the HPC Sales Engineering role, and 
> would like to focus my learning on the most relevant stuff. I have 
> searched near and far for a current survey of some sort listing the 
> top used “stacks”, but cannot seem to find one that is free.

https://sites.google.com/site/smallhpc/home, see "Files & Documents" section

> I was breaking things down similar to this:
>
> _OS disto_:  CentOS, Debian, TOSS, etc?  I know some come trimmed 
> down, and also include specific HPC libraries, like CNL, CNK, INK?
>
> _MPI options_: MPICH2, MVAPICH2, Open MPI, Intel MPI, ?
>
> _Provisioning software_: Cobbler, Warewulf, xCAT, Openstack, Platform 
> HPC, ?
>
> _Configuration management_: Warewulf, Puppet, Chef, Ansible, ?

xCAT

Some sites (incl. the one I'm at) use Quattor (http://www.quattor.org/)
>
> _Resource and job schedulers_: I think these are basically the same thing?

Not really, although there's some overlap.

Resource managers like Torque (PBS) are in charge of ... managing the 
resources.

Job schedulers (e.g. Maui, MOAB) decide which job gets to start next, 
and talk to the resource manager (i.e. poll for available resources, 
instruct which job should start next, etc.).

> Torque, Lava, Maui, Moab, SLURM, Grid Engine, Son of Grid Engine, 
> Univa, Platform LSF, etc… others?
>
> _Shared filesystems_: NFS, pNFS, Lustre, GPFS, PVFS2, GlusterFS, ?
>
> _Library management_: Lmod, ?

EasyBuild

(disclaimer: I'm the lead developer)

>
> _Performance monitoring_: Ganglia, Nagios, ?
>
> _Cluster management toolkits_: I believe these perform many of the 
> functions above, all wrapped up in one tool?  Rocks, Oscar, Scyld, 
> Bright, ?
>
>
> Does anyone have any observations as to which of the above are the 
> most common?  Or is that too broad?  I  believe most the clusters I 
> will be involved with will be in the 128 - 2000 core range, all on 
> commodity hardware.

See the "Small HPC Centers" survey I pointed to above.


K.
>
> Thank you!
>
> - Jeff
>
>
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20160308/1f8ac967/attachment-0001.html>


More information about the Beowulf mailing list