[Beowulf] 2 starting questions on how I should proceed for a correct first micro-cluster (2-nodes) building

Sun Mar 3 08:22:33 PST 2019

One thing to remember, a cluster is often defined by its environment
and goals.

1. HPC type clusters, as discussed on this list, operate in
a particular way (cluster nodes are provisioned in a reproducible
way, job scheduler that provides dynamic resources
to users, clusters are optimized for processing and communications
speed)

2. Things like Kafka operate differently and I assume you are talking
about creating Kafka Consumers in HPX for some kind of workflow.
In general, something like Kafka manages it's own work and
job distribution (even fault tolerance) and can span multiple
data centers.

I assume that you are more interested in situation #2. In this
case, much of the machinery used in case #1 is not needed.
However, running parallel HPX jobs will take some resource management
and I don't know enough about Kafka to comment on how to do this.
I assume you are setting a static single application based
cluster environment.

Provisioning is an other issue. Some of the HPC cluster provisioning
tools my help get nodes setup easily and reproducibility
(Google the Warewulf Toolkit) In the "analytics" world provisioning
is very different where tools like Ambari are used, but that may be overkill
for what you are trying to. And, of course, many of the distributed
Apache analytics type tools have there own cluster install options
and recipes. (e..g. setting a stand alone Spark cluster)

Hope that helps a bit.

--
Doug

> Hi all,
>
> I'm developing an application which need to use tools and other
> applications that excel in a distributed environment:
> - HPX ( https://github.com/STEllAR-GROUP/hpx ) ,
> - Kafka ( http://kafka.apache.org/ )
> - a blockchain tool.
> This is why I'm eager to learn how to deploy a beowulf cluster.
>
> I've read some info here:
> - https://en.wikibooks.org/wiki/Building_a_Beowulf_Cluster
> - https://www.linux.com/blog/building-beowulf-cluster-just-13-steps
> -
> https://www-users.cs.york.ac.uk/~mjf/pi_cluster/src/Building_a_simple_Beowulf_cluster.html
>
> And I have 2 starting questions in order to clarify how I should proceed
> for a correct cluster building:
>
> 1) My starting point is a PC, I'm working with at the moment, with this
> features:
>   - Corsair Simm Memoria RAM, DDR3, PC1600, 32GB, CL10 Ven k
>   - Intel Ci7 Box Processore CPU 1150 i7-4790K, 4.00 GHz
>   - Samsung MZ-76E500B UnitÃ  SSD Interna 860 EVO, 500 GB, 2.5" SATA III,
> Nero/Grigio
>   - MB ASUS H97-PLUS
>    - lettore DVD-RW
>
>   I'm using as OS Ubuntu 18.04.01 Server Edition.
>
> On one side I read that it should be better to put in the same cluster the
> same type of HW : PCs of the same type,
> but on the other side also hetherogeneous HW (server or PCs) can also be
> deployed.
> So....which HW should I take in consideration for the second node, if the
> features of the very first "node" are the ones above?
>
> 2) I read that some software (Rocks, OSCAR) would make the cluster
> configuration easier and smoother. But I also read that
>  using the same OS,
> with the right same version, for all nodes, in my case Ubuntu 18.04.01
> Server Edition, could be a safe starter.
> So... is it strictly necessary to use Rocks or OSCAR to correctly
> configure
> the nodes network?
>
> Looking forward to your kind hints and suggestions.
> Marco
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
Doug