[Beowulf] Build Recommendations - Private Cluster

Richard Edwards ejb at fastmail.fm
Tue Aug 20 23:13:22 PDT 2019


Hi Dmitri,

I have no specific application. I have done some CUDA Enabled OpenCV for realtime video stitching, pattern recognition etc in the past. Was planning on spending some time learning more about CUDA and getting into MPICH. I think the K20x’s might still be OK for tensor flow. However this exercise is for me more about infrastructure build, management and learning than a given application. 

Would certainly be interested in insights to the s/w stack.

Cheers

Richard

> On 21 Aug 2019, at 4:06 pm, Dmitri Chubarov <dmitri.chubarov at gmail.com> wrote:
> 
> Hi Richard,
> 
> I am speaking from experience of keeping up a small cluster of 4 Supermicro boxes with a total of 16 C2070 cards. We had to freeze NVIDIA Driver updates as Fermi cards are not supported by the latest drivers as well. This means you can use CUDA but not the latest versions of NVIDIA SDK. Machine Learning applications are out since later version of TensorFlow require later versions of NVIDIA Drivers and SDK. So this cluster runs some computational chemistry codes that are less demanding in terms of CUDA features. I can probably give you details of the software stack off the list.
> 
> What would be good to keep in the list thread is the information on the type of applications that you intend to use the cluster for.
> 
> 
> 
> On Wed, 21 Aug 2019 at 12:47, Richard Edwards <ejb at fastmail.fm <mailto:ejb at fastmail.fm>> wrote:
> Hi Dmitri
> 
> Thanks for the response.
> 
> Yes old hardware but as I said it is for a personal cluster. I have also put M2070’s in one of the 1070 chases as they are basically 4 slot PCI expansions. I have various other M2050/M2070/M2090/K20x cards around so depending on time I can certainly get more bang than the C1060’s that are in there now. I am prepared to live with the pain of older drivers, potentially having to use older linux distributions and not being able to support much beyond CUDA 2.0...
> 
> Yes I could go out and purchase probably a couple of newer cards and get the same performance or better but this is more about the exercise and the learning.
> 
> So maybe the hardware list was a distraction. What are people using as the predominant distro and management tools? 
> 
> cheers
> 
> Richard
> 
> 
> 
>> On 21 Aug 2019, at 3:08 pm, Dmitri Chubarov <dmitri.chubarov at gmail.com <mailto:dmitri.chubarov at gmail.com>> wrote:
>> 
>> Hi,
>> this is a very old hardware and you would have to stay with a very outdated software stack as 1070 cards are not supported by the recent versions of NVIDIA Drivers and old versions of NVIDIA drivers do not play well with modern kernels and modern system libraries.Unless you are doing this for digital preservation, consider dropping 1070s out of the equation.
>> 
>> Dmitri
>> 
>> 
>> On Wed, 21 Aug 2019 at 06:46, Richard Edwards <ejb at fastmail.fm <mailto:ejb at fastmail.fm>> wrote:
>> Hi Folks
>> 
>> So about to build a new personal GPU enabled cluster and am looking for peoples thoughts on distribution and management tools.
>> 
>> Hardware that I have available for the build
>> - HP Proliant DL380/360 - mix of G5/G6
>> - HP Proliant SL6500 with 8 GPU
>> - HP Proliant DL580 - G7 + 2x K20x GPU
>> -3x Nvidia Tesla 1070 (4 GPU per unit)
>> 
>> Appreciate people insights/thoughts
>> 
>> Regards
>> 
>> Richard
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit https://beowulf.org/cgi-bin/mailman/listinfo/beowulf <https://beowulf.org/cgi-bin/mailman/listinfo/beowulf>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20190821/6845611d/attachment.html>


More information about the Beowulf mailing list