[Beowulf] Mobos for portable use
Lux, Jim (337C)
james.p.lux at jpl.nasa.gov
Thu Jan 19 14:09:01 PST 2017
From: Beowulf [mailto:beowulf-bounces at beowulf.org] On Behalf Of Andrew M.A. Cater
Sent: Thursday, January 19, 2017 12:49 PM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] Mobos for portable use
10 x Beaglebone Black would be good for maximum power/minimum space/minimum volume and cost.
10 x Beaglebone Green might also provide WiFi networking
64 x Raspberry Pi model B - original, not Pi B+ / 2 / 3 - was tried. Not necessarily running off one power socket - PoE, network switches etc.
Search for 64 node Raspberry Pi Bramble cluster
As it happens, I have a bunch of BBG and BBB and RPi and the Teensy flavor of Arduino..
And I've thought more than once about building a cluster of Teensys, perhaps using something like PAPERS as an interconnect, or using the built in serial ports in a hypercube. I even started looking at implementing an interprocessor comm using an API that replicates MPI subset on the Arduino
So here are my thoughts:
Teensy is cool as a toy, but if your goal is to "do parallel processing" then why not use a bunch of VMs spun up on a desktop to do the algorithm development. There's no display on the Arduino, other than a LED.. I could easily write some sort of little message passing demo that drives a tricolor LED or something, but to what end?
So we look at the ARM Cortex A7 based boards.. I love the BBB and BBG, and yes, the BBG comes in a wireless version. And they're reasonably fast - a 1024 point FFT on a BBG runs in <80 microseconds. But again, I'm not sure that the MIPS/Watt is favorable.. And software development for the ARM is a bit trickier than for x86 - if nothing else, you'll need to recompile for a new target.
What our lunchtime discussion was about was basically making a "compute appliance" that could provide significant computational crunch as an adjunct to a laptop. Running non-open, non-free software like Matlab or various and sundry design tools that are demanding of resources (FPGA place and route). To be honest, we weren't thinking of a cluster here.. just the equivalent of a big fast desktop machine, in a different package.
And there is a scale issue - sure, the ARM is good on a MIPS/Watt basis, but the peak MIPS is lower than the peak MIPS for a single x86 (maybe.. I've not actually looked).
Even from the earliest Beowulf days, there's been this whole "do I use 4 cheap but slow processors or 1 fast and expensive processor" - if your task can be done acceptably fast by a single (fast) processor, then that's what you should be doing. It's when that single fast processor can't make the grade, you go to clusters, and then all the scalability issues come to play.
(I just found that at least a while ago, Xilinx supported clusters for some of their design tools.. Since right now the design I'm working with takes an hour to synthesize (on a single machine), I'm going to look further - it has been a real rate limiter in the lab, because it makes the test, new design, load, test cycle a lot longer.)
More information about the Beowulf