[Beowulf] A cluster of Arduinos
Lux, Jim (337C)
james.p.lux at jpl.nasa.gov
Fri Jan 13 08:26:29 PST 2012
On 1/13/12 7:18 AM, "Douglas Eadline" <deadline at eadline.org> wrote:
>> And Doug, your small systems have a lot of the same issues, perhaps
>> because that small Limulus might be operated in environments other than
>> what the underlying hardware was designed for. I know people who have
>> been rudely surprised when they found that the design environment for a
>> laptop is a pretty narrow temperature range (e.g. office desktop) and
>> they put them in a car, subject to 0C or 40C temperatures, if not wider,
>> that things don't work quite as well as expected.
>I will be curious to see where these things show up since
>all you really need is a power plug. (a little nervous actually).
Yes.. That *will* be interesting... And wait til someone has a cluster of
Limuluses (Not sure of the proper alliterative collective noun, nor the
plural form.. A litany of limuli? A school? A murder?)
>I agree. Four nodes is really small. BTW, the most fun in designing
>this system is a set of tighter constraints than are found on the typical
>cluster. Noise, power, space, cabling, low cost packaging, etc. I have
>been asked about a rack mount version, we'll see.
>One thing I find interesting is the core/node efficiency.
>(what I call "effective cores") In general *on some codes*, I found that
>less cores (1P micro-atx 4-cores) is more efficient than many
>cores (2P server 12-core). Seems obvious, but I like to test things.
Yes, because we're using, in general, commodity components/assemblies,
we're subject to the results of optimizations and market/business forces
in other user spaces. Someone designing a media PC for home use might not
care about electrical efficiency (there's no big yellow energy tags on
computers, yet), but would care about noise. Someone designing a rack
mounted server cares not a whit about noise, but really cares about a 10%
change in power consumption.
And, drop on top of that the non-synchronized differences in
development/manufacturing/fabrication generations for the underlying
parts. Consumer stuff comes out for the winter selling season. Commercial
stuff probably is on a different cycle. It's not like everyone uses the
same "model year changeover".
>> (oddly, simulated fault injection is one of the trickier parts)
>I would assume, because in a sense, the black swan* is
>by definition hard to predict.
Not so much that, as the actual mechanics of fault injection. Think about
testing error detection and recovery for Flash memory. The underlying
specification error rate is something like 1E-9 or 1E-10/read, and that's
a worst case kind of spec, so errors aren't too common (I.e. You can't
just run and wait for them to occur). SO how do you cause errors to occur
(without perturbing the system.)...
In the flash case, because we developed our own flash controller logic in
an FPGA, we can add "error injection logic" to the design, but that's not
always the case. How would you simulate upsets in a CPU core? (short of
blasting it with radiation.. Which is difficult and expensive.. I wish it
was as easy as getting a little Co60 gamma source and putting it on top of
the chip.. We hike to somewhere that has an accelerator (UC Davis,
Brookhaven, etc) and shoot protons and heavy ions at it.
>(* the book by Nick Taleb, not the movie)
Black swans in this case would be things like the Pentium divide bug.
Yes.. That *would* be a challenge, but hey, we've got folks in our JPL
Laboratory for Reliable Software (LARS) who sit around thinking of how to
do that, among other things. (http://lars-lab.jpl.nasa.gov/) Hmm.. I'll
have to go talk to those guys about clusters of pi or arduinos... They're
big into formal verifications, too, and model based verification. So you
could have a modeled system in SysML or UML and compare its behavior with
that on your prototype.
>This message has been scanned for viruses and
>dangerous content by MailScanner, and is
>believed to be clean.
More information about the Beowulf