[Beowulf] A cluster of Arduinos

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Wed Jan 11 17:09:53 PST 2012



-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Vincent Diepeveen
Sent: Wednesday, January 11, 2012 4:37 PM
To: Beowulf Mailing List
Subject: Re: [Beowulf] A cluster of Arduinos

Yes this was impossible to explain to a bunch of MiT folks as well, some of whom wrote your book i bet - yet the slower the processor, the more of a true SMP system it is.

It's obvious that you missed that point.

Writing code for a multicore is tougher, from SMP constraints viewpoint, than for a bunch of 70Mhz cpu's that have a millisecond latency to the other cpu's.

-> Yes, that's true... but that's also what I would think of as more advanced than understanding basic message passing or non-tightly-coupled multiprocessing systems.  And there are lots of applications for the latter.  Some might not be as sexy as others, but they exist.

So it's far from demonstrating clusterprogramming. Lightyears away.
Emulation at a simple quadcore is in fact better representative than this.
If you want to get closer to clusterprogramming than this, just buy yourself off ebay some barcelona core SMP system with 4 sockets. Say with energy efficient 1.8Ghz CPU's.
So with one of the first incarnations of hypertransport, as of course later on it dramatically improved.
Latency from cpu to cpu is some 300+ ns if you lookup randomly.
Even good programmers in game tree search have big problems working with those latencies.

-> but that's an entirely different sort of problem space and instructional area.   


Clusters are having latencies that are far worse than that. Yet as cpu speeds no longer increase much and number of cores doesn't double that quickly, clusters are the way to go if you're CPU hungry.
Setting up small clusters is cheap as well. If i put in the name 'mellanox' in ebay i see bunches of cheap cards out there and also switches.

-> Oh, Im sure the surplus market is full of things one could potentially use. But I suspect that by the time you lash together your $40 cards and $20 cables and several hundred $ switch, you're up in the total system price >$1k.  And you're using surplus, so there's a support issue.  If you're tinkering for yourself in the garage or as a one-off, then surplus is a fine way to go.  If you want to be able to give a list of "go buy this" to a teacher, it needs to be off-the-shelf currently being manufactured stuff.   

-> Say you want to set up 10 demo systems with 8 nodes each, so that each student in a small class has their own to work with.  There's a big difference between $30 Arduinos and $200 netbooks. 

With a single switch you can teach half a dozen students. You can just connect the machines you already got there onto a few switches and write MPI code like that.

-> The whole point is to give a student exclusive access to the system, without needing to share.  Sure, we've all done the shared "computer lab" resource thing and managed to learn(In the late 1970s, I would have done quite a lot to have on demand access to an 029 keypunch).  That's part of what *personal* computers is all about.    My program doesn't work right, I just hit the reset button and start over.  

-> I confess, too, that there is an aspect of the "mass of boards on the desktop with cables strewn around", which is a learning experience in itself.  On the other hand, the Arduino experience is a lot less hassle than, say, a mass of PC mobos, network cards, and power supplies and trying to get them to boot off the net or a USB drive. 


Average cost per student also will be a couple of hundreds of dollars.
-> that's the "total cost of several thousand dollars divided by N students who share it" I suspect.  We could get into a little BOM battle, and I'd venture that I can keep the off the shelf parts cost under $500, and give each student a dedicated system to play with.  The only part that I don't know right off the top of my head is the actual interconnect hardware.  I think you'd want to design some sort of board with a bunch of connectors that connects to the Arduinos with ribbon cables.   But even there, that could be "here's your PCBExpress file.. order the board and you get 3 for $50"

-> over the years I've been involved in several of these "what can we set up for a demonstration", and I've converged to the realization that what you need is a parts list (preferably preloaded at Newark or DigiKey or Mouser or similar) and an explicit set of instructions.   A setup that starts out with:
1) Find 8 motherboards on eBay or newegg with these sorts of specs
2) Find 8 power supplies that match the mother boards

Is doomed to failure.  You need "buy 3 of those and 6 of these, and hook them up this way"

This is the beauty of the whole Arduino culture. In fact, it's a bit too much of that.. there's not a lot of good overview tutorial material.. but lots of "here's how to do specific task X"... I got started looking at Arduinos because I want to build a multichannel temperature controller to smoke/cure sausage.

But I've used just about every small single board computer out there: Rabbit, Basic Stamp, various PIC boards, etc. not to mention various MiniITX and PC schemes.   So far, the Arduino is the winner on dirt cheap and simple combined.  Spend $30, plug in USB cable, load java environment, done.  Now I know why all those projects at the science fair are using them.  You get to focus on what you want to do, rather than getting a computer working.

Vincent



On Jan 12, 2012, at 12:24 AM, Lux, Jim (337C) wrote:

>
>
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Wednesday, January 11, 2012 2:47 PM
> To: Beowulf Mailing List
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> Jim, your microcontroller cluster is not a rather good idea.
>
> Latency didn't keep up with the CPU speeds...
>
> --- You're missing the point of the cluster.  It's not for performance 
> (where I can't imagine that the slowest single CPU PC out there 
> wouldn't blow the figurative doors off).  It's to provide a very 
> inexpensive way to experiment/play/demonstrate loosely coupled 
> multiprocessor systems.
>
> --> for example, you could experiment with redundant message
> routing across a fabric of nodes.  The algorithms are fairly simple, 
> and this gives you a testbed which is qualitatively
> different than just simulating a bunch of nodes on a single PC.   
> There is pedagogical value in a system where you can force a link 
> error by just disconnecting the cable, and your blinky lights on each 
> node show what's going on.
>
>
> There is still too much years 80s and years 90s software out there, 
> written by the guys who wrote books about how to parallellize, which 
> simply doesn't scale at all at modern hardware.
>
> -->  I think that a lot of the theory of parallel processes is
> speed independent, and while some historical approaches might not be 
> used in a modern system for good implementation reasons, students and 
> others still need to learn about them, if only as the
> canonical approach.    Sure, you could do a simulation on a single  
> PC (and I've seen them, in Simulink, and in other more specialized 
> tools), but there's a lot of appeal to a hands-on-the-cheap- hardware 
> approach to learning.
>
> --> To take an example, if you set a student a problem of lighting
> a LED on each node in a specified node order at  specified intervals, 
> and where the node interconnects are not specified in advance, that's 
> a fairly interesting homework problem.  You have to discover the 
> network connectivity graph, then figure out how to
> pass the message to the appropriate node at the appropriate time.   
> This is a classic "hot plug network discovery" kind of problem, and in 
> the face of intermittent links, it's of great interest.
>
> --> While that particular problem isn't exactly HPC, it DOES relate
> to HPC in a world where you cannot assume perfect processor nodes and 
> perfect communications links.  And that gets right to the whole 
> "scalability" thing in HPC.  It wasn't til the implementation of Error 
> Correcting Codes in logic that something like the Q7A computer was 
> even possible, because it was so large that you couldn't guarantee 
> that all the tubes would be working all the time.  Likewise with many 
> other aspects of modern computing.
>
> --> And, of course, in the spaceflight world, this kind of thing is
> even more important.  A concept of growing importance is the 
> "fractionated spacecraft" where all of the functions that would have 
> been all in one physical vehicle are now spread across many smaller 
> pieces.  And one might reallocate spacecraft fractional pieces between 
> different virtual spacecraft.  Maybe right now, you need a lot of 
> processing power to do image compression and analysis, so you want to 
> allocate a lot of "processing pieces" to the job, with an ad hoc 
> network connection among them.  Later,  you don't need them, so you 
> can release them to other uses.  The pieces might be in the immediate 
> vicinity, or they might be some distance away, which affects the data 
> rate in the link and its error rates.
>
> --> You can legitimately ask whether this sort of thing (the
> fractionated spacecraft) is a Beowulf (defined as a cluster 
> supercomputer built of commodity components) and I would say it shares 
> many of the same properties, especially in the early Beowulf days 
> before multicores and fancy interconnects were fashionable for 
> multi-thousand processor clusters.  It's that idea of building a large 
> complex device out of many basically identical subunits, using open 
> source/simple software to manage it.
>
>
> -->> in summary, it's not about performance.. it's about a teaching
> tool for networking in the context of cluster computing.  You claim we 
> need to cast off the shackles of old programming styles and get some 
> new blood and ideas.  Well, you need to get people interested in 
> parallel computing and learning the basics (so at least they don't 
> reinvent the square wheel).  One way might be challenges such as 
> parallelization of game play; another might be working with 
> parallelized database; the way I propose is with experimenting with 
> message passing parallelization using dirt cheap hardware.
>
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin 
> Computing To change your subscription (digest mode or unsubscribe) 
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>

_______________________________________________
Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list