[Beowulf] A cluster of Arduinos

Vincent Diepeveen diep at xs4all.nl
Wed Jan 11 18:03:21 PST 2012


The whole purpose of PC's is that they are generic to use. I remember  
how in past decision taking bought low clocked junk for big price -
much against the wish of the sysadmins who wanted a PC for every  
student exclusively. Outdated slow junk is not interesting
to students. Now you and i might like that CPU as it's under $1, but  
to them it's just 70Mhz, factor 500 slower than their home PC single  
core
is. What impresses is if you got something that can beat their own  
machine at home.

In the end in science we basically learn a lot easier if we can take  
a look into the future - so being faster than a single PC is a good  
example of that.

So let them do that. If you take care you launch 1 proces on each  
machine, then at quadcore machines, not to mention i7's with
hyperthreading, you can have 24 computers on 1 switch that serve 24  
students, each using 12 logical cores.

And for demonstration purposes you can run succesful applications  
also at all 24 computers at the same time.

Hey there is switches with even more slots.

Average price per student is gonna beat the crap out any junk  
solution you show up with - besides how many are you gonna buy?

Those computers are already there, one for each student i suspect.

So they can exclusively toy and toy - for the switch it's not a real  
problem except if they really mess up.

But most important they learn something - by toying with 70Mhz  
hardware that's not representative and only intersting to experts like
you and me, who are real good in embedded programming, they don't  
learn much.

There is no replacement for the real thing to test upon.

Besides if you go program at embedded processors, writing good fast  
single CPU code mine is probably gonna kick the hell out of you writing
the same program at 8 CPU's. Probably by factor 10+ it'll be single  
core faster than you at 8.

p.s. not that it's disturbing Jim but your replies are typed within  
my original message always, so tough to read sometimes what you typed  
into
the message i posted here -  maybe this apple macbookpro's
mailing system doesn't know how to handle it - FYI i want to reformat  
it to linux anyway -
getting sick being hacked silly each time by about every other  
consultant,
but well this is all off topic - so hence the postscriptum.

On Jan 12, 2012, at 2:09 AM, Lux, Jim (337C) wrote:

>
>
> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf- 
> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
> Sent: Wednesday, January 11, 2012 4:37 PM
> To: Beowulf Mailing List
> Subject: Re: [Beowulf] A cluster of Arduinos
>
> Yes this was impossible to explain to a bunch of MiT folks as well,  
> some of whom wrote your book i bet - yet the slower the processor,  
> the more of a true SMP system it is.
>
> It's obvious that you missed that point.
>
> Writing code for a multicore is tougher, from SMP constraints  
> viewpoint, than for a bunch of 70Mhz cpu's that have a millisecond  
> latency to the other cpu's.
>
> -> Yes, that's true... but that's also what I would think of as  
> more advanced than understanding basic message passing or non- 
> tightly-coupled multiprocessing systems.  And there are lots of  
> applications for the latter.  Some might not be as sexy as others,  
> but they exist.
>
> So it's far from demonstrating clusterprogramming. Lightyears away.
> Emulation at a simple quadcore is in fact better representative  
> than this.
> If you want to get closer to clusterprogramming than this, just buy  
> yourself off ebay some barcelona core SMP system with 4 sockets.  
> Say with energy efficient 1.8Ghz CPU's.
> So with one of the first incarnations of hypertransport, as of  
> course later on it dramatically improved.
> Latency from cpu to cpu is some 300+ ns if you lookup randomly.
> Even good programmers in game tree search have big problems working  
> with those latencies.
>
> -> but that's an entirely different sort of problem space and  
> instructional area.
>
>
> Clusters are having latencies that are far worse than that. Yet as  
> cpu speeds no longer increase much and number of cores doesn't  
> double that quickly, clusters are the way to go if you're CPU hungry.
> Setting up small clusters is cheap as well. If i put in the name  
> 'mellanox' in ebay i see bunches of cheap cards out there and also  
> switches.
>
> -> Oh, Im sure the surplus market is full of things one could  
> potentially use. But I suspect that by the time you lash together  
> your $40 cards and $20 cables and several hundred $ switch, you're  
> up in the total system price >$1k.  And you're using surplus, so  
> there's a support issue.  If you're tinkering for yourself in the  
> garage or as a one-off, then surplus is a fine way to go.  If you  
> want to be able to give a list of "go buy this" to a teacher, it  
> needs to be off-the-shelf currently being manufactured stuff.
>
> -> Say you want to set up 10 demo systems with 8 nodes each, so  
> that each student in a small class has their own to work with.   
> There's a big difference between $30 Arduinos and $200 netbooks.
>
> With a single switch you can teach half a dozen students. You can  
> just connect the machines you already got there onto a few switches  
> and write MPI code like that.
>
> -> The whole point is to give a student exclusive access to the  
> system, without needing to share.  Sure, we've all done the shared  
> "computer lab" resource thing and managed to learn(In the late  
> 1970s, I would have done quite a lot to have on demand access to an  
> 029 keypunch).  That's part of what *personal* computers is all  
> about.    My program doesn't work right, I just hit the reset  
> button and start over.
>
> -> I confess, too, that there is an aspect of the "mass of boards  
> on the desktop with cables strewn around", which is a learning  
> experience in itself.  On the other hand, the Arduino experience is  
> a lot less hassle than, say, a mass of PC mobos, network cards, and  
> power supplies and trying to get them to boot off the net or a USB  
> drive.
>
>
> Average cost per student also will be a couple of hundreds of dollars.
> -> that's the "total cost of several thousand dollars divided by N  
> students who share it" I suspect.  We could get into a little BOM  
> battle, and I'd venture that I can keep the off the shelf parts  
> cost under $500, and give each student a dedicated system to play  
> with.  The only part that I don't know right off the top of my head  
> is the actual interconnect hardware.  I think you'd want to design  
> some sort of board with a bunch of connectors that connects to the  
> Arduinos with ribbon cables.   But even there, that could be  
> "here's your PCBExpress file.. order the board and you get 3 for $50"
>
> -> over the years I've been involved in several of these "what can  
> we set up for a demonstration", and I've converged to the  
> realization that what you need is a parts list (preferably  
> preloaded at Newark or DigiKey or Mouser or similar) and an  
> explicit set of instructions.   A setup that starts out with:
> 1) Find 8 motherboards on eBay or newegg with these sorts of specs
> 2) Find 8 power supplies that match the mother boards
>
> Is doomed to failure.  You need "buy 3 of those and 6 of these, and  
> hook them up this way"
>
> This is the beauty of the whole Arduino culture. In fact, it's a  
> bit too much of that.. there's not a lot of good overview tutorial  
> material.. but lots of "here's how to do specific task X"... I got  
> started looking at Arduinos because I want to build a multichannel  
> temperature controller to smoke/cure sausage.
>
> But I've used just about every small single board computer out  
> there: Rabbit, Basic Stamp, various PIC boards, etc. not to mention  
> various MiniITX and PC schemes.   So far, the Arduino is the winner  
> on dirt cheap and simple combined.  Spend $30, plug in USB cable,  
> load java environment, done.  Now I know why all those projects at  
> the science fair are using them.  You get to focus on what you want  
> to do, rather than getting a computer working.
>
> Vincent
>
>
>
> On Jan 12, 2012, at 12:24 AM, Lux, Jim (337C) wrote:
>
>>
>>
>> -----Original Message-----
>> From: beowulf-bounces at beowulf.org [mailto:beowulf-
>> bounces at beowulf.org] On Behalf Of Vincent Diepeveen
>> Sent: Wednesday, January 11, 2012 2:47 PM
>> To: Beowulf Mailing List
>> Subject: Re: [Beowulf] A cluster of Arduinos
>>
>> Jim, your microcontroller cluster is not a rather good idea.
>>
>> Latency didn't keep up with the CPU speeds...
>>
>> --- You're missing the point of the cluster.  It's not for  
>> performance
>> (where I can't imagine that the slowest single CPU PC out there
>> wouldn't blow the figurative doors off).  It's to provide a very
>> inexpensive way to experiment/play/demonstrate loosely coupled
>> multiprocessor systems.
>>
>> --> for example, you could experiment with redundant message
>> routing across a fabric of nodes.  The algorithms are fairly simple,
>> and this gives you a testbed which is qualitatively
>> different than just simulating a bunch of nodes on a single PC.
>> There is pedagogical value in a system where you can force a link
>> error by just disconnecting the cable, and your blinky lights on each
>> node show what's going on.
>>
>>
>> There is still too much years 80s and years 90s software out there,
>> written by the guys who wrote books about how to parallellize, which
>> simply doesn't scale at all at modern hardware.
>>
>> -->  I think that a lot of the theory of parallel processes is
>> speed independent, and while some historical approaches might not be
>> used in a modern system for good implementation reasons, students and
>> others still need to learn about them, if only as the
>> canonical approach.    Sure, you could do a simulation on a single
>> PC (and I've seen them, in Simulink, and in other more specialized
>> tools), but there's a lot of appeal to a hands-on-the-cheap- hardware
>> approach to learning.
>>
>> --> To take an example, if you set a student a problem of lighting
>> a LED on each node in a specified node order at  specified intervals,
>> and where the node interconnects are not specified in advance, that's
>> a fairly interesting homework problem.  You have to discover the
>> network connectivity graph, then figure out how to
>> pass the message to the appropriate node at the appropriate time.
>> This is a classic "hot plug network discovery" kind of problem,  
>> and in
>> the face of intermittent links, it's of great interest.
>>
>> --> While that particular problem isn't exactly HPC, it DOES relate
>> to HPC in a world where you cannot assume perfect processor nodes and
>> perfect communications links.  And that gets right to the whole
>> "scalability" thing in HPC.  It wasn't til the implementation of  
>> Error
>> Correcting Codes in logic that something like the Q7A computer was
>> even possible, because it was so large that you couldn't guarantee
>> that all the tubes would be working all the time.  Likewise with many
>> other aspects of modern computing.
>>
>> --> And, of course, in the spaceflight world, this kind of thing is
>> even more important.  A concept of growing importance is the
>> "fractionated spacecraft" where all of the functions that would have
>> been all in one physical vehicle are now spread across many smaller
>> pieces.  And one might reallocate spacecraft fractional pieces  
>> between
>> different virtual spacecraft.  Maybe right now, you need a lot of
>> processing power to do image compression and analysis, so you want to
>> allocate a lot of "processing pieces" to the job, with an ad hoc
>> network connection among them.  Later,  you don't need them, so you
>> can release them to other uses.  The pieces might be in the immediate
>> vicinity, or they might be some distance away, which affects the data
>> rate in the link and its error rates.
>>
>> --> You can legitimately ask whether this sort of thing (the
>> fractionated spacecraft) is a Beowulf (defined as a cluster
>> supercomputer built of commodity components) and I would say it  
>> shares
>> many of the same properties, especially in the early Beowulf days
>> before multicores and fancy interconnects were fashionable for
>> multi-thousand processor clusters.  It's that idea of building a  
>> large
>> complex device out of many basically identical subunits, using open
>> source/simple software to manage it.
>>
>>
>> -->> in summary, it's not about performance.. it's about a teaching
>> tool for networking in the context of cluster computing.  You  
>> claim we
>> need to cast off the shackles of old programming styles and get some
>> new blood and ideas.  Well, you need to get people interested in
>> parallel computing and learning the basics (so at least they don't
>> reinvent the square wheel).  One way might be challenges such as
>> parallelization of game play; another might be working with
>> parallelized database; the way I propose is with experimenting with
>> message passing parallelization using dirt cheap hardware.
>>
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
>> Computing To change your subscription (digest mode or unsubscribe)
>> visit http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing To change your subscription (digest mode or unsubscribe)  
> visit http://www.beowulf.org/mailman/listinfo/beowulf
>



More information about the Beowulf mailing list