[Beowulf] New HPCC results and the Myri viewpoint
tbole1 at umbc.edu
Wed Jul 20 21:55:05 PDT 2005
I must side with Patrick on this issue.
A GigE network works just fine for me, running embarassingly parallel
Monte Carlo simulations. I haven't seen RGB weigh in on this, so I'll try
to make the point which I think would be his. A Beowulf is engineered to
solve a problem, not the other way around.
I don't know how much time paralellized chess programs spend passing
messages, but I can tell you that it seems to be a small market in the
grand scheme of Beowulfery. Most clusters that I have seen are devoted to
large scale simulations or numerical analysis. For these types of uses,
it seems a good bet that most time is spent on computation, not message
just my US$0.02
On Wed, 20 Jul 2005, Patrick Geoffray wrote:
> Vincent Diepeveen wrote:
> >>>Tests at all processors at the same time make major sense.
> >>Yes and no. Most networking people believe the job of a node is to send
> >>messages. Actually, it's mainly to compute, and sometimes sends
> >>messages. So, would running a pingpong test on multiple processors at
> >>the same time sharing a NIC an interesting benchmark ? Not really, it
> >>won't happen much on real codes that compute most of the time. I prefer
> >>to optimize other things that help the host compute faster.
> > If most of the time they are 'just computing', then it just doesn't make
> > sense to have a highend network. A $10 gigabit network will do in that case.
> And it does for many people. What is the most used interconnect in the
> cluster market ? GigE.
> > Reality is however different. Reality is that you simply stress the network
> > until it wastes say 10-20% of your system time until a maximum of 50%.
> What do you know about my reality ? Your reality is a 8x8 chessboard.
> Have you looked at a trace of one of the 10 ISV codes that are the
> majority of applications running on real world clusters ? Yes they do
> communicate, but they compute most of the time.
> Your reality is very unususal: your problem size if tiny, you add nodes
> to go faster, not bigger. If you would add nodes to go bigger, then you
> will realize that your compute/communicate ratio (usually) increases.
> You have rambled on this list about parallel machines not being suited
> to your usage. Maybe it's the way around, maybe nobody thinks about
> chess when they buy a cluster.
> > In short, if you deliver highend nic's, ASSUME they get used.
> Of course they will get used, that's not the question ! It's about what
> is important. Tuning for a pattern that is not common has little return.
> An example for your curious and open mind: many interconnect people
> advertize the streamed bandwidth curve, where the sender just keeps
> sending messages as fast as possible. How often does this communication
> pattern happens in my reality ? Never. I have never seen an application
> sending enough messages back to back to fill up the pipeline. So why
> optimizing for this case ? because the curve looks good and people likes
> to think they have a bigger pipe than their friends.
> > At least *i* understand that principle.
> Good for you. It must be lonely up there, so many stupid people around.
> > Weirdly enough the manufacturer of a product assumes his stuff isn't going
> > to get used.
> > Why make it then for your users?
> > You try to sell a product without your users using it?
> What was that procmail filter again ? I just remember the "idiot" part.
> Got to look in the archives...
> Patrick Geoffray
> Myricom, Inc.
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Timothy W. Bole a.k.a valencequark
Department of Physics
More information about the Beowulf