[Beowulf] Register article on Cray Cascade
diep at xs4all.nl
Fri Nov 9 12:40:52 PST 2012
that's not how fast you can get the data at each core.
The benchmark i wrote is actually a reflection of how a hashtable
works for Game Tree Search in general.
the speedup of it is exponential, so doing it in a different way we
can PROVE (as in mathematical proof)
that you will have troubles getting the same exponent (which we cal
So practical testing then what you can achieve from core to core is
The first disappointment then happens with the new opteron cores
actually, namely that AMD has designed
a memory controller which just doesn't scale if you use all cores.
Joel Hruska performed some tests there (not sure where he posted it
We see then that the bulldozer type architecture still scales ok if
you run benchmarks single core.
Sure no real good latency but still...
Yet if you move then from using 4 processes to measure to 8 processes
to measure, this
at a chip we already land at nearly 200 ns, which is real slow.
The same effect happens when at a big supercomputer you run at full
throttle with all cores.
Manufacturers can claim whatever, but it is always paper math.
If they ever release something it's some sort of single core, whereas
in the first place that
box didn't get ordered to work single core.
You don't want the performance at a single core in a lab with
temperatures nearby 0 Kelvin,
you want to see that the box you got performs like this with all
cores running :)
And on the number posted you already start losing at Cray, starting
with the actual CPU's that suck when you use all cores.
On Nov 9, 2012, at 8:38 PM, atchley tds.net wrote:
> Vincent, it is easy to measure.
> 1. Connect to NICs back-to-back.
> 2. Measure latency
> 3. Connect machines to switch
> 4. Measure latency
> 5. Subtract (2) from (4)
> That is how we did it at Myricom and that is how we do it at ORNL.
> Try it sometime.
> On Fri, Nov 9, 2012 at 2:36 PM, Vincent Diepeveen <diep at xs4all.nl>
> On Nov 9, 2012, at 7:31 PM, atchley tds.net wrote:
> Modern switches need 100-150 ns per hop.
> yeah that's BS when you have software that goes measure that with
> all cores busy.
> I wrote a benchmark to measure that with all cores busy.
> The SGI box back then that was having 50 ns switches which would
> have 'in theory' a latency of 480 ns @ 500 cpu's,
> so 960 for a blocked read, i couldn't get it down to less than 5.8
> us on average.
> There are some things that do not scale per hp such as traversing
> the PCIE link from socket to NIC and back. So, I see it as 1.2 to
> go to the router and back and 100 ns per hop.
> On Fri, Nov 9, 2012 at 11:17 AM, Vincent Diepeveen <diep at xs4all.nl>
> The latency estimate taking 5 hops seems a tad optimistic to me
> except when i read the English wrong and they mean 1.7 microseconds a
> hop making it for a 5 hop 5 * 1.7 = 8.5 microseconds in total.
> "Not every node is only one hop away, of course. On a fully
> configured system, you are five hopes away maximum from any socket,
> so there is some latency. But the delta is pretty small with
> Dragonfly, with a minimum of about 1.2 microseconds for a short hop,
> an average of 1.5 microseconds on average, and a maximum of 1.7
> microseconds for the five-hop jump, according to Bolding."
> On Nov 8, 2012, at 7:13 PM, Hearns, John wrote:
> > Well worth a read:
> > http://www.theregister.co.uk/2012/11/08/
> > cray_cascade_xc30_supercomputer/
> > John Hearns | CFD Hardware Specialist | McLaren Racing Limited
> > McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21
> 4YH, UK
> > T: +44 (0) 1483 262000
> > D: +44 (0) 1483 262352
> > F: +44 (0) 1483 261928
> > E: john.hearns at mclaren.com
> > W: www.mclaren.com
> > The contents of this email are confidential and for the exclusive
> > use of the intended recipient. If you receive this email in error
> > you should not copy it, retransmit it, use it or disclose its
> > contents but should return it to the sender immediately and delete
> > your copy.
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> > Computing
> > To change your subscription (digest mode or unsubscribe) visit
> > http://www.beowulf.org/mailman/listinfo/beowulf
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf