[Beowulf] Stroustrup regarding multicore
larry.stewart at sicortex.com
Wed Sep 3 11:42:26 PDT 2008
Lux, James P wrote:
> On 9/3/08 10:34 AM, "Peter St. John" <peter.st.john at gmail.com> wrote:
> I'm thinking that multicore will make topology interesting again,
> because of the difference between intercore on a common chip vs
> going through a nic to even the fastest fabric.
It is probably worth putting numbers on statements like this. For example, a main memory reference on a fast processor these days is around 80 nanoseconds. Sending a message to a process on another node
on a fast IB network is getting to 1.2 microseconds. Communicating
to another thread on the same socket is probably not much faster than
a memory reference since you have to thrash a cache-line or two back and
forth between cores.
The numbers for SiCortex stuff are similar: 80 ns for memory, 1 microsecond for MPI nearest-neighbor, 1.3 microseconds for max-diameter.
Core to core via shared memory is about 300 ns, IIRC.
We think of messaging to other nodes as taking a long time, but it isn't
really so. It is perfectly reasonable to think of programs that
communicate every 1000 flops or so, in the same way we think of 15-50
flops per cache miss as "reasonable".
So I am deeply skeptical of the current furor about how we need new
programming models for "multicore chips". We have models that work
perfectly well for 100-1000 core clusters, lets use them.
-Larry / Sector IX
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf