[Beowulf] fast interconnects, HT 3.0 ...

Jim Lux James.P.Lux at jpl.nasa.gov
Tue May 23 18:00:43 PDT 2006


At 09:35 AM 5/23/2006, Eugen Leitl wrote:
>On Tue, May 23, 2006 at 11:04:52AM -0500, Richard Walsh wrote:
>
>If I have some 10^3 nodes,

or if your nodes are spread out over some volume of space..



>and the context is not read-only
>I always have to wait to make sure nobody is trying to write to
>the same location. It's a worst case, but in a relativistic universe
>maintaining the illusion of coherence over many copies is an
>expensive one. Lots of signalling back and forth, until you
>know the state is settled for sure. This might work for 8, 16, maybe 32 
>systems
>in a close enough location -- but with 10^3 or 10^6 nodes it
>has to give.
>
> >    That is where the pGAS programming models become more efficient.  Remote
> >    memory references expressed in the syntax and compiled to
> > instructions for
> >    direct puts and gets without management or translation by a NIC.  It
>
>We're talking lunatic fringe interconnects

lunatic fringe today, regular order of business in 3-4 or 10 years, so we'd 
better figure out how to deal with "non-simultaneity" pretty soon.


>where the wire or the fibre
>is your FIFO, and the switch makes a routing decision after a few bits
>of the headers have streamed past -- which is reasonably close to c.
>With 10 GBit data rates and above that's a quick decision to take.
>At 10 GBit/s your serial bit is just ~3 cm or 100 ps short -- in vacuum.
>Shorter in glass, and much shorter in copper. So a very short message
>can arrive within a few ns, which is order of magnitude RAM access.
glass (silica) has a index of about 1.5, so 5ns/meter.  Coax with velocity 
factor 0.66 (solid PE dielectric) would be about the same.  Foam dielectric 
(i.e. cable TV coax) has a faster propagation speed (VF=0.70 to 
VF=0.80).  Twisted pairs depending on the twist rate and dielectrics 
involved could be anywhere from 0.5 (6ns/meter) to 0.9 (3.3 ns/meter)


> > would seem
> >    that HT 3.0 supports this model across chassis as long as the
> > programmer manages
> >    memory synchronization.
>
>You have to bite the bullet and manage synchronization by higher-order
>protocols. The physical world at the bottom is fundamentally message-passing.
>You might notice it very much if you're working on us scale, but
>in ns and below it you can't ignore it.
Exactly... when bit time (or message length) starts to get comparable to 
light time between endpoints, you've got to start thinking about it.


Jim. 





More information about the Beowulf mailing list