[Beowulf] fast interconnects, HT 3.0 ...

Richard Walsh rbw at ahpcrc.org
Wed May 24 12:54:24 PDT 2006

Eugen Leitl wrote:
> On Wed, May 24, 2006 at 09:09:23AM -0500, Richard Walsh wrote:
>>    Jim, I meant cache coherence.  As we know, HT provides cache 
>> coherent and non-cache coherent
>>    memory management.  Typically within the board complex on an SMP 
>> device we want cache coherency.
> You cannot have cache coherency over a large amount of systems *and*
> have temporally unconstrained execution. There is no free lunch.
> There are already coherency issues in distributing such a simple
> thing as clock over such a small area as a single die. (Which
> is why global clocks will go away one day).
    Yes, yes ... ;-) ... ccNUMA gets to be heavy-weight as the processor 
count rises, but
     again what I was asking is:  If HT 3.0, by providing a 
chassis-to-chassis connection,
     defines a single, globally addressable address space across enough 
processors (thus
     my question about switches and scalability), then it could become a 
standard protocol
     on top of which pGAS languages like UPC and CAF can run.  The HT 
3.0 layer would
     play a role similar to global address space (non-coherent) that the 
Cray X1 provides
     UPC and CAF across its node boards, while allow on-board 
applications a cache
     coherent alternative ... or function, as a standard, as an 
important alternative to GASnet
     in a COTS/cluster regime

>>    The HT 3.0 standard, as I understand it, offers off-chassis memory 
>> access at lower bit rates using AC power,
>>    but without cache coherence.  This is quite similar to the approach 
>> taken on the Cray X1 with cache coherent
>>    on-board images and non-coherent access off-board.  The Cray X1 
> I think cache coherency on 4-16 CPUs on-board does make some sense.
     Yes, again ... as I said above, as the Cray X1 design shows, etc.
>> support the partitioned Global Address
>>    Space (pGAS) programming models of UPC and CAF.   The question here 
> pGAS assumes shared memory. There is no such thing as a shared memory,
> beyond of multiport memory where "crossbars do not scale" thing applies.
    pGAS only assumes a >>somehow globally addressable memory<< that's 
why you can still run
    UPC and CAF on a cluster.  The >>global addressability<< is 
currently provided through the GASnet
    API written for your particular interconnect although is can even 
run "upside down" on top
    of MPI!

    HT 3.0 would seem to offer a more uniform and efficient way of 
providing the >>somehow global
    addressibility<< pGAS langauges need in a COTS/cluster regime with 
better latency and bandwith.
    I was hoping someone on here that know HT 3.0 well would be able to 
comment, but it seems
    the folks are not too familiar with it yet or with UPC and CAF in a 
cluster context.  People should
    down load Berkeley UPC and give it a try ... ;-) ...

    Perhaps, you are asserting that with latencies for 1 byte of ~1.5 
micros and N 1/2 bandwidth (400 Mbytes/sec)
    at 400 byte-messages we can't hope to do better at the scale we 
would like to take cluster systems.  This
    problem is exacerbated by the absense of vector memory operations 
that are available on the Cray X1 ...
    although there is a message-vector like sent of UPC memory copy 
libraries that can be used.  The Cray
    delivers better latency and bandwidth in UPC and CAF than it does in 
MPI which is an argument in
    favor of some room for improvement still.

    It may be that little more can be extracted from interconnect 
physics with HT 3.0 ... but pGAS language
    performance should at least equal MPI and would have a programming 
elegance advantage.


>> was: What do those that under
>>    stand HT 3.0 better than I do think about its ability to similarly 
>> support the pGAS programming style
>>    efficiently?  The follow up question was:  What might be the 
>> implications for commodity parallel programming
>>    in MPI.  I want to get a feel for HT 3.0s scalability in this 
>> context, the need/density of potential HT switches,
>>    etc. 
>>    The discussion on signal coherence was of course interesting ... ;-) ...
> ------------------------------------------------------------------------
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


Richard B. Walsh

Project Manager
Network Computing Services, Inc.
Army High Performance Computing Research Center (AHPCRC)
rbw at ahpcrc.org  |  612.337.3467

This message (including any attachments) may contain proprietary or
privileged information, the use and disclosure of which is legally
restricted.  If you have received this message in error please notify
the sender by reply message, do not otherwise distribute it, and delete
this message, with all of its contents, from your files.

More information about the Beowulf mailing list