Kidger's comments on Quadric's design and performance

Wed Apr 24 06:26:53 PDT 2002

Sorry if you get something like this message twice, I submitted it
once and nothing has come back, although my correction to one of the
www addresses went through :-(

Joachim Worringen <joachim at lfbs.RWTH-Aachen.DE> wrote

> > This message also reminded me to ask if a long-held opinion is valid - and
> > that opinion is "that a cache coherent interconnect would offer performance
> > enhancement when applications are at the 'more tightly coupled' end of the
> > spectrum."  I know that present PCI based interfaces can't do that without
> > invoking software overhead and latencies.  Anyone have data - or an argument
> > for invalidating this opinion?
> 
> You would need another programming model than MPI for that (see below),
> maybe OpenMP as you basically have the characteristics of a SMP system
> with cc-NUMA architecture.

No, you are confusing two completely different issues. To support
OpenMP you need a single address space which spans the processors. 

You can have cache coherent communication interfaces which do not
implement such a thing. (If it's still the same as it was at Meiko,
the Quadrics is an example of such an interface).

What Quadrics provides is an explicit remote store access model. You
can perform reads or writes cache coherently to a remote process'
address space, but you have to know that you're doing a remote access
and do something different to achieve it. You can't just indirect
through some random pointer and have that fetch data.

OpenMP assumes a single address space within which pointers can be
passed around freely, so will not implement easily on top of an
interface like Quadrics, even though that is (I believe) cache
coherent at both ends.

Languages which are built on an explicit remote store access model
include

Co-Array Fortran	http://www.co-array.org
UPC                     http://hpc.gwu.edu/~upc
Titanium                http://www.cs.berkeley.edu/Research/Projects/titanium/

in these languages the compiler always knows which accesses may be
remote.

Of course such languages can also run on SMP boxes and use a
"genuinely" shared memory (and, indeed one might hope that the extra
information available in such languages allows the compiler to
generate better code for such a machine than one can generate from
OpenMP, since it should be able to avoid much false sharing).

-- Jim 

James Cownie	<jcownie at etnus.com>
Etnus, LLC.     +44 117 9071438
http://www.etnus.com