Unisys

Tue Feb 27 08:24:06 PST 2001

> On Mon, 26 Feb 2001 kragen at pobox.com wrote:
> 
> > FWIW, big SMPs will tend to do shared-memory things more > efficiently
> 
> Shared memory is unphysical. Don't use it for code with > expected long
> lifetime. You're facing relativistic lag problems, signal > fanout 
> problems
> (multiport memory is expensive to do), and the nightmare of > cache
> coherency. Most of it is hidden in current > overhead/inefficiencies,
> but it's there.

I'm not sure you are entirely in tune with this conversation.  The nomenclature
"shared-memory" is, in this case, used in opposition to a message-passing
paradigm.  The implementations of shared memory machines can certainly vary,
and the concerns you mentioned are present in various designs.  However,
you would be EXTERMELY hard pressed to make a case that large SMP boxes
implement a shared memory paradigm as well as a message passing instance
of the same scale.  You simply have to pay more, and live with a lower upper-limit
on scaling.  :)

> 
> > than machines that actually have to pass messages to simulate > shared
> > memory, and some people think writing a threaded program that > > scales
> 
> If you want to access bits stored in a remote piece of hardware > (such

> as a
> memory chip), you apply a bit pattern, selecting a group of > bits, and
> recieve said bit pattern. If this is not message passing, I > don't know
> what message passing is.

You don't know what message passing is.  j/k :) The use of the term message
passing here is that of logical communications between interacting processes.
 Obviously the term processes has many different handlings in many different
scenarios (ranging from processors to processes to applications, etc), but
here we are specifically describing those entities that contribute to a
user-level application directly.  In other words it is a form of RPC (such
as MPI).  Anyway, what you have described is a mechanism that an SMP box
(and maybe a few others) might use internally to interconnect processors.
 This is not the typical handling of the term in this mail list.

> 
> It is handled in hardware, but there is no reason why > distributed 
> memory
> could not be emulated by hardware on an efficient > message-passing
> infrastructure.

Except that distributing memory causes network issues to govern the throughput
more directly, and that scaling your memory locations causes a higher work-function
for each cache miss, and the cost of creating a "pipe" between processors
that is fat enough and fast enough for large clusters can be VERY significant,
etc...  SGI did some really cool work on this with their ccNUMA architecture...
It has a new name now, but I can't remember it off hand... NUMAPlex or something...

Just my 2 cents.

Eric

--
iWon <http://www.iwon.com/> - Voted the #1 portal on the Web!