[Beowulf] Parallel memory

Wed Oct 19 03:29:41 PDT 2005

On Tue, 2005-10-18 at 18:28 -0400, Robert G. Brown wrote:

> There was once also an online list discussion about swapping to an NFS
> mounted remote exported ramdisk, IIRC.  In principle all of that should
> work although I have NOT tried it and cannot say for certain.  Making a
> big ramdisk is fairly straightforward.  Once made and mounted, it can be
> NFS exported.  The trick then is whether it indeed possible to swap to
> an NFS mounted swapspace.  On Suns this used to be routine (otherwise
> e.g. the SLC and ELC, fully diskless sparc nodes from the 90's, wouldn't
> have worked).  On linux back in the 2.0 and maybe 2.2 series, I think
> that there was a problem -- maybe with page sizes? -- and it wouldn't
> work.  But I THINK that I recall hearing that remote linux swap now
> works, and if so that might be a way to go to get to a largish VM.
> 
> I'd really be interested (I'm sure the whole list would be) if you try
> the latter and it works, especially if it works pretty "well" (e.g.
> orders of magnitude faster than disk swap if not as fast as real
> local memory).  It would also give a great idea to all those cs grad
> students looking for a useful project, as very large VM systems are
> almost as advantageous to certain kinds of research as very FAST local
> memory systems are to others.

You are almost right, there *used* to be a kernel module that allowed
swap over the network, I remember playing with it in the 2.2 days (a
long time ago).  I say used to because I honestly don't know if it's
there any more, somehow I doubt it.  It worked over sockets and the
remote "server" was a simple user-space affair like any other process
running on a node, none of this NFS/ramdisk stuff.  What you got on the
client end was a network block device which you could then configure as
swap if you wanted.  I used it for creating logical devices which were
larger than any physical device I had and then putting a filesystem on
it, user-space remote LVM if you like.

On a good day it worked but was flaky and although I'm sure it served a
purpose wrong by design for any serious uses.  The changes of it
out-performing disc are small unless you buy a proper network and even
then it will only deliver a small proportion of what the network is
capable of.

> > Also, are there any tools to help implement mpi in an older code?
> 
> I've decided that I'm too ignorant to help on MPI questions, so I'll
> leave this to maybe Jeff Squyers and other people who are real experts;-)

Well if all you want is access to memory over the network maybe MPI
isn't exactly the right choice either although it is close, something
like cray shmem would be easier to program as it has simple put()/get()
calls.  Combine this with a RDMA capable network and application layer
pipelining of memory access and you should be able to get access to
terabytes of ram for almost zero CPU usage.

Then again MPI *is* ubiquitous whilst shmem isn't and it wouldn't be
that much harder or slower to code in MPI so maybe it is the right
solution after all.

Ashley,