[Beowulf] CSharifi Next generation of HPC
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at mcmaster.caWed Dec 5 08:22:55 PST 2007
- Previous message: [Beowulf] CSharifi Next generation of HPC
- Next message: [Beowulf] Using Autoparallel compilers or Multi-Threaded libraries with MPI
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> DSM / Distributed Shared Memory (which I prefer to call NVM, Network > Virtual Memory) is a prefect example of this. It certainly doesn't help I think the 'N' is a valuable change, but would suggest NSM is even better. to me, the V hints too much of paging-type VM, and doesn't hint at the main point (sharing). > the end user. The only aspect an end user or system administrator sees is > that NVM causes cascading system failures when one machine drops out of > the cluster. a really good NSM implementation might well provide some kind of persistence, even replication of the space. it would be tricky to do without introducing some sort of transactional support, though, and that seriously complicates the user-level interface. of course, people who do this sort of thing often worry about different consistency models which require transaction-like directives anyway. again the programmer's interface becomes not so simple. > The programmer doesn't benefit either. They initially > think that NVM gives them an easy to use shared memory model. They > quickly find that it only appears to be normal memory. To get even barely > acceptable performance they have to treat the shared memory very > differently than regular memory. Variables written by different processes > have to be segregated into different pages. Writes have to grouped. You > have to think about when to manually cache structures to avoid a re-read > that might trigger a network page fault, but refresh that structure when > you need potentially updated values. well put. I was pondering how to say this while also pointing out that even within a single machine, programmers really cannot think memory is flat. that is, you have to program for your caches. level latency size concurrency register <.5 ns 8B 1-10? (renaming) L1 1-2 ns 64B ~2 L2/3 4-20 ns 64B ~1 ram 50-80 ns 64B 1-4 remote 5+ us 4KB 1 swap 10 ms >=4KB 1 the 'remote' there is for a reference to an NSM page that has to be brought over the net, and is assuming a fast interconnect. it's effectively the same as an MPI send and receive. notice that you can't really express just a send with NSM (it would be a blind write). I think NSM is attractive mainly at a shallow level: either for very simple, limited applications which just want to replicate a chunk of read-only shared memory across machines, or cases where details like locking and locality haven't been thought out yet.
- Previous message: [Beowulf] CSharifi Next generation of HPC
- Next message: [Beowulf] Using Autoparallel compilers or Multi-Threaded libraries with MPI
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
