charmm scalability on 2.4 kernels

Bogdan Costescu bogdan.costescu at iwr.uni-heidelberg.de
Tue Jan 8 10:32:18 PST 2002


On Mon, 7 Jan 2002, Greg Lindahl wrote:

> On Mon, Jan 07, 2002 at 07:22:39PM +0100, Tru wrote:
> 
> > I think the bad speedup comes from dual VS single cpu nodes
> > regarding parallel behaviour of CHARMM.
> 
> If so, that's easy enough to check: You can run only one process on a
> dual cpu node, for benchmarking purposes.

That is actually what I have observed during the last 3 years of running 
different versions of kernels, MPI libraries and CHARMM. Running using 
only one transport (TCP or shared mem) is always better than mixing them, 
f.e (using LAM-6.5.6):

CPUs	nodes	real time (min)		transports
4	4	5.95			TCP
4	2	7.08			TCP+USYSV

As you can see, the difference is quite significant.

With 8 single CPU nodes over FE, the scalability of PME goes down to 50%; 
using Myrinet (with SCore), it's around 75% - so the algorithm is not 
quite Beowulf-friendly. However, I haven't noticed any significant change 
in scalability between runs with 2.2.x and 2.4.x kernels.

I obtained a behaviour similar with that from the graphs when I used TCP 
as IPC on the same node instead of shared memory. Apropos, could the 
zero-copy kernel stuff be used to improve this situation ?

-- 
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu at IWR.Uni-Heidelberg.De




More information about the Beowulf mailing list