Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Any Gaussian users out there?

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mikhail Kuzminsky kus at free.net
Mon Jan 8 08:30:31 PST 2007


In message from Joe Landman <landman at scalableinformatics.com> (Sun, 07 
Jan 2007 22:49:55 -0500):
>I found a neat ... feature ... of Linux while getting g03 running in 
>SMP on cluster nodes.  Long story, but the folks I am doing this for 
>don't have/want to use Linda.  They asked us to help them get g03 
>operational in SMP parallel.  This wasn't painful.  Have it 
>integrated into SGE and our SICE interface now as well.
>
>Basic idea is that we are getting a kernel exception in the VFS layer 
>only when running with 2 or more CPUs on an SMP node.  Shows up only 
>on SuSE 9.3 nodes.  The other nodes are RHEL 3 based (2.4 kernel, but 
>hey, its really stable).
   We have working g03 C02 w/SMP parallelization under SuSE 9.0 for 
x86-64 (2.6 kernel, but more old than f0r 9.3 ). In particular, xfs 
works OK.

Yours
Mikhail


>
>I don't want to post a nasty-looking trap here.
>
>The problem occurs with both xfs and jfs.  Haven't had the chance to 
>try ext3 yet, though if the issue is in the vfs layer, I can't see 
>how changing the underlying block device is going to alter the layers 
>(VFS) above it.
>
>The net effect of this is that it runs great on the 2.4 based 
>machines, but gets SIGKILLs when running on the 2.6 based SuSE 9.3 
>machines. Looks like the app is tickling the OS bug.  I can 
>repeatably cause this trap, though it seems to occur at "random" 
>places, well, not really. The way Gaussian runs, it has "links" which 
>are binary modules which execute a particular portion of the 
>calculation (its pretty neat really).  Each link is read in from the 
>disk.  This VFS bug gets triggered regardless of local or remote FS.
>
>Any Gaussian users out there see that?  Does a kernel upgrade fix it? 
>Inquiring minds want to know ...
>
>-- 
>
>Joseph Landman, Ph.D
>Founder and CEO
>Scalable Informatics LLC,
>email: landman at scalableinformatics.com
>web  : http://www.scalableinformatics.com
>phone: +1 734 786 8423
>fax  : +1 734 786 8452 or +1 866 888 3112
>cell : +1 734 612 4615
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list