From ocoskun1 at jhu.edu Thu Mar 11 18:46:02 2004 From: ocoskun1 at jhu.edu (ORKID COSKUNER) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] jaguar Message-ID: Hi, I have not got the experience in using beowulf and started to work bu using biowulf cluster due to my postdoc. My question is: we bought the program Jaguar which is good for some ab initio calculations. The program is already in paralell. mode and there is no need to install pbs or dqs as the technical support workers at the schroedinger company said. However, the program Jaguar does not run on our cluster biowulf scyld. The name of our admisnistrator is amit paliwali and may be you know him already. Could you please help us what we have to change in the batch que to get the jaguar run and this on 15 nodes? thanks, oc From becker at scyld.com Thu Mar 11 18:52:01 2004 From: becker at scyld.com (Donald Becker) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] jaguar In-Reply-To: Message-ID: On Thu, 11 Mar 2004, ORKID COSKUNER wrote: > My question is: we bought the program Jaguar which > is good for some ab initio calculations. The program > is already in paralell. mode and there is no need to > install pbs or dqs as the technical support workers > at the schroedinger company said. Is it a MPI program? If so, it is compiled with the Scyld BeoMPI, or pre-linked with another MPI implementation? If it uses BeoMPI, you may use all CPUs (e.g. 2 processes per dual-SMP node) with export ALL_CPUS=1 and just run the program as you would any other. Otherwise you likely need to use 'mpprun' or 'mpirun', perhaps with additional command-line parameter. This is described in the Scyld Beowulf User Guide. Additional details are in the Administrator Guide and the Programmers Guide. > However, the program Jaguar does not run on our > cluster biowulf scyld. The name of our > admisnistrator is amit paliwali and may be you know > him already. Could you please help us what we have > to change in the batch que to get the jaguar run and > this on 15 nodes? -- Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 914 Bay Ridge Road, Suite 220 Scyld Beowulf cluster systems Annapolis MD 21403 410-990-9993 From bioinformaticist at mn.rr.com Wed Mar 17 21:53:01 2004 From: bioinformaticist at mn.rr.com (Eric R Johnson) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] Scyld system mysteriously locks up Message-ID: <405608D0.60501@mn.rr.com> Hello, I purchased a 4 node, 8 processor Scyld (version 28) cluster approximately 6 months ago. About 5 days ago, it started mysteriously locking up on me. Once it is locked up, I can't do anything except physically reboot the machine. Unfortunately, I am rather new to Linux clusters and, since it worked "right out of the box", I have had no experience in troubleshooting. Can someone give me an idea of where I should start? I have the BIOS on all machines set to do a full memory check on startup and the /var/log/message file shows nothing. Thanks, Eric -- ******************************************************************** Eric R A Johnson University Of Minnesota tel: (612) 626 5115 Dept. of Laboratory Medicine & Pathology fax: (612) 625 1121 7-230 BSBE e-mail: john4482@umn.edu 312 Church Street web: www.eric-r-johnson.com Minneapolis, MN 55455 USA From twhitcomb at apl.washington.edu Thu Mar 18 12:20:01 2004 From: twhitcomb at apl.washington.edu (Tim Whitcomb) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] Re: Scyld system mysteriously locks up In-Reply-To: <200403181704.i2IH4sA06839@NewBlue.scyld.com> References: <200403181704.i2IH4sA06839@NewBlue.scyld.com> Message-ID: <4059DDEB.7000705@apl.washington.edu> > I purchased a 4 node, 8 processor Scyld (version 28) cluster > approximately 6 months ago. About 5 days ago, it started mysteriously > locking up on me. Once it is locked up, I can't do anything except > physically reboot the machine. > Unfortunately, I am rather new to Linux clusters and, since it worked > "right out of the box", I have had no experience in troubleshooting. > Can someone give me an idea of where I should start? > I have the BIOS on all machines set to do a full memory check on startup > and the /var/log/message file shows nothing. > Thanks, > Eric This sounds suspiciously like a problem we've been fighting for the past year at least. Are the machines actively running a job when they lock up or are they sitting idle? I've done some tests that seem to suggest that our system does not like the same job being run on both processors of the same machine. Where did you purchase your equipment from, what kind of processors are in it, what kind of interconnect are you using, and what is the motherboard in the machines? TRW Timothy R. Whitcomb =================== Meteorologist Applied Physics Lab University of Washington mail: twhitcomb at apl dot washington dot edu voice: (206) 543-2663 From kristen at cgcmail.cpmc.columbia.edu Mon Mar 22 14:38:54 2004 From: kristen at cgcmail.cpmc.columbia.edu (Kristen J. McFadden) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] Re: Scyld system mysteriously locks up Message-ID: Hi, We experienced the same sort of thing until we added in sleep 5 or sleep 10's automatically in between the dispatching of jobs... Perhaps you could try that? KM -----Original Message----- From: scyld-users-admin@scyld.com [mailto:scyld-users-admin@scyld.com] On Behalf Of Tim Whitcomb Sent: Thursday, March 18, 2004 12:36 PM To: scyld-users@scyld.com Subject: [scyld-users] Re: Scyld system mysteriously locks up > I purchased a 4 node, 8 processor Scyld (version 28) cluster > approximately 6 months ago. About 5 days ago, it started mysteriously > locking up on me. Once it is locked up, I can't do anything except > physically reboot the machine. > Unfortunately, I am rather new to Linux clusters and, since it worked > "right out of the box", I have had no experience in troubleshooting. > Can someone give me an idea of where I should start? > I have the BIOS on all machines set to do a full memory check on startup > and the /var/log/message file shows nothing. > Thanks, > Eric This sounds suspiciously like a problem we've been fighting for the past year at least. Are the machines actively running a job when they lock up or are they sitting idle? I've done some tests that seem to suggest that our system does not like the same job being run on both processors of the same machine. Where did you purchase your equipment from, what kind of processors are in it, what kind of interconnect are you using, and what is the motherboard in the machines? TRW Timothy R. Whitcomb =================== Meteorologist Applied Physics Lab University of Washington mail: twhitcomb at apl dot washington dot edu voice: (206) 543-2663 _______________________________________________ Scyld-users mailing list, Scyld-users@beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/scyld-users