From thehotdoggod at hotmail.com Mon Feb 9 13:11:09 2004 From: thehotdoggod at hotmail.com (Chris Brake) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] hello Message-ID: hello, my name is Chris, i'm a student in an information technology program at UCCB Nova Scotia Canada. i'm very interested in Beowulf technology and i was asked, as an assignment for class, to ask a question on a listserv. i've chosen this one. i was wondering what, at this point is the most stable and felxable version of the beowulf software. any respionse is welcome, thank you. _________________________________________________________________ STOP MORE SPAM with the new MSN 8 and get 2 months FREE* http://join.msn.com/?page=dept/bcomm&pgmarket=en-ca&RU=http%3a%2f%2fjoin.msn.com%2f%3fpage%3dmisc%2fspecialoffers%26pgmarket%3den-ca From becker at scyld.com Mon Feb 9 17:14:02 2004 From: becker at scyld.com (Donald Becker) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] hello In-Reply-To: Message-ID: On Mon, 9 Feb 2004, Chris Brake wrote: > my name is Chris, i'm a student in an information technology program at > UCCB Nova Scotia Canada. i'm very interested in Beowulf technology and i > was asked, as an assignment for class, to ask a question on a listserv. > i've chosen this one. i was wondering what, at this point is the most > stable and felxable version of the beowulf software. "Stable" and "flexible" are very subjective terms. Most Beowulf distributions are based on an underlying Linux system, and thus their stability is roughly comparable. Linux can be a very stable system, running for years without rebooting. However few Beowulf distributions besides Scyld are tested before release, and thus Scyld Beowulf distributions are more likely to be stable than other cluster systems. A clear example was during the first year of the Linux 2.4 kernel. Scyld's commercial release continued to use the Linux 2.2 kernel, while other cluster distributions used the 2.4 kernels with the claim that "newer is better". Scyld's first commercial release with the 2.4 kernel used 2.4.17, which was the first 2.4 in which the VM subsystem was stable for large memory jobs. It's now widely acknowledged that earlier 2.4 kernels were only suitable for light workloads. The Scyld Beowulf distribution is also more flexible than other Beowulf distributions, by many criteria. It takes only a single configuration change to build disk or diskless nodes, with no change to the administrative model. It's easy to create specialized single-purpose nodes, or to leave all machines as general purpose compute nodes. There are so many other examples of flexibility that the term really needs to be narrowed. -- Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 914 Bay Ridge Road, Suite 220 Scyld Beowulf cluster systems Annapolis MD 21403 410-990-9993 From becker at scyld.com Mon Feb 9 19:19:01 2004 From: becker at scyld.com (Donald Becker) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] Re: [Support] Setting up ntpd on compute nodes In-Reply-To: Message-ID: On Fri, 6 Feb 2004, Tony Stocker wrote: > How do we go about setting up ntpd on our compute nodes, with the ntp > server being the host node? What aspect of NTP do you need? This same topic came up during my meeting with Panasas this past Friday: if you need time synchronization only for the filesystem, our current approach will work. We prefer not to run the standard NTP daemon, or any daemon, on compute nodes. Running daemons on compute nodes results in unpredictable scheduling. This becomes a significant issue with lock-step computation and larger node counts, as the slowest node sets the step rate. Instead Scyld provides 'bdate', which explicitly sets the time (settimeofday(), including microseconds) on compute nodes from the master's clock. This is called at node boot time, and optionally periodically with 'cron'. In both cases it follows the Scyld approach of cluster operation being controlled by a master machine, rather than compute nodes having independent operations or relying on distributed, persistent configuation files. If the exact behavior of 'ntp' is required, it's simple to configure 'ntpd' to start automatically on node boot. Create a start-up script /etc/beowulf/init.d/ntp that calls bpsh -n $NODE /usr/sbin/ntpd -m -g (or the appropriate options for your needs). Please let us know what your time sync requirements are -- we can likely efficiently provide the functionality needed, but are reluctant to include the 'ntpd' approach in our default node configuration. It is more intrusive, complex and configuration-intensive than is needed for a tightly coupled cluster. -- Donald Becker becker@scyld.com Scyld Computing Corporation http://www.scyld.com 914 Bay Ridge Road, Suite 220 Scyld Beowulf cluster systems Annapolis MD 21403 410-990-9993 From astocker at tsdis02.nascom.nasa.gov Tue Feb 10 12:10:01 2004 From: astocker at tsdis02.nascom.nasa.gov (Tony Stocker) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] Re: [Support] Setting up ntpd on compute nodes In-Reply-To: Message-ID: Don, Well the main reason behind this is the odd date/time stamps that I'm seeing on processes on the compute nodes. For instance here's the output of 3 commands, an uptime, a date, and a ps -ef. Note that the compute node's time (via the date command) is about 18 seconds off of the master node's time. Also note that it has a process (the ps -ef) that started running 10 DAYS in the FUTURE! # bpsh 3 uptime; bpsh 3 date; bpsh 3 ps -ef 2:48pm up 14 days, 19:42, 0 users, load average: 0.00, 0.00, 0.00 Tue Feb 10 14:48:36 UTC 2004 UID PID PPID C STIME TTY TIME CMD root 15212 15211 0 Feb06 ? 00:12:20 /usr/bin/sendstats 3 root 15265 1 0 Jan26 ? 00:00:01 syslogd -m 0 root 27389 27388 0 Feb20 ? 00:00:00 ps -ef root@hrunting.gsfc.nasa.gov (bash) Tue Feb 10 14:48:18 /root # This behavior varies by node, for instance here is the same set of commands run on node 8, notice that in this case the ps command is only running 8 hours and 17 minutes in the future even though its clock (via date) appears to be almost a minute faster than the host node (49 sec): # bpsh 8 uptime; bpsh 8 date; bpsh 8 ps -ef 2:52pm up 24 days, 16:49, 0 users, load average: 0.00, 0.00, 0.00 Tue Feb 10 14:52:38 UTC 2004 UID PID PPID C STIME TTY TIME CMD root 12075 12073 0 Jan17 ? 00:24:55 /usr/bin/sendstats 8 root 12140 1 0 Jan16 ? 00:00:02 syslogd -m 0 root 27408 27407 0 23:09 ? 00:00:00 ps -ef root@hrunting.gsfc.nasa.gov (bash) Tue Feb 10 14:51:49 I don't care whether we use ntpd or bdate via cron, so long as the delta's in time are eliminated. I'm also concerned about the future STIME's listed since this has caused some confusion when diagnosing issues - not to mention the fact that it's disconcerting to see something so obviously wrong. Tony +-----------------------------------+ | Tony Stocker | | Systems Administrator | | TSDIS/TRMM Code 902 | | 301-614-5738 (office) | | 301-614-5269 (fax) | | Anton.K.Stocker.1@gsfc.nasa.gov | +-----------------------------------+ On Mon, 9 Feb 2004, Donald Becker wrote: > On Fri, 6 Feb 2004, Tony Stocker wrote: > > > How do we go about setting up ntpd on our compute nodes, with the ntp > > server being the host node? > > What aspect of NTP do you need? > > This same topic came up during my meeting with Panasas this past Friday: > if you need time synchronization only for the filesystem, our current > approach will work. > > We prefer not to run the standard NTP daemon, or any daemon, on compute > nodes. Running daemons on compute nodes results in unpredictable scheduling. > This becomes a significant issue with lock-step computation and > larger node counts, as the slowest node sets the step rate. > > Instead Scyld provides 'bdate', which explicitly sets the time > (settimeofday(), including microseconds) on compute nodes from the > master's clock. This is called at node boot time, and optionally > periodically with 'cron'. In both cases it follows the Scyld approach > of cluster operation being controlled by a master machine, rather than > compute nodes having independent operations or relying on distributed, > persistent configuation files. > > If the exact behavior of 'ntp' is required, it's simple to configure > 'ntpd' to start automatically on node boot. Create a start-up script > /etc/beowulf/init.d/ntp > that calls > bpsh -n $NODE /usr/sbin/ntpd -m -g > (or the appropriate options for your needs). > > Please let us know what your time sync requirements are -- we can likely > efficiently provide the functionality needed, but are reluctant to > include the 'ntpd' approach in our default node configuration. It is > more intrusive, complex and configuration-intensive than is needed for a > tightly coupled cluster. > > -- > Donald Becker becker@scyld.com > Scyld Computing Corporation http://www.scyld.com > 914 Bay Ridge Road, Suite 220 Scyld Beowulf cluster systems > Annapolis MD 21403 410-990-9993 > > From robertjmunro at yahoo.co.uk Sun Feb 15 07:00:01 2004 From: robertjmunro at yahoo.co.uk (=?iso-8859-1?q?Robert=20Jamie=20Munro?=) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] An old version of a page on your site Message-ID: <20040214195946.91867.qmail@web25006.mail.ukl.yahoo.com> While looking for monte on Google, I came across the following URL http://www.scyld.com/products/beowulf/software/monte.html that implies that monte doesn't work on the 2.4 kernel. After some more searching, I found the newer page: http://www.scyld.com/software/monte.html which states that it does. Please can you put a redirect or at least a link on the old pages to point to the new ones. Thanks, Robert Munro ===== Robert (Jamie) Munro Viva Network Technical Support Freelance web developer ___________________________________________________________ BT Yahoo! Broadband - Free modem offer, sign up online today and save ?80 http://btyahoo.yahoo.co.uk From jwnash at buffalo.edu Mon Feb 23 18:37:01 2004 From: jwnash at buffalo.edu (Jim Nash) Date: Tue Nov 9 01:14:28 2010 Subject: [scyld-users] VMAD_LIB_CLEAR Message-ID: <5.1.1.6.2.20040217132328.02251ff8@mail.piensaclara.com> Hi, I'm looking for a little help, please. When I start the Beowulf process, the message "VMAD_LIB_CLEAR. function not implemented" appears. I think it has something to do with the libraries, but I'm not sure. Thanks in advance, Jim Nash