From s.hogg@ic.ac.uk Mon, 31 May 1999 16:48:52 -0400 Date: Mon, 31 May 1999 16:48:52 -0400 From: Simon Hogg s.hogg@ic.ac.uk Subject: Two stupid ip questions; Just two quick questions for now - I only have accesss to email at the moment (well for a couple of days, so ...) 1) what are the reserved ip numbers for the network 'internal' to the beowulf? (I *thought* it was 192.x.x.x but I just want to check? 2) what's with these ip addresses with no dots? Is this in an rfc or something? (or is it straight decimal to hex conversion?) e.g. http://3626046468/ maps to www.angelfire.com (216.33.20.4) Thanks. -- Simon Hogg, Research Assistant, RCA/V&A Conservation Course, Victoria and Albert Museum, London, SW7 2RL, UK Tel. +44 (0)171 938 8685 Fax. +44 (0)171 938 8661 Mobile: +44 (0)7788 870 550 Email: s.hogg@vam.ac.uk s.hogg@ic.ac.uk From chris.corney@safeway.com Mon, 31 May 1999 16:58:15 -0400 Date: Mon, 31 May 1999 16:58:15 -0400 From: Chris Corney chris.corney@safeway.com Subject: seti @ home on clusters? This is a multi-part message in MIME format. --------------19B8081DFB81D32E820A0E54 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Isn't that what the Distributed.net client and proxy do? A "normal" beowulf would be isolated from the rest of the network world, while the head runs the proxy, fetching keys as needed to distribute to nodes as they complete their previous key. I think this is a perfect example of a course grained task. It does not, however, test network performance at all as the keys are small and transmission times are minimal compared to the time it takes to process the key. chris.corney@safeway.com Computer Analyst Canada Safeway Ltd. The opinions expressed are my own and don't necessarily reflect those of the company, blah, blah, blah. Mark E Drummond wrote: > Felix Rauch wrote: > > > > It's easy to use the SETI@home client on the cluster: Just start one > > process per CPU on each machine... > > I was going to do the same for my Distributed.net code cracking efforts. > I currently employ many workstations and servers but when I build my > Beowulf in the fall I'll run the rc5des client on it. The only thing is, > this is not a demonstration of the "network-parallel" features of > Beowulf processing. Running a copy of the client on each machine has > nothing to do with the Beowulf itself and could be done on any set of > workstations. > > What would be interesting would be to have Distributed.net/SETI@Home > clients that were rewritten using PVM/MPI or whatever so that the > console node would (in the Distributed.net case) grab some keys and > distribute them among the clients. > > -- > _________________________________________________________________ > Mark E Drummond Royal Military College of Canada > drummond-m@rmc.ca Computing Services > Linux Uber Alles perl || die --------------19B8081DFB81D32E820A0E54 Content-Type: text/x-vcard; charset=us-ascii; name=chris.corney.vcf Content-Transfer-Encoding: 7bit Content-Description: Card for Chris Corney Content-Disposition: attachment; filename=chris.corney.vcf begin:vcard n:Corney;Christopher tel;pager:(800)749-4974 tel;fax:(604)806-5404 tel;work:(604)806-5870 x-mozilla-html:TRUE org:Canada Safeway Ltd.;Software Company version:2.1 email;internet:chris.corney@safeway.com title:Computer Analyst adr;quoted-printable:;;Suite 500=0D=0A1185 West Georgia Street=0D=0A;Vancouver;British Columbia;;Canada fn:Christopher Corney end:vcard --------------19B8081DFB81D32E820A0E54-- From billf@inxpress.net Mon, 31 May 1999 18:01:27 -0400 Date: Mon, 31 May 1999 18:01:27 -0400 From: Bill Fredrickson billf@inxpress.net Subject: SCSI as a network interface Thanks all for the replies. Perhaps I should have been a little more specific about my intentions. I'm looking for a fast, easy way to network a cluster [Beowulf style] of PC's each of which already have SCSI controlers in them. When I was reading through the mail messages I saw what I thought was a reference to using the SCSI controler as a means of interconnecting the nodes. So, not being a SCSI expret, I posted the message in hopes that maybe this might be a possible way of doing it. I was hoping to avoid the adtional cost of NIC cards, switches, etc. Any thoughts, and/or suggestions would be most appreciated. Thanks in advance. Bill From morrone@wen.capsl.udel.edu Mon, 31 May 1999 18:54:30 -0400 Date: Mon, 31 May 1999 18:54:30 -0400 From: Christopher J. Morrone morrone@wen.capsl.udel.edu Subject: Two stupid ip questions; On Mon, 31 May 1999, Simon Hogg wrote: > Just two quick questions for now - I only have accesss to email at the > moment (well for a couple of days, so ...) > > 1) what are the reserved ip numbers for the network 'internal' to the > beowulf? (I *thought* it was 192.x.x.x but I just want to check? The 192 one is 192.168.0.0, and is the most common. Other posibilities are 10.0.0.0 and 172.16.0.0. > 2) what's with these ip addresses with no dots? Is this in an rfc or > something? (or is it straight decimal to hex conversion?) e.g. > http://3626046468/ maps to www.angelfire.com (216.33.20.4) Looks like it is the decimal representation of the address, but I'm just guessing... From mdavis@kieser.net Mon, 31 May 1999 19:25:27 -0400 Date: Mon, 31 May 1999 19:25:27 -0400 From: mdavis@kieser.net mdavis@kieser.net Subject: Two stupid ip questions; Hi, As far as I know, it's 10.x.x.x; 192.x.x.x 224.x.x.x ; I could be wrong about the 224. Anyway, as for the second question, I haven't really seen it being used that way, except in text books! Basically an IP address is divided into 4 numbers, seperated by the dots... if you convert these numbers to hex individually, concatenate them and then convert them back from hex, you'll get your number! Obviously you can reverse this to get back to the original ip address. So, 216.33.20.4 = D8 21 14 04 (in HEX) = 3626046468 I used the windoze calulator. Mike Davis mdavis@kieser.net Date sent: Mon, 31 May 1999 21:47:15 +0100 To: beowulf@beowulf.gsfc.nasa.gov From: Simon Hogg Subject: Two stupid ip questions; > Just two quick questions for now - I only have accesss to email at the > moment (well for a couple of days, so ...) > > 1) what are the reserved ip numbers for the network 'internal' to the > beowulf? (I *thought* it was 192.x.x.x but I just want to check? > > 2) what's with these ip addresses with no dots? Is this in an rfc or > something? (or is it straight decimal to hex conversion?) e.g. > http://3626046468/ maps to www.angelfire.com (216.33.20.4) > > Thanks. > > > > -- Simon Hogg, Research Assistant, > RCA/V&A Conservation Course, > Victoria and Albert Museum, > London, SW7 2RL, UK > Tel. +44 (0)171 938 8685 > Fax. +44 (0)171 938 8661 > Mobile: +44 (0)7788 870 550 > Email: s.hogg@vam.ac.uk s.hogg@ic.ac.uk From admin@cersa.admu.edu.ph Mon, 31 May 1999 21:45:51 -0400 Date: Mon, 31 May 1999 21:45:51 -0400 From: William Emmanuel S. Yu admin@cersa.admu.edu.ph Subject: were can i find pvmpovray samples do you know were i can find pvmpovray samples? i would like to try out some for the demo. i also have a problem with access because i only have email access for now. if anyone i kind enough to email the samples to me i would highly appreciate it. william.s.yu@ieee.org From rob@varesearch.com Mon, 31 May 1999 22:47:02 -0400 Date: Mon, 31 May 1999 22:47:02 -0400 From: Rob Walker rob@varesearch.com Subject: were can i find pvmpovray samples there are a ton of povray sample traces which come with the povray sources. look for an examples directory, iirc. also, you will love to look at the stuff found at the internet ray tracing competition. rob >>>>> On Tue, 1 Jun 1999 08:38:49 +0800 (PHT), "William Emmanuel S. Yu" said: William> do you know were i can find pvmpovray samples? William> i would like to try out some for the demo. i also have a problem with William> access because i only have email access for now. if anyone i kind enough William> to email the samples to me i would highly appreciate it. William> william.s.yu@ieee.org -- Cisco's Internetwork Operating System (IOS) technology, which is used by over 20 companies in over 50 different products, is the de facto worldwide standard for data transmission across both public and private networks. -cisco Marketing, 1994 From sct@lanl.gov Mon, 31 May 1999 22:57:27 -0400 Date: Mon, 31 May 1999 22:57:27 -0400 From: sct sct@lanl.gov Subject: I thought this was an extreme linux list Nate Downes wrote: > William Emmanuel S. Yu wrote: > > > > > > How possible would it be to set up a Linux hypercube over a firewire > > > > network? > > > > > i am a newbie when it comes to other networking protocol aside from atm > > and ethernet. i thought that firewire was ieee640 or something that is the > > bus for the gameboy. are there nics that support firewire? > > No, Firewire is IEEE1394, and it was originally designed as a > replacement for SCSI by Apple. It, like SCSI, can be used as a network > interface. And no, the gameboy isn't firewire. > > No NIC cards support IEEE1394, since it is designed to be built into a > motherboard. > > I wouldn't call SCSI a network interface. It's really a channel (I/O) interface. Calling it a network interface implies that there are SCSI switches and a more robust addressing scheme instead of the 7 addresses that SCSI supports. Network interfaces are such things as Ethernet, FDDI, ATM, etc. From ulairi@ecs.csun.edu Mon, 31 May 1999 22:57:39 -0400 Date: Mon, 31 May 1999 22:57:39 -0400 From: Ulairi ulairi@ecs.csun.edu Subject: Two stupid ip questions; -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 1: Check out RFC 1918 2: It's what is known the LONG IP notation, your browser can accept it. | | Just two quick questions for now - I only have accesss to email at the | moment (well for a couple of days, so ...) | | 1) what are the reserved ip numbers for the network 'internal' to the | beowulf? (I *thought* it was 192.x.x.x but I just want to check? | | 2) what's with these ip addresses with no dots? Is this in an rfc or | something? (or is it straight decimal to hex conversion?) e.g. | http://3626046468/ maps to www.angelfire.com (216.33.20.4) | -----BEGIN PGP SIGNATURE----- Version: PGPfreeware 6.0.2i iQA/AwUBN1NLYFR8Yh25VFLEEQLSgwCfakNBktSDY2Qm0ByG7VKIsXMUkH0AnRbe r75E99ft+jftZXtK6z0gyRzb =6rku -----END PGP SIGNATURE----- From bobby@fapenet.org Mon, 31 May 1999 23:21:23 -0400 Date: Mon, 31 May 1999 23:21:23 -0400 From: Robert S. Raagas bobby@fapenet.org Subject: Beowulf w/o Extreme Linux CD from Redhat Hi, Would it be possible to create a Beowulf Cluster without the separate package of Extreme Linux from Redhat, using only Redhat 5.2 or 6.0? :) I am new to Linux and Beowulf clusters, but i have a Linux box at the office and want to have a demo Beowulf Cluster, i've been reading a lot of website about Beowulf so far. Btw, i'm from the Phillippines, a province away from Manila (our Capital, which will held our first Linux Conference this coming June 9-11), coz it's hard to find a Extreme Linux CD, i only have the RH CD Distribution 5.2 and 6.0 which i have downloaded from the net. Thanks. Bobby From casioqv@mail.geocities.com Mon, 31 May 1999 23:30:27 -0400 Date: Mon, 31 May 1999 23:30:27 -0400 From: Casioqv casioqv@mail.geocities.com Subject: standard apps in paralell If I run a non-beowulf optimized program on one computer on a cluster I know that it will not be run across the entire cluster and that it will only run on one node. Will it always run on the node it was loaded on? Can I still take advantage of a cluster if I have 8 nodes and a programs open? From ronelson@vt.edu Mon, 31 May 1999 23:31:23 -0400 Date: Mon, 31 May 1999 23:31:23 -0400 From: Rob Nelson ronelson@vt.edu Subject: Two stupid ip questions; >> 2) what's with these ip addresses with no dots? Is this in an rfc or >> something? (or is it straight decimal to hex conversion?) e.g. >> http://3626046468/ maps to www.angelfire.com (216.33.20.4) Good thing I learned about radix stuff in college... 216.33.20.4 is a number of a 256 radix, if I have my terminology right. The periods are separators of each digit. Take each digit starting at the right and multiply it by 256^(number of digits from the right), starting with 0. So, you get 216 * 256^3 + 33 * 256^2 + 20 * 256^1 + 4 * 256^0 = 3626046468 Why, you ask, would anyone give their IP like that? Certain versions of IE4 would take any IP given in decimal form as in the intranet zone, therefore bypassing most security. Rob Nelson ronelson@vt.edu From admin@cersa.admu.edu.ph Tue, 1 Jun 1999 01:56:33 -0400 Date: Tue, 1 Jun 1999 01:56:33 -0400 From: William Emmanuel S. Yu admin@cersa.admu.edu.ph Subject: were can i find pvmpovray samples On Mon, 31 May 1999, Rob Walker wrote: > > there are a ton of povray sample traces which come with the povray > sources. look for an examples directory, iirc. > in my distribution there is no examples directory. > also, you will love to look at the stuff found at the internet ray > tracing competition. > i have no web access right now so could you just mail me a sample. in your opinion is cool. william.s.yu@ieee.org > rob > > >>>>> On Tue, 1 Jun 1999 08:38:49 +0800 (PHT), "William Emmanuel S. Yu" said: > > William> do you know were i can find pvmpovray samples? > > William> i would like to try out some for the demo. i also have a problem with > William> access because i only have email access for now. if anyone i kind enough > William> to email the samples to me i would highly appreciate it. > > William> william.s.yu@ieee.org > > > -- > Cisco's Internetwork Operating System (IOS) technology, which is used > by over 20 companies in over 50 different products, is the de facto > worldwide standard for data transmission across both public and > private networks. -cisco Marketing, 1994 > From rauch@inf.ethz.ch Tue, 1 Jun 1999 02:48:54 -0400 Date: Tue, 1 Jun 1999 02:48:54 -0400 From: Felix Rauch rauch@inf.ethz.ch Subject: Two stupid ip questions; On Mon, 31 May 1999 mdavis@kieser.net wrote: > As far as I know, it's 10.x.x.x; 192.x.x.x 224.x.x.x ; I could be > wrong about the 224. 224.x.x.x are multicast addresses, AFAIK. - Felix -- Felix Rauch | Email: rauch@inf.ethz.ch Institute for Computer Systems | Homepage: http://www.cs.inf.ethz.ch/~rauch/ ETH Zentrum / RZ H15 | Phone: ++41 1 632 7489 CH - 8092 Zuerich / Switzerland | Fax: ++41 1 632 1307 From revans@e-z.net Tue, 1 Jun 1999 03:46:36 -0400 Date: Tue, 1 Jun 1999 03:46:36 -0400 From: Russell Evans revans@e-z.net Subject: Beowulf w/o Extreme Linux CD from Redhat You may wish to look at the SuSE distibution. ftp.suse.com/pub/suse/i386/6.1/suse/beo1/ contains the Beowulf packages that are part of the standard 6.1 distribution. Thank you Russell > coz it's hard to find a Extreme Linux CD, i only have > the RH CD > Distribution 5.2 and 6.0 which i have downloaded from the net. > > Thanks. > > Bobby > > From admin@cersa.admu.edu.ph Tue, 1 Jun 1999 03:51:10 -0400 Date: Tue, 1 Jun 1999 03:51:10 -0400 From: William Emmanuel S. Yu admin@cersa.admu.edu.ph Subject: can someone email me six more povray files for the demo can someone email me the povray files for the beowulf demo? please.. i need them and i do not have web access. please. and if you guys have some urls can you include them na rin. tnx for all your help. william.s.yu@ieee.org From JesseP@europe.stortek.com Tue, 1 Jun 1999 04:51:53 -0400 Date: Tue, 1 Jun 1999 04:51:53 -0400 From: Jessen, Per JesseP@europe.stortek.com Subject: Beowulf w/o Extreme Linux CD from Redhat > -----Original Message----- > From: Robert S. Raagas [mailto:bobby@fapenet.org] > Sent: 13 April 1999 04:33 [snip] > Would it be possible to create a Beowulf Cluster without the separate > package of Extreme Linux from Redhat, using only Redhat 5.2 > or 6.0? :) I am Certainly. In fact, don't use the Extreme-Linux software. There is no or little need for it. If you do need some of e.g. the kernel extensions it provides, it is far better getting the most recent versions of the 'net. > new to Linux and Beowulf clusters, but i have a Linux box at the office > and want to have a demo Beowulf Cluster, i've been reading a lot of website > about Beowulf so far. Btw, i'm from the Phillippines, a province away from > Manila (our Capital, which will held our first Linux Conference this coming > June 9-11), coz it's hard to find a Extreme Linux CD, i only have the RH CD > Distribution 5.2 and 6.0 which i have downloaded from the net. Good luck with your Beowulf - what you have should do just fine. regards, Per Jessen, ENIDAN Technologies, LOndon From a.saha@acm.org Tue, 1 Jun 1999 05:21:41 -0400 Date: Tue, 1 Jun 1999 05:21:41 -0400 From: a.saha@acm.org a.saha@acm.org Subject: Two stupid ip questions; Felix Rauch writes: > On Mon, 31 May 1999 mdavis@kieser.net wrote: > > As far as I know, it's 10.x.x.x; 192.x.x.x 224.x.x.x ; I could be > > wrong about the 224. > > 224.x.x.x are multicast addresses, AFAIK. > > - Felix and it is 192.168.x.x since all other addresses are valid IPs. Also, 172.16.x.x ought to be included in the non-routable IP list. -amlan -- Amlan Saha as@cwc.nus.edu.sg Mobile Computing and Protocols Group Center for Wireless Comms, 20 Science Park Rd, Singapore 117674 +65.8709.265 (Tel) +65.779.5441 (Fax) **I speak only for myself** From JesseP@europe.stortek.com Tue, 1 Jun 1999 05:52:20 -0400 Date: Tue, 1 Jun 1999 05:52:20 -0400 From: Jessen, Per JesseP@europe.stortek.com Subject: two books with the same ISBN# ? (was RE: Second Puzzle Piece? ) > -----Original Message----- > From: Putchong Uthayopas [mailto:uthayopa@mcs.anl.gov] > Sent: 28 May 1999 16:28 [snip] > BARRY WILKINSON AND MICHAEL ALLEN, Parallel > Programming Techniques and Applications Using > Networked Workstations and Parallel Computers, Prentice > Hall, Upper Saddle River, 1999, ISBN 0-13-671710-1. > [snip] Hmmm, I just did a search at www.waterstones.co.uk, and found a title by the same authors called 'Parallel Computing' - with the quoted ISBN number. At Waterstones, it is listed as having been published in Aug1998. At amazon.co.uk it says Sep1998, but same ISBN-number. If we are talking about a NEW book, but with misqouted ISBN I'd be interested. A search on the title as quoted returns nothing. --- pause Hmmm, at amazon.com it is listed with the title as above, but still published Aug1998. Same ISBN. Ah, and if you're now thinking of buying it, you might want to know that amazon.com is asking USD52 while amazon.co.uk is merely asking GBP19.95. Apparently the same book, but with two different titles. Odd. regards, Per Jessen, ENIDAN Technologies, London From s.hogg@ic.ac.uk Tue, 1 Jun 1999 06:58:08 -0400 Date: Tue, 1 Jun 1999 06:58:08 -0400 From: Simon Hogg s.hogg@ic.ac.uk Subject: Two stupid ip questions; >Why, you ask, would anyone give their IP like that? Certain versions of IE4 >would take any IP given in decimal form as in the intranet zone, therefore >bypassing most security. That explains why the only place I've seen it is in spam messages. -- Simon Hogg, Research Assistant, RCA/V&A Conservation Course, Victoria and Albert Museum, London, SW7 2RL, UK Tel. +44 (0)171 938 8685 Fax. +44 (0)171 938 8661 Mobile: +44 (0)7808 587 647 Email: s.hogg@vam.ac.uk s.hogg@ic.ac.uk From deadline@plogic.com Tue, 1 Jun 1999 07:43:34 -0400 Date: Tue, 1 Jun 1999 07:43:34 -0400 From: Douglas Eadline deadline@plogic.com Subject: Beowulf mailing list FAQ v2 Can some one tell me the URL of the FAQ? I just searched my email records and can not find it. Thanks Doug ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.861.6960 115 Research Drive | PARALLEL | Fax:+610.861.8247 Bethlehem, PA 18017 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- From alvin@iplink.net Tue, 1 Jun 1999 07:44:44 -0400 Date: Tue, 1 Jun 1999 07:44:44 -0400 From: Alvin Starr alvin@iplink.net Subject: I thought this was an extreme linux list On Mon, 31 May 1999, sct wrote: > I wouldn't call SCSI a network interface. It's really a channel (I/O) > interface. Calling it a network interface implies that there are SCSI switches > and a more robust addressing scheme instead of the 7 addresses that SCSI > supports. Network interfaces are such things as Ethernet, FDDI, ATM, etc. A network can be thought of as more than one computer connected together over some medium. That medium could be SCSI. the newer versions of SCSI now support 16 devices and there are some devices that will break LUN's into seperate scsi addresses so there is to some extent the equivilant of switches. At 80Mbytes/sec SCSI can make for a fast link between a small number of systems and with a low overhead protocol it could help solve some of the problems involved in trying to share memory across a network. Alvin Starr || voice: (416)585-9971 Interlink Connectivity || fax: (416)585-9974 alvin@iplink.net || From ajl4@eecs.lehigh.edu Tue, 1 Jun 1999 08:41:29 -0400 Date: Tue, 1 Jun 1999 08:41:29 -0400 From: Adam Lazur ajl4@eecs.lehigh.edu Subject: standard apps in paralell Casioqv (casioqv@mail.geocities.com) said: > If I run a non-beowulf optimized program on one computer on a cluster I > know that it will not be run across the entire cluster and that it will > only run on one node. Will it always run on the node it was loaded on? > Can I still take advantage of a cluster if I have 8 nodes and a programs > open? You could take a look at MOSIX. It is a set of kernel mods and utilities that will transparently balance the load across all of the nodes on your cluster. Check out more info about MOSIX at http://www.cs.huji.ac.il/mosix/ HTH .adam -- Adam Lazur - Computer Engineering Undergrad - Lehigh University icq# 3354423 - http://www.lehigh.edu/~ajl4 "Samba is a huge win for these beleaguered techies; it enables open-source fans to stealth their Linux boxes so they look like Microsoft servers that somehow miraculously fail to suck." From deadline@plogic.com Tue, 1 Jun 1999 09:24:03 -0400 Date: Tue, 1 Jun 1999 09:24:03 -0400 From: Douglas Eadline deadline@plogic.com Subject: Beowulf w/o Extreme Linux CD from Redhat On Tue, 13 Apr 1999, Robert S. Raagas wrote: > Hi, > > Would it be possible to create a Beowulf Cluster without the separate > package of Extreme Linux from Redhat, using only Redhat 5.2 or 6.0? :) I am Yes. DO NOT USE THE EXTREME LINUX CD! > new to Linux and Beowulf clusters, but i have a Linux box at the office > and want to have a demo Beowulf Cluster, i've been reading a lot of website > about Beowulf so far. Btw, i'm from the Phillippines, a province away from > Manila (our Capital, which will held our first Linux Conference this coming > June 9-11), coz it's hard to find a Extreme Linux CD, i only have the RH CD > Distribution 5.2 and 6.0 which i have downloaded from the net. See: http://www.xtreme-machines.com/x-cluster-qs.html http://pobox.com/~kragen/beowulf-faq.txt Doug ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.861.6960 115 Research Drive | PARALLEL | Fax:+610.861.8247 Bethlehem, PA 18017 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- From Christopher.Bohn@sn.wpafb.af.mil Tue, 1 Jun 1999 09:23:44 -0400 Date: Tue, 1 Jun 1999 09:23:44 -0400 From: Bohn Christopher A Capt AFRL/IFSD Christopher.Bohn@sn.wpafb.af.mil Subject: Cox report on Chinese spy activities and Beowulf > off-topic, so I'll make it short - I am under the impression > that neither > Pakistan nor India have a delivery-mechanism for those nuclear bombs ? > So perhaps they're not quite on the brink of a nuclear > exchange. One can > always hope (or pray, as rgb suggests). The Indians, at least, have been testing sounding rockets for scientific research. The leap from sounding rockets to SRBMs is only azimuth. Take care, cb > -----Original Message----- > From: Jessen, Per [mailto:JesseP@europe.stortek.com] > Sent: Friday, May 28, 1999 4:04 AM > To: 'Robert G. Brown' > Cc: 'beowulf' > Subject: RE: Cox report on Chinese spy activities and Beowulf > > > > -----Original Message----- > > From: Robert G. Brown [mailto:rgb@phy.duke.edu] > > Sent: 28 May 1999 06:16 > [snip] > > > > (Speaking of which, let us pray for the Indians and > Pakistanis who are > > about to die in the world's second nuclear war...and who > didn't steal > > any codes or misuse any controlled computers to build the > devices that > > they will use in it). > > rgb > > Robert G. Brown > > off-topic, so I'll make it short - I am under the impression > that neither > Pakistan nor India have a delivery-mechanism for those nuclear bombs ? > So perhaps they're not quite on the brink of a nuclear > exchange. One can > always hope (or pray, as rgb suggests). > > > regards, > Per Jessen > ENIDAN Technologies, London > From roche@ibs.fr Tue, 1 Jun 1999 09:51:53 -0400 Date: Tue, 1 Jun 1999 09:51:53 -0400 From: Olivier Roche roche@ibs.fr Subject: Superstck 2 switch and channel bonding Hi, I want to use channel bonding in my cluster but i didn't find how to configure my switch (a 3com superstack 2 switch 3300) for VLAN in the user guide ? Maybe someone who have the same switch as mine can give me the solution ? Thanks, Olivier -- ---------------------------------------------------------------- Roche Olivier | E-Mail: roche@ibs.fr Laboratoire de Dynamique Moleculaire | Tel.: +33-4.76.88.99.28 Institut de Biologie Structurale | Fax: +33-4.76.88.54.94 41, rue Jules Horowitz | Francais/English 38027 Grenoble Cedex 1, France ---------------------------------------------------------------- From newt@hq.nasa.gov Tue, 1 Jun 1999 10:08:04 -0400 Date: Tue, 1 Jun 1999 10:08:04 -0400 From: Daniel Ridge newt@hq.nasa.gov Subject: SCSI as a network interface On Mon, 31 May 1999, Bill Fredrickson wrote: > Perhaps I should have been a little more specific about my intentions. > I'm looking for a fast, easy way to network a cluster [Beowulf style] of > PC's each of which already have SCSI controlers in them. > I was hoping to avoid the adtional > cost of NIC cards, switches, etc. Three observations: You can't really broadcast on the SCSI bus. for normal ultra/wide SCSI, the max spec cable length is 1.5m. nice SCSI _cables_ can cost more per port than NICs. Cheers, DSKR -------------------------------------+--------------------------------- Daniel Ridge | Computer Crime Division | N A S A email: dridge@hq.nasa.gov | Office of Inspector General tel: 202-358-4308 | 300 E Street SW fax: 202-358-3439 | Washington, D.C. 20546 NexTel: 301-440-9153 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From uthayopa@mcs.anl.gov Tue, 1 Jun 1999 10:15:44 -0400 Date: Tue, 1 Jun 1999 10:15:44 -0400 From: Putchong Uthayopas uthayopa@mcs.anl.gov Subject: Beowulf mailing list FAQ v2 I put the faq in my web at http://www.mcs.anl.gov/~uthayopa On Tue, 1 Jun 1999, Douglas Eadline wrote: > > Can some one tell me the URL of the FAQ? > > I just searched my email records and can not find it. > > Thanks > > Doug > > ------------------------------------------------------------------- > Paralogic, Inc. | PEAK | Voice:+610.861.6960 > 115 Research Drive | PARALLEL | Fax:+610.861.8247 > Bethlehem, PA 18017 USA | PERFORMANCE | http://www.plogic.com > ------------------------------------------------------------------- > > From uthayopa@mcs.anl.gov Tue, 1 Jun 1999 10:19:20 -0400 Date: Tue, 1 Jun 1999 10:19:20 -0400 From: Putchong Uthayopas uthayopa@mcs.anl.gov Subject: Beowulf w/o Extreme Linux CD from Redhat HI, Kragen has written a good FAQ for all these questions. I put that Beowulf FAQ V2 on my web at http://www.mcs.anl.gov/~uthayopa. Please read that and it will help save a lot of your time. Putchong. On Tue, 1 Jun 1999, Jessen, Per wrote: > > -----Original Message----- > > From: Robert S. Raagas [mailto:bobby@fapenet.org] > > Sent: 13 April 1999 04:33 > [snip] > > Would it be possible to create a Beowulf Cluster without the separate > > package of Extreme Linux from Redhat, using only Redhat 5.2 > > or 6.0? :) I am > > Certainly. In fact, don't use the Extreme-Linux software. There is no > or little need for it. > If you do need some of e.g. the kernel extensions it provides, it is > far better getting the most recent versions of the 'net. > > > new to Linux and Beowulf clusters, but i have a Linux box at the office > > and want to have a demo Beowulf Cluster, i've been reading a lot of > website > > about Beowulf so far. Btw, i'm from the Phillippines, a province away from > > Manila (our Capital, which will held our first Linux Conference this > coming > > June 9-11), coz it's hard to find a Extreme Linux CD, i only have the RH > CD > > Distribution 5.2 and 6.0 which i have downloaded from the net. > > Good luck with your Beowulf - what you have should do just fine. > > > regards, > Per Jessen, ENIDAN Technologies, LOndon > From cworley@altatech.com Tue, 1 Jun 1999 10:43:13 -0400 Date: Tue, 1 Jun 1999 10:43:13 -0400 From: Chris Worley cworley@altatech.com Subject: Hard Disk or not Hard Disk Benjamin Forgeau wrote: > I'm wondering if it's reasonnable to build a Linux Cluster without HardDisk but with NFS. Does anyone have ideas about it?? > I find diskless booting: 1) is the only way to debug device drivers (kernel crashes don't lead to corrupt disks). 2) is easier to keep a cluster homogeneous (if all nodes share the same root, with the exception of etc, var, dev, and tmp, which I keep as ram disks). 3) programs load a lot slower (but if everything you're doing runs as daemons, then only your startup time is effected). 4) is less prone to disk failure (if your disks are 2 years MTBF, and you're running 12 nodes, MTBF is now 2 months). 5) is the quickest way to bring new hardware on line (a new node doesn't need to be loaded with software if it boots disklessly). So, Yes, it's very reasonable. But most don't. Chris From Scott_Palmer@Dell.com Tue, 1 Jun 1999 11:52:08 -0400 Date: Tue, 1 Jun 1999 11:52:08 -0400 From: Scott_Palmer@Dell.com Scott_Palmer@Dell.com Subject: SCSI as a network interface The pain of developing such a solution, were it even possible, can't be worth the money unless you really have a personal interest in making this work. There are some SAN-type ideas already out there that might be able to be modified, but, AFAIK, not without some serious mods. Cheap 10/100 NIC's can be found for $30, and 100Mbps switches are running around $50/port. Cat 5 cabling is cheap, assume $5 per box covers the cable and connectors, and you're running well under $100 per node. The price of good SCSI cable and active terminators won't be cheap, and I wouldn't skimp on that part, esp. considering the likely cable length that you're going to have. Don't forget that SCSI stands for (S)ystem (C)an't (S)ee (I)t, imagine the joy of troubleshooting this system with a bad terminator....! Just my two cents. - Scott -----Original Message----- From: Bill Fredrickson [mailto:billf@inxpress.net] Sent: Monday, May 31, 1999 5:01 PM To: extreme-linux@acl.lanl.gov; beowulf@beowulf.gsfc.nasa.gov Subject: SCSI as a network interface Thanks all for the replies. Perhaps I should have been a little more specific about my intentions. I'm looking for a fast, easy way to network a cluster [Beowulf style] of PC's each of which already have SCSI controlers in them. When I was reading through the mail messages I saw what I thought was a reference to using the SCSI controler as a means of interconnecting the nodes. So, not being a SCSI expret, I posted the message in hopes that maybe this might be a possible way of doing it. I was hoping to avoid the adtional cost of NIC cards, switches, etc. Any thoughts, and/or suggestions would be most appreciated. Thanks in advance. Bill From joelja@darkwing.uoregon.edu Tue, 1 Jun 1999 11:54:25 -0400 Date: Tue, 1 Jun 1999 11:54:25 -0400 From: Joel Jaeggli joelja@darkwing.uoregon.edu Subject: SCSI as a network interface On Mon, 31 May 1999, Bill Fredrickson wrote: > Thanks all for the replies. > > Perhaps I should have been a little more specific about my intentions. > I'm looking for a fast, easy way to network a cluster [Beowulf style] of > PC's each of which already have SCSI controlers in them. When I was > reading through the mail messages I saw what I thought was a reference > to using the SCSI controler as a means of interconnecting the nodes. So, > not being a SCSI expret, I posted the message in hopes that maybe this > might be a possible way of doing it. > I was hoping to avoid the adtional > cost of NIC cards, switches, etc. scsi controllers, or motherboards with them built-in are more expensive than fast-ethernet. cable lengths are a real problem with fw/uw scsi 1.5-3 meters at most. Single cpu motherboard with built-in scsi is ~$360 (asus p2b-s) vs $120 for a regular p2 board $30 for a fast ethernet card and ~$50 for per port for the switch, and the doesn't include the cost of the scsi interconnect which will likely be significant... > Any thoughts, and/or suggestions would be most appreciated. > > Thanks in advance. > > Bill > -------------------------------------------------------------------------- Joel Jaeggli joelja@darkwing.uoregon.edu Academic User Services consult@gladstone.uoregon.edu PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -------------------------------------------------------------------------- It is clear that the arm of criticism cannot replace the criticism of arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of the right, 1843. From rriendeau@netquotient.com Tue, 1 Jun 1999 12:02:12 -0400 Date: Tue, 1 Jun 1999 12:02:12 -0400 From: Richard Riendeau rriendeau@netquotient.com Subject: SCSI as a network interface For hooking together 4 or less nodes IN ADDITION to ethernet- it could allow for a much closer relationship between those nodes. I.E. Parallel virtual database. Keeping it part of a network stack makes code also applicable to other high bandwidth connections such as ATM or GB Ethernet- instead of having to deal with file base I/O. If you think about the implementations of a socket versus a file handle and they are very similar. -Rich Riendeau Netquotient Consulting Group -----Original Message----- From: Daniel Ridge [SMTP:newt@hq.nasa.gov] Sent: Tuesday, June 01, 1999 10:13 AM To: Bill Fredrickson Cc: extreme-linux@acl.lanl.gov; beowulf@beowulf.gsfc.nasa.gov Subject: Re: SCSI as a network interface On Mon, 31 May 1999, Bill Fredrickson wrote: > Perhaps I should have been a little more specific about my intentions. > I'm looking for a fast, easy way to network a cluster [Beowulf style] of > PC's each of which already have SCSI controlers in them. > I was hoping to avoid the adtional > cost of NIC cards, switches, etc. Three observations: You can't really broadcast on the SCSI bus. for normal ultra/wide SCSI, the max spec cable length is 1.5m. nice SCSI _cables_ can cost more per port than NICs. Cheers, DSKR -------------------------------------+--------------------------------- Daniel Ridge | Computer Crime Division | N A S A email: dridge@hq.nasa.gov | Office of Inspector General tel: 202-358-4308 | 300 E Street SW fax: 202-358-3439 | Washington, D.C. 20546 NexTel: 301-440-9153 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From daye@ultramip.com Tue, 1 Jun 1999 13:09:53 -0400 Date: Tue, 1 Jun 1999 13:09:53 -0400 From: Melchior daye@ultramip.com Subject: SCSI as a network interface On May 31, 1999 Bill Frederickson said: >Thanks all for the replies. > >Perhaps I should have been a little more specific about my intentions. >I'm looking for a fast, easy way to network a cluster [Beowulf style] of >PC's each of which already have SCSI controlers in them. When I was >reading through the mail messages I saw what I thought was a reference >to using the SCSI controler as a means of interconnecting the nodes. So, >not being a SCSI expret, I posted the message in hopes that maybe this >might be a possible way of doing it. I was hoping to avoid the adtional >cost of NIC cards, switches, etc. > >Any thoughts, and/or suggestions would be most appreciated. > >Thanks in advance. > >Bill SCSI is not really suitable for direct network connection. Each SCSI bus needs to be terminated at both ends within the same machine. Given that network cards are cheap nowadays (hubs too), the path of least resistance is to put in some thirty dollar NE 2000 compatibles. If you don't want to spend (as little as) a hundred dollars per hub, you can use 10 base 2 (coaxial network cable and tees). The downside is that your network will only run at 10 MBs, but it's very cheap and performance will suffice to test many of the concepts in a smaller array. If you decide in the future that you need more bandwidth inside the cluster the cards can be replaced with 10/100 cards and you only have to throw out a couple of hundred dollars worth of stuff. D. D. (Daye) Dancer daye@ultramip.com From darie@cab.cnea.gov.ar Tue, 1 Jun 1999 13:43:07 -0400 Date: Tue, 1 Jun 1999 13:43:07 -0400 From: Enzo A. Dari darie@cab.cnea.gov.ar Subject: SCSI as a network interface Bill Fredrickson wrote: > ... > I'm looking for a fast, easy way to network a cluster [Beowulf style] of > PC's each of which already have SCSI controlers in them. When I was > reading through the mail messages I saw what I thought was a reference > to using the SCSI controler as a means of interconnecting the nodes. So, > not being a SCSI expret, I posted the message in hopes that maybe this > might be a possible way of doing it. I was hoping to avoid the adtional > cost of NIC cards, switches, etc. > ... Take a look at: http://www.msoe.edu/~sebern/courses/cs400/team1/final/ -- Regards, O__ Enzo. ,>/ ========================================================()=\()==== Enzo A. Dari | Instituto Balseiro / Centro Atomico Bariloche 8400-San Carlos de Bariloche, Argentina | email: darie@cab.cnea.gov.ar Phone: 54-2944-445208, 54-2944-445100 Fax: 54-2944-445299 Web page: http://cabmec1.cnea.gov.ar/darie/darie.htm From eugene.leitl@lrz.uni-muenchen.de Tue, 1 Jun 1999 13:46:37 -0400 Date: Tue, 1 Jun 1999 13:46:37 -0400 From: Eugene Leitl eugene.leitl@lrz.uni-muenchen.de Subject: Two stupid ip questions; Rob Nelson writes: > >> 2) what's with these ip addresses with no dots? Is this in an rfc or > >> something? (or is it straight decimal to hex conversion?) e.g. > >> http://3626046468/ maps to www.angelfire.com (216.33.20.4) Any ideas why people decided on a decimal IP address notation http://3626046468/ http://d8211404/ instead of a much more natural hexadecimal one? And won't it be just horrible typing monstrosities like http://1.2.4.8.16.32.64.128.1.2.4.8.16.32.64.128/ in IPv6? Hexadecimal notation would seem definitely more compact here... http://d8211404d8211404d8211404d8211404/ Still, quite difficult to memorize, unless you're an idiot savant, that is. -- Eugene From alan@groucho.med.jhmi.edu Tue, 1 Jun 1999 14:03:16 -0400 Date: Tue, 1 Jun 1999 14:03:16 -0400 From: Alan Grossfield alan@groucho.med.jhmi.edu Subject: SCSI as a network interface :On Mon, 31 May 1999, Bill Fredrickson wrote: :> PC's each of which already have SCSI controlers in them. When I was :> reading through the mail messages I saw what I thought was a reference :> to using the SCSI controler as a means of interconnecting the nodes. So, :> not being a SCSI expret, I posted the message in hopes that maybe this :> might be a possible way of doing it. There was an article about this in _Linux Journal_ last year: "Encapsulating IP using SCSI" by Ben Elliston, August 1998, pp 60-62. He mentions the existence of RFC 2143. From the article, I get the impression that it's still in the early development stages, but that was almost a year ago. In any event, if you're interested in pursuing this, you might want to talk to the author (bje@cygnus.com). Alan Grossfield -------------------------------------------------------------------------- |"In theory, there is no difference between theory and practice. In | |practice, there is." Jan L.A. van de Snepscheut | -------------------------------------------------------------------------- From alvin@iplink.net Tue, 1 Jun 1999 15:32:28 -0400 Date: Tue, 1 Jun 1999 15:32:28 -0400 From: Alvin Starr alvin@iplink.net Subject: SCSI as a network interface On Tue, 1 Jun 1999, Melchior wrote: > On May 31, 1999 Bill Frederickson said: > > >Thanks all for the replies. > > > >Perhaps I should have been a little more specific about my intentions. > >I'm looking for a fast, easy way to network a cluster [Beowulf style] of > >PC's each of which already have SCSI controlers in them. When I was > >reading through the mail messages I saw what I thought was a reference > >to using the SCSI controler as a means of interconnecting the nodes. So, > >not being a SCSI expret, I posted the message in hopes that maybe this > >might be a possible way of doing it. I was hoping to avoid the adtional > >cost of NIC cards, switches, etc. > > > >Any thoughts, and/or suggestions would be most appreciated. > > > >Thanks in advance. > > > >Bill > > > SCSI is not really suitable for direct network connection. Each SCSI bus > needs to be terminated at both ends within the same machine. Given that > network cards are cheap nowadays (hubs too), the path of least resistance > is to put in some thirty dollar NE 2000 compatibles. If you don't want to > spend > (as little as) a hundred dollars per hub, you can use 10 base 2 (coaxial > network > cable and tees). The downside is that your network will only run at 10 MBs, > but it's very cheap and performance will suffice to test many of the > concepts > in a smaller array. If you decide in the future that you need more > bandwidth > inside the cluster the cards can be replaced with 10/100 cards and you only > have to throw out a couple of hundred dollars worth of stuff. SCSI is distance limited but it does not have to be in a single enclosure. With LVD the limiting distance is 12 meters, not unreasonable for a small cluster. With a transfer speed of 80Mbytes/sec you could get the equivilent of 640Mbits/sec. The SCSI protocol even supports host to host communications so the idea of using it as a interconnect system was thought of way back. This is not to say that using SCSI is easy. But it can be used for a high speed link between systems. Alvin Starr || voice: (416)585-9971 Interlink Connectivity || fax: (416)585-9974 alvin@iplink.net || From enano@ceu.fi.udc.es Tue, 1 Jun 1999 17:04:21 -0400 Date: Tue, 1 Jun 1999 17:04:21 -0400 From: Miguel Barreiro Paz enano@ceu.fi.udc.es Subject: Superstck 2 switch and channel bonding > I want to use channel bonding in my cluster but i didn't > find how to configure my switch (a 3com superstack 2 > switch 3300) for VLAN in the user guide ? > > Maybe someone who have the same switch as mine can give > me the solution ? At least early SuperStack II 3300 were shipped with a flash version that didn't allow VLANs, and obviously no information regarding VLANs in the documentation (other than "to be provided in future releases" or something alike). Probably manuals weren't changed for later releases, I don't know (all our 3300's had old flash revisions). Newer flash versions are available at 3com.com. Regards, Miguel From mcking@cajunbro.com Tue, 1 Jun 1999 17:20:48 -0400 Date: Tue, 1 Jun 1999 17:20:48 -0400 From: Mark McCoy mcking@cajunbro.com Subject: Two stupid ip questions; Eugene Leitl wrote: > > Rob Nelson writes: > > >> 2) what's with these ip addresses with no dots? Is this in an rfc or > > >> something? (or is it straight decimal to hex conversion?) e.g. > > >> http://3626046468/ maps to www.angelfire.com (216.33.20.4) > > Any ideas why people decided on a decimal IP address notation > http://3626046468/ > http://d8211404/ > instead of a much more natural hexadecimal one? > > And won't it be just horrible typing monstrosities like > http://1.2.4.8.16.32.64.128.1.2.4.8.16.32.64.128/ > in IPv6? Hexadecimal notation would seem definitely more compact here... > http://d8211404d8211404d8211404d8211404/ > > Still, quite difficult to memorize, unless you're an idiot savant, > that is. > > -- Eugene Ahh, but that's what DNS is for!! -- Mark McCoy -- Proud to run Linux since February 1996 Systems Administrator - Cajun Brothers Technology, llc The views in this message do not necessarily reflect the views of my employer This message posted from snowdog, a 100% MS-free machine. From bob.cat@juno.com Tue, 1 Jun 1999 17:34:40 -0400 Date: Tue, 1 Jun 1999 17:34:40 -0400 From: bob.cat@juno.com bob.cat@juno.com Subject: Two stupid ip questions; The IP addresses of the cluster are irrelevant so long as the head system does not route packets. If you do need to route packets, you will need to use network address translation anyway. The 192.168.x.x and other reserved addresses are there to protect against someone putting a previously private network onto the Internet. Since the net is not supposed to route reserved addresses (SUPPOSED not to), when you plug your 192.168.x.x network in, it just doesn't work for you, rather than possibly screwing up the network whose addresses you used. > http://3626046468/ maps to www.angelfire.com (216.33.20.4) >Any ideas why people decided on a decimal IP address notation >http://3626046468/ >http://d8211404/ >instead of a much more natural hexadecimal one? There IS no *natural* way to express a number. That notation doesn't work everywhere: WIN98 IE5.0 needs: http://0xd8211404/ or http://0xd8.0x21.0x14.0x04/ And it does like octal (leading 0 indicating octal): http://033010212004 etc., etc... There are many ways to express an IP address, and this is not OS dependent. >And won't it be just horrible typing monstrosities like >http://1.2.4.8.16.32.64.128.1.2.4.8.16.32.64.128/ >in IPv6? Hexadecimal notation would seem definitely more compact >here... >http://d8211404d8211404d8211404d8211404/ > >Still, quite difficult to memorize, unless you're an idiot savant, >that is. Simply convert the address to base(2^128) and you'll just have one number to remember! :ßobÇat.Bat 1.0 >^^< Stop me before I hack again! Echo f b800:0000 fff 32 00 e1 09 6f 0f 62 0f 80 04 61 0f 74 0f 32 00 > Bob.Cat Echo q >> Bob.Cat DeBug < Bob.Cat > Nul @Erase Bob.Cat > Nul From ronelson@vt.edu Tue, 1 Jun 1999 17:49:30 -0400 Date: Tue, 1 Jun 1999 17:49:30 -0400 From: Rob Nelson ronelson@vt.edu Subject: Two stupid ip questions; > Any ideas why people decided on a decimal IP address notation > http://3626046468/ > http://d8211404/ > instead of a much more natural hexadecimal one? > > Still, quite difficult to memorize, unless you're an idiot savant, > that is. I think you answered your own question :) 2^2 sections of 2^8 possibilities is easier to remember than 4 sections of 8^2 possibilities. Besides, even in base 256, the digits are represented by decimal numbers rather than a mix of alphanumeric. If you want to be fancy, memorize your MAC #'s instead of IPs! Rob Nelson ronelson@vt.edu From eugene.leitl@lrz.uni-muenchen.de Tue, 1 Jun 1999 20:04:33 -0400 Date: Tue, 1 Jun 1999 20:04:33 -0400 From: Eugene Leitl eugene.leitl@lrz.uni-muenchen.de Subject: Two stupid ip questions; bob.cat@juno.com writes: > There IS no *natural* way to express a number. There is: use the smallest possible base. Binary is thus special. Unfortunately, it's very unwieldy for humans, so one has to use something with a more useful widely used base and the least amount of cyphers: hex. > That notation doesn't work everywhere: WIN98 IE5.0 needs: > > http://0xd8211404/ or > > http://0xd8.0x21.0x14.0x04/ > > And it does like octal (leading 0 indicating octal): > > http://033010212004 etc., etc... Octal is nice. I don't like the noise characters to denote the used base, though. http://d8211404/ is way shorter than http://0xd8.0x21.0x14.0x04/ > There are many ways to express an IP address, and this is not OS > dependent. OS doesn't know anything about syntax: applications do. There ought to be a RFC for that. > Simply convert the address to base(2^128) and you'll just have one number > to remember! Unfortunately, the amount of neural tissue devoted to 2^128 representational systems required would undergo instant gravitational collapse, and create one hell of a singularity. From mcking@cajunbro.com Tue, 1 Jun 1999 21:09:41 -0400 Date: Tue, 1 Jun 1999 21:09:41 -0400 From: Mark McCoy mcking@cajunbro.com Subject: Network cards availability (again) Hi, Does anyone on the list know where I can get a list of network cards that work under RedHat 6? I know about the 3com and tulip-based cards, but what about any others. RedHat's support site has the hardware list for Intel, but not for Alpha (I've already sent them an e-mail about it). I need this to find out what ISA cards are supported (10Mbit is fine for this card) since we will use up all of our PCI slots in the master machine. Ideally, we want to use all tulip-based cards. If anyone knows where we can get 16 DEC-based cards, _pleeeaaasse_ let me know. I have a lead on some cards made by TrendNet. Does anyone know if these are any good? I have not even heard of them. Thanks in advance! -- Mark McCoy -- Proud to run Linux since February 1996 Systems Administrator - Cajun Brothers Technology, llc The views in this message do not necessarily reflect the views of my employer This message posted from snowdog, a 100% MS-free machine. From hanzl@noel.feld.cvut.cz Wed, 2 Jun 1999 04:33:37 -0400 Date: Wed, 2 Jun 1999 04:33:37 -0400 From: Vaclav Hanzl hanzl@noel.feld.cvut.cz Subject: Superstck 2 switch and channel bonding Upgrade to firmware 2.0 solved this. (However web interface is still messy, there are functions available through the telnet interface only. But VLANS can be set using web interface.) Vaclav Hanzl > > I want to use channel bonding in my cluster but i didn't > > find how to configure my switch (a 3com superstack 2 > > switch 3300) for VLAN in the user guide ? > > > > Maybe someone who have the same switch as mine can give > > me the solution ? > > At least early SuperStack II 3300 were shipped with a flash > version that didn't allow VLANs, and obviously no information regarding > VLANs in the documentation (other than "to be provided in future releases" > or something alike). Probably manuals weren't changed for later releases, > I don't know (all our 3300's had old flash revisions). > > Newer flash versions are available at 3com.com. From Christopher.Bohn@sn.wpafb.af.mil Wed, 2 Jun 1999 08:17:41 -0400 Date: Wed, 2 Jun 1999 08:17:41 -0400 From: Bohn Christopher A Capt AFRL/IFSD Christopher.Bohn@sn.wpafb.af.mil Subject: Two stupid ip questions; Good day, > Any ideas why people decided on a decimal IP address notation The IP address we're used to seeing is 256-radix, which probably was originally represented as hex. e.g., 00.00.00.00 .. FF.FF.FF.FF -- which, of course, is representable in exactly 32 bits (hence, 32-bit IP) So, our 216.33.20.4 maps to D8.21.14.04 Take care, cb *-*-*-*-* *-*-*-*-* Christopher A. Bohn, Capt, USAF || christopher.bohn@sn.wpafb.af.mil Digital Simulation Systems Engineer || cbohn@computer.org Collaborative Simulation Technology || and Applications Branch || v (937)255-4429x3576 (DSN785) Information Directorate || f (937)255-4511 (DSN785) Wright Research Site || Air Force Research Laboratory || http://members.aol.com/EngrBohn/ http://www.if.afrl.af.mil/div/IFS/IFSD/IFSD_home.html *-*-*-*-* *-*-*-*-* > -----Original Message----- > From: Eugene Leitl [mailto:eugene.leitl@lrz.uni-muenchen.de] > Sent: Tuesday, June 01, 1999 1:42 PM > To: Rob Nelson > Cc: mdavis@kieser.net; Simon Hogg; beowulf@beowulf.gsfc.nasa.gov > Subject: Re: Two stupid ip questions; > > > Rob Nelson writes: > > >> 2) what's with these ip addresses with no dots? Is > this in an rfc or > > >> something? (or is it straight decimal to hex conversion?) e.g. > > >> http://3626046468/ maps to www.angelfire.com (216.33.20.4) > > Any ideas why people decided on a decimal IP address notation > http://3626046468/ > http://d8211404/ > instead of a much more natural hexadecimal one? > > And won't it be just horrible typing monstrosities like > http://1.2.4.8.16.32.64.128.1.2.4.8.16.32.64.128/ > in IPv6? Hexadecimal notation would seem definitely more > compact here... > http://d8211404d8211404d8211404d8211404/ > > Still, quite difficult to memorize, unless you're an idiot savant, > that is. > > -- Eugene > From hmm@patmos-international.com Wed, 2 Jun 1999 08:36:39 -0400 Date: Wed, 2 Jun 1999 08:36:39 -0400 From: Howard Miller hmm@patmos-international.com Subject: Network cards availability (again) On Tue, 1 Jun 1999, Mark McCoy wrote: > I know about the 3com and tulip-based cards, but what about any others. > RedHat's support site has the hardware list for Intel, but not for Alpha (I've > already sent them an e-mail about it). I need this to find out what ISA cards While I've never seen a good list, it is certainly easy enough to get the kernel-src rpm, go into the /usr/src/linux directory and run 'make menuconfig' or 'make xconfig', and browse through untill you find the list of network adapters. There are quite a few (under menuconfig you have to turn on some general-looking options before you can see them all). Also, the make command has to be done as root. > Ideally, we want to use all tulip-based cards. If anyone knows where we May I ask why you feel the tulips are ideal? Hope this helps, Howard Miller From wasshub@spdc.ti.com Wed, 2 Jun 1999 09:08:32 -0400 Date: Wed, 2 Jun 1999 09:08:32 -0400 From: Christoph Wasshuber wasshub@spdc.ti.com Subject: two NICs - channel bonding - tradeoff Assuming one has two 100MB NICs per node, is it always the best to channel bond? Or are there applications which run better with two separate communication channels. I could imagine an application where one has two kinds of messages. One kind is a short message but highly 'urgent'. Only after the message was delivered can the calculation continue. For example a synchronization message. And then another type of message which is higher in volume but is not time critical. So one channel would stay most of the time empty to provide the lowest latency for the urgent messages. Where the other channel is stuffed to transmit non-urgent data. Do I gain something with running two separate communication channels? Or do I fool myself? Chris.... From uthayopa@mcs.anl.gov Wed, 2 Jun 1999 10:42:18 -0400 Date: Wed, 2 Jun 1999 10:42:18 -0400 From: Putchong Uthayopas uthayopa@mcs.anl.gov Subject: were can i find pvmpovray samples This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. Send mail to mime@docserver.cac.washington.edu for more info. ---578021369-791095760-928334425=:7903 Content-Type: TEXT/PLAIN; charset=US-ASCII Hi, There is a povray benchmark called POVBENCH. This the one that IBM for their cluster in Linux World Conference in March. the site is http://www.haveland.com/povbench I attached the source povray file with this mail for you. all the result can be found on that web site. We also have test run of data for PVM povray on our SMILE Beowulf Cluster I will put the gif file of the graph at http://www.mcs.anl.gov/~uthayopa Putchong. On Tue, 1 Jun 1999, William Emmanuel S. Yu wrote: > > do you know were i can find pvmpovray samples? > > i would like to try out some for the demo. i also have a problem with > access because i only have email access for now. if anyone i kind enough > to email the samples to me i would highly appreciate it. > > william.s.yu@ieee.org > > ---578021369-791095760-928334425=:7903 Content-Type: TEXT/PLAIN; charset=US-ASCII; name="skyvase.pov" Content-Transfer-Encoding: BASE64 Content-ID: Content-Description: PovBench code Content-Disposition: attachment; filename="skyvase.pov" Ly8gUGVyc2lzdGVuY2UgT2YgVmlzaW9uIHJheXRyYWNlciB2ZXJzaW9uIDIu MCBzYW1wbGUgZmlsZS4NDQoNDQovLyBCeSBEYW4gRmFybWVyDQ0KLy8gICAg TWlubmVhcG9saXMsIG1uDQ0KDQ0KLy8gICBza3l2YXNlLnBvdg0NCi8vICAg VmFzZSBtYWRlIHdpdGggSHlwZXJib2xvaWQgYW5kIHNwaGVyZSB7LCBzaXR0 aW5nIG9uIGEgaGV4YWdvbmFsDQ0KLy8gICBtYXJibGUgY29sdW1uLiAgVGFr ZSBub3RlIG9mIHRoZSBjb2xvciBhbmQgc3VyZmFjZSBjaGFyYWN0ZXJpc3Rp Y3MNDQovLyAgIG9mIHRoZSBnb2xkIGJhbmQgYXJvdW5kIHRoZSB2YXNlLiAg SXQgc2VlbXMgdG8gYmUgYSBzdWNjZXNzZnVsDQ0KLy8gICBjb21iaW5hdGlv biBmb3IgZ29sZCBvciBicmFzcy4NDQovLw0NCi8vIENvbnRhaW5zIGEgRGlz a19ZIG9iamVjdCB3aGljaCBtYXkgaGF2ZSBjaGFuZ2VkIGluIHNoYXBlcy5k YXQNDQoNDQoNDQojaW5jbHVkZSAic2hhcGVzLmluYyINDQojaW5jbHVkZSAi c2hhcGVzMi5pbmMiDQ0KI2luY2x1ZGUgImNvbG9ycy5pbmMiDQ0KI2luY2x1 ZGUgInRleHR1cmVzLmluYyINDQoNDQojZGVjbGFyZSBETUZfSHlwZXJib2xv aWQgPSBxdWFkcmljIHsgIC8qIExpa2UgSHlwZXJib2xvaWRfWSwgYnV0IG1v cmUgY3VydnkgKi8NDQogICA8MS4wLCAtMS4wLCAgMS4wPiwNDQogICA8MC4w LCAgMC4wLCAgMC4wPiwNDQogICA8MC4wLCAgMC4wLCAgMC4wPiwNDQogICAt MC41DQ0KfQ0NCg0NCmNhbWVyYSB7DQ0KICAgbG9jYXRpb24gPDAuMCwgMjgu MCwgLTIwMC4wPg0NCiAgIGRpcmVjdGlvbiA8MC4wLCAwLjAsIDIuMD4NDQog ICB1cCAgPDAuMCwgMS4wLCAwLjA+DQ0KICAgcmlnaHQgPDQvMywgMC4wLCAw LjA+DQ0KICAgbG9va19hdCA8MC4wLCAtMTIuMCwgMC4wPg0NCn0NDQoNDQov KiBMaWdodCBiZWhpbmQgdmlld2VyIHBvc3Rpb24gKHBzZXVkby1hbWJpZW50 IGxpZ2h0KSAqLw0NCmxpZ2h0X3NvdXJjZSB7IDwxMDAuMCwgNTAwLjAsIC01 MDAuMD4gY29sb3VyIFdoaXRlIH0NDQoNDQp1bmlvbiB7DQ0KICAgdW5pb24g ew0NCiAgICAgIGludGVyc2VjdGlvbiB7DQ0KICAgICAgICAgcGxhbmUgeyB5 LCAwLjcgfQ0NCiAgICAgICAgIG9iamVjdCB7IERNRl9IeXBlcmJvbG9pZCBz Y2FsZSA8MC43NSwgMS4yNSwgMC43NT4gfQ0NCiAgICAgICAgIG9iamVjdCB7 IERNRl9IeXBlcmJvbG9pZCBzY2FsZSA8MC43MCwgMS4yNSwgMC43MD4gaW52 ZXJzZSB9DQ0KICAgICAgICAgcGxhbmUgeyB5LCAtMS4wIGludmVyc2UgfQ0N CiAgICAgIH0NDQogICAgICBzcGhlcmUgeyA8MCwgMCwgMD4sIDEgc2NhbGUg PDEuNiwgMC43NSwgMS42ID4gdHJhbnNsYXRlIDwwLCAtMS4xNSwgMD4gfQ0N Cg0NCiAgICAgIHNjYWxlIDwyMCwgMjUsIDIwPg0NCg0NCiAgICAgIHBpZ21l bnQgew0NCiAgICAgICAgIEJyaWdodF9CbHVlX1NreQ0NCiAgICAgICAgIHR1 cmJ1bGVuY2UgMC4zDQ0KICAgICAgICAgcXVpY2tfY29sb3IgQmx1ZQ0NCiAg ICAgICAgIHNjYWxlIDw4LjAsIDQuMCwgNC4wPg0NCiAgICAgICAgIHJvdGF0 ZSAxNSp6DQ0KICAgICAgfQ0NCiAgICAgIGZpbmlzaCB7DQ0KICAgICAgICAg YW1iaWVudCAwLjENDQogICAgICAgICBkaWZmdXNlIDAuNzUNDQogICAgICAg ICBwaG9uZyAxDQ0KICAgICAgICAgcGhvbmdfc2l6ZSAxMDANDQogICAgICAg ICByZWZsZWN0aW9uIDAuMzUNDQogICAgICB9DQ0KICAgfQ0NCg0NCiAgIHNw aGVyZSB7ICAvKiBHb2xkIHJpZGdlIGFyb3VuZCBzcGhlcmUgcG9ydGlvbiBv ZiB2YXNlKi8NDQogICAgICA8MCwgMCwgMD4sIDENDQogICAgICBzY2FsZSA8 MS42LCAwLjc1LCAxLjY+DQ0KICAgICAgdHJhbnNsYXRlIC03KnkNDQogICAg ICBzY2FsZSA8MjAuNSwgNC4wLCAyMC41Pg0NCg0NCiAgICAgIGZpbmlzaCB7 IE1ldGFsIH0NDQogICAgICBwaWdtZW50IHsgT2xkR29sZCB9DQ0KICAgfQ0N Cg0NCiAgIGJvdW5kZWRfYnkgew0NCiAgICAgIG9iamVjdCB7DQ0KICAgICAg ICAgRGlza19ZDQ0KICAgICAgICAgdHJhbnNsYXRlIC0wLjUqeSAgLy8gUmVt b3ZlIGZvciBuZXcgRGlza19ZIGRlZmluaXRpb24NDQogICAgICAgICBzY2Fs ZSA8MzQsIDEwMCwgMzQ+DQ0KICAgICAgfQ0NCiAgIH0NDQp9DQ0KDQ0KLyog U3RhbmQgZm9yIHRoZSB2YXNlICovDQ0Kb2JqZWN0IHsgSGV4YWdvbg0NCiAg IHJvdGF0ZSAtOTAuMCp6ICAgICAgICAgICAgIC8qIFN0YW5kIGl0IG9uIGVu ZCAodmVydGljYWwpKi8NDQogICByb3RhdGUgLTQ1KnkgICAgICAgICAgICAg ICAvKiBUdXJuIGl0IHRvIGEgcGxlYXNpbmcgYW5nbGUgKi8NDQogICBzY2Fs ZSA8NDAsIDI1LCA0MD4NDQogICB0cmFuc2xhdGUgLTcwKnkNDQoNDQogICBw aWdtZW50IHsNDQogICAgICBTYXBwaGlyZV9BZ2F0ZQ0NCiAgICAgIHF1aWNr X2NvbG9yIFJlZA0NCiAgICAgIHNjYWxlIDEwLjANDQogICB9DQ0KICAgZmlu aXNoIHsNDQogICAgICBhbWJpZW50IDAuMg0NCiAgICAgIGRpZmZ1c2UgMC43 NQ0NCiAgICAgIHJlZmxlY3Rpb24gMC44NQ0NCiAgIH0NDQp9DQ0KDQ0KdW5p b24gew0NCiAgIHBsYW5lIHsgeiwgNTAgIHJvdGF0ZSAtNDUqeSB9DQ0KICAg cGxhbmUgeyB6LCA1MCAgcm90YXRlICs0NSp5IH0NDQoNDQogICBwaWdtZW50 IHsgRGltR3JheSB9DQ0KICAgZmluaXNoIHsNDQogICAgICBhbWJpZW50IDAu Mg0NCiAgICAgIGRpZmZ1c2UgMC43NQ0NCiAgICAgIHJlZmxlY3Rpb24gMC41 DQ0KICAgfQ0NCn0NDQo= ---578021369-791095760-928334425=:7903-- From ulairi@ecs.csun.edu Wed, 2 Jun 1999 10:48:57 -0400 Date: Wed, 2 Jun 1999 10:48:57 -0400 From: Ulairi ulairi@ecs.csun.edu Subject: Two stupid IP questions; Imagine the poor sysadmins (us) that will have to maintain the DNS and have to memorize the IPs by heart. UGH! :) From uthayopa@mcs.anl.gov Wed, 2 Jun 1999 10:53:37 -0400 Date: Wed, 2 Jun 1999 10:53:37 -0400 From: Putchong Uthayopas uthayopa@mcs.anl.gov Subject: Network cards availability (again) Hi, This is related and not quite related. I have just bought LinkSys Etherfast 10/100 + 56Kmodem PCMCIA card. It work good with Linux but for Redhat 5.2 you must load PCMCIA 3.0.9 since RH5.2 use PCMCIA3.0.5. I install Redhat 6 yesterday, it work right away. The price is also good, the one I have is about 192US$ (Plus tax already). Putchong. PS: Sorry if it not very related, I am just in the mood to post. On Tue, 1 Jun 1999, Mark McCoy wrote: > Hi, > Does anyone on the list know where I can get a list of network cards that work > under RedHat 6? > I know about the 3com and tulip-based cards, but what about any others. > RedHat's support site has the hardware list for Intel, but not for Alpha (I've > already sent them an e-mail about it). I need this to find out what ISA cards > are supported (10Mbit is fine for this card) since we will use up all of our PCI > slots in the master machine. > > Ideally, we want to use all tulip-based cards. If anyone knows where we can get > 16 DEC-based cards, _pleeeaaasse_ let me know. I have a lead on some cards made > by TrendNet. Does anyone know if these are any good? I have not even heard of > them. > > Thanks in advance! > -- > Mark McCoy -- Proud to run Linux since February 1996 > Systems Administrator - Cajun Brothers Technology, llc > The views in this message do not necessarily reflect the views of my employer > This message posted from snowdog, a 100% MS-free machine. > From walt@parl.ces.clemson.edu Wed, 2 Jun 1999 11:18:34 -0400 Date: Wed, 2 Jun 1999 11:18:34 -0400 From: Walter B. Ligon III walt@parl.ces.clemson.edu Subject: two NICs - channel bonding - tradeoff -------- Now THIS is a really good question. The answer is: no one can answer that for you. Someone should explore that idea to see. Generally I think there are a number of interesting issues in using multiple networks. We have experimented with using one bus and one switched network (back when switches were still pricey). As you have correctly pointed out it will depend a lot on your application. First off you need to find such an application. Then do some experimenting to see how best to use the network resources. Of course, an alternative to using a dedicated network would be to modify the networking code in the kernel to provide priority access to your special traffic - of course getting switches to do that would be much more difficult. This would make a great project for an MS student - mucking around in the kernel to implement this. I wonder if anyone know if the "out of band" feature of TCP/IP (or is it UDP/IP?) actually gives priority to packets at the network device queue, or if it simply provides a seperate buffer for the socket? One idea would be for the kernel to automatically route "out of band" data via a different device. Anyway, lots of issues to explore. Please do, and then let us know what you find. Walt > Assuming one has two 100MB NICs per node, is it > always the best to channel bond? Or are there > applications which run better with two separate > communication channels. > I could imagine an application where one has two > kinds of messages. One kind is a short message but > highly 'urgent'. Only after the message was delivered > can the calculation continue. For example a > synchronization message. And then another type of > message which is higher in volume but is not > time critical. So one channel would stay most of > the time empty to provide the lowest latency for > the urgent messages. Where the other channel is > stuffed to transmit non-urgent data. > > Do I gain something with running two separate > communication channels? Or do I fool myself? > > Chris.... -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University From mcking@cajunbro.com Wed, 2 Jun 1999 12:09:23 -0400 Date: Wed, 2 Jun 1999 12:09:23 -0400 From: Mark McCoy mcking@cajunbro.com Subject: Network cards availability (again) Mike Menefee wrote: > > > Hi, > > Does anyone on the list know where I can get a list of network cards that > work > > under RedHat 6? > > I know about the 3com and tulip-based cards, but what about any others. > > RedHat's support site has the hardware list for Intel, but not for Alpha > (I've > > already sent them an e-mail about it). I need this to find out what ISA > cards > > are supported (10Mbit is fine for this card) since we will use up all of > our PCI > > slots in the master machine. > > Might do a search on their site.. they have a thing about burying stuff > sometimes.. also, might look for a general compat HOW-TO... (Never tried > sticking Linux on an Alpha system, but I would expect the hardware list to > be close to an Intel..) In general, it is, except for network cards and video cards > > > > > Ideally, we want to use all tulip-based cards. If anyone knows where we > can get > > 16 DEC-based cards, _pleeeaaasse_ let me know. I have a lead on some > cards made > > by TrendNet. Does anyone know if these are any good? I have not even > heard of > > them. > > Hmm.. this might be a bad memory on my part, but I think NetGear's 10BT card > is DEC based... might look into it... They're prolly around $20 a pop now, > and NetGear is Bay Networks, so... > > > Thanks in advance! > > Np. > > MikeM The Netgear cards used to be DEC, now they're some other (FPIC??) manufacturer. -- Mark McCoy -- Proud to run Linux since February 1996 Systems Administrator - Cajun Brothers Technology, llc The views in this message do not necessarily reflect the views of my employer This message posted from snowdog, a 100% MS-free machine. From dhart@indiana.edu Wed, 2 Jun 1999 12:51:57 -0400 Date: Wed, 2 Jun 1999 12:51:57 -0400 From: Dave Hart dhart@indiana.edu Subject: PVM or MPI essential to run parallel applications on a Beowulf? I have 32 dual-processor nodes, with MPICH and Portland Group compilers. Is it possible to take advantage of the dual processors, for a 32-process MPI & pgf90 job? -- David Hart http://php.indiana.edu/~dhart Research Computing Support 812-855-2632 University Information Technology Services Indiana University From kragen@pobox.com Wed, 2 Jun 1999 13:14:03 -0400 Date: Wed, 2 Jun 1999 13:14:03 -0400 From: Kragen Sitaker kragen@pobox.com Subject: two NICs -- channel bonding -- tradeoff Someone asked: > I wonder if anyone know if the "out of band" feature of TCP/IP (or is > it UDP/IP?) actually gives priority to packets at the network device > queue, or if it simply provides a seperate buffer for the socket? One > idea would be for the kernel to automatically route "out of band" data > via a different device. The "out of band" feature is a feature of the BSD socket interface. On TCP it translates into "urgent data", which is just data in the normal data stream with an 'urgent pointer' pointing to it. Accordingly, 'out of band' data cannot usefully be routed through a different device, because it will not be received until all previous packets on the same TCP connection are received. You could use a different TCP connection (with different IP QoS flags) though. -- Kragen Sitaker TurboLinux is outselling NT in Japan's retail software market 10 to 1, so I hear. -- http://www.performancecomputing.com/opinions/unixriot/981218.shtml From tull@pgroup.com Wed, 2 Jun 1999 13:53:48 -0400 Date: Wed, 2 Jun 1999 13:53:48 -0400 From: Dave Borer tull@pgroup.com Subject: PVM or MPI essential to run parallel applications on a Beowulf? David, I assume you have 3.0-4 release installed. The problem with the MPI type programs is that the code which is loaded on each node does not propogate the environment variables needed, so the program must set them. We recommend that the program loaded on each node should use the OpenMP routine OMP_SET_NUM_THREADS() to set the number of threads (2 in this case) and can be verified with OMP_GET_NUM_THREADS() to make sure it was set. So use these routines in your codes that are to be compiled and linked using -mp. If you need to verify that some representative example of your coding is appropriate for your intended use, feel free to send it, and we will see if we can help. regards, dave > > > I have 32 dual-processor nodes, with MPICH and Portland Group > compilers. Is it possible to take advantage of the dual processors, > for a 32-process MPI & pgf90 job? > > -- > David Hart http://php.indiana.edu/~dhart > Research Computing Support 812-855-2632 > University Information Technology Services Indiana University > From jferg@2boot.com Wed, 2 Jun 1999 14:43:04 -0400 Date: Wed, 2 Jun 1999 14:43:04 -0400 From: jferg jferg@2boot.com Subject: Two stupid ip questions; This is a multi-part message in MIME format. --------------05EA8422E05458D3E3479836 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Bohn Christopher A Capt AFRL/IFSD wrote: > Good day, > > > Any ideas why people decided on a decimal IP address notation > > The IP address we're used to seeing is 256-radix, which probably was > originally represented as hex. e.g., 00.00.00.00 .. FF.FF.FF.FF -- which, > of course, is representable in exactly 32 bits (hence, 32-bit IP) > > So, our 216.33.20.4 maps to D8.21.14.04 > > Take care, > cb > *-*-*-*-* *-*-*-*-* > Christopher A. Bohn, Capt, USAF || christopher.bohn@sn.wpafb.af.mil > Digital Simulation Systems Engineer || cbohn@computer.org > Collaborative Simulation Technology || > and Applications Branch || v (937)255-4429x3576 (DSN785) > Information Directorate || f (937)255-4511 (DSN785) > Wright Research Site || > Air Force Research Laboratory || http://members.aol.com/EngrBohn/ > http://www.if.afrl.af.mil/div/IFS/IFSD/IFSD_home.html > *-*-*-*-* *-*-*-*-* > > > -----Original Message----- > > From: Eugene Leitl [mailto:eugene.leitl@lrz.uni-muenchen.de] > > Sent: Tuesday, June 01, 1999 1:42 PM > > To: Rob Nelson > > Cc: mdavis@kieser.net; Simon Hogg; beowulf@beowulf.gsfc.nasa.gov > > Subject: Re: Two stupid ip questions; > > > > > > Rob Nelson writes: > > > >> 2) what's with these ip addresses with no dots? Is > > this in an rfc or > > > >> something? (or is it straight decimal to hex conversion?) e.g. > > > >> http://3626046468/ maps to www.angelfire.com (216.33.20.4) > > > > Any ideas why people decided on a decimal IP address notation > > http://3626046468/ > > http://d8211404/ > > instead of a much more natural hexadecimal one? > > > > And won't it be just horrible typing monstrosities like > > http://1.2.4.8.16.32.64.128.1.2.4.8.16.32.64.128/ > > in IPv6? Hexadecimal notation would seem definitely more > > compact here... > > http://d8211404d8211404d8211404d8211404/ > > > > Still, quite difficult to memorize, unless you're an idiot savant, > > that is. > > > > -- Eugene > > The library routines derived from the BSD work will normally accept IP addresses in any of the long integer string specifications acceptable to C compilers, e.g., leading "0" for octal, leading "0x" for Hexadecimal, a straight decimal number, and the "dotted decimal" notation which is most familiar. In setting net masks, I frequently use hexadecimal. While "0xFFFFFF00" has little to offer over "255.255.255.0", when I want, say, a 10-bit host field, I prefer "0xFFFFFC00" to "255.255.252.0"; it just seems a bit more intuitive to me. I make no assertion that this works for independently derived libraries. -- Joe Ferguson, ApeX Systems Integration Corp. Voice: 919.468.8150 FAX: 919.468.5288 email: jferg@2boot.com --------------05EA8422E05458D3E3479836 Content-Type: text/x-vcard; charset=us-ascii; name="jferg.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for jferg Content-Disposition: attachment; filename="jferg.vcf" begin:vcard n:Ferguson;Joe x-mozilla-html:FALSE org:ApeX Systems Integration Corp. adr:;;;;;; version:2.1 email;internet:jferg@2boot.com title:Tech Director x-mozilla-cpt:;0 fn:Joe Ferguson end:vcard --------------05EA8422E05458D3E3479836-- From Christopher.Bohn@sn.wpafb.af.mil Wed, 2 Jun 1999 14:54:58 -0400 Date: Wed, 2 Jun 1999 14:54:58 -0400 From: Bohn Christopher A Capt AFRL/IFSD Christopher.Bohn@sn.wpafb.af.mil Subject: Sun community-sources its HPC software Good day, Saw this on Slashdot: Sun is releasing their HPC 2.0 under the Sun Community Source License http://slashdot.org/article.pl?sid=99/06/02/1735235 http://www.sun.com/servers/hpc/software/ Interestingly, their HPF compiler is listed as one of the "highlights" -- does this mean a freely-available HPF compiler? HPC 2.0's hightlights (http://www.sun.com/servers/hpc/software/overview.html) HPC 2.0 supports single symmetric multiprocessors (SMPs) and clusters of SMPs with up to 256 processors. Sun High-Performance Fortran (HPF) produces optimized parallel codes that run on either a Sun HPC cluster or a single SMP system. Prism graphical programming environment allows developers to execute, debug, visualize, analyze, and tune both serial and parallel programs. Sun Scientific Subroutine Library(S3L) provides scalable parallel functions and tools for scientific applications. Load Sharing Facility (LSF) optimizes load balancing, job execution, and distributed batch scheduling. Run-Time Environment (RTE) provides tools for optimizing parallel application workload management, parallel resource sharing, and parallel load balancing. Sun MPI delivers thread-safe message passing designed for communication among multiple nodes in clusters as well as among processes on the same symmetric multiprocessor (SMP). Sun MPI I/O maximizes parallel I/O capabilities for message-passing among multiple nodes in clusters. Parallel File System(PFS) allows parallel applications to perform high-performance, scalable I/O across multiple storage systems in parallel. Parallel Virtual Machine (PVM) is a public-domain message-passing library. PETSc is a portable, extensible toolkit for scientific computation, providing support for sparse iterative solvers. Cluster Console Manager (CCM)enables administrators to open windows to each node in a cluster and to initiate operations either across all nodes or across subsets of nodes. Switch Management Agent (SMA) helps the administrator configure and monitor the SCI switch. Take care, cb *-*-*-*-* *-*-*-*-* Christopher A. Bohn, Capt, USAF || christopher.bohn@sn.wpafb.af.mil Digital Simulation Systems Engineer || cbohn@computer.org Collaborative Simulation || Technology Branch || v (937)255-4429x3576 (DSN785) Information Directorate || f (937)255-4511 (DSN785) Wright Research Site || Air Force Research Laboratory || http://members.aol.com/EngrBohn/ http://www.if.afrl.af.mil/div/IFS/IFSD/IFSD_home.html *-*-*-*-* *-*-*-*-* From walt@parl.ces.clemson.edu Wed, 2 Jun 1999 15:11:42 -0400 Date: Wed, 2 Jun 1999 15:11:42 -0400 From: Walter B. Ligon III walt@parl.ces.clemson.edu Subject: two NICs -- channel bonding -- tradeoff -------- > Someone asked: > > I wonder if anyone know if the "out of band" feature of TCP/IP (or is > > it UDP/IP?) actually gives priority to packets at the network device > > queue, or if it simply provides a seperate buffer for the socket? One > > idea would be for the kernel to automatically route "out of band" data > > via a different device. > > The "out of band" feature is a feature of the BSD socket interface. On > TCP it translates into "urgent data", which is just data in the normal > data stream with an 'urgent pointer' pointing to it. > > Accordingly, 'out of band' data cannot usefully be routed through a > different device, because it will not be received until all previous > packets on the same TCP connection are received. > > You could use a different TCP connection (with different IP QoS flags) > though. > Right. So it sounds like one implementation of that socket feature over TCP/IP does not actually give priority to that data. So the one idea is to implement a mechanism that does. Anyway, I don't know if it would really be worth it - depends on what applications could do with it. But it IS an idea. Walt -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University From ctimmer@gci.net Wed, 2 Jun 1999 15:19:29 -0400 Date: Wed, 2 Jun 1999 15:19:29 -0400 From: Curt Timmerman ctimmer@gci.net Subject: two NICs - channel bonding - tradeoff This is a multi-part message in MIME format. --------------E950A66CE2351639243F5A2F Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit I took the easy way out. I have 2 NIC cards per node and dedicate cluster messaging (PVM) to one card and everything else to the other card. I have no benchmarks the the activity lights indicate a pretty good load balance. Again - depends on your application. "Walter B. Ligon III" wrote: > > -------- > Now THIS is a really good question. The answer is: no one can answer that > for you. Someone should explore that idea to see. > > Generally I think there are a number of interesting issues in using multiple > networks. We have experimented with using one bus and one switched network > (back when switches were still pricey). > > As you have correctly pointed out it will depend a lot on your application. > First off you need to find such an application. Then do some experimenting > to see how best to use the network resources. > > Of course, an alternative to using a dedicated network would be to modify > the networking code in the kernel to provide priority access to your special > traffic - of course getting switches to do that would be much more difficult. > This would make a great project for an MS student - mucking around in the > kernel to implement this. I wonder if anyone know if the "out of band" > feature of TCP/IP (or is it UDP/IP?) actually gives priority to packets > at the network device queue, or if it simply provides a seperate buffer for > the socket? One idea would be for the kernel to automatically route "out > of band" data via a different device. > > Anyway, lots of issues to explore. Please do, and then let us know what you > find. > > Walt > > > Assuming one has two 100MB NICs per node, is it ... > > > > Do I gain something with running two separate > > communication channels? Or do I fool myself? > > > > Chris.... > > -- > Dr. Walter B. Ligon III > Associate Professor > ECE Department > Clemson University -- -------------------------------------------- Curt Timmerman PO Box 520153 Big Lake, Alaska 99652 (907)892-7460 -------------------------------------------- --------------E950A66CE2351639243F5A2F Content-Type: text/x-vcard; charset=us-ascii; name="ctimmer.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Curt Timmerman Content-Disposition: attachment; filename="ctimmer.vcf" begin:vcard n:Timmerman;Curt tel;home:(907)892-7460 tel;work:(907)777-6656 x-mozilla-html:FALSE adr:;;;;;; version:2.1 email;internet:ctimmer@gci.net fn:Curt end:vcard --------------E950A66CE2351639243F5A2F-- From johansen@publicistech.com Wed, 2 Jun 1999 15:22:31 -0400 Date: Wed, 2 Jun 1999 15:22:31 -0400 From: Jon Johansen johansen@publicistech.com Subject: greetings please excuse what is probably an oft-asked question, but I'm new to the list: do apps (commercial/shareware/etc) currently exist to run graphical environments (OpenGl, etc/multi-object, with complex behaviors, and such) on a beowulf cube, or is the state of the tech still at the home-grown level? thnx in advance From philip_juels@harvard.edu Wed, 2 Jun 1999 16:39:15 -0400 Date: Wed, 2 Jun 1999 16:39:15 -0400 From: Philip Juels philip_juels@harvard.edu Subject: SSH and clusters Sometimes my users simply want to run a batch process on any given node within our cluster as opposed to true parallel processing. So they use ssh to access the master node of our cluster and then rlogin or telnet to access the clients from the master (the client nodes are on an isolated intranet with the master acting as gatekeeper). Is this insecure? Should we run ssh for connections withing the cluster? My understanding of ssh is that it's like a secure pipe...anything on top of it should be encrypted. Thanks, Philip Juels philip_juels@harvard.edu From a_mccabe@worldnet.att.net Wed, 2 Jun 1999 16:53:52 -0400 Date: Wed, 2 Jun 1999 16:53:52 -0400 From: Andrew McCabe a_mccabe@worldnet.att.net Subject: NFS mounts i read somewhere about mounting a common directory with NFS to move files around (ftp is getting annoying) how would i go about doing this? my server is RH6.0, and the client is RH5.2 any help would be greatly appreciated thanx --andrew mccabe From ctimmer@gci.net Wed, 2 Jun 1999 17:24:42 -0400 Date: Wed, 2 Jun 1999 17:24:42 -0400 From: Curt Timmerman ctimmer@gci.net Subject: PVM or MPI essential to run parallel applications on aBeowulf? This is a multi-part message in MIME format. --------------85F1CE57A29BE88197A515C9 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit The simplest, but not necessarily most optimal, way to handle this is to treat a dual processor node as if it were 2 nodes. No special programming is required and the benefits are available immediately. Curt Dave Hart wrote: > > I have 32 dual-processor nodes, with MPICH and Portland Group > compilers. Is it possible to take advantage of the dual processors, > for a 32-process MPI & pgf90 job? > > -- > David Hart http://php.indiana.edu/~dhart > Research Computing Support 812-855-2632 > University Information Technology Services Indiana University -- -------------------------------------------- Curt Timmerman PO Box 520153 Big Lake, Alaska 99652 (907)892-7460 -------------------------------------------- --------------85F1CE57A29BE88197A515C9 Content-Type: text/x-vcard; charset=us-ascii; name="ctimmer.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for Curt Timmerman Content-Disposition: attachment; filename="ctimmer.vcf" begin:vcard n:Timmerman;Curt tel;home:(907)892-7460 tel;work:(907)777-6656 x-mozilla-html:FALSE adr:;;;;;; version:2.1 email;internet:ctimmer@gci.net fn:Curt end:vcard --------------85F1CE57A29BE88197A515C9-- From rgb@phy.duke.edu Wed, 2 Jun 1999 17:57:24 -0400 Date: Wed, 2 Jun 1999 17:57:24 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: Cox report on Chinese spy activities and Beowulf On Fri, 28 May 1999, texelsoft wrote: > Whoa RGB you're getting carried away by a perfectly good rant. > > > e) That nuclear bombs basically Are Not That Hard To Build. You > > cannot keep the laws of nature secret from anybody with even a paltry > > research budget (and the Chinese are far from mingy on research and > > defense, I'm sure). Once the laws are known (and they've been > > generally known for decades, now) all that is left is engineering, and > > folks need to get a grip on their seats to hear this, but engineering > > is not all that difficult either. What do they think, Americans have > > some sort of a monopoly on clever ideas so that the world has to steal > > them? I think not... > > Then why the efforts to steal it? Surely you're not sufficiently racist to > suggest the chinese people, in general, are stupid. Equally unlikely is the I'm suggesting from direct personal experience that they are far from stupid. In fact, they're more than smart enough that they didn't need to steal anything to build advanced design bombs. > notion that the engineering effort done in the last N years in american labs > is valueless. Or perhaps you'd like to make the code open source? Kinda has Not valueless, just easily duplicated by other smart people in non-American labs given a decade or more and the resources to work on it. If you check out the website the Chinese government set up to prove that they didn't steal any "secrets" to build their bombs, you will see that the "code" already is "open source". You might look up the nuclear weapons FAQ site (it moves or I'd give you a URL, but a web browser should easily find it) you can browse a fraction of what is known and in the public domain about bombs. The NWFAQ stops (just) short of providing "engineering details" but, as the Chinese noted, most of those details (physical dimensions, materials, specs) are in the open literature or moderately unimportant (many ways to design a house or car or bomb). Curiously, the Chinese government stated clearly that they think that America's claim that they (needed to steal because they are "ignorant Chinese" and hence) "stole" the bomb "secrets" is what is really racist. I agree. I will make a Pronouncement. In my professional opinion, as a physicist of reasonably good standing, a team made up of just the physicists from China that >>I personally<< have taught in graduate school at Duke over the last decade (most of whom were at or near the top of their graduate classes grade-wise) could easily build any kind of nuclear bomb you like from the descriptions of bombs available on the NWFAQ and throughout the open literature. Sure, they'd need a decent budget, some readily available computers, and the support of a corps of explosives (and other) engineers, but again, I feel confident that the Chinese are up to the challenge of building explosive lenses out of well-documented explosive materials with well-documented differential burn rates. Especially given a decade or more to experiment and perfect. They manage to build lasers, ballistic missiles, integrated circuits, and all sorts of other "high-tech" devices; their best engineers are certainly competent and creative enough. I'm not a nuclear theorist, but I'm totally confident that I could do it all by myself in a decade (given a budget of a a few hundred million $$ and a cast of thousands, of course;-). The physics just isn't all that difficult any more, and the physics is the hardest part. The only thing at all difficult about building a bomb is acquiring bomb-grade fissionable materials. This a country the size of China obviously has no problem with. The place where the Chinese obtained the "secrets" required to build the bomb is in the many top-level graduate physics departments across the country that have been eagerly accepting Chinese students for ten years or so now. Note that those students represent the heavily selected "cream" of students from a population pool of approximately 1 billion people, in a culture that has valued education as a means of personal advancement (and hence to some extent selected for intelligence) for some 3000 years. We have trained them in nuclear physics, laser physics, condensed matter physics, computational physics, pretty much any kind of physics they wanted to learn, all sorts of engineering and mathematics -- there have been no "restrictions" on the kind of knowledge foreign students, including the Chinese, are permitted to acquire in American universities. This is the same place, by the way, that both the Pakistanis and the Indians (the latter with an equally large population pool of their own) learned the requisite physics. Sure, both countries have some excellent universities of their own, but face it -- the best graduate physics institutions in the world are arguably in USA (with some equals in Western Europe) right now. We accept and train the best students from all over the world. It's hardly surprising to me that a few Fermis or Tellers or Oppenheimers are among them, especially when the REALLY hard part, the conception and proof of the Idea, is long since accomplished. The last observation to make about the whole issue is that the flow of this sort of information is inevitably from "unknown" to "secret" to "generally and openly known". In the early 1940's nuclear bomb design stopped being "unknown". It remained approximately "secret" up to perhaps the late sixties, with occasional (very brief) bursts of new, ever "smaller" secrets as new bomb designs were obtained. Once it is known that something (like a neutron bomb, or a very small implosion warhead) is possible, however, it is just a matter of time before its "secrets" become public domain and generally known. Inside the next decade, every detail of building any kind of bomb you like will become publically available, instantly retrievable knowledge; most of the details are already available if you look carefully for them. We might as well get used to the idea. The USA is trying to stop the bleeding of an amputation with a band-aid, or building an armed bunker around the barn long after the horse has departed, or passing water into the face of a force seven hurricane, or trying to reverse the course of the second law of thermodynamics (literally, as this dictates the direction of the flow of information from isolated to disseminated). Perhaps we'll buy a few years before any Iraqi armed with a web-browser can get the details. Perhaps not; personally I think that the Iraqi's have done just fine building bombs lacking only the "pits" to make them functional even without web browsers. Either way their response of the moment is redolent with jingoistic hysteria. We don't really need some sort of Nuclear McCarthyism introduced into American politics to restrict free trade in computers and distract us from real issues -- that it is occuring anyway strikes me as blatant election-year issue-creation. > > (Speaking of which, let us pray for the Indians and Pakistanis who are > > about to die in the world's second nuclear war...and who didn't steal > > any codes or misuse any controlled computers to build the devices that > > they will use in it). > Good sentiment if a bit overwrought with nuclear hysteria. Pray for the > young indian and pakistani men and women who really are dying right now from > conventional weapons too. Good point, although they are dying by the tens or even the hundreds in a very confined locale. In almost any kind of nuclear exchange in the crowded Indian sub-continent, I would expect casualties four to five orders of magnitude greater. A hundred million dead or wounded or poisoned by fallout would not be out of the question. I actually lived in India from 1959 to 1967, during one of the earlier India-Pakistani wars (and not so long after their splitup, Nehru was Prime Minister when I first arrived). I had occasion to visit India again fairly recently. Based on this experience, I'm not at all optimistic about the current situation there. Perhaps I'm being "hysterical"; certainly I'm being cynical. Very, very cynical. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb@phy.duke.edu Wed, 2 Jun 1999 18:56:15 -0400 Date: Wed, 2 Jun 1999 18:56:15 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: NFS mounts On Wed, 2 Jun 1999, Andrew McCabe wrote: > i read somewhere about mounting a common directory with NFS to move files > around (ftp is getting annoying) how would i go about doing this? my > server is RH6.0, and the client is RH5.2 > > any help would be greatly appreciated There are a number of documents describing how to do this in the HOWTOs, usually in /usr/doc somewhere. Its also described in almost any book on systems administration, especially in linux-specific ones. In a nutshell: a) pick a filesystem or directory (e.g. /whattever) to export b) add it to /etc/exports. read the man page on format, be careful not to export it promiscuously (that is, to everybody in the known universe). Export the directory rw if you want the client to be able to make changes (e.g. delete or add files) c) restart rpc.nfsd and rpc.mountd (kill -1 their pids) On the client, add a line to /etc/fstab and run mount -a, or just hand enter "mount server:/whattever /whattever". Set ownership and permissions appropriately -- both server and client need to agree about the uid's of files in /whattever. A second (perhaps better) alternative is to install and use ssh. scp is a secure alternative to ftp, provided that you have accounts on both systems. Note that between SOME systems, especially those widely separated by routers, one may encounter a router set to not pass port 2049. In this (sensible!) case one cannot use nfs at all. Since nfs is fundamentally insecure over a WAN (IMHO, at least) if your connection passes through one or more routers you should consider the ssh alternative even if the routers do pass 2049. rgb > > thanx > --andrew mccabe > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb@phy.duke.edu Wed, 2 Jun 1999 18:59:40 -0400 Date: Wed, 2 Jun 1999 18:59:40 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: Cox report on Chinese spy activities and Beowulf On Fri, 28 May 1999, texelsoft wrote: > Whoa RGB you're getting carried away by a perfectly good rant. Oops, sorry list folks, especially Alan Cox and others who complained. I meant to hit the other "r"; I agree that it is time to take the discussion offline (or let it die a natural death:-) but was catching up on a bunch of mail fast and mis-hit. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From jstan@trendcmhs.org Wed, 2 Jun 1999 19:03:12 -0400 Date: Wed, 2 Jun 1999 19:03:12 -0400 From: Stanley, Jeremy jstan@trendcmhs.org Subject: Cox report on Chinese spy activities and Beowulf > ---------- > From: Robert G. Brown[SMTP:rgb@phy.duke.edu] > Sent: Wednesday, June 02, 1999 5:57 PM > To: texelsoft > Cc: Matt Welsh; extreme-linux@acl.lanl.gov; > beowulf@beowulf.gsfc.nasa.gov > Subject: RE: Cox report on Chinese spy activities and Beowulf > > Not valueless, just easily duplicated by other smart people in > non-American labs given a decade or more and the resources to work on > it. At the risk of continuing a potentially off-topic thread, I feel obligated to point out the obvious Ameri-centrism prevalent in the press's coverage of this unfortunate but inevitable situation. Forget not that the pioneers of nuclear technology in the States were, for the most part, foreign nationals themselves. They brought with them much of the information that the United States claims for itself. This strikes me as nothing more than a case of inflated ego, but one with the potential to do much damage to the scientific world as a whole... -- Jeremy Stanley Trend CMHS I.S.Network Engineer http://www.trendcmhs.org The opinions expressed herein do not necessarily represent those of Trend CMHS or Trend Foundation. "I program my homecomputer; beam myself into the future." --Kraftwerk, 1981 begin 600 winmail.dat M>)\^(AX6`0:0"``$```````!``$``0>0!@`(````Y`0```````#H``$(@`<` M&````$E032Y-:6-R;W-O9G0@36%I;"Y.;W1E`#$(`06``P`.````SP<&``(` M$@`P`!P``P`_`0$@@`,`#@```,\'!@`"`!(`,``<``,`/P$!"8`!`"$```!& M1#9"-44W-$5",3A$,S$Q.4,S,#`P03!#.39%14(P-`!"!P$$@`$`-0```%)% M.B!#;W@@`'```0```#$```!#;W@@1")!'M,B0`*`"H%+#G$+8&X.$#`S%.!HH06P>F1O8P``*A)5-B`"D240 M;"5%"O1L:0@Q.#`"T6DM,33^-`ZP#-`G4`N9'5K92X)@'5^72H_*TT&8`(P+'\MBU<%"8!N!Y!D87DL(,1*=34` M(#`R-6`B,`$V("`U.C4W(%#F33"_*TU4;S+_+8L:0.!X96QS;P&`-M\K3`0BTFT&YU>$``T&S&+B/Q02!G;W9` M`"YP82\0=6QF0$'E08!SY1)@+A;P"`0-;!S<'G"(`#0=&EV:4H`!Y%]`'!D+N!!]`J% M"HLFT#,N-B@7$[(,`6,3H"`^7T07&L)-DR@1*25.*3`@5G8'0`I0;`>0`_<`F`0=!)P"DP:`20]TF0`,`ND7!!\`M0 M-;`+@"]#O4U?3F\I-&X"("U!7P>`!1!1T`.@"V!B!"!GWTH0"?!)T%%P!9!A M#G!(X/\%P`1@'*!*@U)Q2'$[(`AP0F,'D71O('<%L&N_2.%3OU3/5=\I4DHP M+ET][RD6"H]!9X1R@6/`U8'A)(&8)X`,@+F`FT&??4>-:T4B@"X!(T75B M]"Y@[TH@"&`4(5=R+5J0`C!C4?YM4Q`%17:B_TC@ M9<`#``(@!"`ZX&LC4@'_4H%T\`.@@@%ODC4`6I%#4/\%$%%0A45(@5HQ:G)2 M<'0@_VP3C#A9(HPT;R`UD#4P;K/[2TR(X2)F@"D1"<""D%E0STG`)(`'@`6@ M;7!H,`20_T'"G^-[XE.!6M">R(H%6?*N9F@P"'`P4"(I@4MKX&L!@',Q:S7R M.`!01EHV+TNO3+]@#1O!`*E`````0``Y`!!OP`=*K;X!`P#Q/PD$```>`#%` M`0````8```!*4U1!3@````,`&D``````'@`P0`$````&````2E-404X````# M`!E```````,`_3_D!````@%'``$````O````8SU54SMA/2`[<#U4`!T.`0```#$```!#;W@@ Sometimes my users simply want to run a batch process on any given node > within our cluster as opposed to true parallel processing. So they use > ssh to access the master node of our cluster and then rlogin or telnet > to access the clients from the master (the client nodes are on an > isolated intranet with the master acting as gatekeeper). Is this > insecure? Should we run ssh for connections withing the cluster? My > understanding of ssh is that it's like a secure pipe...anything on top > of it should be encrypted. I believe that this is as secure as your gatekeeper. As you say, the traffic over open links from your originating host to the gatekeeper should be encrypted and non-snoopable. Furthermore, your INTERNAL traffic on the cluster probably is no-passwd-needed stuff enabled with .rhosts. However, I don't think there is any really good reason not to run ssh inside the cluster anyway. I suppose it is a religious view, but I'd like to see rsh go away permanently, and the best way for this to eventually occur is if everybody everywhere starts to use ssh/sshd exclusively whereever they once used rsh. IIRC this is the last year that RSA is patented; next year ssh will be truly public domain, and if the num-nums who think that people in the USA should be prohibited from exporting an encryption/software package originally distributed from Finland and universally available anyway could just be persuaded to lighten up and rent a brain, we might see all linux distributions adopt it as standard fare. rgb P.S. to forstall possible arguments that rsh is a lighter-weight protocol, I agree, but if one is using the shell itself for IPC's in an application where speed is critical, well... P.P.S. - and if one is really doing this, one CAN still use rsh, but very few people are, I'm sure.... Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From nvertes@intellicorp.com.au Wed, 2 Jun 1999 20:00:01 -0400 Date: Wed, 2 Jun 1999 20:00:01 -0400 From: nvertes nvertes@intellicorp.com.au Subject: Fw: Network cards availability (again) To: Mark McCoy Sent: Wednesday, June 02, 1999 9:39 PM Subject: Re: Network cards availability (again) Compex (a US based Taiwanese/Singapore manufacturer) sells a PCI 10/100 Network card which uses the Digital 21143-PD chip. If this is the chip you're referring to then the Name & part number is: Compex Freedomline 100 FL100TX-PCI Their web address is www.cpx.com Nick Vertes / Sales & Mkting Dir Intellicorp Australia Pty Ltd Level 5, 107 Walker St North Sydney NSW 2060 Australia Tel: 61 2 9922 6466 Fax: 61 2 9922 6465 Mobile 0413 059 959 nvertes@intellicorp.com.au > ----- Original Message ----- > From: Mark McCoy > To: beowulf mailing list > Sent: Wednesday, June 02, 1999 11:12 AM > Subject: Network cards availability (again) > > > > Hi, > > Does anyone on the list know where I can get a list of network cards that > work > > under RedHat 6? > > I know about the 3com and tulip-based cards, but what about any others. > > RedHat's support site has the hardware list for Intel, but not for Alpha > (I've > > already sent them an e-mail about it). I need this to find out what ISA > cards > > are supported (10Mbit is fine for this card) since we will use up all of > our PCI > > slots in the master machine. > > > > Ideally, we want to use all tulip-based cards. If anyone knows where we > can get > > 16 DEC-based cards, _pleeeaaasse_ let me know. I have a lead on some > cards made > > by TrendNet. Does anyone know if these are any good? I have not even > heard of > > them. > > > > Thanks in advance! > > -- > > Mark McCoy -- Proud to run Linux since February 1996 > > Systems Administrator - Cajun Brothers Technology, llc > > The views in this message do not necessarily reflect the views of my > employer > > This message posted from snowdog, a 100% MS-free machine. > > > From bob@drzyzgula.org Wed, 2 Jun 1999 21:44:55 -0400 Date: Wed, 2 Jun 1999 21:44:55 -0400 From: Bob Drzyzgula bob@drzyzgula.org Subject: Sun HPC software goes "Community Source" FYI, in case y'all haven't seen it... http://www.sun.com/servers/hpc/communitysource/ -- ============================================================ Bob Drzyzgula It's not a problem bob@drzyzgula.org until something bad happens ============================================================ From dylan@aero.org Wed, 2 Jun 1999 21:45:17 -0400 Date: Wed, 2 Jun 1999 21:45:17 -0400 From: Dylan A. Loomis dylan@aero.org Subject: SSH and clusters --Fig2xvG2VGoz8o/s Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable On Wed, Jun 02, 1999 at 01:42:33AM -0700, Philip Juels wrote: > Sometimes my users simply want to run a batch process on any given node > within our cluster as opposed to true parallel processing. So they use > ssh to access the master node of our cluster and then rlogin or telnet > to access the clients from the master (the client nodes are on an > isolated intranet with the master acting as gatekeeper). Is this > insecure? Should we run ssh for connections withing the cluster? My > understanding of ssh is that it's like a secure pipe...anything on top > of it should be encrypted. >=20 > Thanks, >=20 > Philip Juels > philip_juels@harvard.edu Philip, the short answer to your questions is yes this is secure, the long answer is that it depends on where you want your security. If you are primarily worried about people sniffing traffic destined from the outside, passing through your Gatekeeper to the clustered machines, then using ssh to connect to the Gatekeeper and using rsh from there (Gatekeeper to Compute node) will be fine. In this case data is encrypted until the Gatekeeper, then within the private network it is sent cleartext, so as long as you trust your compute nodes th= is is fine. So the person connects: -Encrypted- Outside Host --- SSH --- Gatekeeper -Encrypted- Then once the have ssh'd to the Gatekeeper they rsh to Compute node: -Encrypted- -Cleartext- Outside Host --- SSH --- Gatekeeper --- rsh --- Compute node -Encrypted- -Cleartext- Only the Gatekeeper to Compute node traffic is vulnerable, the traffic from the Outside host to the Gatekeeper stays encrypted. So unless someone is sniffing on either the Gatekeeper, or one of the Compute nodes, your fine. hope that helps -DAL- --=20 Dylan A. Loomis Computer Systems Research Department The Aerospace Corporation e-mail: dylan@aero.org phone: (310) 336-2449 PGP Key fingerprint =3D 55 DE BB DD 34 10 CD 20 72 79 88 FE 02 0E 21 3A PGP 2.6.2 key available upon request --Fig2xvG2VGoz8o/s Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: 2.6.2 iQCVAwUBN1XeEezCrQVfDVWRAQH9vAP/bIxHJiXF7PM4dmNfELVRTnR/21xqCUqE kCHwI5uLptgDmOyOudurMWsg7wO855rGqyjrGDiJO32MNcNEikePQAKPVmu3r4ht JI7uHcpwAHVsCu+XclKc9t1++ZHgr42pPXfOCC9ICiE553H0wVqwHVEKdMfeYwyc XF4Hr9Qm/6I= =xqVg -----END PGP SIGNATURE----- --Fig2xvG2VGoz8o/s-- From dhart@indiana.edu Thu, 3 Jun 1999 00:33:30 -0400 Date: Thu, 3 Jun 1999 00:33:30 -0400 From: Dave Hart dhart@indiana.edu Subject: SSH and clusters At 06:47 PM 6/2/1999 -0400, Robert G. Brown wrote: >. . . I don't think there is any really good reason not to run ssh >inside the cluster anyway. . . . I ran NAS Parallel Benchmarks with rsh and with ssh and found less than 1% difference, IIRC. And since ssh is SOP . . . -- David Hart http://php.indiana.edu/~dhart Research Computing Support 812-855-2632 University Information Technology Services Indiana University From dhart@indiana.edu Thu, 3 Jun 1999 00:33:29 -0400 Date: Thu, 3 Jun 1999 00:33:29 -0400 From: Dave Hart dhart@indiana.edu Subject: PVM or MPI essential to run parallel applications on aBeowulf? At 01:24 PM 6/2/1999 -0800, Curt wrote: >The simplest, but not necessarily most optimal, way to handle this is to >treat a dual processor node as if it were 2 nodes. No special >programming is required and the benefits are available immediately. Thanks, I did try that [2 MPI processes on same system]. Since the OS shifts a lot of work to second processor when the first is pegged, bringing in a second MPI process is not so great [I don't expect it to be]. I have had some suggestions from the Portland Group [to try another time]. - Dave From pesch@ibm.net Thu, 3 Jun 1999 00:51:05 -0400 Date: Thu, 3 Jun 1999 00:51:05 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: Cox report on Chinese spy activities and Beowulf DOn't be sorry, it was interesting anyhow... At 06:59 PM 6/2/99 -0400, Robert G. Brown wrote: >On Fri, 28 May 1999, texelsoft wrote: > >> Whoa RGB you're getting carried away by a perfectly good rant. > >Oops, sorry list folks, especially Alan Cox and others who complained. >I meant to hit the other "r"; I agree that it is time to take the >discussion offline (or let it die a natural death:-) but was catching up >on a bunch of mail fast and mis-hit. > > rgb > >Robert G. Brown http://www.phy.duke.edu/~rgb/ >Duke University Dept. of Physics, Box 90305 >Durham, N.C. 27708-0305 >Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu > > > > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From daniel.pfenniger@obs.unige.ch Thu, 3 Jun 1999 04:38:13 -0400 Date: Thu, 3 Jun 1999 04:38:13 -0400 From: Daniel Pfenniger daniel.pfenniger@obs.unige.ch Subject: keeping nodes in synch Jacek Radajewski wrote: > > rsync is good as well. rsync will only copy the changes. Yes, and also on option delete the deleted files. One then gets a true copy of a disk/directory with a minimum of traffic. In our Beo cluster we use rsync to backup the two master nodes on *spare* hard disks. This is much cheaper than a tape backup. Daniel Pfenniger From john.hearns@framestore.co.uk Thu, 3 Jun 1999 06:03:30 -0400 Date: Thu, 3 Jun 1999 06:03:30 -0400 From: John Hearns john.hearns@framestore.co.uk Subject: two NICs - channel bonding - tradeoff > Of course, an alternative to using a dedicated network would be to modify > the networking code in the kernel to provide priority access to your special > traffic - of course getting switches to do that would be much more difficult. > This would make a great project for an MS student - mucking around in the > kernel to implement this. > Anyway, lots of issues to explore. Please do, and then let us know what you > find. > > Walt > > I could imagine an application where one has two > > kinds of messages. One kind is a short message but > > highly 'urgent'. Only after the message was delivered > > can the calculation continue. For example a > > synchronization message. And then another type of > > message which is higher in volume but is not > > time critical. Perhaps I shouldn't raise myself above the parapet here, but as an ATM person, that's the sort of thing an ATM network might be able to do - QoS for different traffic types. Some thoughts off the top of my head would be using PVCs for this 'priority' traffic, and having normal IP traffic via Classical IP or LANE. Also could investigate multicasting for moving bulk data to all machines. (Quickly ducks head back below parapet before the flames start leaping). -------------------OOOOOOOOOOOOOO----------------- John Hearns Systems Engineer FrameStore http://www.framestore.co.uk Tel 0171-344-8910 0171-208-2626 Fax -------------------OOOOOOOOOOOOOO----------------- From rgb@phy.duke.edu Thu, 3 Jun 1999 06:23:50 -0400 Date: Thu, 3 Jun 1999 06:23:50 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: SSH and clusters On Wed, 2 Jun 1999, Dave Hart wrote: > At 06:47 PM 6/2/1999 -0400, Robert G. Brown wrote: > > >. . . I don't think there is any really good reason not to run ssh > >inside the cluster anyway. . . . > > I ran NAS Parallel Benchmarks with rsh and with ssh and found less > than 1% difference, IIRC. And since ssh is SOP . . . Well, there you go then...real numbers and not my opinion. Death to the Infidel rsh! Long live ssh! Seriously, our University is being portscanned and probed literally two or three times a week. Most of the documented breakins that have succeeded in our well-managed department have occurred because of offsite passwd traps -- grad students or faculty telnetting or rlogin'ing (rlogging in?:-) back to the department from an insecure site, perhaps while on summer break or at a conference. We're trying to figure out how to make ssh use MANDATORY, and the best way is to simply stop, cease, desist in using either telnetd or rshd or ftpd in favor of sshd. The only obstacle we face is a lack of universally available ssh clients, partly due to US export restrictions (and those restrictions themselves); hopefully at least the first problem will evaporate in 6 more months when RSA becomes public domain for real. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From kodym@mit.jyu.fi Thu, 3 Jun 1999 06:29:34 -0400 Date: Thu, 3 Jun 1999 06:29:34 -0400 From: Petr Ladislav Kodym kodym@mit.jyu.fi Subject: New class of Beowulf clusters ? (Re: two NICs - channel bonding - tradeoff) On Wed, 2 Jun 1999, Walter B. Ligon III wrote: > >Generally I think there are a number of interesting issues in using multiple >networks. We have experimented with using one bus and one switched network >(back when switches were still pricey). Hi All, What I'm really interested in is coupling of fast ethernet with PAPERS "network". -------------------------------------------------------------------- PAPERS, Purdue's Adapter for Parallel Execution and Rapid Synchronization, is custom hardware that allows a cluster of unmodified PCs and/or workstations to function as a fine-grain parallel computer capable of MIMD, SIMD, and VLIW execution. The total time taken to perform a typical barrier synchronization using PAPERS is about 3 microseconds, including all hardware and software overhead; this is several orders of magnitude faster than using conventional networks, and is even faster than most commercial parallel supercomputers. A wide range of aggregate communication operations are also supported with comparable efficiency. Despite this performance, the public-domain PAPERS designs are less expensive than most conventional networks and are scalable to very large clusters. ttp://garage.ecn.purdue.edu/~papers/ ------------------------------------------------------------------- Folks in CSC, Helsinki, Finland have measured 312 barriers/sec for 32 processes running on 16 dual PentiumsII 400Mhz connected by switched fast ethernet (under MPI over TCP/IP). It clearly limits usability of such a cluster only to (very) coarse grain parallel programs. PAPERS can handle 330 000 barriers/sec, thats 3 orders of magnitude more. They can also transfer short messages with very low latency. (PAPERS is a small chunk of basic TTL logic circuits which is connected to the parallel port of PC) It would push Beowulf clusters to completely new application areas. Has any of you ever tried this PAPERS thing ? Can you give any comments ? ** Does someone know, how many barriers can by done over MYRINET or ** ** using U-Net ? ** Is anyone running code, that is bottlenecked by the need for frequent synchronization, and therefore possibly interested in trying PAPERS ? Petr From wasshub@spdc.ti.com Thu, 3 Jun 1999 08:21:01 -0400 Date: Thu, 3 Jun 1999 08:21:01 -0400 From: Christoph Wasshuber wasshub@spdc.ti.com Subject: hard disk reliability Some days ago someone mentioned that one of the big benefits of running a diskless cluster is the increased reliability. Hard disks are the most unreliable part in PCs. Does anybody have manufacturer numbers like MTBF (mean time between failure)? I would also be interested in comments from people running beowulfs with 100 or more nodes, where every node has a hard disk. Do you guys exchange a hard disk every month? Or even every week? How serious is the hard disk reliability issue in reality? Chris.... From rbross@parl.ces.clemson.edu Thu, 3 Jun 1999 10:05:16 -0400 Date: Thu, 3 Jun 1999 10:05:16 -0400 From: Rob Ross rbross@parl.ces.clemson.edu Subject: hard disk reliability Actually, I have found that power supplies have been the least reliable components of our systems. Rob Ross Parallel Architecture Research Lab, Clemson University On Thu, 3 Jun 1999, Christoph Wasshuber wrote: > Some days ago someone mentioned that one of > the big benefits of running a diskless cluster > is the increased reliability. Hard disks are > the most unreliable part in PCs. Does anybody > have manufacturer numbers like MTBF (mean time > between failure)? > > I would also be interested in comments from > people running beowulfs with 100 or more > nodes, where every node has a hard disk. Do > you guys exchange a hard disk every month? > Or even every week? > > How serious is the hard disk reliability issue > in reality? > > Chris.... From scm@tcdi.com Thu, 3 Jun 1999 10:49:19 -0400 Date: Thu, 3 Jun 1999 10:49:19 -0400 From: Shawn Masters scm@tcdi.com Subject: hard disk reliability -----Original Message----- From: Christoph Wasshuber To: beowulf Date: Thu, 3 Jun 1999 10:49:19 -0400 Subject: hard disk reliability >Some days ago someone mentioned that one of >the big benefits of running a diskless cluster >is the increased reliability. Hard disks are >the most unreliable part in PCs. Does anybody >have manufacturer numbers like MTBF (mean time >between failure)? Seagate posts all the MTBFs for their drives last I checked (about 6 months ago). Quantum gave MTBF for some, and another number that could be used to derive the MTBF for others. The seagate drives range from 300,000 MTBF on soem of the older drives to 1,000,000 on some of the newer ones, with quite a few in the 800,000 range. My measured numbers on the quantums is about 250,000 for the bigfoot 6.4 gig, and 320,000 on the same sized fireball. This is with sample sizes of 120 and 24 respectivly. >I would also be interested in comments from >people running beowulfs with 100 or more >nodes, where every node has a hard disk. Do >you guys exchange a hard disk every month? >Or even every week? With the low end drives and that number you will be replacing a drive every few months. We have experienced a MTBF of about 60 days when cooling wasn't adequate (note there were eight drives in each system), but some simple fixes brought it up to about 90 before we finished with the array. >How serious is the hard disk reliability issue >in reality? With a hundred drives you will notice the MTBF, even under perfect conditions. Choose your drives based on the cost of losing one at the calculated frequency, and that will tell you what you can afford. If 90 days between node lose is acceptable then you can buy cheap. If runs need to be over 180 days then you need to look at higher MTBFs (in the 600,000+ range). Overall the price difference isn't as much as when I did these arrays. 73, From deadline@plogic.com Thu, 3 Jun 1999 11:04:00 -0400 Date: Thu, 3 Jun 1999 11:04:00 -0400 From: Douglas Eadline deadline@plogic.com Subject: hard disk reliability On Thu, 3 Jun 1999, Rob Ross wrote: I would agree. Here is the order of failures/problems after a system is burned in (this is what we have seen): 1. power supplies (general failures) 2. hard drives (due to shipping) 3. hard drives (general failures) 4. motherboards failing 5. NICs going hay-wire 6. Cable problems 7. Switch problems Building systems we have seen (in order of occurance): 1. bad SDRAM (way too much than we care to think about) 2. bad IDE drives 3. bad Motherboards 4. bad SCSI cables, floppies, NICs BTW: we have found that early PII-400s had problems with Linux SMP. After eliminating everything else, we found that replacing the CPUs (with PIII-450) solved the problem. The problem included random crashes, wrong answers, and stalled MPI/PVM runs. It only happened when the system is under high load with lots of interrupts. Goes away if the FSB is set to 66MHz. Doug > Actually, I have found that power supplies have been the least reliable > components of our systems. > > Rob Ross > Parallel Architecture Research Lab, Clemson University > > On Thu, 3 Jun 1999, Christoph Wasshuber wrote: > > > Some days ago someone mentioned that one of > > the big benefits of running a diskless cluster > > is the increased reliability. Hard disks are > > the most unreliable part in PCs. Does anybody > > have manufacturer numbers like MTBF (mean time > > between failure)? > > > > I would also be interested in comments from > > people running beowulfs with 100 or more > > nodes, where every node has a hard disk. Do > > you guys exchange a hard disk every month? > > Or even every week? > > > > How serious is the hard disk reliability issue > > in reality? > > > > Chris.... > ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.861.6960 115 Research Drive | PARALLEL | Fax:+610.861.8247 Bethlehem, PA 18017 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- From JesseP@europe.stortek.com Thu, 3 Jun 1999 11:02:54 -0400 Date: Thu, 3 Jun 1999 11:02:54 -0400 From: Jessen, Per JesseP@europe.stortek.com Subject: hard disk reliability > -----Original Message----- > From: Christoph Wasshuber [mailto:wasshub@spdc.ti.com] > Sent: 03 June 1999 13:16 [snip] > is the increased reliability. Hard disks are > the most unreliable part in PCs. Does anybody > have manufacturer numbers like MTBF (mean time > between failure)? [snip] checkout www.quantum.com, www.seagate.com etc. Last time I looked, they qouted MTBF, MTTR etc on the individual harddrive datasheets. regards, Per Jessen, ENIDAN Technologies, London From kragen@pobox.com Thu, 3 Jun 1999 11:06:02 -0400 Date: Thu, 3 Jun 1999 11:06:02 -0400 From: Kragen Sitaker kragen@pobox.com Subject: New class of Beowulf clusters ? Someone writes: > What I'm really interested in is coupling of fast ethernet with PAPERS > "network". The easiest way to couple FE with PAPERS is probably to put some PCs on the FE with their parallel ports attached to the PAPERS device. But since every PC has a parallel port anyway, and since most PCs in clusters don't use them, you should just use the ordinary parallel port instead. -- Kragen Sitaker TurboLinux is outselling NT in Japan's retail software market 10 to 1, so I hear. -- http://www.performancecomputing.com/opinions/unixriot/981218.shtml From lindahl@cs.virginia.edu Thu, 3 Jun 1999 11:15:22 -0400 Date: Thu, 3 Jun 1999 11:15:22 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: hard disk reliability > I would also be interested in comments from > people running beowulfs with 100 or more > nodes, where every node has a hard disk. Do > you guys exchange a hard disk every month? > Or even every week? In my previous life, I had a steady failure rate of 1 per 400 disks per month. In my current life I have 400 disks and I've only had 1 failure in a year. 0 power supplies, 4 case fans, but 3 from the same batch. -- g From joelja@darkwing.uoregon.edu Thu, 3 Jun 1999 11:45:33 -0400 Date: Thu, 3 Jun 1999 11:45:33 -0400 From: Joel Jaeggli joelja@darkwing.uoregon.edu Subject: hard disk reliability On a 13 node cluster 2 bad power supplies one dead disk(western digital 2gb ide) in a year and a half. On Thu, 3 Jun 1999, Christoph Wasshuber wrote: > Some days ago someone mentioned that one of > the big benefits of running a diskless cluster > is the increased reliability. Hard disks are > the most unreliable part in PCs. Does anybody > have manufacturer numbers like MTBF (mean time > between failure)? > > I would also be interested in comments from > people running beowulfs with 100 or more > nodes, where every node has a hard disk. Do > you guys exchange a hard disk every month? > Or even every week? > > How serious is the hard disk reliability issue > in reality? Probably not a serious as the the cheapo chinese power supply reliability issue. > Chris.... > -------------------------------------------------------------------------- Joel Jaeggli joelja@darkwing.uoregon.edu Academic User Services consult@gladstone.uoregon.edu PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -------------------------------------------------------------------------- It is clear that the arm of criticism cannot replace the criticism of arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of the right, 1843. From mccsnrw@dirac.phy.umist.ac.uk Thu, 3 Jun 1999 12:12:53 -0400 Date: Thu, 3 Jun 1999 12:12:53 -0400 From: Niels R. Walet mccsnrw@dirac.phy.umist.ac.uk Subject: hard disk reliability I concur: on an eight node dual cluster 2 motherboard failures, no other problems.... Niels -- Dr Niels R. Walet http://www.phy.umist.ac.uk/Theory/people/walet.html Dept. of Physics, UMIST, P.O. Box 88, Manchester, M60 1QD, U.K. Phone: +44(0)161-2003693 Fax: +44(0)161-2004303 Niels.Walet@umist.ac.uk From lindahl@cs.virginia.edu Thu, 3 Jun 1999 12:19:24 -0400 Date: Thu, 3 Jun 1999 12:19:24 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: New class of Beowulf clusters ? (Re: two NICs - channel bonding - > Folks in CSC, Helsinki, Finland have measured 312 barriers/sec for 32 > processes running on 16 dual PentiumsII 400Mhz connected by switched > fast ethernet (under MPI over TCP/IP). It clearly limits usability of > such a cluster only to (very) coarse grain parallel programs. Tweet! The overgeneralization police have given you a ticket. PAPERS does one thing really well: global barriers and broadcasts. That's it. The benchmarks I've been doing all involve algorithms which would not be sped up by PAPERS. Most of them involve either nearest neighbor communication, or all-to-all, where each process receives different information. It would be nice if someone who did have an algorithm which is helped by PAPERS integrated it with MPICH. That would make it easier for the rest of us to experiment with it. But it isn't a magic bullet. > ** Does someone know, how many barriers can by done over MYRINET or ** > ** using U-Net ? ** Probably about only 5x as many possible with TCP/IP. However, you could hack some features into your myrinet driver to directly support broadcasts, and that would probably speed it up by quite a bit. The fastest possible would probably be around 10,000 per second, which is far slower than PAPERS. -- g From jdm2d@cs.virginia.edu Thu, 3 Jun 1999 14:03:36 -0400 Date: Thu, 3 Jun 1999 14:03:36 -0400 From: Justin Moore jdm2d@cs.virginia.edu Subject: hard disk reliability Hello, I'd be interested to know if anyone has any of the newer, high-end SCSI drives in the 36 - 47 GB range, and what kind of reliability they've seen in of those. The kind of application(s) I have in mind will be very very heavy on I/O, so these drives definitely will be pounded around the clock. Also what kind of experience have people had with different RAID levels, mainly RAID 5 and 10? I've gotten different definitions of what level 10 is exactly. The definition from the Quantum page is disk striping with mirroring, so the failure of two disks is unlikely to kill you unless they're the correct (or incorrect, I guess) two. Given the reliability of the SCSI drives, would there be a financial advantage in using this if the data was absolutely irreplaceable? Thanks. -Justin Moore From mack.joseph@epa.gov Thu, 3 Jun 1999 14:16:43 -0400 Date: Thu, 3 Jun 1999 14:16:43 -0400 From: Joseph Mack mack.joseph@epa.gov Subject: hard disk reliability Christoph Wasshuber wrote: > Hard disks are > the most unreliable part in PCs. Does anybody > have manufacturer numbers like MTBF (mean time > between failure)? I was at a talk by a systems integrator, where he said that if the same procedures were used for determining MTBF for humans as for disks, the human lifetime would be about 2000 yrs. The young people who die from infectious diseases, and car accidents are clearly not representative of a normal working adult and the old people who've retired and suffering from degenerative diseases can be similarly discounted. Thus an expected lifespan of about 2000yrs for healthy adults. Joe -- Joseph Mack PhD, Senior Systems Engineer, Lockheed Martin contractor to the National Environmental Supercomputer Center, mailto:mack.joseph@epa.gov ph# 919-541-0007, RTP, NC, USA From enano@ceu.fi.udc.es Thu, 3 Jun 1999 14:46:47 -0400 Date: Thu, 3 Jun 1999 14:46:47 -0400 From: Miguel Barreiro Paz enano@ceu.fi.udc.es Subject: hard disk reliability > > How serious is the hard disk reliability issue > > in reality? > > Probably not a serious as the the cheapo chinese power supply reliability > issue. 24 node cluster running for almost one year, 3 fans dead (1 cpu fan and 2 power supply ones) and no disks dead. Regards, From lindahl@cs.virginia.edu Thu, 3 Jun 1999 15:13:45 -0400 Date: Thu, 3 Jun 1999 15:13:45 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: hard disk reliability > Also what kind of experience have people had with different RAID > levels, mainly RAID 5 and 10? I've gotten different definitions of what > level 10 is exactly. Quantum's defintion of "level 10" (actually 1+0) is corect. I've never seen any other definition, but I've seen some really poor explanations. > Given the > reliability of the SCSI drives, would there be a financial advantage in > using this if the data was absolutely irreplaceable? That's not necessarily why you use RAID. You can use it to decrease your downtime (redundancy), and to up your performance (striping). If you have a shop which uses 1,000 disks, and you're willing to spend money to avoid downtime, it doesn't matter if the data is "absolutely irreplaceable"... -- g From jdm2d@cs.virginia.edu Thu, 3 Jun 1999 15:20:28 -0400 Date: Thu, 3 Jun 1999 15:20:28 -0400 From: Justin Moore jdm2d@cs.virginia.edu Subject: hard disk reliability > > Given the > > reliability of the SCSI drives, would there be a financial advantage in > > using this if the data was absolutely irreplaceable? > > That's not necessarily why you use RAID. You can use it to decrease > your downtime (redundancy), and to up your performance (striping). If > you have a shop which uses 1,000 disks, and you're willing to spend > money to avoid downtime, it doesn't matter if the data is "absolutely > irreplaceable"... Oops, forgot to clarify. By 'this', I meant using RAID 10 vs using RAID 5. If only english classes hadn't been so conducive to sleeping ... -Justin Moore From carlo@carlo.org Thu, 3 Jun 1999 15:33:17 -0400 Date: Thu, 3 Jun 1999 15:33:17 -0400 From: Carlo Perassi carlo@carlo.org Subject: Proxy Server? Hi all. I've a friend: he is an engineer and he works for an italian ISP. One day I was chatting with him about the Beowulf. Suddently he said: "Can I use a Beowulf to run a proxy server? Why don't you search someting about it?" Can you help me to find the answers to the following questions? a) Is it possible to use a Beowulf to run a proxy server? In other hand, what's the best way to realize a "Super" proxy server? b) If the answer to the previous question is positive, where could I find more informations? Thank you. -- Carlo Perassi -- From lindahl@cs.virginia.edu Thu, 3 Jun 1999 15:34:28 -0400 Date: Thu, 3 Jun 1999 15:34:28 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: hard disk reliability > > > Given the > > > reliability of the SCSI drives, would there be a financial advantage in > > > using this if the data was absolutely irreplaceable? [...] > Oops, forgot to clarify. By 'this', I meant using RAID 10 vs using RAID > 5. If only english classes hadn't been so conducive to sleeping ... The reason to use RAID 10 over RAID 5 is usually performance. The two have reasonably similar redundancy characteristics -- the odds of losing a second drive before the first can be repaired are slim, unless the problem is really a power supply or cable etc. -- g From scm@tcdi.com Thu, 3 Jun 1999 16:12:41 -0400 Date: Thu, 3 Jun 1999 16:12:41 -0400 From: Shawn Masters scm@tcdi.com Subject: hard disk reliability -----Original Message----- From: Joseph Mack To: Christoph Wasshuber Cc: beowulf Date: Thu, 3 Jun 1999 16:12:41 -0400 Subject: Re: hard disk reliability >Christoph Wasshuber wrote: > >> Hard disks are >> the most unreliable part in PCs. Does anybody >> have manufacturer numbers like MTBF (mean time >> between failure)? I would be surprised if hard drives are the most unreliable part. Modern hard drives (in the last year) seem to be more reliable then all the stock CPU fans I've seen. > >I was at a talk by a systems integrator, where he >said that if the same procedures were used for >determining MTBF for humans as for disks, the >human lifetime would be about 2000 yrs. The young >people who die from infectious diseases, and >car accidents are clearly not representative >of a normal working adult and the old people who've >retired and suffering from degenerative diseases >can be similarly discounted. Thus an expected >lifespan of about 2000yrs for healthy adults. > I've actually worked with large samples of hard drives in both PCs and Suns, and have found the MTBFs to be accurate for prediction of failures for large sample sizes. Granted I don't expect a 1,000,000 MTBF drive to run for 114 years, but I do expect to replace one in a 1000 drive installation every 41 days (give or take a few days). I can think of a few of my clients that have that enough drives on premises to see something like this. 73, Shawn From dmerchan@hiwaay.net Thu, 3 Jun 1999 17:05:10 -0400 Date: Thu, 3 Jun 1999 17:05:10 -0400 From: dmanddmer dmerchan@hiwaay.net Subject: hard disk reliability Todays hard drives are rated with MTTR's and MTBF's in the 5000+ hours. In my career of 21 years in computers, the number 1 failure item that can have devastating consequences has been memory, #2 is hard drives. My .02 David Rob Ross wrote: > > Actually, I have found that power supplies have been the least reliable > components of our systems. > > Rob Ross > Parallel Architecture Research Lab, Clemson University > > On Thu, 3 Jun 1999, Christoph Wasshuber wrote: > > > Some days ago someone mentioned that one of > > the big benefits of running a diskless cluster > > is the increased reliability. Hard disks are > > the most unreliable part in PCs. Does anybody > > have manufacturer numbers like MTBF (mean time > > between failure)? > > > > I would also be interested in comments from > > people running beowulfs with 100 or more > > nodes, where every node has a hard disk. Do > > you guys exchange a hard disk every month? > > Or even every week? > > > > How serious is the hard disk reliability issue > > in reality? > > > > Chris.... From dmerchan@hiwaay.net Thu, 3 Jun 1999 17:07:55 -0400 Date: Thu, 3 Jun 1999 17:07:55 -0400 From: dmanddmer dmerchan@hiwaay.net Subject: [Fwd: [Fwd: FW: Fw: proposed email surcharges]] This is a multi-part message in MIME format. --------------DA3AEC61B1C8959BF1763236 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit --------------DA3AEC61B1C8959BF1763236 Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline Received: from arnet.arn.net (arnet.arn.net [204.177.232.11]) by mail.HiWAAY.net (8.9.1a/8.9.0) with ESMTP id JAA02681; Thu, 3 Jun 1999 09:37:06 -0500 (CDT) Received: from arn.net (amarillo.bentleysauction.com [204.177.232.137]) by arnet.arn.net (8.9.3/8.9.3) with ESMTP id JAA00948; Thu, 3 Jun 1999 09:33:33 -0500 (CDT) Message-ID: <37569312.B4143CFA@arn.net> Date: Thu, 3 Jun 1999 17:07:55 -0400 From: "Neil C. Bentley, CAI" Organization: Bentley's & Associates, LLC X-Mailer: Mozilla 4.5 [en]C-DIAL (WinNT; I) X-Accept-Language: en MIME-Version: 1.0 Subject: [Fwd: FW: Fw: proposed email surcharges] Content-Type: multipart/mixed; boundary="------------425DE7C1917B31F4F257A674" X-Mozilla-Status2: 00000000 This is a multi-part message in MIME format. --------------425DE7C1917B31F4F257A674 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit -- Neil C. Bentley, CAI (800).841.4087 President (806).376.1121 Bentley's & Associates, L.L.C. (806).622.0085 fax --------------425DE7C1917B31F4F257A674 Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline >From drewcat@fone.net Wed Jun 2 02:07:59 1999 Return-Path: Received: from www.fone.net (www.fone.net [206.168.68.2]) by arnet.arn.net (8.9.3/8.9.3) with ESMTP id CAA07706 for ; Wed, 2 Jun 1999 02:07:58 -0500 (CDT) Received: from drewcat (bijou16.fone.net [206.168.248.177]) by www.fone.net (8.8.4/8.7.1) with SMTP id BAA26811; Wed, 2 Jun 1999 01:07:42 -0600 (MDT) Received: from drewcat (bijou16.fone.net [206.168.248.177]) by www.fone.net (8.8.4/8.7.1) with SMTP id BAA26811; Wed, 2 Jun 1999 01:07:42 -0600 (MDT) From: "Lori" To: Subject: FW: Fw: proposed email surcharges Date: Thu, 3 Jun 1999 17:07:55 -0400 Message-ID: <000201beacc6$908cc800$b1f8a8ce@drewcat> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook 8.5, Build 4.71.2377.0 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2014.211 Importance: Normal X-Mozilla-Status2: 00000000 > ><< > Read the following and please pass it on to everyone you know: > > > > Dear Internet Subscriber: > > > > Please read the following carefully if you intend to stay > > online and continue using email: > > > > The last few months have revealed an alarming trend in the > > Government of the United States attempting to quietly push through > > legislation that will affect your use of the Internet. Under proposed > > legislation the U.S.Postal Service will be attempting to bilk email > > users out of "alternate postage fees". Bill 602P will permit the > > Federal Govt to charge a 5 cent surcharge on every email delivered, by > > billing Internet Service Providers at source. > > > > The consumer would then be billed in turn by the ISP. Washington D.C. > > lawyer Richard Stepp is working without pay to prevent this > > legislation from becoming law. The U.S. Postal Service is claiming > > that lost revenue due to the proliferation of email is costing nearly > > $230,000,000 in revenue > > per year. You may have noticed their recent ad campaign "There is > > nothing like a letter". Since the average citizen received about 10 > > pieces of email per day in 1998, the cost to the typical individual > > would be an additional 50 cents per day, or over $180 dollars per > > year, above and beyond their regular Internet costs. Note that this > > would > > be money paid directly to the U.S. Postal Service for a service they > > do not even provide. The whole point of the Internet is democracy and > > non-interference. If the federal government is permitted to tamper > > with our liberties by adding a surcharge to email,who knows where it > > will end. You are already paying an exorbitant price for snail mail > > because of bureaucratic efficiency. It currently takes up to 6 days > > for a letter to be delivered from New York to Buffalo. If the U.S. > > Postal Service is allowed to tinker with email, it will mark the end > > of the "free" Internet in the United States. One congressman, Tony > > Schnell (R) has even suggested a "twenty to forty dollar per month > > surcharge on all Internet service" above and beyond the government's > > proposed email charges. Note that most of the major newspapers have > > ignored the story, the only exception being the Washingtonian, which > > called the idea of email surcharge "a useful concept whose time has > > come" (March 6, 1999 Editorial) Don't sit by and watch your freedoms > > erode away! > > > > > > Send this email to all Americans on your list and tell your friends > > and relatives to write to their congressman and say "No!" to Bill > > 602P >> > --------------425DE7C1917B31F4F257A674-- --------------DA3AEC61B1C8959BF1763236-- From fabio.perroni@roma1.infn.it Thu, 3 Jun 1999 17:34:38 -0400 Date: Thu, 3 Jun 1999 17:34:38 -0400 From: Fabio Perroni fabio.perroni@roma1.infn.it Subject: fh_verify: permission failure I've newly upgraded my small cluster (1 master (s1) + 4 nodes (s2--s5)) to RH6.0. My nodes are diskless. They boot from floppy and nfs mount master's disks (as usual) When I turn a node on, after a small net activity, this messages apperar on master's console: ... ... Jun 3 22:37:22 s1 kernel: fh_verify: log/wtmp permission failure, acc=2, error=30 Jun 3 22:37:22 s1 kernel: fh_verify: run/utmp permission failure, acc=2, error=30 Jun 3 22:37:22 s1 kernel: fh_verify: dev/tty6 permission failure, acc=8, error=30 Jun 3 22:37:22 s1 kernel: fh_verify: log/wtmp permission failure, acc=2, error=30 Jun 3 22:37:22 s1 kernel: fh_verify: log/wtmp permission failure, acc=a, error=30 Jun 3 22:37:22 s1 kernel: fh_verify: run/utmp permission failure, acc=2, error=30 Jun 3 22:37:22 s1 kernel: fh_verify: log/wtmp permission failure, acc=2, error=30 Jun 3 22:37:22 s1 kernel: fh_verify: log/wtmp permission failure, acc=a, error=30 ... ... then nodes respond to ping but not to telnet. Has anyone encountered this problem yet? Does ayone know what does this message mean? Thanks in advance : Fabio. From abate@brahma.ticam.utexas.edu Thu, 3 Jun 1999 17:34:45 -0400 Date: Thu, 3 Jun 1999 17:34:45 -0400 From: Jason Abate abate@brahma.ticam.utexas.edu Subject: PVM or MPI essential to run parallel applications on aBeowulf? > >The simplest, but not necessarily most optimal, way to handle this is to > >treat a dual processor node as if it were 2 nodes. No special > >programming is required and the benefits are available immediately. > > Thanks, I did try that [2 MPI processes on same system]. Since the > OS shifts a lot of work to second processor when the first is pegged, > bringing in a second MPI process is not so great [I don't expect it > to be]. If you're using MPICH, you can make use of both shared-memory for passing messages between processes on a machine, and ethernet for messages to remote machines if you configure and compile with -comm=shared. Note that you need to setup a procgroup file, not just list each dual-processor machine twice. It took me a while to discover this, but it is crucial for good performance. Without specifying the procgroup, using both processors on a node was actually _slower_ than only using a single processor - not a good thing. If you need more details on how to setup the procgroup file, check the MPICH docs or email me. -jason ==================================================================== Jason Abate abate@ticam.utexas.edu www.ticam.utexas.edu/~abate Texas Institute for Computational and Applied Mathematics 304 SHC, University of Texas at Austin, Austin, TX 78712 Work: 512-471-6947 Home: 512-912-1012 Fax: 512-471-8694 From joelja@darkwing.uoregon.edu Thu, 3 Jun 1999 17:38:18 -0400 Date: Thu, 3 Jun 1999 17:38:18 -0400 From: Joel Jaeggli joelja@darkwing.uoregon.edu Subject: Proxy Server? On Thu, 3 Jun 1999, Carlo Perassi wrote: > Hi all. > I've a friend: he is an engineer and he works for an italian ISP. > One day I was chatting with him about the Beowulf. > Suddently he said: "Can I use a Beowulf to run a proxy server? Why don't you > search someting about it?" > Can you help me to find the answers to the following questions? > > a) Is it possible to use a Beowulf to run a proxy server? In other hand, what's > the best way to realize a "Super" proxy server? proxy services are quite ameniable to clustering, but don't require special libraries or programming techniques. Doing a round-robin in dns or load balancing on you routers if you do transparent caching is generally sufficient to distribute caching among multiple hosts. you can also do cache replication via multicast and do all sorts of other tricks to divde the load up between multiple machines. For the most part though caching is about having lots of ram and disk, cpu is rarely a bottleneck. > b) If the answer to the previous question is positive, where could I find more > informations? Cache heirarchies like nlanr are essentially large distributed computing projects, except that network io and storage space are more important than raw cpu. > Thank you. > > -- > Carlo Perassi > -- > -------------------------------------------------------------------------- Joel Jaeggli joelja@darkwing.uoregon.edu Academic User Services consult@gladstone.uoregon.edu PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -------------------------------------------------------------------------- It is clear that the arm of criticism cannot replace the criticism of arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of the right, 1843. From cendsley@varesearch.com Thu, 3 Jun 1999 17:48:02 -0400 Date: Thu, 3 Jun 1999 17:48:02 -0400 From: cendsley@varesearch.com cendsley@varesearch.com Subject: Channel-bonding and kernel 2.2.7 Has anyone ported the channel bonding patch for 2.0.30 for Linux kernel version 2.2.7? ---Christopher cendsley@varesearch.com From joelja@darkwing.uoregon.edu Thu, 3 Jun 1999 18:08:53 -0400 Date: Thu, 3 Jun 1999 18:08:53 -0400 From: Joel Jaeggli joelja@darkwing.uoregon.edu Subject: [Fwd: [Fwd: FW: Fw: proposed email surcharges]] This is an inapropriate forum for chain letters. In fact, I can't think of an appropriate one... joelja -------------------------------------------------------------------------- Joel Jaeggli joelja@darkwing.uoregon.edu Academic User Services consult@gladstone.uoregon.edu PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -------------------------------------------------------------------------- It is clear that the arm of criticism cannot replace the criticism of arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of the right, 1843. From billm@troikanetworks.com Thu, 3 Jun 1999 18:28:40 -0400 Date: Thu, 3 Jun 1999 18:28:40 -0400 From: Bill Moshier billm@troikanetworks.com Subject: [Fwd: [Fwd: FW: Fw: proposed email surcharges]] I have searched the congressional site regarding pending bills, and other reports. I have found nothing on this - so unless you have specific information regarding the sponsors, and bill number, we should all consider this junk mail, and disregard it! Bill -----Original Message----- From: dmanddmer [mailto:dmerchan@hiwaay.net] Sent: Thursday, June 03, 1999 2:12 PM To: Andrew McCabe; Benjamin D. Steele; beowulflist; Donna Merchant; Eugene A Smith; F. Marc de Piolenc; James A Cox; Jason Smith; Jon Winters; margie bradley; P.R. Kunz; roger merchant; Thuan Nguyen Subject: [Fwd: [Fwd: FW: Fw: proposed email surcharges]] From tadavis@lbl.gov Thu, 3 Jun 1999 21:15:59 -0400 Date: Thu, 3 Jun 1999 21:15:59 -0400 From: Thomas Davis tadavis@lbl.gov Subject: Channel-bonding and kernel 2.2.7 cendsley@varesearch.com wrote: > > Has anyone ported the channel bonding patch for 2.0.30 > for Linux kernel version 2.2.7? yup. check the mailing list archive. -- ------------------------+-------------------------------------------------- Thomas Davis | PDSF Project Leader tadavis@lbl.gov | (510) 486-4524 | "Only a petabyte of data this year?" From dmerchan@hiwaay.net Thu, 3 Jun 1999 23:35:51 -0400 Date: Thu, 3 Jun 1999 23:35:51 -0400 From: dmanddmer dmerchan@hiwaay.net Subject: proposed email surcharges Oops! Sorry, folks. I did not intend to send it to the list. David Joel Jaeggli wrote: > > This is an inapropriate forum for chain letters. In fact, I can't think of > an appropriate one... > > joelja > > -------------------------------------------------------------------------- > Joel Jaeggli joelja@darkwing.uoregon.edu > Academic User Services consult@gladstone.uoregon.edu > PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E > -------------------------------------------------------------------------- > It is clear that the arm of criticism cannot replace the criticism of > arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of > the right, 1843. From ClusterHack@snet.net Fri, 4 Jun 1999 02:22:24 -0400 Date: Fri, 4 Jun 1999 02:22:24 -0400 From: Bob Cat ClusterHack@snet.net Subject: hard disk reliability > I've actually worked with large samples of hard drives in both PCs and > Suns, and have found the MTBFs to be accurate for prediction of failures for > large sample sizes. Granted I don't expect a 1,000,000 MTBF drive to run > for 114 years, but I do expect to replace one in a 1000 drive installation > every 41 days (give or take a few days). Shawn Excellent answer, Shawn. And so: 1) MTBF or MTTR is calculated using the assumed probability of failure of the individual parts used in assembling the unit. They *do* tend to disregard infant mortality. This used to be known as a SWAG. Actually, it DOES come out pretty close to the true number. 2) You must divide MTBF by the number of units in a group to get the MTBF of *any* unit in a group. 3) PLAN FOR FAILURE!!!! It ALWAYS happens. You already knew that, didn't you? 4) Statistics are fun. 5) I think we'll find that power supply and CPU fans are the most common failure points. Why? Because they are usually cheaply made. Are there roller bearings in YOUR fans? I thought not. Real world, we save maybe 10-20 US$ per node by using cheap fans. Only you can determine if this is false economy for your purposes. 6) There should be significant cost savings in both hardware and electricity using properly sized, well engineered, and well constructed power supplies and cooling systems to service multiple nodes. Does anyone have figures on the actual power/cooling reqs of a typical node? 7) This tends to get us away from commodity hardware, but what are we trying to accomplish, anyway? More bang for the buck, I say. Any other ideas on increasing bang/buck? :ßobÇat.Bat 1.0 >^^< In base(one half) an infinite number approaches unity. Echo f b800:0000 fff 32 00 e1 09 6f 0f 62 0f 80 04 61 0f 74 0f 32 00 > Bob.Cat Echo q >> Bob.Cat DeBug < Bob.Cat > Nul @Erase Bob.Cat > Nul From wasshub@spdc.ti.com Fri, 4 Jun 1999 08:17:32 -0400 Date: Fri, 4 Jun 1999 08:17:32 -0400 From: Christoph Wasshuber wasshub@spdc.ti.com Subject: barriers What are 'barriers'? Please enlighten me. Chris.... From scm@tcdi.com Fri, 4 Jun 1999 08:34:21 -0400 Date: Fri, 4 Jun 1999 08:34:21 -0400 From: Shawn Masters scm@tcdi.com Subject: hard disk reliability -----Original Message----- From: Bob Cat To: Shawn Masters Cc: beowulf Date: Fri, 4 Jun 1999 08:34:21 -0400 Subject: Re: hard disk reliability >> I've actually worked with large samples of hard drives in both PCs and >> Suns, and have found the MTBFs to be accurate for prediction of failures >for >> large sample sizes. Granted I don't expect a 1,000,000 MTBF drive to run >> for 114 years, but I do expect to replace one in a 1000 drive installation >> every 41 days (give or take a few days). Shawn > >Excellent answer, Shawn. And so: > >1) MTBF or MTTR is calculated using the assumed probability of failure of >the individual parts used in assembling the unit. They *do* tend to >disregard infant mortality. This used to be known as a SWAG. Actually, it >DOES come out pretty close to the true number. MTBF is a long term estimate, but can still be used to predict early failure on the bath tub curve. I don't remember the exact method used from school, but I do remember that the conversion presented was nothing better then a good estimate. One of these estimating techniques will give a decent expectation of how many will fail in the early times given a certain level of quality (yes subjective, but with experience not hard to fudge apparently). 73, Shawn From wasshub@spdc.ti.com Fri, 4 Jun 1999 08:39:10 -0400 Date: Fri, 4 Jun 1999 08:39:10 -0400 From: Christoph Wasshuber wasshub@spdc.ti.com Subject: PAPERS Where can I find more information on PAPERS? Chris.... From rssr@iota.lncc.br Fri, 4 Jun 1999 09:46:34 -0400 Date: Fri, 4 Jun 1999 09:46:34 -0400 From: rssr@iota.lncc.br rssr@iota.lncc.br Subject: two NICs - channel bonding - tradeoff Hi We have a small cluster with 8 nodes 2 FastEthernet NICs per node. Here are our number with CHannel Bonding running NAS benchmark with two Hub's. At mu poitn of view they are GOOD!!!! If sameoene have any kind of comment please let me know. We are running the Class B and C by now. ----------------CG Class=A --------------------- without CHB with CHB Time (s) Mop/s Time (s) Mop/s NP = 2 185.29 8.08 47.23 31.69 NP = 4 111.82 13.38 33.13 45.17 NP = 8 108.59 13.78 34.00 44.01 Renato Renato Simoes Silva Laboratorio Nacional de Computacao Cientifica LNCC-CNPq ------------------------------------------------------------------ Avenida Getulio Vargas, 333 Quitandinha- Petropolis - RJ Brazil 25651-070 tel: + 55-24-233-6148 fax: + 55-24-233-6165 | ------------------------------------------------------------------ e-mail: rssr@lncc.br http://www.lncc.br/~rssr/ From jferg@2boot.com Fri, 4 Jun 1999 09:49:50 -0400 Date: Fri, 4 Jun 1999 09:49:50 -0400 From: jferg jferg@2boot.com Subject: hard disk reliability This is a multi-part message in MIME format. --------------35501BDF87716B9287151DA3 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Justin Moore wrote: > Hello, > I'd be interested to know if anyone has any of the newer, high-end SCSI > drives in the 36 - 47 GB range, and what kind of reliability they've > seen in of those. The kind of application(s) I have in mind will be very > very heavy on I/O, so these drives definitely will be pounded around the > clock. > > Also what kind of experience have people had with different RAID > levels, mainly RAID 5 and 10? I've gotten different definitions of what > level 10 is exactly. The definition from the Quantum page is disk > striping with mirroring, so the failure of two disks is unlikely to kill > you unless they're the correct (or incorrect, I guess) two. Given the > reliability of the SCSI drives, would there be a financial advantage in > using this if the data was absolutely irreplaceable? > > Thanks. > -Justin Moore If the criterion is "absolutely irreplaceable" the answer is not in RAID, but in a robust backup policy (which is really observed). But RAID (especially RAID0) can help between backup passes. -- Joe Ferguson, ApeX Systems Integration Corp. Voice: 919.468.8150 FAX: 919.468.5288 email: jferg@2boot.com --------------35501BDF87716B9287151DA3 Content-Type: text/x-vcard; charset=us-ascii; name="jferg.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for jferg Content-Disposition: attachment; filename="jferg.vcf" begin:vcard n:Ferguson;Joe x-mozilla-html:FALSE org:ApeX Systems Integration Corp. adr:;;;;;; version:2.1 email;internet:jferg@2boot.com title:Tech Director x-mozilla-cpt:;0 fn:Joe Ferguson end:vcard --------------35501BDF87716B9287151DA3-- From jferg@2boot.com Fri, 4 Jun 1999 10:00:35 -0400 Date: Fri, 4 Jun 1999 10:00:35 -0400 From: jferg jferg@2boot.com Subject: [Fwd: [Fwd: FW: Fw: proposed email surcharges]] This is a multi-part message in MIME format. --------------D26EA0709EBA7FCC3AAFA6F7 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit dmanddmer wrote: > > > ------------------------------------------------------------------------ > > Subject: [Fwd: FW: Fw: proposed email surcharges] > Date: Thu, 03 Jun 1999 09:37:06 -0500 > From: "Neil C. Bentley, CAI" > Organization: Bentley's & Associates, LLC > > -- > Neil C. Bentley, CAI (800).841.4087 > President (806).376.1121 > Bentley's & Associates, L.L.C. (806).622.0085 fax > > ------------------------------------------------------------------------ > > Subject: FW: Fw: proposed email surcharges > Date: Wed, 2 Jun 1999 01:07:23 -0600 > From: "Lori" > To: > > -->SNIP<-- > > > > > > Send this email to all Americans on your list and tell your friends > > > and relatives to write to their congressman and say "No!" to Bill > > > 602P >> > > Ironic, ain't it? If this were to become law, the kind of spam represented by this junk email would become a bit expensive to send. If it REALLY cut down on the junk email, it might be a fair price to pay! -- Joe Ferguson, ApeX Systems Integration Corp. Voice: 919.468.8150 FAX: 919.468.5288 email: jferg@2boot.com --------------D26EA0709EBA7FCC3AAFA6F7 Content-Type: text/x-vcard; charset=us-ascii; name="jferg.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for jferg Content-Disposition: attachment; filename="jferg.vcf" begin:vcard n:Ferguson;Joe x-mozilla-html:FALSE org:ApeX Systems Integration Corp. adr:;;;;;; version:2.1 email;internet:jferg@2boot.com title:Tech Director x-mozilla-cpt:;0 fn:Joe Ferguson end:vcard --------------D26EA0709EBA7FCC3AAFA6F7-- From josip@icase.edu Fri, 4 Jun 1999 10:40:05 -0400 Date: Fri, 4 Jun 1999 10:40:05 -0400 From: Josip Loncaric josip@icase.edu Subject: hard disk reliability > How serious is the hard disk reliability issue > in reality? Our 32 node cluster has been running for 6 months, with hard drive problems: On arrival: 3 defective disks - 1 fixed by updating its table of bad blocks - 2 replaced, but + 1 replacement needed updating its table of bad blocks In operation: 2 disks developed new bad blocks - 1 fixed by updating its table of bad blocks - 1 had to be replaced Other problems: one network cards was 20% slower in transmission than the rest (now replaced), several performance problems related to network card driver and Linux implementation of TCP, one noisy case fan, and occasional BIOS refusal to recognize ECC memory as ECC. BTW, is there some easy way to verify from Linux that ECC is actually in use? For our purposes, we need it, but the only place I've seen this reported is on the BIOS startup screen, which nobody looks at because nodes normally do not have monitors attached... Sincerely, Josip -- Dr. Josip Loncaric, Senior Staff Scientist mailto:josip@icase.edu ICASE, Mail Stop 132C http://www.icase.edu/~josip/ NASA Langley Research Center mailto:j.loncaric@larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134 From akc@mip.sdu.dk Fri, 4 Jun 1999 10:52:07 -0400 Date: Fri, 4 Jun 1999 10:52:07 -0400 From: Arnold K. Christensen akc@mip.sdu.dk Subject: PAPERS Christoph Wasshuber wrote: > Where can I find more information on > PAPERS? > > Chris.... http://garage.ecn.purdue.edu/~papers/ From eilers@linux-buero.de Fri, 4 Jun 1999 11:21:11 -0400 Date: Fri, 4 Jun 1999 11:21:11 -0400 From: Michael Eilers eilers@linux-buero.de Subject: PAPERS Christoph Wasshuber wrote: > Where can I find more information on > PAPERS? http://garage.ecn.purdue.edu/~papers/ Michael From konold@alpha.tat.physik.uni-tuebingen.de Fri, 4 Jun 1999 11:30:59 -0400 Date: Fri, 4 Jun 1999 11:30:59 -0400 From: Martin Konold konold@alpha.tat.physik.uni-tuebingen.de Subject: I thought this was an extreme linux list On Tue, 1 Jun 1999, Alvin Starr wrote: > At 80Mbytes/sec SCSI can make for a fast link between a small number of > systems and with a low overhead protocol it could help solve some of the > problems involved in trying to share memory across a network. No, unfortunately the overhead of SCSI is compared to SCI and Myrinet tremendeous. (Latencies in the ms range) Regards, -- martin // Martin Konold, Herrenbergerstr. 14, 72070 Tuebingen, Germany // // Email: konold@kde.org // KDE: A stable GUI for a reliable OS. UNIX: Everything including a device is a file. KDE: Everything including a file is a URL. From konold@alpha.tat.physik.uni-tuebingen.de Fri, 4 Jun 1999 11:31:09 -0400 Date: Fri, 4 Jun 1999 11:31:09 -0400 From: Martin Konold konold@alpha.tat.physik.uni-tuebingen.de Subject: SCSI as network interfaces?? On Mon, 31 May 1999, Bill Fredrickson wrote: Hi Bill, > The following messages were posted with a reference to using SCSI as an network > interface. This sounds like a possible solution to a cluster [network] problem > I'm having and was wondering where I might find out more about using SCSI on > Intel and AMD based systems as a network interface. I would appreciate it if > some one could direct me to sources of info as to how it's done and what > obsticales may be in the way. Thanks in advance. Well, it is feasable but not very interesting for people doing stuff like MPI. This is mainly due to big overhead and extremely high latencies. SCSI was designed/optimized for rather high latency periphials like disks, scanners etc. Regards, -- martin // Martin Konold, Herrenbergerstr. 14, 72070 Tuebingen, Germany // // Email: konold@kde.org // KDE: A stable GUI for a reliable OS. UNIX: Everything including a device is a file. KDE: Everything including a file is a URL. From lindahl@cs.virginia.edu Fri, 4 Jun 1999 12:09:48 -0400 Date: Fri, 4 Jun 1999 12:09:48 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: hard disk reliability > 1) MTBF or MTTR is calculated using the assumed probability of failure of > the individual parts used in assembling the unit. They *do* tend to > disregard infant mortality. These measures intentionally disregard infant mortality. Read what the definition is. MTBF also ignores old age. It represents the failure rate after infant mortality and before old age. > 5) I think we'll find that power supply and CPU fans are the most common > failure points. But if you buy good ones, I have proof that they rarely fail. As I said before, out of my 290 machines (1/3 2 years old, most of the rest 4 months old), I haven't had a single power supply or cpu fan failure. Deal with clueful vendors and don't scrimp. You'll then see that disks are the next most common failure. > 6) There should be significant cost savings in both hardware and electricity > using properly sized, well engineered, and well constructed power supplies > and cooling systems to service multiple nodes. Where's your proof? I dislike multi-node anything because it reduces reliability. -- g From rgb@phy.duke.edu Fri, 4 Jun 1999 12:09:06 -0400 Date: Fri, 4 Jun 1999 12:09:06 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: barriers On Fri, 4 Jun 1999, Christoph Wasshuber wrote: > What are 'barriers'? Please enlighten me. Points where parallel code has to exchange information between (some-to-all) nodes and resynchronize. In between barriers, work proceeds in parallel on the nodes. rgb > > Chris.... > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From paul@pxh.oac.ucla.edu Fri, 4 Jun 1999 12:29:49 -0400 Date: Fri, 4 Jun 1999 12:29:49 -0400 From: Paul Hoffman paul@pxh.oac.ucla.edu Subject: PVM or MPI essential to run parallel applications on aBeowulf? According to the MPICH V1.1.2 Installation Guide: "Using shared memory with '-comm=shared' is not supported on LINUX as it is on other variations of Unix becaue LINUX does not support the standard features of 'mmap.'" If you have a work around for this, I'd be happy to know... paul / > > > >The simplest, but not necessarily most optimal, way to handle this is to > > >treat a dual processor node as if it were 2 nodes. No special > > >programming is required and the benefits are available immediately. > > > > Thanks, I did try that [2 MPI processes on same system]. Since the > > OS shifts a lot of work to second processor when the first is pegged, > > bringing in a second MPI process is not so great [I don't expect it > > to be]. > > If you're using MPICH, you can make use of both shared-memory for > passing messages between processes on a machine, and ethernet for > messages to remote machines if you configure and > compile with -comm=shared. Note that you need to setup a procgroup > file, not just list each dual-processor machine twice. It took me a > while to discover this, but it is crucial for good performance. > Without specifying the procgroup, using both processors on a node was > actually _slower_ than only using a single processor - not a good > thing. > > If you need more details on how to setup the procgroup file, check > the MPICH docs or email me. > > -jason > > > ==================================================================== > Jason Abate abate@ticam.utexas.edu www.ticam.utexas.edu/~abate > Texas Institute for Computational and Applied Mathematics > 304 SHC, University of Texas at Austin, Austin, TX 78712 > Work: 512-471-6947 Home: 512-912-1012 Fax: 512-471-8694 > From Christopher.Bohn@sn.wpafb.af.mil Fri, 4 Jun 1999 12:48:09 -0400 Date: Fri, 4 Jun 1999 12:48:09 -0400 From: Bohn Christopher A Capt AFRL/IFSD Christopher.Bohn@sn.wpafb.af.mil Subject: PAPERS & barrier def'n http://garage.ecn.purdue.edu/~papers/ <-- PAPERS home page A barrier is used to synchronize processes. If you need multiple processes to all be at a certain point in their execution at the same time (e.g., to prevent race conditions), barrier synchronization is generally what you'll want. Compare/contrast this with the implicit synchronization that happens when you block for communication. Take care, cb *-*-*-*-* *-*-*-*-* Christopher A. Bohn, Capt, USAF || christopher.bohn@sn.wpafb.af.mil Digital Simulation Systems Engineer || cbohn@computer.org Collaborative Simulation Technology || and Applications Branch || v (937)255-4429x3576 (DSN785) Information Directorate || f (937)255-4511 (DSN785) Wright Research Site || Air Force Research Laboratory || http://members.aol.com/EngrBohn/ http://www.if.afrl.af.mil/div/IFS/IFSD/IFSD_home.html *-*-*-*-* *-*-*-*-* > -----Original Message----- > From: Christoph Wasshuber [mailto:wasshub@spdc.ti.com] > Sent: Friday, June 04, 1999 8:34 AM > To: beowulf > Subject: PAPERS > > > Where can I find more information on > PAPERS? > > Chris.... > > -----Original Message----- > From: Christoph Wasshuber [mailto:wasshub@spdc.ti.com] > Sent: Friday, June 04, 1999 8:13 AM > To: beowulf > Subject: barriers > > > What are 'barriers'? Please enlighten me. > > Chris.... > From qobi@research.nj.nec.com Fri, 4 Jun 1999 13:28:17 -0400 Date: Fri, 4 Jun 1999 13:28:17 -0400 From: Jeffrey Mark Siskind qobi@research.nj.nec.com Subject: Checking ECC under Linux [was: hard disk reliability] BTW, is there some easy way to verify from Linux that ECC is actually in use? For our purposes, we need it, but the only place I've seen this reported is on the BIOS startup screen, which nobody looks at because nodes normally do not have monitors attached... Henry Cejtin henry@clairv.com has such a program. Jeff (http://www.neci.nj.nec.com/homepages/qobi) From abate@brahma.ticam.utexas.edu Fri, 4 Jun 1999 13:41:57 -0400 Date: Fri, 4 Jun 1999 13:41:57 -0400 From: Jason Abate abate@brahma.ticam.utexas.edu Subject: PVM or MPI essential to run parallel applications on aBeowulf? > According to the MPICH V1.1.2 Installation Guide: > "Using shared memory with '-comm=shared' is not supported on LINUX as it > is on other variations of Unix becaue LINUX does not support the > standard features of 'mmap.'" That's true, but given the poor performance I was seeing on dual-processor Xeon boxes with the usual p4 device, I decided to go ahead and try it. From the tests that I've run (not exhaustive, but not trivial either), it seems to work fine. I'd be interested to see if anyone else has had similar results. -jason ==================================================================== Jason Abate abate@ticam.utexas.edu www.ticam.utexas.edu/~abate Texas Institute for Computational and Applied Mathematics 304 SHC, University of Texas at Austin, Austin, TX 78712 Work: 512-471-6947 Home: 512-912-1012 Fax: 512-471-8694 From jhbobrink@earthlink.net Sat, 5 Jun 1999 01:49:04 -0400 Date: Sat, 5 Jun 1999 01:49:04 -0400 From: JHBobrink jhbobrink@earthlink.net Subject: [Fwd: responsibility is ours to take...] This is a multi-part message in MIME format. --------------F955CDE76717C653D526F3D7 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit --------------F955CDE76717C653D526F3D7 Content-Type: message/rfc822 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Mozilla-Status2: 00000000 Message-ID: <3758BA65.BE1DA789@earthlink.net> Date: Sat, 5 Jun 1999 01:49:04 -0400 From: JHBobrink X-Mailer: Mozilla 4.6 [en] (Win98; I) X-Accept-Language: en MIME-Version: 1.0 To: Joel Jaeggli Subject: responsibility is ours to take... Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Dear Joel, This is in response to your retort to the mail posted to "our" list regarding the federal governments' attempt to place a charge of "postage" on each and every email transaction on the web. Sir, you MUST realize that the VERY existence of our lists' communication is at stake. The first wedge driven in to the free exchange of ideas on the web is like the tolling of the funeral bells of free speech! First the odd tax, then the regulation of transmition (authorized by the fact that it IS taxed) then monitoring, then access limitation to the net by government proxie.... on and on how you gonna yak about thisa SHIT THEN?????? HuH duh>>>>>>>>>>>*~ So.... give a MOMENT of thought before piping up with a retort that by any estimation is COUNTER to your own aims. Feel free to respond I am waiting Friend JHB --------------F955CDE76717C653D526F3D7-- From ClusterHack@snet.net Sat, 5 Jun 1999 04:44:59 -0400 Date: Sat, 5 Jun 1999 04:44:59 -0400 From: Bob Cat ClusterHack@snet.net Subject: hard disk reliability > > 5) I think we'll find that power supply and CPU fans are the most common > > failure points. > > But if you buy good ones, I have proof that they rarely fail. As I > said before, out of my 290 machines (1/3 2 years old, most of the rest > 4 months old), I haven't had a single power supply or cpu fan failure. I have seen Micron, HP, and other "name-brand PC" fans and supplies fail. Out of 3 Microns installed, I had 1 bad fan and 2 bad sectors on 1 HD - I submit personal experience is not a sufficient indicator of reliability. (I would still recommend Micron) Perhaps it was because they were in an office environment, not a nice clean lab? I'm curious about the MTBFs for fans and power supplies - are they greater than those for disks? Have you looked inside one of the "commodity" supplies? Ecchh! BTW, dying supplies often take other components with them. > > service multiple nodes. > Where's your proof? I dislike multi-node anything because it reduces > reliability. If a single node fails while working on a sufficiently fine-grained parallel problem, won't that stop the entire run anyway? Coarse-grained is different, of course. But: If I have X components, I divide the MTBF by X. Halve the components and you double reliability. (assuming same MTBF). I *said* statistics are fun. :ßobÇat.Bat 1.0 >^^< In base(one half) an infinite number approaches unity. Echo f b800:0000 fff 32 00 e1 09 6f 0f 62 0f 80 04 61 0f 74 0f 32 00 > Bob.Cat Echo q >> Bob.Cat DeBug < Bob.Cat > Nul @Erase Bob.Cat > Nul From wmilas@rarcoa.com Sat, 5 Jun 1999 12:04:16 -0400 Date: Sat, 5 Jun 1999 12:04:16 -0400 From: Wayde Milas wmilas@rarcoa.com Subject: hard disk reliability > I have seen Micron, HP, and other "name-brand PC" fans and supplies fail. > Out of 3 Microns installed, I had 1 bad fan and 2 bad sectors on 1 HD - I submit > personal experience is not a sufficient indicator of reliability. > (I would still recommend Micron) > Perhaps it was because they were in an office environment, not a nice clean lab? > I'm curious about the MTBFs for fans and power supplies - are they greater than those for disks? Its been my experience, that High rpm scsi disks (that run hot) fail about as often as your average ATX Power supply). The power supplies might actually fail MORE often then low rpm cool running drives. This is jsut my personal experience, btw.... Wayde Milas From jonathanclements@hotmail.com Sat, 5 Jun 1999 15:57:40 -0400 Date: Sat, 5 Jun 1999 15:57:40 -0400 From: Jonathan Clements jonathanclements@hotmail.com Subject: all this political wapping Guys, I have been watching this list for about 6 months now. I do so to learn about clusters. But recently there has been an increase in the "political" discussions that have been going on. Very little of what has been said is constructive, well thought out, or well informed. I am by no means a cluster expert, but I am sick of reading this "big brother is out to keep the man" down bull shit. SHUT UP! Half the emails I get are this kind of crap. If any of you out there are in any way qualified to discuss these things you certainly aren't showing it. On this list you (for the most part) treat "newbies" resonably well and try to help them. But when someone says something that wouldn't even be a "newbie" political question/statement we spend a week and twenty email discussing it when it doesn't even warrant one. Any resenblence to fact by what was laid out is generally purely coincidental. So if you stop "sharing" your "opinions", then I am going to personally take it upon myself to answer every single riduculous email (off the list of course). Shut up and stop wasting my mail box space. And your momma too! jonathan clements _______________________________________________________________ Get Free Email and Do More On The Web. Visit http://www.msn.com From tibbs@math.uh.edu Sat, 5 Jun 1999 16:33:19 -0400 Date: Sat, 5 Jun 1999 16:33:19 -0400 From: Jason L Tibbitts III tibbs@math.uh.edu Subject: Dataless nodes using Coda? I've seen instructions for building diskless nodes using an NFS root and it seems relatively easy to do full installs with NFS or Coda-shared /usr/local but what I'd like to investigate is a shared /usr configuration with Coda as the transport. One issue I see so far is that RedHat doesn't make it really easy to have a shared /usr configuration (or at least they don't advertise the fact) and most packages don't really advertise what they put in /usr and what they put elsewhere (/etc, /sbin). I don't see an easy way to use kickstart to set it up. If shared /usr is possible then the transport should be irrelevant, and Coda seems to make more sense than NFS because of the caching aspect. (There's no need to melt the server sharing unchanging binaries.) Thanks, -- Jason L Tibbitts III - tibbs@uh.edu - 713/743-3486 - 660PGH - 94 PC800 System Manager: University of Houston Department of Mathematics "You'll see the blood as we roll in it together..." From pesch@ibm.net Sat, 5 Jun 1999 16:38:16 -0400 Date: Sat, 5 Jun 1999 16:38:16 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: Cables We looked at SCSI and dropped it; Myrinet looks good if you really need the bandwith. About FastEthernet: we've exxperimented with cables with only 4 wires, and they seem to work quite well (in some cases we observed a perfomance improvement for which I have absolutely no explanation). Any experience in this field? Paul At 04:29 PM 6/1/99 +0200, Martin Konold wrote: >On Tue, 1 Jun 1999, Alvin Starr wrote: > >> At 80Mbytes/sec SCSI can make for a fast link between a small number of >> systems and with a low overhead protocol it could help solve some of the >> problems involved in trying to share memory across a network. > >No, unfortunately the overhead of SCSI is compared to SCI and Myrinet >tremendeous. (Latencies in the ms range) > >Regards, >-- martin > >// Martin Konold, Herrenbergerstr. 14, 72070 Tuebingen, Germany // >// Email: konold@kde.org // >KDE: A stable GUI for a reliable OS. >UNIX: Everything including a device is a file. >KDE: Everything including a file is a URL. > > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From pesch@ibm.net Sat, 5 Jun 1999 16:39:24 -0400 Date: Sat, 5 Jun 1999 16:39:24 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: hard disk reliability What brand were the disks? Paul At 10:40 AM 6/4/99 -0400, Josip Loncaric wrote: >> How serious is the hard disk reliability issue >> in reality? > >Our 32 node cluster has been running for 6 months, with hard drive >problems: > >On arrival: > 3 defective disks > - 1 fixed by updating its table of bad blocks > - 2 replaced, but > + 1 replacement needed updating its table of bad blocks > >In operation: > 2 disks developed new bad blocks > - 1 fixed by updating its table of bad blocks > - 1 had to be replaced > >Sincerely, >Josip > > >-- >Dr. Josip Loncaric, Senior Staff Scientist mailto:josip@icase.edu >ICASE, Mail Stop 132C http://www.icase.edu/~josip/ >NASA Langley Research Center mailto:j.loncaric@larc.nasa.gov >Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134 > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From pesch@ibm.net Sat, 5 Jun 1999 17:05:02 -0400 Date: Sat, 5 Jun 1999 17:05:02 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: hard disk reliability Have you measured the temperatures of the drives? - or do "hot" and "cool" represent subjective values... Paul At 11:01 AM 6/5/99 -0500, Wayde Milas wrote: >> I have seen Micron, HP, and other "name-brand PC" fans and supplies fail. >> Out of 3 Microns installed, I had 1 bad fan and 2 bad sectors on 1 HD - I submit >> personal experience is not a sufficient indicator of reliability. >> (I would still recommend Micron) >> Perhaps it was because they were in an office environment, not a nice clean lab? >> I'm curious about the MTBFs for fans and power supplies - are they greater than those for disks? > >Its been my experience, that High rpm scsi disks (that run hot) fail >about as often as your average ATX Power supply). The power supplies >might actually fail MORE often then low rpm cool running drives. This is >jsut my personal experience, btw.... > >Wayde Milas > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From judson@mcs.anl.gov Sat, 5 Jun 1999 17:45:29 -0400 Date: Sat, 5 Jun 1999 17:45:29 -0400 From: Ivan R. Judson judson@mcs.anl.gov Subject: netmem/DSM? This is a multi-part message in MIME format. ------=_NextPart_000_0006_01BEAF72.AFFA46A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Has there been any changes or better yet, releases, docs, or examples of netmem (from cesdis), working on any machine? I'd like to peer at some working code, and utilize it for some experiments. We're tracking down treadmarks for some high-level comparisons, but I'd like something that I can peer at. --Ivan .......................................................................... Math/Computer Science | judson@anl.gov Argonne National Laboratory | judson@mcs.anl.gov http://www.mcs.anl.gov/people/judson | ------=_NextPart_000_0006_01BEAF72.AFFA46A0 Content-Type: application/vcard; name="Ivan R. Judson.vcf" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Ivan R. Judson.vcf" BEGIN:VCARD VERSION:2.1 N:Judson;Ivan;R.;; FN:Ivan R. Judson ORG:Argonne National Laboratory; TITLE: EMAIL;PREF;INTERNET:judson@mcs.anl.gov REV:19981026T010356Z END:VCARD ------=_NextPart_000_0006_01BEAF72.AFFA46A0-- From ok_murphy@email.msn.com Sat, 5 Jun 1999 19:18:29 -0400 Date: Sat, 5 Jun 1999 19:18:29 -0400 From: Keith Murphy ok_murphy@email.msn.com Subject: Cables If you are looking for (800MBytes) aggregate performance try SCI (Scaleable Coherent Interface) which also has an extremely low latency (better than Myrinet). -----Original Message----- From: Paul Eduard Schenker To: Martin Konold ; Alvin Starr Cc: sct ; extreme-linux@acl.lanl.gov ; beowulf mail list Date: Sat, 5 Jun 1999 19:18:29 -0400 Subject: Cables >We looked at SCSI and dropped it; Myrinet looks good if you really need the >bandwith. > >About FastEthernet: we've exxperimented with cables with only 4 wires, and >they seem to work quite well (in some cases we observed a perfomance >improvement for which I have absolutely no explanation). Any experience in >this field? > >Paul > >At 04:29 PM 6/1/99 +0200, Martin Konold wrote: >>On Tue, 1 Jun 1999, Alvin Starr wrote: >> >>> At 80Mbytes/sec SCSI can make for a fast link between a small number of >>> systems and with a low overhead protocol it could help solve some of the >>> problems involved in trying to share memory across a network. >> >>No, unfortunately the overhead of SCSI is compared to SCI and Myrinet >>tremendeous. (Latencies in the ms range) >> >>Regards, >>-- martin >> >>// Martin Konold, Herrenbergerstr. 14, 72070 Tuebingen, Germany // >>// Email: konold@kde.org // >>KDE: A stable GUI for a reliable OS. >>UNIX: Everything including a device is a file. >>KDE: Everything including a file is a URL. >> >> >> >Paul Eduard Schenker >1 Peirce Hill >Singapore 248558 > >Phone: 476 2245 >Fax: 472 6480 >email: pesch@ibm.net > > From alan@lxorguk.ukuu.org.uk Sat, 5 Jun 1999 19:44:31 -0400 Date: Sat, 5 Jun 1999 19:44:31 -0400 From: Alan Cox alan@lxorguk.ukuu.org.uk Subject: Cables > If you are looking for (800MBytes) aggregate performance try SCI (Scaleable > Coherent Interface) which also has an extremely low latency (better than > Myrinet). Is there are source for Open Source SCI yet or do you still have to pray your cluster doesnt get obsoleted by some random third party ? From ok_murphy@email.msn.com Sat, 5 Jun 1999 20:15:29 -0400 Date: Sat, 5 Jun 1999 20:15:29 -0400 From: Keith Murphy ok_murphy@email.msn.com Subject: Cables SCI is an ANSI/IEEE standard so well documented. As for being obsolete by a third party is any technology free from that? In the meantime it is the fastest interface available today and makes an ideal Beowulf interface. -----Original Message----- From: Alan Cox To: Keith Murphy Cc: konold@alpha.tat.physik.uni-tuebingen.de ; alvin@iplink.net ; pesch@ibm.net ; sct@lanl.gov ; extreme-linux@acl.lanl.gov ; beowulf@beowulf.gsfc.nasa.gov Date: Sat, 5 Jun 1999 20:15:29 -0400 Subject: Re: Cables >> If you are looking for (800MBytes) aggregate performance try SCI (Scaleable >> Coherent Interface) which also has an extremely low latency (better than >> Myrinet). > >Is there are source for Open Source SCI yet or do you still have to pray >your cluster doesnt get obsoleted by some random third party ? > From alan@lxorguk.ukuu.org.uk Sat, 5 Jun 1999 20:24:59 -0400 Date: Sat, 5 Jun 1999 20:24:59 -0400 From: Alan Cox alan@lxorguk.ukuu.org.uk Subject: Cables > SCI is an ANSI/IEEE standard so well documented. As for being obsolete by a The upper layer > third party is any technology free from that? In the meantime it is the > fastest interface available today and makes an ideal Beowulf interface. What I meant was is anyone providing open source drivers or documentation to their SCI cards. SCI itself may be a standard, but so is ethernet. It still doesn't save you if the vendor provides nothing but binary only drivers for old kernels.. Alan From ok_murphy@email.msn.com Sat, 5 Jun 1999 22:47:28 -0400 Date: Sat, 5 Jun 1999 22:47:28 -0400 From: Keith Murphy ok_murphy@email.msn.com Subject: Cables Check our Dolphin Interconnect and SCALI sites www.dolphinics.com and www.scali.com interesting regarding SCI/Linux support, Keith -----Original Message----- From: Alan Cox To: Keith Murphy Cc: alan@lxorguk.ukuu.org.uk ; konold@alpha.tat.physik.uni-tuebingen.de ; alvin@iplink.net ; pesch@ibm.net ; sct@lanl.gov ; extreme-linux@acl.lanl.gov ; beowulf@beowulf.gsfc.nasa.gov Date: Sat, 5 Jun 1999 22:47:28 -0400 Subject: Re: Cables >> SCI is an ANSI/IEEE standard so well documented. As for being obsolete by a > >The upper layer > >> third party is any technology free from that? In the meantime it is the >> fastest interface available today and makes an ideal Beowulf interface. > >What I meant was is anyone providing open source drivers or documentation >to their SCI cards. SCI itself may be a standard, but so is ethernet. It still >doesn't save you if the vendor provides nothing but binary only drivers for >old kernels.. > >Alan > > From csnyder1@cwix.com Sat, 5 Jun 1999 23:02:33 -0400 Date: Sat, 5 Jun 1999 23:02:33 -0400 From: Christopher Snyder csnyder1@cwix.com Subject: Smallest Linux PC On Earth? Good Beowulf node? This is a multi-part message in MIME format. ------=_NextPart_000_0014_01BEAFA7.C5210F70 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hey there, A few weeks ago I saw some stuff on the web about the smallest Linux = computer on Earth. It's a German made PC, about 4 inches square, 2 inches thick, with a = tiny solid state hard drive or IBM drive and all the works. (Original development was for embedded systems) I think the story was that for about 920 bucks $US$ these guys put = together a node with CPU, RAM, and disk in a box about the size of a VCR cassette, I thought it might make a = great tool for a Beowulf system, but maybe not? (Consider having 64 mini nodes in a box a little bigger than the size of = one PC - full sized tower, sitting by your desk, just crank up the air = conditioning...) I read that they even are using this little PC as a web server serving = up their site on the web. Where, I do not know. Anyway, If anyone has read about this too, please let me know. I seem = to have lost track of the site, the manufacturer, and the original = story! But I remember it was basically low cost, small and they even = had instructions on putting a node all together... Regards, C.S. ------=_NextPart_000_0014_01BEAFA7.C5210F70 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hey there,
 
A few weeks ago I saw some stuff on = the web=20 about the smallest Linux computer on Earth.
It's a German made PC, about 4 inches square, 2 = inches thick,=20 with a tiny solid state hard drive or IBM drive
and all the works.  (Original development was = for=20 embedded systems)
 
I think the story was that for about = 920 bucks=20 $US$ these guys put together a node with CPU, RAM, and disk
in a box about the size of a VCR = cassette, I=20 thought it might make a great tool for a Beowulf system, but maybe=20 not?
(Consider having 64 mini nodes in a box a little = bigger than=20 the size of one PC - full sized tower, sitting by your desk, just crank = up the=20 air conditioning...)
 
I read that they even are using this = little PC=20 as a web server serving up their site on the web.  Where, I do not=20 know.
 
Anyway, If anyone has read about = this too,=20 please let me know.  I seem to have lost track of the site, the=20 manufacturer, and the original story!   But I remember it was=20 basically low cost, small and they even had instructions on putting a = node all=20 together...
 
Regards,
 
       =20 C.S.
------=_NextPart_000_0014_01BEAFA7.C5210F70-- From goebel@his.com Sat, 5 Jun 1999 23:03:31 -0400 Date: Sat, 5 Jun 1999 23:03:31 -0400 From: goebel goebel@his.com Subject: Cables On Sun, 6 Jun 1999, Alan Cox wrote: > What I meant was is anyone providing open source drivers or documentation > to their SCI cards. SCI itself may be a standard, but so is ethernet. It still > doesn't save you if the vendor provides nothing but binary only drivers for > old kernels.. > Alan -- I have a SCI machine at work that I am testing and offering others to try our via the Net, but your right, it is a proprietary driver ( we're testing the same companys MPI too ).We're talking to both companys, hardware and software, to release the hardware spec and the driver code. Right now it's running on 2.2.7 RH 6.0. This is the unfortuate part of closed source development; the more interest in the product, the more development it gets, and organizations aren't as fast on their collective feet as the open sources community (typically). In open source, the more it is open, the more it lives. I think both companys will come around, but at this point it is my hope, and not what is reality. SCI does have low latency, and that's the reason why I am interested in getting it into open source. John Goebel VA Linux Systems From mwd@sgi.com Sun, 6 Jun 1999 00:30:05 -0400 Date: Sun, 6 Jun 1999 00:30:05 -0400 From: Mark Dalton mwd@sgi.com Subject: Question on network performance.. Tulip.c I have been helping a friend (or trying to) with his cluster. One thing that is interesting is that if I send larger packets I start loosing some of the data. It seems to always miss one message (It is always the first message, for each machine).. So if I: ping -c 1 -s 8000 d03 - This will fail - if I try again, it will work. ping -c 4 -s 8000 d04 - This will get 25% loss (It seems to be the first message) As a FYI, the pings take about, 2.1ms between boxes. I believe the cards are Lynksys Ethernet cards (I know the have the dec chip/tulip driver), and we are using 2 ExtremeNetworks 48 port switches (2 Gigabit uplinks on each). (Not that it is a driver issue, I thought I should just mention it) This seems to occur with the older driver: tulip.c:v0.83 10/19/97 And with the newer driver also: tulip.c:v0.91e 5/27/99 The OS is Red Hat linux 5.2, with Linux 2.1.125 #5 SMP, i686 (We plan to move on to Linux 2.2.*, after we are confident we have the performance/stability we are happy with at this point, and verify Linux 2.2.* works better, which I am sure it will since these are DUAL CPU boxes). Does anyone have any comments/ideas? Also what type of simple ping performance/loss for larger packets are people seeing? Also is there any 'special' mods to the kernel made to help improve performance for beowulf clusters? (Mike Warren?? have you released those changes). (I know, I know.. it is not a SGI machine.. (^8 ). Mark -- Mark Dalton CH3-S-CH2 H H O H Silicon Graphics, Inc. | | | \ | Eagan, MN 55121 CH2-C-COO //\ ---C--CH2-C-COO C-CH2-C-COO mwd@sgi.com | | || || | // | NH3 \\/ \ / CH NH3 O NH3 NH My home page: http://www.cbc.umn.edu/~mwd/mwd.html Cell Biology: http://www.cbc.umn.edu/~mwd/cell.html BEAM Robotics pages: Beam-Online: http://www.beam-online.com/ Tek FAQ: http://people.ne.mediaone.net/bushbo/beam/FAQ.html Chiu-Yuan's: http://www.geocities.com/SouthBeach/6897/ From torben@net.Hawaii.Edu Sun, 6 Jun 1999 03:09:54 -0400 Date: Sun, 6 Jun 1999 03:09:54 -0400 From: Torben Noerup Nielsen torben@net.Hawaii.Edu Subject: Smallest Linux PC On Earth? Good Beowulf node? This is a multi-part message in MIME format. ------=_NextPart_000_0004_01BEAF97.BE8285A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Try looking at http://www.jumptec.de; that might be what you are looking for.... Torben -----Original Message----- From: owner-beowulf@beowulf.gsfc.nasa.gov [mailto:owner-beowulf@beowulf.gsfc.nasa.gov]On Behalf Of Christopher Snyder Sent: Saturday, June 05, 1999 5:05 PM To: beowulf@beowulf.gsfc.nasa.gov Subject: Smallest Linux PC On Earth? Good Beowulf node? Hey there, A few weeks ago I saw some stuff on the web about the smallest Linux computer on Earth. It's a German made PC, about 4 inches square, 2 inches thick, with a tiny solid state hard drive or IBM drive and all the works. (Original development was for embedded systems) I think the story was that for about 920 bucks $US$ these guys put together a node with CPU, RAM, and disk in a box about the size of a VCR cassette, I thought it might make a great tool for a Beowulf system, but maybe not? (Consider having 64 mini nodes in a box a little bigger than the size of one PC - full sized tower, sitting by your desk, just crank up the air conditioning...) I read that they even are using this little PC as a web server serving up their site on the web. Where, I do not know. Anyway, If anyone has read about this too, please let me know. I seem to have lost track of the site, the manufacturer, and the original story! But I remember it was basically low cost, small and they even had instructions on putting a node all together... Regards, C.S. ------=_NextPart_000_0004_01BEAF97.BE8285A0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Try=20 looking at http://www.jumptec.de; = that might=20 be what you are looking for....
 
Torben
-----Original Message-----
From:=20 owner-beowulf@beowulf.gsfc.nasa.gov=20 [mailto:owner-beowulf@beowulf.gsfc.nasa.gov]On Behalf Of = Christopher=20 Snyder
Sent: Saturday, June 05, 1999 5:05 PM
To:=20 beowulf@beowulf.gsfc.nasa.gov
Subject: Smallest Linux PC On = Earth?=20 Good Beowulf node?

Hey there,
 
A few weeks ago I saw some stuff = on the web=20 about the smallest Linux computer on Earth.
It's a German made PC, about 4 inches square, 2 = inches=20 thick, with a tiny solid state hard drive or IBM drive
and all the works.  (Original development was = for=20 embedded systems)
 
I think the story was that for = about 920 bucks=20 $US$ these guys put together a node with CPU, RAM, and = disk
in a box about the size of a VCR = cassette, I=20 thought it might make a great tool for a Beowulf system, but maybe=20 not?
(Consider having 64 mini nodes in a box a little = bigger than=20 the size of one PC - full sized tower, sitting by your desk, just = crank up the=20 air conditioning...)
 
I read that they even are using = this little PC=20 as a web server serving up their site on the web.  Where, I do = not=20 know.
 
Anyway, If anyone has read about = this too,=20 please let me know.  I seem to have lost track of the site, the=20 manufacturer, and the original story!   But I remember it = was=20 basically low cost, small and they even had instructions on putting a = node all=20 together...
 
Regards,
 
       =20 C.S.
------=_NextPart_000_0004_01BEAF97.BE8285A0-- From torben@net.Hawaii.Edu Sun, 6 Jun 1999 03:15:09 -0400 Date: Sun, 6 Jun 1999 03:15:09 -0400 From: Torben Noerup Nielsen torben@net.Hawaii.Edu Subject: Smallest Linux PC On Earth? Good Beowulf node? This is a multi-part message in MIME format. ------=_NextPart_000_0009_01BEAF98.7AB93340 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit P.S. For complete instructions on how to build one, check http://wearables.stanford.edu; they give you the instructions for setting it all up. Note that this just a 33-66 MHz 486 so it probably won't do all that well as part of a cluster.... -----Original Message----- From: owner-beowulf@beowulf.gsfc.nasa.gov [mailto:owner-beowulf@beowulf.gsfc.nasa.gov]On Behalf Of Christopher Snyder Sent: Saturday, June 05, 1999 5:05 PM To: beowulf@beowulf.gsfc.nasa.gov Subject: Smallest Linux PC On Earth? Good Beowulf node? Hey there, A few weeks ago I saw some stuff on the web about the smallest Linux computer on Earth. It's a German made PC, about 4 inches square, 2 inches thick, with a tiny solid state hard drive or IBM drive and all the works. (Original development was for embedded systems) I think the story was that for about 920 bucks $US$ these guys put together a node with CPU, RAM, and disk in a box about the size of a VCR cassette, I thought it might make a great tool for a Beowulf system, but maybe not? (Consider having 64 mini nodes in a box a little bigger than the size of one PC - full sized tower, sitting by your desk, just crank up the air conditioning...) I read that they even are using this little PC as a web server serving up their site on the web. Where, I do not know. Anyway, If anyone has read about this too, please let me know. I seem to have lost track of the site, the manufacturer, and the original story! But I remember it was basically low cost, small and they even had instructions on putting a node all together... Regards, C.S. ------=_NextPart_000_0009_01BEAF98.7AB93340 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
P.S.=20 For complete instructions on how to build one, check http://wearables.stanford.edu;= they=20 give you the instructions for setting it all up. Note that this just a = 33-66 MHz=20 486 so it probably won't do all that well as part of a=20 cluster....
-----Original Message-----
From:=20 owner-beowulf@beowulf.gsfc.nasa.gov=20 [mailto:owner-beowulf@beowulf.gsfc.nasa.gov]On Behalf Of = Christopher=20 Snyder
Sent: Saturday, June 05, 1999 5:05 PM
To:=20 beowulf@beowulf.gsfc.nasa.gov
Subject: Smallest Linux PC On = Earth?=20 Good Beowulf node?

Hey there,
 
A few weeks ago I saw some stuff = on the web=20 about the smallest Linux computer on Earth.
It's a German made PC, about 4 inches square, 2 = inches=20 thick, with a tiny solid state hard drive or IBM drive
and all the works.  (Original development was = for=20 embedded systems)
 
I think the story was that for = about 920 bucks=20 $US$ these guys put together a node with CPU, RAM, and = disk
in a box about the size of a VCR = cassette, I=20 thought it might make a great tool for a Beowulf system, but maybe=20 not?
(Consider having 64 mini nodes in a box a little = bigger than=20 the size of one PC - full sized tower, sitting by your desk, just = crank up the=20 air conditioning...)
 
I read that they even are using = this little PC=20 as a web server serving up their site on the web.  Where, I do = not=20 know.
 
Anyway, If anyone has read about = this too,=20 please let me know.  I seem to have lost track of the site, the=20 manufacturer, and the original story!   But I remember it = was=20 basically low cost, small and they even had instructions on putting a = node all=20 together...
 
Regards,
 
       =20 C.S.
------=_NextPart_000_0009_01BEAF98.7AB93340-- From pesch@ibm.net Sun, 6 Jun 1999 03:45:41 -0400 Date: Sun, 6 Jun 1999 03:45:41 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: Question on network performance.. Tulip.c What speed of ethernet are you using and how far are the boxes apart? At 11:29 PM 6/5/99 -0500, Mark Dalton wrote: > >I have been helping a friend (or trying to) with his cluster. > >One thing that is interesting is that if I send larger packets >I start loosing some of the data. It seems to always miss one >message (It is always the first message, for each machine).. > >So if I: > ping -c 1 -s 8000 d03 > - This will fail > - if I try again, it will work. > ping -c 4 -s 8000 d04 > - This will get 25% loss (It seems to be the first message) > >As a FYI, the pings take about, 2.1ms between boxes. I believe the >cards are Lynksys Ethernet cards (I know the have the dec chip/tulip driver), >and we are using 2 ExtremeNetworks 48 port switches (2 Gigabit uplinks >on each). > >(Not that it is a driver issue, I thought I should just mention it) > This seems to occur with the older driver: > tulip.c:v0.83 10/19/97 > And with the newer driver also: > tulip.c:v0.91e 5/27/99 > >The OS is Red Hat linux 5.2, with Linux 2.1.125 #5 SMP, i686 >(We plan to move on to Linux 2.2.*, after we are confident we > have the performance/stability we are happy with at this point, > and verify Linux 2.2.* works better, which I am sure it will > since these are DUAL CPU boxes). > >Does anyone have any comments/ideas? Also what type of simple ping >performance/loss for larger packets are people seeing? > >Also is there any 'special' mods to the kernel made to help improve >performance for beowulf clusters? (Mike Warren?? have you released >those changes). (I know, I know.. it is not a SGI machine.. (^8 ). > >Mark >-- >Mark Dalton CH3-S-CH2 H H O H >Silicon Graphics, Inc. | | | \ | >Eagan, MN 55121 CH2-C-COO //\ ---C--CH2-C-COO C-CH2-C-COO >mwd@sgi.com | | || || | // | > NH3 \\/ \ / CH NH3 O NH3 > NH >My home page: http://www.cbc.umn.edu/~mwd/mwd.html >Cell Biology: http://www.cbc.umn.edu/~mwd/cell.html >BEAM Robotics pages: > Beam-Online: http://www.beam-online.com/ > Tek FAQ: http://people.ne.mediaone.net/bushbo/beam/FAQ.html > Chiu-Yuan's: http://www.geocities.com/SouthBeach/6897/ > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From mwd@sgi.com Sun, 6 Jun 1999 09:45:03 -0400 Date: Sun, 6 Jun 1999 09:45:03 -0400 From: Mark Dalton mwd@sgi.com Subject: Question on network performance.. Tulip.c > What speed of ethernet are you using and how far are the boxes apart? > 100bT ethernet.. The distance between the boxes varies, but the furthest points would be for: They layout is.. d01 d13 d25 d33 d42 d55 .. .. .. .. .. .. .. .. switch1 switch2 .. .. .. .. .. .. .. .. d12 d24 d32 d41 d54 d66 However, the distance does not seem to matter. The shortest cable lengths are between machines like d27 and d28 Also I did further narrowing down.. Perhaps there is a option in the tulip driver or a kernel change, that I need to learn about... A size of 4432 seems to be the limit for the size of the data for the first message. When I get to the: ping -s 4433 [machine] - I can get the first message to fail, as does anything larger. However, ever ping afterwards to the same machine works. ping -s 4432 [machine] - repeatedly works. And once the connect to a given machine is established (either with a multiple count ping, or by multiple ping commands), the pings succeed. (We did try the BayStack 16 port switch also, which was a little slower and we still got the same behaviour has above). Examples below... (FYI on ping) -s packetsize Specifies the number of data bytes to be sent. The default is 56, which translates into 64 ICMP data bytes when combined with the 8 bytes of ICMP header data. Mark > At 11:29 PM 6/5/99 -0500, Mark Dalton wrote: > > > >I have been helping a friend (or trying to) with his cluster. > > > >One thing that is interesting is that if I send larger packets > >I start loosing some of the data. It seems to always miss one > >message (It is always the first message, for each machine).. > > > >So if I: > > ping -c 1 -s 8000 d03 > > - This will fail > > - if I try again, it will work. > > ping -c 4 -s 8000 d04 > > - This will get 25% loss (It seems to be the first message) > > > >As a FYI, the pings take about, 2.1ms between boxes. I believe the > >cards are Lynksys Ethernet cards (I know the have the dec chip/tulip driver), > >and we are using 2 ExtremeNetworks 48 port switches (2 Gigabit uplinks > >on each). > > > >(Not that it is a driver issue, I thought I should just mention it) > > This seems to occur with the older driver: > > tulip.c:v0.83 10/19/97 > > And with the newer driver also: > > tulip.c:v0.91e 5/27/99 > > > >The OS is Red Hat linux 5.2, with Linux 2.1.125 #5 SMP, i686 > >(We plan to move on to Linux 2.2.*, after we are confident we > > have the performance/stability we are happy with at this point, > > and verify Linux 2.2.* works better, which I am sure it will > > since these are DUAL CPU boxes). > > > >Does anyone have any comments/ideas? Also what type of simple ping > >performance/loss for larger packets are people seeing? > > > >Also is there any 'special' mods to the kernel made to help improve > >performance for beowulf clusters? (Mike Warren?? have you released > >those changes). (I know, I know.. it is not a SGI machine.. (^8 ). > > > >Mark > >-- > >Mark Dalton CH3-S-CH2 H H O H > >Silicon Graphics, Inc. | | | \ | > >Eagan, MN 55121 CH2-C-COO //\ ---C--CH2-C-COO C-CH2-C-COO > >mwd@sgi.com | | || || | // | > > NH3 \\/ \ / CH NH3 O NH3 > > NH > >My home page: http://www.cbc.umn.edu/~mwd/mwd.html > >Cell Biology: http://www.cbc.umn.edu/~mwd/cell.html > >BEAM Robotics pages: > > Beam-Online: http://www.beam-online.com/ > > Tek FAQ: http://people.ne.mediaone.net/bushbo/beam/FAQ.html > > Chiu-Yuan's: http://www.geocities.com/SouthBeach/6897/ > > > > > Paul Eduard Schenker > 1 Peirce Hill > Singapore 248558 > > Phone: 476 2245 > Fax: 472 6480 > email: pesch@ibm.net > -- Mark Dalton CH3-S-CH2 H H O H Silicon Graphics, Inc. | | | \ | Eagan, MN 55121 CH2-C-COO //\ ---C--CH2-C-COO C-CH2-C-COO mwd@sgi.com | | || || | // | NH3 \\/ \ / CH NH3 O NH3 NH My home page: http://www.cbc.umn.edu/~mwd/mwd.html Cell Biology: http://www.cbc.umn.edu/~mwd/cell.html BEAM Robotics pages: Beam-Online: http://www.beam-online.com/ Tek FAQ: http://people.ne.mediaone.net/bushbo/beam/FAQ.html Chiu-Yuan's: http://www.geocities.com/SouthBeach/6897/ From jav@blazenet.net Sun, 6 Jun 1999 10:47:30 -0400 Date: Sun, 6 Jun 1999 10:47:30 -0400 From: jav jav@blazenet.net Subject: Smallest Linux PC On Earth? Good Beowulf node? Christopher, I am actually involved with a project similar to the one that you have pointed out called the Linux Router Project (http://www.linuxrouter.org), the concept is the same, except our total image fits on 1.44 meg and expands into a RAMdisk. The problem that begins is that when you start getting involved with the addidition of either enough memory, or a physical hardware storage device, you begin to increase costs significantly. If cost is no object, then using a kernel such as the LRP (based on Debian), becomes a wonderful concept. The machines can be built very, very small, very few overhead daemons running and you can fit about 8 of them into the same space as a mid-tower... The positives, you're talking about a very small footprint, with low overhead, and substancial power. The negatives, when you're talking about the systems such as these, you are talking about systems which are designed with purpose built-modified distributions... If you think that you're going to get involved with the notion of building a cluster out of one of these devices, please feel free to contact me for questions. John > -----Original Message----- > From: Christopher Snyder [SMTP:csnyder1@cwix.com] > Sent: Saturday, 05 June, 1999 23:05 > To: beowulf@beowulf.gsfc.nasa.gov > Subject: Smallest Linux PC On Earth? Good Beowulf node? > > Hey there, > > A few weeks ago I saw some stuff on the web about the smallest Linux > computer on Earth. > It's a German made PC, about 4 inches square, 2 inches thick, with a > tiny solid state hard drive or IBM drive > and all the works. (Original development was for embedded systems) > > I think the story was that for about 920 bucks $US$ these guys put > together a node with CPU, RAM, and disk > in a box about the size of a VCR cassette, I thought it might make a > great tool for a Beowulf system, but maybe not? > (Consider having 64 mini nodes in a box a little bigger than the size > of one PC - full sized tower, sitting by your desk, just crank up the > air conditioning...) > > I read that they even are using this little PC as a web server serving > up their site on the web. Where, I do not know. > > Anyway, If anyone has read about this too, please let me know. I seem > to have lost track of the site, the manufacturer, and the original > story! But I remember it was basically low cost, small and they even > had instructions on putting a node all together... > > Regards, > > C.S. << File: ATT00006.html >> From jteneyck@xyos.net Sun, 6 Jun 1999 11:06:46 -0400 Date: Sun, 6 Jun 1999 11:06:46 -0400 From: John M. TenEyck jteneyck@xyos.net Subject: beowulf cluster hello, I have the option of buying machines with 2mb L2 cache. while this is very nice, I was wondering if anyone had any input on if this is worth the extra money, when being considered in a cluster configuration. thanks, John TenEyck _________________________________________________________________________ John TenEyck jteneyck@xyos.net http://jteneyck.xyos.net 409.229.8954 .-. __ _____ ____ ___ __ /v\ / / / _/ | / / / / / |/ / / \ / / / // |/ / / / /| / /( )\ / /____/ // /| / /_/ // | ^^-^^ /_____/___/_/ |_/_____//_/|_| >Phear The Penguin< If you refuse to accept anything but the best you very often get it. _________________________________________________________________________ From wmilas@rarcoa.com Sun, 6 Jun 1999 12:34:22 -0400 Date: Sun, 6 Jun 1999 12:34:22 -0400 From: Wayde Milas wmilas@rarcoa.com Subject: hard disk reliability Paul Eduard Schenker wrote: > > Have you measured the temperatures of the drives? - or do "hot" and "cool" > represent subjective values... > It is subjective. Id classify the Seagate Baracudas(sp)? 10000 rpm as HOT. a plain old vanilla 5400 rpm lvd as cool... If you can touch the drive witholut hurting yourself after its been active for 2 hours, its cool. otherwise its hot. :P Ibms some where inbetween... Never said it was scientific, Just personal experience. Hot drives tend to fail more often. Wayde From voda@vision.doa.org Sun, 6 Jun 1999 13:13:30 -0400 Date: Sun, 6 Jun 1999 13:13:30 -0400 From: Leif Hardison voda@vision.doa.org Subject: Cables The reason you may notice improvement by using less pairs is because you cut down on factors like interferance. Also if you wish to get the best performance you will want to crimp wires from loosely packed spools and make sure there are no tight bends. If you bend a wire at to sharp of an angle you will loose performance. I can snag some diagrams if anyone wishes. -Leif Hardison hardware.doa.org On Sun, 6 Jun 1999, Paul Eduard Schenker wrote: > We looked at SCSI and dropped it; Myrinet looks good if you really need the > bandwith. > > About FastEthernet: we've exxperimented with cables with only 4 wires, and > they seem to work quite well (in some cases we observed a perfomance > improvement for which I have absolutely no explanation). Any experience in > this field? > > Paul > > At 04:29 PM 6/1/99 +0200, Martin Konold wrote: > >On Tue, 1 Jun 1999, Alvin Starr wrote: > > > >> At 80Mbytes/sec SCSI can make for a fast link between a small number of > >> systems and with a low overhead protocol it could help solve some of the > >> problems involved in trying to share memory across a network. > > > >No, unfortunately the overhead of SCSI is compared to SCI and Myrinet > >tremendeous. (Latencies in the ms range) > > > >Regards, > >-- martin > > > >// Martin Konold, Herrenbergerstr. 14, 72070 Tuebingen, Germany // > >// Email: konold@kde.org // > >KDE: A stable GUI for a reliable OS. > >UNIX: Everything including a device is a file. > >KDE: Everything including a file is a URL. > > > > > > > Paul Eduard Schenker > 1 Peirce Hill > Singapore 248558 > > Phone: 476 2245 > Fax: 472 6480 > email: pesch@ibm.net > From konold@alpha.tat.physik.uni-tuebingen.de Sun, 6 Jun 1999 15:14:37 -0400 Date: Sun, 6 Jun 1999 15:14:37 -0400 From: Martin Konold konold@alpha.tat.physik.uni-tuebingen.de Subject: Cables On Sun, 6 Jun 1999, Alan Cox wrote: > > If you are looking for (800MBytes) aggregate performance try SCI (Scaleable > > Coherent Interface) which also has an extremely low latency (better than > > Myrinet). > > Is there are source for Open Source SCI yet or do you still have to pray > your cluster doesnt get obsoleted by some random third party ? Last time I talk to the relevant people they are indeed considering to open up at least the kernel drivers. I would not expect to get an OpenSourced SCAMPI anytime soon though. Siemens is right now actively marketing Linux and Solaris SCI Clusters (hpc-line) together with Dolphin and of course Scali. Regards, -- martin // Martin Konold, Herrenbergerstr. 14, 72070 Tuebingen, Germany // // Email: konold@kde.org // KDE: A stable GUI for a reliable OS. UNIX: Everything including a device is a file. KDE: Everything including a file is a URL. From hmm@patmos-international.com Sun, 6 Jun 1999 15:35:26 -0400 Date: Sun, 6 Jun 1999 15:35:26 -0400 From: Howard Miller hmm@patmos-international.com Subject: Cables On Sat, 5 Jun 1999, Keith Murphy wrote: > Check our Dolphin Interconnect and SCALI sites www.dolphinics.com and I thought that someone wrote an open-source driver specifically for some of the Dolphin equipment, though, for the life of me, I can't remeber or find who or where. I do remeber that the driver lacked some backwards-compatability that the software from Dolphin offers, but supposedly, for the purposes of the university that used it, it worked flawlessly. If anyone is particularly interested in this, I'm sure I could relocate the source of this. -- Howard Miller From beckman@acl.lanl.gov Sun, 6 Jun 1999 16:20:18 -0400 Date: Sun, 6 Jun 1999 16:20:18 -0400 From: Pete Beckman beckman@acl.lanl.gov Subject: Cables At 04:20 PM 6/5/99 -0700, Keith Murphy wrote: >If you are looking for (800MBytes) aggregate performance try SCI (Scaleable >Coherent Interface) which also has an extremely low latency (better than >Myrinet). I don't believe that is true. Measuring latency is a place where poor benchmarks and clever tricks abound. Unfortunately, there is no standard definition of "latency", just like some people say a Flop is a multiply and add, and some people count a multiple and an add as two flops. SCI publishes that latency could be as low as 2.3 microseconds. The thing to remember is that all of these high-performance cards (HiPPI, Myrinet, SCI) are generally limited first by the PCI bus. It often takes nearly a microsec to get anything from memory to the PCI bus and out the interface of the card. Then, ignoring time in flight, it takes about another microsecond to get the data from the interface card over the PCI bus to the system memory. That's nearly 2 microseconds right there. Now here comes the tricky part... measuring latency. How do you time it? How do you know when the data arrived? Do you have the remote CPU spin wait on a memory location? Does the card raise an interrupt (slow)? What do you call latency? The time it takes for the sender to initiate the transfer until the data has arrived, or until the data has been detected to arrive? I've seen people measure it both ways. Does the data have to be page aligned? Word aligned? Some interfaces have problems with unaligned data. Does the benchmark include a switch, or for the benchmark have the two machines been plugged together with the equivalent of a cross-over cable? Does the latency benchmark time an unreliable send (no checksum to detect message corruption) or does the interface card buffer and then check the integrity of the data before putting it in memory? Anyway, as you can see, head-to-head latency tests rarely happen. The only fair number is run a standard MPI latency ping/pong test, and report those latencies. Unfortunately, people rarely post their MPI latency measuring code, so we can't even do head to head comparisons there. The SCALI web pages say: "ScaMPI's message latency, measured as half the round-trip delay of a zero length MPI message, is less than 10 µsec". The BIP folks in France have an MPI over BIP that they report has a latency of 12 µsec, and a bandwidth of 1 Gb/sec for a 8MB message. As you can see, apples to apples is hard. -Pete --- ======================================================================== | Peter H. Beckman | Advanced Computing Laboratory | | Los Alamos National Laboratory | Phone: 505-665-0800 | | CIC/ACL MS-B287 | Fax: 505-665-4939 | | Los Alamos, NM 87545 | email: beckman@acl.lanl.gov | ======================================================================== From lindahl@cs.virginia.edu Sun, 6 Jun 1999 16:26:55 -0400 Date: Sun, 6 Jun 1999 16:26:55 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: Cables > In the meantime it is the > fastest interface available today and makes an ideal Beowulf interface. Please avoid generalizations. It isn't ideal if much cheaper hardware can do the same job -- maybe your application doesn't need that much network, or isn't sensitive to latency? And you still can't buy huge SCI switches, which makes it inferior to Myrinet for large systems. Linux drivers, or better yet, open-source drivers will be a big step forward for SCI and cluster computing. But you still have to look at price/performance. -- g From dhart@indiana.edu Sun, 6 Jun 1999 16:32:50 -0400 Date: Sun, 6 Jun 1999 16:32:50 -0400 From: Dave Hart dhart@indiana.edu Subject: Question on network performance.. Tulip.c At 11:29 PM 6/5/1999 -0500, Mark Dalton wrote: > >I have been helping a friend (or trying to) with his cluster. > >One thing that is interesting is that if I send larger packets >I start loosing some of the data. ... >The OS is Red Hat linux 5.2, with Linux 2.1.125 #5 SMP, i686 >(We plan to move on to Linux 2.2.*, after we are confident we > have the performance/stability we are happy with at this point, > and verify Linux 2.2.* works better, which I am sure it will > since these are DUAL CPU boxes). There were some tcp problem in versions before about 2.2.5, much discussed here about three monthes ago. For more information, see http://www.icase.edu/coral/LinuxTCP.html Suggestion: upgrade immediately. We're at 2.2.7 now and things are much better. -- David Hart http://php.indiana.edu/~dhart Research Computing Support 812-855-2632 University Information Technology Services Indiana University From lindahl@cs.virginia.edu Sun, 6 Jun 1999 17:12:22 -0400 Date: Sun, 6 Jun 1999 17:12:22 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: Cables > Anyway, as you can see, head-to-head latency tests rarely happen. The only > fair number is run a standard MPI latency ping/pong test, and report those > latencies. Unfortunately, people rarely post their MPI latency measuring > code, so we can't even do head to head comparisons there. Sounds like a call for consumer action: We should nag vendors to use the "mpptest" program distributed with mpich as their test program. If they don't do that, we should collect the information ourselves. -- greg From spiffy@tamu.edu Sun, 6 Jun 1999 20:01:58 -0400 Date: Sun, 6 Jun 1999 20:01:58 -0400 From: Scott Patrick Faasse spiffy@tamu.edu Subject: some newbie questions Okay I am new to the cluster world. My boss gave me permission to play with some old hardware (P5-75's with 32 Meg RAM, 700 Meg Hd's and Ne2000 10BaseT) So i installed extreme linux 5.0 and got the "render-farm" going (i call it that cause all we have done on it is run pvmpov :) ) now for some questions. what are the advantages of using PVM or LAM? are there any good benchmarks out there that report how many FLOPS or MIPS a cluster runs at ie bogomips for beowulf.? -spiffy ---------------------------------------------------- Scott "spiffy" Faasse web: temporarily unavailable email: spiffy@tamu.edu ---------------------------------------------------- From admin@cersa.admu.edu.ph Sun, 6 Jun 1999 21:16:59 -0400 Date: Sun, 6 Jun 1999 21:16:59 -0400 From: William Emmanuel S. Yu admin@cersa.admu.edu.ph Subject: cannot spawn node when using comppi i downloaded a pvm program called comppi that solves for pi. the problem is that there is an error because the program terminate after i do a pvm_send() in the code. what could be the possible errors. i double that the code is bad. it could be the pvm connection the the node. the sample pvm files such as hello and hello_other do not work. cannot spawn task. but pvmpov works but i think that it did not farm out the task because there when i did a top there is no x-pvmpov process in the nodes. there is nothing in the pvml.501 log file. what could be wrong? ------------------------------ William Emmanuel S. Yu william.s.yu@ieee.org ------------------------------ From wiseowl@accessgate.net Sun, 6 Jun 1999 21:40:38 -0400 Date: Sun, 6 Jun 1999 21:40:38 -0400 From: Doug Shubert wiseowl@accessgate.net Subject: Cables Pete Beckman wrote: > At 04:20 PM 6/5/99 -0700, Keith Murphy wrote: > >If you are looking for (800MBytes) aggregate performance try SCI > (Scaleable > >Coherent Interface) which also has an extremely low latency (better > than > >Myrinet). > > It often takes nearly a > microsec to get anything from memory to the PCI bus and out the > interface > of the card. Then, ignoring time in flight, it takes about another > microsecond to get the data from the interface card over the PCI bus > to the > system memory. That's nearly 2 microseconds right there. > I was wondering if the i810 Accelerated Hub Architecture PCI 266MB/s will providelower latencies than the current PCI 133MB/s? Although the memory controller and system memory running at 100Mhz may still be a bottleneck. Doug From pesch@ibm.net Sun, 6 Jun 1999 22:29:52 -0400 Date: Sun, 6 Jun 1999 22:29:52 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: Cables Agree, latency doesn't say a lot. Looks like some standards for measuring overall cluster performance are needed - probably best if they came from an authoritative source like NASA. However, I'm told that with SCI you can have a sustainable data transfer of 60 MB/s which compares well with Myrinet... Paul At 02:21 PM 6/6/99 -0600, Pete Beckman wrote: >At 04:20 PM 6/5/99 -0700, Keith Murphy wrote: >>If you are looking for (800MBytes) aggregate performance try SCI (Scaleable >>Coherent Interface) which also has an extremely low latency (better than >>Myrinet). > >I don't believe that is true. Measuring latency is a place where poor >benchmarks and clever tricks abound. Unfortunately, there is no standard >definition of "latency", just like some people say a Flop is a multiply and >add, and some people count a multiple and an add as two flops. SCI >publishes that latency could be as low as 2.3 microseconds. The thing to >remember is that all of these high-performance cards (HiPPI, Myrinet, SCI) >are generally limited first by the PCI bus. It often takes nearly a >microsec to get anything from memory to the PCI bus and out the interface >of the card. Then, ignoring time in flight, it takes about another >microsecond to get the data from the interface card over the PCI bus to the >system memory. That's nearly 2 microseconds right there. > >Now here comes the tricky part... measuring latency. How do you time it? >How do you know when the data arrived? Do you have the remote CPU spin >wait on a memory location? Does the card raise an interrupt (slow)? What >do you call latency? The time it takes for the sender to initiate the >transfer until the data has arrived, or until the data has been detected to >arrive? I've seen people measure it both ways. Does the data have to be >page aligned? Word aligned? Some interfaces have problems with unaligned >data. Does the benchmark include a switch, or for the benchmark have the >two machines been plugged together with the equivalent of a cross-over >cable? Does the latency benchmark time an unreliable send (no checksum to >detect message corruption) or does the interface card buffer and then check >the integrity of the data before putting it in memory? > >Anyway, as you can see, head-to-head latency tests rarely happen. The only >fair number is run a standard MPI latency ping/pong test, and report those >latencies. Unfortunately, people rarely post their MPI latency measuring >code, so we can't even do head to head comparisons there. The SCALI web >pages say: "ScaMPI's message latency, measured as half the round-trip delay >of a zero length MPI message, is less than 10 µsec". The BIP folks in >France have an MPI over BIP that they report has a latency of 12 µsec, and >a bandwidth of 1 Gb/sec for a 8MB message. As you can see, apples to >apples is hard. > >-Pete > > > >--- >======================================================================== >| Peter H. Beckman | Advanced Computing Laboratory | >| Los Alamos National Laboratory | Phone: 505-665-0800 | >| CIC/ACL MS-B287 | Fax: 505-665-4939 | >| Los Alamos, NM 87545 | email: beckman@acl.lanl.gov | >======================================================================== > > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From ulairi@ecs.csun.edu Sun, 6 Jun 1999 22:34:52 -0400 Date: Sun, 6 Jun 1999 22:34:52 -0400 From: Ulairi ulairi@ecs.csun.edu Subject: L2 cache (was: beowulf cluster) I would say yes. Especially if your CPU will be doing a lot of number crunching with laaaaarge amounts of data fed to/from it. (mostly to) | | hello, | | I have the option of buying machines with 2mb L2 cache. while this is | very nice, I was wondering if anyone had any input on if | this is worth | the extra money, when being considered in a cluster configuration. | | | thanks, | John TenEyck From pesch@ibm.net Sun, 6 Jun 1999 22:38:33 -0400 Date: Sun, 6 Jun 1999 22:38:33 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: Cables Do you have diagrams that show the decrease of performance due to tight bends? (I would assume the tighter the bend the less performance); I would be very interested to see them. Paul At 01:01 PM 6/6/99 -0400, Leif Hardison wrote: > > The reason you may notice improvement by using less pairs > is because you cut down on factors like interferance. Also > if you wish to get the best performance you will want to > crimp wires from loosely packed spools and make sure there > are no tight bends. If you bend a wire at to sharp of an > angle you will loose performance. > > I can snag some diagrams if anyone wishes. > > -Leif Hardison > hardware.doa.org > >On Sun, 6 Jun 1999, Paul Eduard Schenker wrote: > >> We looked at SCSI and dropped it; Myrinet looks good if you really need the >> bandwith. >> >> About FastEthernet: we've exxperimented with cables with only 4 wires, and >> they seem to work quite well (in some cases we observed a perfomance >> improvement for which I have absolutely no explanation). Any experience in >> this field? >> >> Paul >> >> At 04:29 PM 6/1/99 +0200, Martin Konold wrote: >> >On Tue, 1 Jun 1999, Alvin Starr wrote: >> > >> >> At 80Mbytes/sec SCSI can make for a fast link between a small number of >> >> systems and with a low overhead protocol it could help solve some of the >> >> problems involved in trying to share memory across a network. >> > >> >No, unfortunately the overhead of SCSI is compared to SCI and Myrinet >> >tremendeous. (Latencies in the ms range) >> > >> >Regards, >> >-- martin >> > >> >// Martin Konold, Herrenbergerstr. 14, 72070 Tuebingen, Germany // >> >// Email: konold@kde.org // >> >KDE: A stable GUI for a reliable OS. >> >UNIX: Everything including a device is a file. >> >KDE: Everything including a file is a URL. >> > >> > >> > >> Paul Eduard Schenker >> 1 Peirce Hill >> Singapore 248558 >> >> Phone: 476 2245 >> Fax: 472 6480 >> email: pesch@ibm.net >> > > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From pesch@ibm.net Sun, 6 Jun 1999 22:45:51 -0400 Date: Sun, 6 Jun 1999 22:45:51 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: hard disk reliability The reason I asked is that I'veheard that drives - before they fail - tend to run warmer. At 11:34 AM 6/6/99 -0500, Wayde Milas wrote: >Paul Eduard Schenker wrote: >> >> Have you measured the temperatures of the drives? - or do "hot" and "cool" >> represent subjective values... >> > >It is subjective. Id classify the Seagate Baracudas(sp)? 10000 rpm as >HOT. a plain old vanilla 5400 rpm lvd as cool... If you can touch the >drive witholut hurting yourself after its been active for 2 hours, its >cool. otherwise its hot. :P > >Ibms some where inbetween... > >Never said it was scientific, Just personal experience. Hot drives tend >to fail more often. > >Wayde > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From jav@blazenet.net Mon, 7 Jun 1999 01:42:44 -0400 Date: Mon, 7 Jun 1999 01:42:44 -0400 From: jav jav@blazenet.net Subject: hard disk reliability Two cents on this end. Whenever a mechanical device spins, a certain amount of friction will be passed through the bearings, shaft, and any other conductive surface attached to either of those. Spinning two devices built similarly at different speeds will show a heat difference, with the faster spinning device running with more heat. The heat produced is not so significant that it will (relatively) quickly deteriorate the metals involved, but over a certain amount of time, this heat will cause failure. Therefore, if the drives are built with similar materials, in a similar manner, the faster spinning drive will fail sooner. The heat produced *should not* effect the magnetic fields involved with the storage of the data. Although, with the crap we put drives through while crunching on a 'wulf, the mechanical failure due to thermal deterioration will come into play and the faster spinning drives will fail sooner. BTW, I've had a number of failures, but I like two drives for my systems, Seagate Cheetahs and the Compaq approved variations of the Seagates (10k RPM 9.1GB hot swap). They both have done well and both manufacturers have been pretty good about getting drives replaced quickly. I must say, however that I use RAID0 (2 drives) on each node and therefore I don't have that much of a problem even my systems are just for pure Beowulf R&D and I don't run mission critical apps on them at all. John > -----Original Message----- > From: Paul Eduard Schenker [SMTP:pesch@ibm.net] > Sent: Sunday, 06 June, 1999 22:51 > To: wmilas@rarcoa.com > Cc: Bob Cat; Greg Lindahl; beowulf@beowulf.gsfc.nasa.gov > Subject: Re: hard disk reliability > > The reason I asked is that I'veheard that drives - before they fail - > tend > to run warmer. > > At 11:34 AM 6/6/99 -0500, Wayde Milas wrote: > >Paul Eduard Schenker wrote: > >> > >> Have you measured the temperatures of the drives? - or do "hot" and > "cool" > >> represent subjective values... > >> > > > >It is subjective. Id classify the Seagate Baracudas(sp)? 10000 rpm as > >HOT. a plain old vanilla 5400 rpm lvd as cool... If you can touch the > >drive witholut hurting yourself after its been active for 2 hours, > its > >cool. otherwise its hot. :P > > > >Ibms some where inbetween... > > > >Never said it was scientific, Just personal experience. Hot drives > tend > >to fail more often. > > > >Wayde > > > > > Paul Eduard Schenker > 1 Peirce Hill > Singapore 248558 > > Phone: 476 2245 > Fax: 472 6480 > email: pesch@ibm.net > > From bkuhn@ebb.org Mon, 7 Jun 1999 03:02:33 -0400 Date: Mon, 7 Jun 1999 03:02:33 -0400 From: Bradley M. Kuhn bkuhn@ebb.org Subject: Computer Science research done on Beowulf class systems I am posting to ask what (if any) types of Computer Science research is being done on Beowulf-class systems. Our Computer Science department is considering building one. However, there is some concern that this computer will be more helpful to the rest of the science departments than to the Computer Science department. I realize that "navel-gazing" research into making Beowulf systems better, faster, and more reliable is certainly possible, and projects like the one at NASA and the Mosix project are doing this type of research. I also know that work to make automatically parallelizing compilers (an active area of research in the compiler design community) is very possible. However, what I am looking for is information about *real* projects using Beowulf-class computers for Computer Science research. I have found lots of information on various aerospace, geological, and other scientific problems being solved with Beowulf class systems. However, I don't see lots of Computer Science projects using these systems. If anyone could tell me about such projects, I would much appreciate it. -- - bkuhn@ebb.org - Bradley M. Kuhn - bkuhn@gnu.org - http://www.ebb.org/bkuhn From eugene.leitl@lrz.uni-muenchen.de Mon, 7 Jun 1999 03:28:23 -0400 Date: Mon, 7 Jun 1999 03:28:23 -0400 From: Eugene Leitl eugene.leitl@lrz.uni-muenchen.de Subject: the need for speed Just saw it on /. Overclocking might be anathema, yet cryo-overclocking perhaps not... http://www.wizard.com/users/scfoster/public_html/ IT'S COOL'N TIME The Project. I had read just about everything you could imagine on cooling systems for computers, looked at all the "projects" posted on the web sites, and looked at both KryoTech® and Asetek® and found nothing for dual processor systems that would go really low temp. Were talking sub -50C temps. The Kryo Tech® and Asetek® both claim "potential" -40C but because of our environmentally friendly R-134a freon, this is pushing the outer limits of the efficiency of this freon. These systems might achieve -40C under ideal, no load conditions, but could never maintain it under an actual max'd out processor load, and it is definitely not the temperature within the core of the processor. I had three obstacles to overcome, space constraints, two processors and commercially available components. The system had to be compact, it had to cool two PIII processors and everything had to be available over the counter or on the internet. Because of the desire to achieve sub -50C temps, I determined that this was a two stage approach. I could achieve exchanger temperature of -30C to -35C with a freon based cooling system but would have to resort to Thermal Electric's (peltiers) to bring it down to the target temps. [...] From knuto@scali.no Mon, 7 Jun 1999 04:42:31 -0400 Date: Mon, 7 Jun 1999 04:42:31 -0400 From: Knut Omang knuto@scali.no Subject: Cables Howard Miller wrote: > > On Sat, 5 Jun 1999, Keith Murphy wrote: > > > Check our Dolphin Interconnect and SCALI sites www.dolphinics.com and > > I thought that someone wrote an open-source driver specifically for some > of the Dolphin equipment, though, for the life of me, I can't remeber or > find who or where. I do remeber that the driver lacked some > backwards-compatability that the software from Dolphin offers, but > supposedly, for the purposes of the university that used it, it worked > flawlessly. Thats correct, I wrote that driver together with two collegues at University of Oslo, but as you can see from my signature Scali gave us an offer we could not refuse :-) so it ended up as Scali property. It has since then of course been subject to major development, particularly in the direction of scalability. Wrt. the backwards compatibility you probably refer to the older Sbus cards. Anyway these cards are not really interoperable with newer PCI cards since they use an older SCI chipset with a lower SCI signal rate. Wrt. our API, one of our reasons for designing a new API was our experiences with the existing one, so our API should really (hopefully..) be considered a progress. We are continuously evaluating how to deal with the source question and the driver which has the major part with full rights owned by Scali, but a small part - closest to the hardware - which is under a non-disclosure with Dolphin. In any case we are not religious about the driver source and have provided it under NDA to a number of customers already under Dolphin NDA. Driver and user API source is indeed designed from the start to be easy to compile and port to new architectures, and uses the GNU autoconfiguration utilities etc. Greg Lindahl wrote: > > > > In the meantime it is the > > fastest interface available today and makes an ideal Beowulf interface. > > Please avoid generalizations. It isn't ideal if much cheaper hardware > can do the same job -- maybe your application doesn't need that much > network, or isn't sensitive to latency? And you still can't buy huge > SCI switches, which makes it inferior to Myrinet for large systems. Our Scali systems are delivered with two SCI interfaces in each node, currently this allows (hw only) switching of packets in a 2D-mesh. We are going to get a 3D-mesh in the coming next generation hw. With our system you avoid the cost of expensive switches, since all nodes are their own switch. Today we have installed a system with a 12x8 mesh (96 nodes) (in Paderborn, Germany), with 3D in theory at least a 10x10x10 system should be doable. Knut Omang, Ph.D. Senior Software Architect, Scali AS Computer Systems e-mail: knuto@scali.com Voice: +47 63 84 67 09 / +47 22 50 14 11 http://www.scali.com Fax: +47 63 84 40 05 From JesseP@europe.stortek.com Mon, 7 Jun 1999 05:08:25 -0400 Date: Mon, 7 Jun 1999 05:08:25 -0400 From: Jessen, Per JesseP@europe.stortek.com Subject: hard disk reliability > -----Original Message----- > From: Greg Lindahl [mailto:lindahl@cs.virginia.edu] > Sent: 04 June 1999 17:10 [snip] > > 5) I think we'll find that power supply and CPU fans are > the most common > > failure points. > > But if you buy good ones, I have proof that they rarely fail. As I And even good CPU fans aren't that expensive. As you will find out, when you call up and ask to buy e.g. 40 or 50 CPU fans, they suddenly become very affordable. (not surprisingly) I bought 35 'Pabst' fans with heatsinks last year, at about 50% of their listed retail price. For fans (CPU and others) of varying sorts, see e.g. http://www.pabst.de. (they were quite happy to ship to the UK, so I'm assuming that goes for the rest of Europe too). regards, Per Jessen ENIDAN Technologies, London From hanzl@noel.feld.cvut.cz Mon, 7 Jun 1999 05:29:13 -0400 Date: Mon, 7 Jun 1999 05:29:13 -0400 From: Vaclav Hanzl hanzl@noel.feld.cvut.cz Subject: Dataless nodes using Coda? > One issue I see so far is that RedHat doesn't make it really easy to have a > shared /usr configuration (or at least they don't advertise the fact) and > most packages don't really advertise what they put in /usr and what they > put elsewhere (/etc, /sbin). I am using /usr shared readonly via NFS with RedHat 5.2. It is possible but not quite clean. The problems I encountered was as follows: 1) Cannot use amd to automount /usr. Had to put it to /etc/fstab. Sorry, no server redundance. (Wanted to use two identical servers.) 2) Linuxconf complains during startup. I am not using it on client nodes, so I do not care. No harm detected so far. 3) root login impossible when client fails to mount /usr. This is a nuisance. Login prompt invites you, but then there probably is some library located on /usr missing. I created my server by normal RH install, then copied everything but /usr to every client node and then used some scripts to change some config files. Cluster is usable, with the above mentioned problems (all of which could be solved if I had time to look after it). Regards Vaclav Hanzl From mack.joseph@epa.gov Mon, 7 Jun 1999 07:10:35 -0400 Date: Mon, 7 Jun 1999 07:10:35 -0400 From: Joseph Mack mack.joseph@epa.gov Subject: Proxy Server? Carlo Perassi wrote: > > what's > the best way to realize a "Super" proxy server? Have a look at the Linux Virtual Server Project http://proxy.iinchina.net/~wensong/ippfvs Joe -- Joseph Mack PhD, Senior Systems Engineer, Lockheed Martin contractor to the National Environmental Supercomputer Center, mailto:mack.joseph@epa.gov ph# 919-541-0007, RTP, NC, USA From deadline@plogic.com Mon, 7 Jun 1999 08:00:04 -0400 Date: Mon, 7 Jun 1999 08:00:04 -0400 From: Douglas Eadline deadline@plogic.com Subject: some newbie questions On Sun, 6 Jun 1999, Scott Patrick Faasse wrote: > > Okay I am new to the cluster world. My boss gave me permission to play > with some old hardware (P5-75's with 32 Meg RAM, 700 Meg Hd's and Ne2000 > 10BaseT) So i installed extreme linux 5.0 and got the "render-farm" going > (i call it that cause all we have done on it is run pvmpov :) ) > > now for some questions. what are the advantages of using PVM or LAM? > are there any good benchmarks out there that report how many FLOPS or MIPS > a cluster runs at ie bogomips for beowulf.? Good start. Now get rid of the EL Disk. See: http://www.dnaco.net/~kragen/beowulf-faq.txt Doug ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.861.6960 115 Research Drive | PARALLEL | Fax:+610.861.8247 Bethlehem, PA 18017 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- From josip@icase.edu Mon, 7 Jun 1999 08:20:48 -0400 Date: Mon, 7 Jun 1999 08:20:48 -0400 From: Josip Loncaric josip@icase.edu Subject: hard disk reliability Paul Eduard Schenker wrote: > > What brand were the disks? > > Paul I should have mentioned that. The disks were Seagate Medalist Pro ST36530A (6.5GB, 7200 RPM, Ultra ATA). See: http://www.seagate.com:80/cda/disc/tech/detail/0,1248,89,00.shtml Supposedly, they have 400,000 hour MTBF rating, but our experience has been 5 failures (3 recoverable and 2 nonrecoverable) out of 33 units during the first six months. Josip -- Dr. Josip Loncaric, Senior Staff Scientist mailto:josip@icase.edu ICASE, Mail Stop 132C http://www.icase.edu/~josip/ NASA Langley Research Center mailto:j.loncaric@larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134 From P.Andersen@coe.ttu.edu Mon, 7 Jun 1999 08:36:34 -0400 Date: Mon, 7 Jun 1999 08:36:34 -0400 From: Andersen, Per P.Andersen@coe.ttu.edu Subject: Computer Science research done on Beowulf class systems We use our cluster for teaching and research. Like you said a lot of the research is in other disciplines. The CS research on our modest cluster is in the area of load balancing, scheduling and new parallel programming languages. By mixing various systems with different CPUs in terms of speed we create some interesting load balancing challenges. Cluster scheduling has always been a problem particularly when a large number of students are using the cluster as part of their course work in this area we have investigated scheduling packages developed by others. The parallel languages development is a new research project we are just getting under way. Specifically we are developing SequenceL, a declarative language Dan Cooke our CS chair has developed, into a cluster programming language. SequenceL has some very important implicit parallelisms that will make programming parallel systems easier, or at least that's our hope. Per Andersen, MS, P.E. Director Advanced Computing Facility TEXAS TECH UNIVERSITY Dept. of Computer Science Ph: 806.742.3527 -----Original Message----- From: Bradley M. Kuhn [mailto:bkuhn@ebb.org] Sent: Monday, June 07, 1999 2:02 AM To: beowulf@beowulf.gsfc.nasa.gov Subject: Computer Science research done on Beowulf class systems I am posting to ask what (if any) types of Computer Science research is being done on Beowulf-class systems. Our Computer Science department is considering building one. However, there is some concern that this computer will be more helpful to the rest of the science departments than to the Computer Science department. I realize that "navel-gazing" research into making Beowulf systems better, faster, and more reliable is certainly possible, and projects like the one at NASA and the Mosix project are doing this type of research. I also know that work to make automatically parallelizing compilers (an active area of research in the compiler design community) is very possible. However, what I am looking for is information about *real* projects using Beowulf-class computers for Computer Science research. I have found lots of information on various aerospace, geological, and other scientific problems being solved with Beowulf class systems. However, I don't see lots of Computer Science projects using these systems. If anyone could tell me about such projects, I would much appreciate it. -- - bkuhn@ebb.org - Bradley M. Kuhn - bkuhn@gnu.org - http://www.ebb.org/bkuhn From oz@machenry.chem.soton.ac.uk Mon, 7 Jun 1999 08:44:46 -0400 Date: Mon, 7 Jun 1999 08:44:46 -0400 From: oz oz@machenry.chem.soton.ac.uk Subject: TCP patch for 2.2.2 We have recently upgraded our linux kernel to 2.2.2 from 2.0.36 and have seen a significant downgrading of comms performance in our MPI app. This is after using the andrea arcangelli's tcp patch. Some numbers for 2 node performance all, post-patch Josip Loncarics patch was used for the 2.0.36 kernel. CPU(mins) WALL(mins) 2.2.2 20.5 27 2.0.36 20.5 21 Has anyone seen anything similar?, or does anyone have any ideas what maybe happening?. Any help would be appreciated. Thanks OZ -- --------------------------------------------------------------------------- Dr O.Parchment | Email: oz@soton.ac.uk Research Fellow | Department of Chemistry | Tel: (01703) 594138 (UK) University of Southampton | Highfield | Fax: (01703) 593781 Southampton | SO17 1BJ | ----------------------------------------------------------------------------- From al@scali.no Mon, 7 Jun 1999 09:23:58 -0400 Date: Mon, 7 Jun 1999 09:23:58 -0400 From: Anders Liverud al@scali.no Subject: Cables Greg Lindahl wrote: > > > Anyway, as you can see, head-to-head latency tests rarely happen. The only > > fair number is run a standard MPI latency ping/pong test, and report those > > latencies. Unfortunately, people rarely post their MPI latency measuring > > code, so we can't even do head to head comparisons there. > > Sounds like a call for consumer action: We should nag vendors to use > the "mpptest" program distributed with mpich as their test program. If > they don't do that, we should collect the information ourselves. > > -- greg Here are the results of the "mpptest" program run between two 450 MHz Dual Pentium II interconnected with SCI, running Scali SCI driver (ScaSCI), Scali MPI (ScaMPI) and Redhat Linux 6.0 To show both latency and bandwidth, we have run the test for small and large messages. #p0 p1 dist len ave time (us) rate 0 1 1 0 13.535491 0.00 0 1 1 32 19.818096 1614685.91 0 1 1 64 33.240926 1925337.44 0 1 1 96 34.281767 2800322.42 0 1 1 128 34.461944 3714241.99 0 1 1 160 37.024158 4321502.75 0 1 1 192 37.095119 5175883.13 0 1 1 224 37.883568 5912853.81 0 1 1 256 37.705628 6789437.35 0 1 1 288 38.219046 7535509.92 0 1 1 320 38.252558 8365453.65 0 1 1 352 38.818683 9067798.64 0 1 1 384 38.856088 9882621.33 0 1 1 416 39.542255 10520391.34 0 1 1 448 39.538833 11330632.85 0 1 1 480 41.984336 11432835.23 0 1 1 512 42.035924 12180058.22 0 1 1 544 42.470909 12808767.52 0 1 1 576 42.477866 13560003.20 0 1 1 608 42.810078 14202263.27 0 1 1 640 42.998063 14884391.41 0 1 1 672 44.846502 14984446.13 0 1 1 704 44.869813 15689835.81 0 1 1 736 45.303992 16245808.83 0 1 1 768 45.399115 16916629.40 0 1 1 800 45.842869 17450914.78 0 1 1 832 45.936604 18111917.84 0 1 1 864 46.505212 18578562.86 0 1 1 896 46.391499 19313883.28 0 1 1 928 47.165273 19675492.99 0 1 1 960 47.001292 20424970.24 0 1 1 992 49.484357 20046739.22 0 1 1 1024 49.643729 20626975.85 # Model complexity is (2.962975e-05 + n * 2.093157e-08) # startup = 29.63 usec and transfer rate = 47.77 Mbytes/sec # Variance in fit = 0.000022 (smaller is better) #p0 p1 dist len ave time (us) rate 0 1 1 65536 1260.435422 51994730.45 0 1 1 131072 2187.997582 59905002.23 0 1 1 196608 3080.313285 63827273.98 0 1 1 262144 3977.117113 65913070.33 0 1 1 327680 4837.693801 67734754.10 0 1 1 393216 5725.836702 68673980.85 0 1 1 458752 6614.742548 69352963.72 0 1 1 524288 7478.801777 70103208.46 0 1 1 589824 8326.160241 70839856.90 0 1 1 655360 9206.776611 71182350.53 0 1 1 720896 10052.445403 71713495.68 0 1 1 786432 10918.792285 72025548.20 0 1 1 851968 11784.037722 72298478.68 0 1 1 917504 12640.646056 72583631.87 0 1 1 983040 13491.218204 72865176.82 0 1 1 1048576 14362.990204 73005410.79 # Model complexity is (4.749962e-04 + n * 1.327807e-08) # startup = 475.00 usec and transfer rate = 75.31 Mbytes/sec # Variance in fit = 0.015855 (smaller is better) _____________________________________ Anders Liverud (Project manager) Scali AS http://www.scali.com Hvamstubben 17, 2013 SKJETTEN, NORWAY Phone : +47 6384 6715 Fax : +47 6384 4005 e-mail : al@scali.no From philip_juels@harvard.edu Mon, 7 Jun 1999 09:40:44 -0400 Date: Mon, 7 Jun 1999 09:40:44 -0400 From: Philip Juels philip_juels@harvard.edu Subject: MKINTRD I recompiled my kernel and then ran mkinitrd to rebuild the ramdisk, but I get this crazy error... kernel does not recognize /dev/loop0 as a block device can't get loopback device. However, /dev/loop0 exists and is listed as a block device... brw-rw---- 1 root disk 7, 0 May 5 1998 /dev/loop0 Any ideas? Thanks, --Philip Juels philip_juels@harvard.edu From dan@cfdws10.concordia.ca Mon, 7 Jun 1999 09:50:03 -0400 Date: Mon, 7 Jun 1999 09:50:03 -0400 From: dan stanescu dan@cfdws10.concordia.ca Subject: Question about SMP within clusters Hi, We're trying to put together our first beowulf, and consider buying dual-processor P-II or P-III machines. However, I saw lots of discussions, including on this list, that when running i.e. a code parallelized with MPI on a dual-processor machine, one doesn't get the expected performance. I'm wondering if there there is anyone out there who has monitored this in detail and can tell me if it's worth or not. Thanks a lot, ------------------------------------------------------------------ Dan Stanescu, PhD | | CFD Laboratory, ER-301 | Tel.: (514)848-3138 | Concordia University | FAX : (514)848-8601 | 1455 de Maisonneuve Blvd. West | E-mail: dan@cfdlab.concordia.ca | Montreal, CANADA H3G 1M8 | | ------------------------------------------------------------------ From jferg@2boot.com Mon, 7 Jun 1999 09:50:07 -0400 Date: Mon, 7 Jun 1999 09:50:07 -0400 From: jferg jferg@2boot.com Subject: hard disk reliability This is a multi-part message in MIME format. --------------EB161891AD81BD4D96A00162 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Wayde Milas wrote: > Paul Eduard Schenker wrote: > > > > Have you measured the temperatures of the drives? - or do "hot" and "cool" > > represent subjective values... > > > > It is subjective. Id classify the Seagate Baracudas(sp)? 10000 rpm as > HOT. a plain old vanilla 5400 rpm lvd as cool... If you can touch the > drive witholut hurting yourself after its been active for 2 hours, its > cool. otherwise its hot. :P > > Ibms some where inbetween... > > Never said it was scientific, Just personal experience. Hot drives tend > to fail more often. > > Wayde The major power consumption component in a hard drive is the power required to overcome aerodynamic losses due to the spinning disk. (All other things being equal) the power requirement due to aerodynamic losses goes as the cube of the rotational speed. Thus, motor power in the 10KRPM drive vs the 5.4KRPM drive goes as 10^3 / 5.4^3, or about a factor of 6.3. Fast drives are hotter in more than way. It is also well known that failure rate rises rapidly with temperature. That's one reason ovens are used in system stress testing. -- Joe Ferguson, ApeX Systems Integration Corp. Voice: 919.468.8150 FAX: 919.468.5288 email: jferg@2boot.com --------------EB161891AD81BD4D96A00162 Content-Type: text/x-vcard; charset=us-ascii; name="jferg.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for jferg Content-Disposition: attachment; filename="jferg.vcf" begin:vcard n:Ferguson;Joe x-mozilla-html:FALSE org:ApeX Systems Integration Corp. adr:;;;;;; version:2.1 email;internet:jferg@2boot.com title:Tech Director x-mozilla-cpt:;0 fn:Joe Ferguson end:vcard --------------EB161891AD81BD4D96A00162-- From hanzl@noel.feld.cvut.cz Mon, 7 Jun 1999 10:47:14 -0400 Date: Mon, 7 Jun 1999 10:47:14 -0400 From: Vaclav Hanzl hanzl@noel.feld.cvut.cz Subject: Shared /usr and RedHat (Was: Dataless nodes using Coda?) Oh thanks, that's cool. Probably good easy way to go if one wants to stay close to the normal distribution install (to keep future upgrade simple). Gerry Creager wrote: > Vaclav Hanzl wrote: > > > > I am using /usr shared readonly via NFS with RedHat 5.2. It is > > possible but not quite clean. The problems I encountered was as > > follows: > > > > 1) Cannot use amd to automount /usr. Had to put it to /etc/fstab. > > Sorry, no server redundance. (Wanted to use two identical servers.) > > Same result here. > > > 2) Linuxconf complains during startup. I am not using it on client > > nodes, so I do not care. No harm detected so far. > > > > 3) root login impossible when client fails to mount /usr. This is a > > nuisance. Login prompt invites you, but then there probably is some > > library located on /usr missing. > > I created a minimal /usr on the local nodes and THEN still used the > mount-point for the NFS mount. When NFS was unhappy, which was rare, I > could still log in and do what I needed. I kept a more "current" copy > of /usr on the NFS server, and still only had to update it, and only it. From wrankin@ee.duke.edu Mon, 7 Jun 1999 11:00:56 -0400 Date: Mon, 7 Jun 1999 11:00:56 -0400 From: William T. Rankin wrankin@ee.duke.edu Subject: Computer Science research done on Beowulf class systems On Mon, 7 Jun 1999, Bradley M. Kuhn wrote: > I am posting to ask what (if any) types of Computer Science research is > being done on Beowulf-class systems. Our Computer Science department is > considering building one. However, there is some concern that this > computer will be more helpful to the rest of the science departments than to > the Computer Science department. Our CS department here at Duke is doing a variety of OS research on their Myrinet connected cluster. See http://www.cs.duke.edu/ari/ for more info. -b dr. bill rankin ............................... philosopher/coffee-drinker wrankin@ee.duke.edu ............................ writer of little programs duke university dept. of electrical engr ...... scientific computing group From ok_murphy@email.msn.com Mon, 7 Jun 1999 11:09:44 -0400 Date: Mon, 7 Jun 1999 11:09:44 -0400 From: Keith Murphy ok_murphy@email.msn.com Subject: Cables Like you I do not want to generalize. Of course if cheaper hardware can do the job it should be used. However many Beowulf projects could certainly use a faster interconnect and SCI (or Myrinet) will improve their performance. If you use 2D Torus you will not need any switches, there is a 96 node 192 server SCI system running in Paderborn Germany with no switches rated at 86.4 GigaFlops. -----Original Message----- From: Greg Lindahl To: Keith Murphy Cc: extreme-linux@acl.lanl.gov ; beowulf@beowulf.gsfc.nasa.gov Date: Mon, 7 Jun 1999 11:09:44 -0400 Subject: Re: Cables >> In the meantime it is the >> fastest interface available today and makes an ideal Beowulf interface. > >Please avoid generalizations. It isn't ideal if much cheaper hardware >can do the same job -- maybe your application doesn't need that much >network, or isn't sensitive to latency? And you still can't buy huge >SCI switches, which makes it inferior to Myrinet for large systems. > >Linux drivers, or better yet, open-source drivers will be a big step >forward for SCI and cluster computing. But you still have to look at >price/performance. > >-- g > From ok_murphy@email.msn.com Mon, 7 Jun 1999 11:24:42 -0400 Date: Mon, 7 Jun 1999 11:24:42 -0400 From: Keith Murphy ok_murphy@email.msn.com Subject: Cables The latency is hard to believe. You suggested that they are not comparing apples to apples and you are right since nobody has any apples like SCI. The 2 microseconds latency is for a direct store from an application into application memory in a remote node. For ring configurations the latency added for each node is minimal and for switches about .1 microseconds. You are right that this corresponds to the delay through the PC I/O systems, as the delay in the SCI network is small. This latency is measured by a CPU spin waiting and storing back to the sending node when the store is detected by the receiver. So it is a simple ping pong benchmark and the number is for one way communication. Also the time, is the time to be detected and no interrupt is involved. Our data does not have to be aligned, but the number is for aligned data. Mis-aligning will generally make you see higher delays. On two machines connected in a SCI "ring" have measured down to 1.9 microseconds latency. The ring architecture of SCI insures that you need no crossover cables. They use one type of cable for tow nodes (small ring), for rings and for switches. You can also combine rings and switches by connecting rings to a switch. Larger topologies will of course affect the latency. SCI has data checking built into hardware so this latency includes the checking. You are right that if you include any kind of SW protocol on top of the hardware the latency will get higher. This is necessary in HA application when you want to detect cables being removed from the system or when you have an application that runs a message passing API like MPI. -----Original Message----- From: Pete Beckman To: Keith Murphy ; Martin Konold ; Alvin Starr ; Paul Eduard Schenker Cc: sct@lanl.gov ; extreme-linux@acl.lanl.gov ; beowulf@beowulf.gsfc.nasa.gov Date: Mon, 7 Jun 1999 11:24:42 -0400 Subject: Re: Cables At 04:20 PM 6/5/99 -0700, Keith Murphy wrote: >If you are looking for (800MBytes) aggregate performance try SCI (Scaleable >Coherent Interface) which also has an extremely low latency (better than >Myrinet). I don't believe that is true. Measuring latency is a place where poor benchmarks and clever tricks abound. Unfortunately, there is no standard definition of "latency", just like some people say a Flop is a multiply and add, and some people count a multiple and an add as two flops. SCI publishes that latency could be as low as 2.3 microseconds. The thing to remember is that all of these high-performance cards (HiPPI, Myrinet, SCI) are generally limited first by the PCI bus. It often takes nearly a microsec to get anything from memory to the PCI bus and out the interface of the card. Then, ignoring time in flight, it takes about another microsecond to get the data from the interface card over the PCI bus to the system memory. That's nearly 2 microseconds right there. Now here comes the tricky part... measuring latency. How do you time it? How do you know when the data arrived? Do you have the remote CPU spin wait on a memory location? Does the card raise an interrupt (slow)? What do you call latency? The time it takes for the sender to initiate the transfer until the data has arrived, or until the data has been detected to arrive? I've seen people measure it both ways. Does the data have to be page aligned? Word aligned? Some interfaces have problems with unaligned data. Does the benchmark include a switch, or for the benchmark have the two machines been plugged together with the equivalent of a cross-over cable? Does the latency benchmark time an unreliable send (no checksum to detect message corruption) or does the interface card buffer and then check the integrity of the data before putting it in memory? Anyway, as you can see, head-to-head latency tests rarely happen. The only fair number is run a standard MPI latency ping/pong test, and report those latencies. Unfortunately, people rarely post their MPI latency measuring code, so we can't even do head to head comparisons there. The SCALI web pages say: "ScaMPI's message latency, measured as half the round-trip delay of a zero length MPI message, is less than 10 µsec". The BIP folks in France have an MPI over BIP that they report has a latency of 12 µsec, and a bandwidth of 1 Gb/sec for a 8MB message. As you can see, apples to apples is hard. -Pete --- ======================================================================== | Peter H. Beckman | Advanced Computing Laboratory | | Los Alamos National Laboratory | Phone: 505-665-0800 | | CIC/ACL MS-B287 | Fax: 505-665-4939 | | Los Alamos, NM 87545 | email: beckman@acl.lanl.gov | ======================================================================== From tibbs@math.uh.edu Mon, 7 Jun 1999 11:26:07 -0400 Date: Mon, 7 Jun 1999 11:26:07 -0400 From: Jason L Tibbitts III tibbs@math.uh.edu Subject: Dataless nodes using Coda? >>>>> "VH" == Vaclav Hanzl writes: VH> I am using /usr shared readonly via NFS with RedHat 5.2. It is possible VH> but not quite clean. This is unfortunate; a shared /usr configuration is rather useful and most major UNIX vendors make it easy (or at least they used to; I recall that AIX was trivial to set up like this). VH> 1) Cannot use amd to automount /usr. Had to put it to VH> /etc/fstab. Sorry, no server redundance. (Wanted to use two identical VH> servers.) I think Coda would take care of this. VH> 3) root login impossible when client fails to mount /usr. This is a VH> nuisance. Login prompt invites you, but then there probably is some VH> library located on /usr missing. That's bad, but Coda (with its disconnected operation) could help here as well. What about single user mode? If root can't login in single-user mode without /usr mounded than RedHat is really, really hosed. VH> I created my server by normal RH install, then copied everything but VH> /usr to every client node and then used some scripts to change some VH> config files. That could explain at least the linuxconf problem, but it would be nice to know if this can be done with packages because of the ease of using kickstart and the simplicity of upgrades. I'll pop over to the RH6.0 list and see if anyone there has any pointers. - J< From josip@icase.edu Mon, 7 Jun 1999 11:34:17 -0400 Date: Mon, 7 Jun 1999 11:34:17 -0400 From: Josip Loncaric josip@icase.edu Subject: TCP patch for 2.2.2 oz wrote: > > We have recently upgraded our linux kernel > to 2.2.2 from 2.0.36 and have seen a significant > downgrading of comms performance in our MPI app. > This is after using the andrea arcangelli's tcp > patch. > Some numbers for 2 node performance all, post-patch > Josip Loncarics patch was used for the 2.0.36 kernel. > > CPU(mins) WALL(mins) > 2.2.2 20.5 27 > 2.0.36 20.5 21 > > Has anyone seen anything similar?, or does anyone > have any ideas what maybe happening?. Any help would > be appreciated. We still use 2.0.36 and the patch helps our codes despite the cost of ACKing every packet. Linux TCP seems to have a problem when there are not enough ACKs coming back from the receiving node. My best guess about the root cause of this problem is as follows: The main idea of the slow start and congestion avoidance algorithms in TCP is that packets_sent should equal packets_received. Serious violation of this equality is interpreted as a signal to the sender to throttle back its data stream. Unfortunately, Linux compares packets_sent not to packets_received but to the number of ACKs it gets. For small packets, each delayed ACK may represent numerous packets_received, so packets_sent>>ACKs_back, which may be mistaken for network congestion by the Linux TCP protocol. Andrea has continued his work on this for the 2.2.x series of kernels. His latest approach is to ACK at least every other packet (see his messages on linux-kernel and linux-net mailing lists dated May 1999). BTW, you should really upgrade to 2.2.9 (your 2.2.2 was plagued with TCP bugs). Sincerely, Josip -- Dr. Josip Loncaric, Senior Staff Scientist mailto:josip@icase.edu ICASE, Mail Stop 132C http://www.icase.edu/~josip/ NASA Langley Research Center mailto:j.loncaric@larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134 From brua@paralline.com Mon, 7 Jun 1999 11:38:39 -0400 Date: Mon, 7 Jun 1999 11:38:39 -0400 From: Pierre Brua brua@paralline.com Subject: BIP (was Re: Cables) Pete Beckman wrote: > The SCALI web > pages say: "ScaMPI's message latency, measured as half the round-trip delay > of a zero length MPI message, is less than 10 µsec". The BIP folks in > France have an MPI over BIP that they report has a latency of 12 µsec, and > a bandwidth of 1 Gb/sec for a 8MB message. As you can see, apples to > apples is hard. BIP is not Open-source either. Close in fact. The source is owned by the Matra french company. See http://lhpca.univ-lyon1.fr/software/distrib.html for more. I don't think it will one day be released under the GPL. But who knows. Pierre -- Pierre Brua Parallélisme & Solutions Linux PARALLINE Sarl mail: brua@paralline.com 71, avenue des Vosges 67000 STRASBOURG http://www.paralline.com Tél:+33 3 88 14 17 40 Fax:+33 3 88 14 17 41 GSM : 06 16 01 46 65 From brua@paralline.com Mon, 7 Jun 1999 11:39:12 -0400 Date: Mon, 7 Jun 1999 11:39:12 -0400 From: Pierre Brua brua@paralline.com Subject: Oracle Parallel Server for Linux ? Hi, I'd like build Oracle database clusters, but the Oracle Parallel Server part of the Oracle database doesn't seem to have been adapted for Linux (yet). Does it need kernel changes to work/be efficient, for cluster-wide locks and such for example ? Is someone working on this subject ? -- Pierre Brua Parallélisme & Solutions Linux PARALLINE Sarl mail: brua@paralline.com 71, avenue des Vosges 67000 STRASBOURG http://www.paralline.com Tél:+33 3 88 14 17 40 Fax:+33 3 88 14 17 41 GSM : 06 16 01 46 65 From josip@icase.edu Mon, 7 Jun 1999 12:02:38 -0400 Date: Mon, 7 Jun 1999 12:02:38 -0400 From: Josip Loncaric josip@icase.edu Subject: I/O node: single or dual? We'd like to add two I/O servers with 100GB+ of scratch disk space to our cluster, with multiple disks forming at most two large filesystems. The idea is to stick a couple of fast 36GB drives per Linux box and call it an I/O server. The bottleneck will probably be NFS and/or our network (2x Fast Ethernet). My questions are these: Does the NFS server in Linux 2.2.x benefit from SMP? How much? Any benchmarks? Other suggestions? Personally, I'd prefer some form of a distributed file system (does GFS fit into this category?), but we cannot do this because distributed file serving would interfere with performance measurements. A job on nodes 1+2 could return variable timings depending on what someone on nodes 3+4 was doing (e.g. accessing files from 1+2). Thanks in advance, Josip -- Dr. Josip Loncaric, Senior Staff Scientist mailto:josip@icase.edu ICASE, Mail Stop 132C http://www.icase.edu/~josip/ NASA Langley Research Center mailto:j.loncaric@larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134 From crowleyk@archmil.org Mon, 7 Jun 1999 12:11:32 -0400 Date: Mon, 7 Jun 1999 12:11:32 -0400 From: kevin Crowley crowleyk@archmil.org Subject: Smallest Linux PC On Earth? Good Beowulf node? --------------445240A79EF86DE648DAE7D7 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit A friend of mine and myself are looking in to making something along those lines but using a 4"x6"x4" format with two case fans for cooling (one intake. one exhaust.), one pci slot for an ethernet card, one comm port for telnetting if needed and one IDE controller. No video, keyboard, mouse or other slots. External redundant power supply for up to a dozen nodes. Hoping to market it at well under $600 with 128 MB RAM, a 8+GB Western Digital drive and a 3com 905b TX card. Won't know for a while if it is doable by us. Kevin Crowley Christopher Snyder wrote: > Hey there, A few weeks ago I saw some stuff on the web about the > smallest Linux computer on Earth.It's a German made PC, about 4 inches > square, 2 inches thick, with a tiny solid state hard drive or IBM > driveand all the works. (Original development was for embedded > systems) I think the story was that for about 920 bucks $US$ these > guys put together a node with CPU, RAM, and diskin a box about the > size of a VCR cassette, I thought it might make a great tool for a > Beowulf system, but maybe not?(Consider having 64 mini nodes in a box > a little bigger than the size of one PC - full sized tower, sitting by > your desk, just crank up the air conditioning...) I read that they > even are using this little PC as a web server serving up their site on > the web. Where, I do not know. Anyway, If anyone has read about this > too, please let me know. I seem to have lost track of the site, the > manufacturer, and the original story! But I remember it was > basically low cost, small and they even had instructions on putting a > node all together... Regards, C.S. --------------445240A79EF86DE648DAE7D7 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit A friend of mine and myself are looking in to making something along those lines but  using a 4"x6"x4" format with two case fans for cooling (one intake. one exhaust.),  one pci slot for an ethernet card, one comm port for telnetting if needed and one IDE controller.  No video, keyboard, mouse or other slots.  External redundant power supply for up to a dozen nodes. Hoping to market it at well under $600 with 128 MB RAM, a 8+GB Western Digital drive and a 3com 905b TX card.  Won't know for a while if it is doable by us.

Kevin Crowley

Christopher Snyder wrote:

 Hey there, A few weeks ago I saw some stuff on the web about the smallest Linux computer on Earth.It's a German made PC, about 4 inches square, 2 inches thick, with a tiny solid state hard drive or IBM driveand all the works.  (Original development was for embedded systems) I think the story was that for about 920 bucks $US$ these guys put together a node with CPU, RAM, and diskin a box about the size of a VCR cassette, I thought it might make a great tool for a Beowulf system, but maybe not?(Consider having 64 mini nodes in a box a little bigger than the size of one PC - full sized tower, sitting by your desk, just crank up the air conditioning...) I read that they even are using this little PC as a web server serving up their site on the web.  Where, I do not know. Anyway, If anyone has read about this too, please let me know.  I seem to have lost track of the site, the manufacturer, and the original story!   But I remember it was basically low cost, small and they even had instructions on putting a node all together... Regards,         C.S.
--------------445240A79EF86DE648DAE7D7-- From pesch@ibm.net Mon, 7 Jun 1999 12:36:29 -0400 Date: Mon, 7 Jun 1999 12:36:29 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: hard disk reliability We're running a lot of disks at 5400 rpm without problems; what appears to surface here is that the technology for 10k RPM is not mature yet - which makes me wonder why IBM for one wants to phase out the 5400 rpm drives (replaced by 7200 rpm). We're in the final design stages of a little A/D converter which you plug into the parallel or keyboard port and hook up to up to 8 thermo-sensors to get temperature measurements of whichever parts you'd like to supervise on your motherboard. The idea is obvious: a problematic part like a fan-cooled cpu or hdd doesn't fail point blank but gives advance warning by slowly increasing its temperature. A program on each node will transmit the measurements periodically or when a triggerpoint is reached to the master. What do you think of the idea? (If you're interested we'll send you one.) Paul At 09:46 AM 6/7/99 -0400, jferg wrote: >Wayde Milas wrote: > >> Paul Eduard Schenker wrote: >> > >> > Have you measured the temperatures of the drives? - or do "hot" and "cool" >> > represent subjective values... >> > >> >> It is subjective. Id classify the Seagate Baracudas(sp)? 10000 rpm as >> HOT. a plain old vanilla 5400 rpm lvd as cool... If you can touch the >> drive witholut hurting yourself after its been active for 2 hours, its >> cool. otherwise its hot. :P >> >> Ibms some where inbetween... >> >> Never said it was scientific, Just personal experience. Hot drives tend >> to fail more often. >> >> Wayde > >The major power consumption component in a hard drive is the power required to >overcome aerodynamic losses due to the spinning disk. (All other things being >equal) the power requirement due to aerodynamic losses goes as the cube of the >rotational speed. Thus, motor power in the 10KRPM drive vs the 5.4KRPM drive >goes as 10^3 / 5.4^3, or about a factor of 6.3. Fast drives are hotter in >more than way. > >It is also well known that failure rate rises rapidly with temperature. That's >one reason ovens are used in system stress testing. > >-- >Joe Ferguson, ApeX Systems Integration Corp. >Voice: 919.468.8150 >FAX: 919.468.5288 >email: jferg@2boot.com > > > >Attachment Converted: "c:\ace\eudora\attach\jferg6.vcf" > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From JHindman@dhs.ca.gov Mon, 7 Jun 1999 12:45:13 -0400 Date: Mon, 7 Jun 1999 12:45:13 -0400 From: Hindman, John (DHS - ITSD) JHindman@dhs.ca.gov Subject: Computer Science research done on Beowulf class systems Brad, et. al., here is an idea/question from an applications guy without the technical know-how to answer it. Everything on Beowulf seems to be related to large scientific/engineering applications. Is Beowulf suitable for a more business oriented architecture of transaction processing against large relational databases? How about a TCP/IP network connecting remote users to a Beowulf system with the database on a storage area network? I have posed this question via fax and email to Red Hat and a professor whose web page seemed oriented toward more general problems. I have had no responses so far, so either the topic is potentially so commercially lucrative that they don't want to talk about it, or so off the wall that it isn't worth a reply. Thoughts, anyone? > -----Original Message----- > From: Bradley M. Kuhn [SMTP:bkuhn@ebb.org] > Sent: Monday, June 07, 1999 12:02 AM > To: beowulf@beowulf.gsfc.nasa.gov > Subject: Computer Science research done on Beowulf class systems > > > I am posting to ask what (if any) types of Computer Science research is > being done on Beowulf-class systems. Our Computer Science department is > considering building one. However, there is some concern that this > computer will be more helpful to the rest of the science departments than > to > the Computer Science department. > > I realize that "navel-gazing" research into making Beowulf systems better, > faster, and more reliable is certainly possible, and projects like the one > at NASA and the Mosix project are doing this type of research. > > I also know that work to make automatically parallelizing compilers (an > active area of research in the compiler design community) is very > possible. > > However, what I am looking for is information about *real* projects using > Beowulf-class computers for Computer Science research. I have found lots > of > information on various aerospace, geological, and other scientific > problems > being solved with Beowulf class systems. However, I don't see lots of > Computer Science projects using these systems. > > If anyone could tell me about such projects, I would much appreciate it. > > -- > - bkuhn@ebb.org - Bradley M. Kuhn - bkuhn@gnu.org - > http://www.ebb.org/bkuhn From ronelson@vt.edu Mon, 7 Jun 1999 12:55:57 -0400 Date: Mon, 7 Jun 1999 12:55:57 -0400 From: Rob Nelson ronelson@vt.edu Subject: MKINTRD > kernel does not recognize /dev/loop0 as a block device > can't get loopback device. > > However, /dev/loop0 exists and is listed as a block device... There's an option in the kernel configuration to use a loopback block device. If that wasn't checked, I suspect even if it exists, the kernel doesn't know what to do with it. Rob Nelson ronelson@vt.edu From fcalvay@aviion.univ-lemans.fr Mon, 7 Jun 1999 13:07:02 -0400 Date: Mon, 7 Jun 1999 13:07:02 -0400 From: Florent Calvayrac fcalvay@aviion.univ-lemans.fr Subject: Cables Keith Murphy wrote: > Like you I do not want to generalize. Of course if cheaper hardware can do > the job it should be used. However many Beowulf projects could certainly > use a faster interconnect and SCI (or Myrinet) will improve their > performance. > > If you use 2D Torus you will not need any switches, there is a 96 node 192 > server SCI system running in Paderborn Germany with no switches rated at > 86.4 GigaFlops. > > >> In the meantime it is the > >> fastest interface available today and makes an ideal Beowulf interface. > > > >Please avoid generalizations. It isn't ideal if much cheaper hardware > >can do the same job -- maybe your application doesn't need that much > >network, or isn't sensitive to latency? And you still can't buy huge > >SCI switches, which makes it inferior to Myrinet for large systems. > > > >Linux drivers, or better yet, open-source drivers will be a big step > >forward for SCI and cluster computing. But you still have to look at > >price/performance. > > > I had the occasion thanks to our German colleagues to test the 32 processors SCI cluster in Paderborn, and I had a very sobering experience. I have developed a parallel Density Functional program under MPI where the wavefunctions are distributed among the processors, and to a good approximation the parallel work amounts to repetitively summing up the density on the discretization grids. It seems (but it might be wrong) that the corresponding MPI_ALLREDUCE are very defavorable under Scampi, and indeed I get a better performance with a TCP/IP Fast Ethernet network, because it seems that the reductions/distributions are way better with such a communication network. Any comments ? What is BIP by the way ? -- Florent Calvayrac | Tel : 02 43 83 32 72 Laboratoire de Physique de l'Etat Condense | Fax : 02 43 83 35 18 UPRESA-CNRS 6087 | Universite du Maine-Faculte des Sciences | 72085 Le Mans Cedex 9 From pesch@ibm.net Mon, 7 Jun 1999 13:14:40 -0400 Date: Mon, 7 Jun 1999 13:14:40 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: the need for speed I have used Peltier elements (expensive) for an astronomical ccd camera and got to about -35C by watercooling the "hot" side of the element to about +18C. I read somewhere that you can go much lower with two Peltiers in series. What you have to look at, i.e. calculate, is the amount of heat you have to remove and then build the cooling system accordingly. Paul At 12:25 AM 6/7/99 -0700, Eugene Leitl wrote: > >Just saw it on /. > >Overclocking might be anathema, yet cryo-overclocking perhaps not... > >http://www.wizard.com/users/scfoster/public_html/ > > IT'S COOL'N TIME > > The Project. > >I had read just about everything you could imagine on cooling systems >for computers, looked at all the "projects" posted on the web sites, and >looked at both KryoTech® and Asetek® and found nothing for dual >processor systems that would go really low temp. Were talking sub -50C >temps. > >The Kryo Tech® and Asetek® both claim "potential" -40C but >because of our environmentally friendly R-134a freon, this is pushing >the outer limits of the efficiency of this freon. These systems might >achieve -40C under ideal, no load conditions, but could never maintain >it under an actual max'd out processor load, and it is definitely not the >temperature within the core of the processor. > >I had three obstacles to overcome, space constraints, two processors and >commercially available components. The system had to be compact, it >had to cool two PIII processors and everything had to be available over >the counter or on the internet. > >Because of the desire to achieve sub -50C temps, I determined that this >was a two stage approach. I could achieve exchanger temperature of >-30C to -35C with a freon based cooling system but would have to >resort to Thermal Electric's (peltiers) to bring it down to the target >temps. > >[...] > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From jferg@2boot.com Mon, 7 Jun 1999 13:18:59 -0400 Date: Mon, 7 Jun 1999 13:18:59 -0400 From: jferg jferg@2boot.com Subject: hard disk reliability This is a multi-part message in MIME format. --------------E6AF5261A3D884D47EBC231D Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Paul Eduard Schenker wrote: > We're running a lot of disks at 5400 rpm without problems; what appears to > surface here is that the technology for 10k RPM is not mature yet - which > makes me wonder why IBM for one wants to phase out the 5400 rpm drives > (replaced by 7200 rpm). > > We're in the final design stages of a little A/D converter which you plug > into the parallel or keyboard port and hook up to up to 8 thermo-sensors to > get temperature measurements of whichever parts you'd like to supervise on > your motherboard. The idea is obvious: a problematic part like a fan-cooled > cpu or hdd doesn't fail point blank but gives advance warning by slowly > increasing its temperature. A program on each node will transmit the > measurements periodically or when a triggerpoint is reached to the master. > What do you think of the idea? (If you're interested we'll send you one.) > > Paul > > At 09:46 AM 6/7/99 -0400, jferg wrote: > >Wayde Milas wrote: > > > >> Paul Eduard Schenker wrote: > >> > > >> > Have you measured the temperatures of the drives? - or do "hot" and > "cool" > >> > represent subjective values... > >> > > >> > >> It is subjective. Id classify the Seagate Baracudas(sp)? 10000 rpm as > >> HOT. a plain old vanilla 5400 rpm lvd as cool... If you can touch the > >> drive witholut hurting yourself after its been active for 2 hours, its > >> cool. otherwise its hot. :P > >> > >> Ibms some where inbetween... > >> > >> Never said it was scientific, Just personal experience. Hot drives tend > >> to fail more often. > >> > >> Wayde > > > >The major power consumption component in a hard drive is the power > required to > >overcome aerodynamic losses due to the spinning disk. (All other things > being > >equal) the power requirement due to aerodynamic losses goes as the cube of > the > >rotational speed. Thus, motor power in the 10KRPM drive vs the 5.4KRPM drive > >goes as 10^3 / 5.4^3, or about a factor of 6.3. Fast drives are hotter in > >more than way. > > > >It is also well known that failure rate rises rapidly with temperature. > That's > >one reason ovens are used in system stress testing. > > > >-- > >Joe Ferguson, ApeX Systems Integration Corp. > >Voice: 919.468.8150 > >FAX: 919.468.5288 > >email: jferg@2boot.com > > > > > > > >Attachment Converted: "c:\ace\eudora\attach\jferg6.vcf" > > > Paul Eduard Schenker > 1 Peirce Hill > Singapore 248558 > > Phone: 476 2245 > Fax: 472 6480 > email: pesch@ibm.net We need the capability to get the temperature for the reasons you suggest. I believe the best place to do it is on the processor die itself. The temp. coefficient of silicon is a nice place to start designing for such an integrated capability. But untill that comes to pass, external sensors are important. Som recent BIOSs support temperature measurement at the CPU heatsink. -- Joe Ferguson, ApeX Systems Integration Corp. Voice: 919.468.8150 FAX: 919.468.5288 email: jferg@2boot.com --------------E6AF5261A3D884D47EBC231D Content-Type: text/x-vcard; charset=us-ascii; name="jferg.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for jferg Content-Disposition: attachment; filename="jferg.vcf" begin:vcard n:Ferguson;Joe x-mozilla-html:FALSE org:ApeX Systems Integration Corp. adr:;;;;;; version:2.1 email;internet:jferg@2boot.com title:Tech Director x-mozilla-cpt:;0 fn:Joe Ferguson end:vcard --------------E6AF5261A3D884D47EBC231D-- From lindahl@cs.virginia.edu Mon, 7 Jun 1999 13:35:27 -0400 Date: Mon, 7 Jun 1999 13:35:27 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: Computer Science research done on Beowulf class systems > I am posting to ask what (if any) types of Computer Science research is > being done on Beowulf-class systems. The Legion group at the University of Virginia uses our cluster for development of the Legion metacomputing system, and also as a testbed. You can't beat the price, if you're developing software. We have more CPU power for our dozen developers than the rest of the department combined. -- g From lindahl@cs.virginia.edu Mon, 7 Jun 1999 13:36:43 -0400 Date: Mon, 7 Jun 1999 13:36:43 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: Computer Science research done on Beowulf class systems > Everything on Beowulf seems to be related to large scientific/engineering > applications. Is Beowulf suitable for a more business oriented architecture > of transaction processing against large relational databases? Yes, although the commercial databases have not released their "MPP" versions. But many websites consist of clusters which serve as front-ends for a database. > I have posed this question via fax and email to Red Hat and a professor > whose web page seemed oriented toward more general problems. I have had no > responses so far, so either the topic is potentially so commercially > lucrative that they don't want to talk about it, or so off the wall that it > isn't worth a reply. Been done for many years. -- g From lindahl@cs.virginia.edu Mon, 7 Jun 1999 13:40:04 -0400 Date: Mon, 7 Jun 1999 13:40:04 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: Cables > I have developed a parallel Density Functional program under MPI where > the wavefunctions are distributed among the processors, and to > a good approximation the parallel work amounts to repetitively > summing up the density on the discretization grids. It seems (but it might be > wrong) > that the corresponding MPI_ALLREDUCE are very defavorable > under Scampi, and indeed I get a better performance with a TCP/IP > Fast Ethernet network, because it seems that the reductions/distributions > are way better with such a communication network. Sounds like a mistake in their MPI implementation. There are a lot of details to get right in order to get good performance on a wide variety of MPI codes. -- g From kat.kirk@thelinuxstore.com Mon, 7 Jun 1999 13:51:12 -0400 Date: Mon, 7 Jun 1999 13:51:12 -0400 From: Kat kirk kat.kirk@thelinuxstore.com Subject: all this political wapping Jonathan Clements wrote: > > Guys, > > I have been watching this list for about 6 months now. I do so to learn > about clusters. But recently there has been an increase in the "political" > discussions that have been going on. Very little of what has been said is > constructive, well thought out, or well informed. > > I am by no means a cluster expert, but I am sick of reading this "big > brother is out to keep the man" down bull shit. SHUT UP! Half the emails I > get are this kind of crap. If any of you out there are in any way qualified > to discuss these things you certainly aren't showing it. > > On this list you (for the most part) treat "newbies" resonably well and try > to help them. But when someone says something that wouldn't even be a > "newbie" political question/statement we spend a week and twenty email > discussing it when it doesn't even warrant one. Any resenblence to fact by > what was laid out is generally purely coincidental. > > So if you stop "sharing" your "opinions", then I am going to personally take > it upon myself to answer every single riduculous email (off the list of > course). Shut up and stop wasting my mail box space. > > And your momma too! > jonathan clements > > _______________________________________________________________ > Get Free Email and Do More On The Web. Visit http://www.msn.com hmm, everyone has a right to say what they want, i guess if it offends you in such a way then ignore. (which you arent doing to well ). It is a mail list and people are open to their own topics of conversation i guess. If it is so offending then deal with it in your own way on your own time because..you may be wasting someone's mail box space also by complaining. From kat.kirk@thelinuxstore.com Mon, 7 Jun 1999 14:05:27 -0400 Date: Mon, 7 Jun 1999 14:05:27 -0400 From: Kat kirk kat.kirk@thelinuxstore.com Subject: 2.2.9 Anyone upgrade to 2.2.9 ? If so how goes? has anyone had any kaka with nfs? From lindahl@cs.virginia.edu Mon, 7 Jun 1999 14:25:35 -0400 Date: Mon, 7 Jun 1999 14:25:35 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: Cables > If you use 2D Torus you will not need any switches, there is a 96 node 192 > server SCI system running in Paderborn Germany with no switches rated at > 86.4 GigaFlops. But then I pay a huge price in terms of lowered bisection bandwidth, and I have to teach my queue system and my MPI implementation about the topology of the network, and I have to accept the fact that 2 unrelated jobs on the same system will interfere with each other. I could imagine that a program which only wanted to send a few small packets with low latency might not be bothered by this, but I can imagine other bandwidth-hog programs which would run better on fast ethernet than SCI. -- g From lindahl@cs.virginia.edu Mon, 7 Jun 1999 14:30:31 -0400 Date: Mon, 7 Jun 1999 14:30:31 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: Cables > Here are the results of the "mpptest" program run between two 450 MHz > Dual Pentium II interconnected with SCI, running Scali SCI driver > (ScaSCI), Scali MPI (ScaMPI) and Redhat Linux 6.0 Thank you -- this is excellent information. > #p0 p1 dist len ave time (us) rate > 0 1 1 0 13.535491 0.00 It shouldn't surprise anyone that a 2 us ping-pong turns into 13.5 us when you are using MPI. Overall, these numbers are actually quite good. The bandwidth rises much more steeply than Myrinet and peaks out faster than Myrinet, although it sounds like most SCI systems are not fully switched, so a typical SCI application will see less bandwidth than this case. If anyone would like to see the graphs, they are temporarily at: http://www.cs.virginia.edu/~lindahl/bandwidth.gif I'm looking forward to benchmarking SCI on some real problems. -- g From walt@parl.ces.clemson.edu Mon, 7 Jun 1999 14:31:00 -0400 Date: Mon, 7 Jun 1999 14:31:00 -0400 From: Walter B. Ligon III walt@parl.ces.clemson.edu Subject: Computer Science research done on Beowulf class systems -------- The reason you hear more about scientific/engineering apps on parallel systems is that they have been working on them for well over 30 years (jeez, I guess its more like 40 years or more ...) so when one wants to implement parallel codes, there is a lot of experience in doing that with large numeric codes. There is a lot less experience in doing it with business apps, and much fewer business app programmers that want to learn to program a parallel machine and not too many companies are willing to take the risk of going with a parallel system given they will have to port a lot of code and re-train programmers. A lot less, but not none. I know for example that back in the 80's some of the big credit rating houses were working with parallel machines. There are also quite a few distributed systems out there. Smaller companies are sometimes willing to experiment with it in hopes of getting a leg up. One big reason that businesses are reluctant (and the programmers who work for them) is that it is not easy to quantify what (if anything) going parallel will give back. It isn't a situation of just plug in your parallel machine, doing some porting, and BOOM! improved performance (or reliability or whatever). You really have to know your system to know how and if a parallel implementation will work for your app. Also every problem can be parallelized in several different ways. So how do you know if you have done it the best way, or even a decent way? As an example, people are always posting to this group asking about using a beowulf for a great-big web server. Problem is, for most web servers it isn't the computation that is the bottleneck - its the network. A few really large sites might need to spread the load a bit, but would probably be better served my multiple servers that work together than a beowulf. A beowulf really doesn't have facilities to support really really large external network traffic. Of course, I am sure there ARE some web applications that COULD make use of a beowulf. For example, if my web page allowed users to browse a large database of imagery and make image transformations - and finally download the results then having a beowulf to do the file I/O and processing might be a good idea. Or, in the business arena, if my site does a lot if complex database searches (more than just looking up a couple of records in a table) then having those searches distributed by a beowulf on the back side might make sense. Most people in parallel processing research would like to see parallel processing become more ubiquitous. We tend to believe that eventually all computers and all programs will be parallel. A major step in doing that is convincing our most conservative users this is a good thing, and frankly we just aren't quite there yet. Boy, it sure is a lot easier than it was 10 years ago - but its still not the truly mainstream thing we'd like it to be. Anyway, those are my thoughts Walt > Brad, et. al., here is an idea/question from an applications guy without the > technical know-how to answer it. > > Everything on Beowulf seems to be related to large scientific/engineering > applications. Is Beowulf suitable for a more business oriented architecture > of transaction processing against large relational databases? How about a > TCP/IP network connecting remote users to a Beowulf system with the database > on a storage area network? > > I have posed this question via fax and email to Red Hat and a professor > whose web page seemed oriented toward more general problems. I have had no > responses so far, so either the topic is potentially so commercially > lucrative that they don't want to talk about it, or so off the wall that it > isn't worth a reply. > > Thoughts, anyone? > -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University From kragen@pobox.com Mon, 7 Jun 1999 14:41:40 -0400 Date: Mon, 7 Jun 1999 14:41:40 -0400 From: Kragen Sitaker kragen@pobox.com Subject: latency (was Re: Cables) Some SCI guy said: > you are right since nobody has any apples like SCI. > > The 2 microseconds latency is for a direct store from an application into > application memory in a remote node. For ring configurations the latency > added for each node is minimal and for switches about .1 microseconds. You > are right that this corresponds to the delay through the PC I/O systems, as > the delay in the SCI network is small. > > This latency is measured by a CPU spin waiting and storing back to the > sending node when the store is detected by the receiver. In other words, the benchmark you're talking about is for a CC-NUMA system, which can't be built from standard PCs -- not for a cluster. Am I mistaken? A CC-NUMA system is not a Beowulf. Hey, I can show you some really impressively small latencies between the CPUs on my dual-Pentium-Pro SMP, or the 16-processor Sun Enterprise 10000 down the hall[0]. But that is not really relevant to the Beowulf list. (Well, maybe one day SCI-based CC-NUMAs will eat Beowulf's lunch. Not this week though.) [0] I don't really have a Sun Enterprise 10000. -- Kragen Sitaker TurboLinux is outselling NT in Japan's retail software market 10 to 1, so I hear. -- http://www.performancecomputing.com/opinions/unixriot/981218.shtml From tibbs@math.uh.edu Mon, 7 Jun 1999 15:28:39 -0400 Date: Mon, 7 Jun 1999 15:28:39 -0400 From: Jason L Tibbitts III tibbs@math.uh.edu Subject: Home directories and non-worldly nodes Is it absolutely necessary that every node be able to see each user's home directory (or, for that matter, allow logins)? Each user here only ever has one directory no matter what OS and host they're on. Of course, the worldly node/server of the cluster will be able to see the directories so the user can log in compile and such, but if the nodes themselves actually need to mount the directories then an interesting problem develops because they can't actually see the fileservers. If it is true then how does everyone work around this deficiency? I could just make every node worldly (its just a matter of the IP address and an extra cable to the switch stack), but then that's a whole other pile of machines I have to monitor for network intrusion and such. I could also give the users special home directories, but that breaks what so far has been a nice seamless environment. Ideas for using some type of packet forwarding or NFS proxy on the server make me queasy. Thanks, -- Jason L Tibbitts III - tibbs@uh.edu - 713/743-3486 - 660PGH - 94 PC800 System Manager: University of Houston Department of Mathematics "You'll see the blood as we roll in it together..." From lindahl@cs.virginia.edu Mon, 7 Jun 1999 15:33:44 -0400 Date: Mon, 7 Jun 1999 15:33:44 -0400 From: Greg Lindahl lindahl@cs.virginia.edu Subject: Computer Science research done on Beowulf class systems > A few really > large sites might need to spread the load a bit, but would probably be better > served my multiple servers that work together than a beowulf. Obviously we are back to "What's a beowulf?" Multiple servers that work together is a traditional cluster. Businesses, btw, have used clusters for as long as scientific programmers have used clusters. The wall street firm I used to work for didn't have any machine with more than 2 CPUs, nor did they do any parallel programing, but they had a large cluster. They built it for availability and throughput reasons. > A beowulf really doesn't have facilities to support really really > large external network traffic. But my cluster does. And, actually, even the strict definition of "beowulf" doesn't outlaw big gateways. So watch out for folks who use "beowulf" interchangably with "cluster". I don't, but most of the new people asking questions on this mailing list do. -- g From florin@bamberg.baynet.de Mon, 7 Jun 1999 16:40:45 -0400 Date: Mon, 7 Jun 1999 16:40:45 -0400 From: Florin Boariu florin@bamberg.baynet.de Subject: the need for speed Paul Eduard Schenker wrote: > > I have used Peltier elements (expensive) for an astronomical ccd camera and > got to about -35C by watercooling the "hot" side of the element to about > +18C. I read somewhere that you can go much lower with two Peltiers in > series. What you have to look at, i.e. calculate, is the amount of heat you > have to remove and then build the cooling system accordingly. > Please excuse my naive question, but why do you need temperatures below zero to cool down a CPU? Is it to turn down the resistance of the material, so that it doesn't melt when you clock it 2 or 3 times higher? (i think this is it, but i want to know for sure) How high can you really clock a CPU that way (supposing you really get it as cool as you want)? florin. -- " [...] Linux source code would have to be shielded from young eyes, lest they get the impression that "fuck" is a valid engineering term." ----------------------------------------------------- Blue Ribbon -- supporting free speech on the net From walt@parl.ces.clemson.edu Mon, 7 Jun 1999 16:41:35 -0400 Date: Mon, 7 Jun 1999 16:41:35 -0400 From: Walter B. Ligon III walt@parl.ces.clemson.edu Subject: Computer Science research done on Beowulf class systems -------- > > A few really > > large sites might need to spread the load a bit, but would probably be better > > served my multiple servers that work together than a beowulf. > > Obviously we are back to "What's a beowulf?" Multiple servers that > work together is a traditional cluster. Businesses, btw, have used > clusters for as long as scientific programmers have used clusters. The > wall street firm I used to work for didn't have any machine with more > than 2 CPUs, nor did they do any parallel programing, but they had a > large cluster. They built it for availability and throughput reasons. Yeah, well, I really don't want to debate that. What you have said here is exactly my point. Beowulf isn't really the approach for these problems, but there are other good approaches. "Clusters" have been around for a long time in many different forms, and certainly they are a technique for improving throughput and capacity. Beowulf is a parallel computer architecture. Most of the resources in a beowulf are for internal use. A lot of what I said in that posting was actually generic to parallel computers - which HAVE been around longer than computers clustered for business use. Anyway, people tend to think that parallel systems are going to magically fix anything and that's not the case. I think it is important they know that and why. For an extreme example there was a professor where I went to school that worked in AI (quite a good one, too) who told a class that the implementation complexity of some search algorithm didn't matter because soon parallel computers would be able to run them very quickly. The algorithm was NP-complete. Parallel computers AREN'T going to fix that. > > A beowulf really doesn't have facilities to support really really > > large external network traffic. > > But my cluster does. And, actually, even the strict definition of > "beowulf" doesn't outlaw big gateways. Well, I didn't really mean that you can't put a big ol' NIC in your beowulf. I mean that all of the network bandwidth and CPU power isn't easily adapted to a problem without a fairly large ratio of computation and local I/O to data being shipped into or out of the site. To re-iterate what you have said, there ARE other clustering approaches that DO provide this. > So watch out for folks who use "beowulf" interchangably with "cluster". > I don't, but most of the new people asking questions on this mailing > list do. Well, I feel I should work to educate them, not support their misconceptions. Walt -- Dr. Walter B. Ligon III Associate Professor ECE Department Clemson University From wrankin@ee.duke.edu Mon, 7 Jun 1999 16:44:23 -0400 Date: Mon, 7 Jun 1999 16:44:23 -0400 From: William T. Rankin wrankin@ee.duke.edu Subject: latency (was Re: Cables) On Mon, 7 Jun 1999, Kragen Sitaker wrote: > Some SCI guy said: > > you are right since nobody has any apples like SCI. > > > > The 2 microseconds latency is for a direct store from an application into > > application memory in a remote node. For ring configurations the latency > > added for each node is minimal and for switches about .1 microseconds. You > > are right that this corresponds to the delay through the PC I/O systems, as > > the delay in the SCI network is small. > > > > This latency is measured by a CPU spin waiting and storing back to the > > sending node when the store is detected by the receiver. > > In other words, the benchmark you're talking about is for a CC-NUMA > system, which can't be built from standard PCs -- not for a cluster. > Am I mistaken? Maybe. It sounds like they have the remote CPU constantly reading a single memory word and respinding when that word changes. While it's not shared-memory (like CC-NUMA), there *are* (as others have pointed out) problems in using this type of measurement as an indication actual performance of real-world applications. > [0] I don't really have a Sun Enterprise 10000. It's like that 64-node O2K of mine ;-) -b From rauch@inf.ethz.ch Mon, 7 Jun 1999 17:30:49 -0400 Date: Mon, 7 Jun 1999 17:30:49 -0400 From: Felix Rauch rauch@inf.ethz.ch Subject: Cables On Mon, 7 Jun 1999, Greg Lindahl wrote: [...] > The bandwidth rises much more steeply than Myrinet and peaks out > faster than Myrinet, although it sounds like most SCI systems are not > fully switched, so a typical SCI application will see less bandwidth > than this case. > > If anyone would like to see the graphs, they are temporarily at: > > http://www.cs.virginia.edu/~lindahl/bandwidth.gif > > I'm looking forward to benchmarking SCI on some real problems. For a comparision of SCI and Myrinet you might want to look at a paper published in last years "SCI Europe" conference written by collegues from our research group: http://www.cs.inf.ethz.ch/CoPs/publications/ (the second paper) I just thought this might interest some people following this discussion. - Felix -- Felix Rauch | Email: rauch@inf.ethz.ch Institute for Computer Systems | Homepage: http://www.cs.inf.ethz.ch/~rauch/ ETH Zentrum / RZ H15 | Phone: ++41 1 632 7489 CH - 8092 Zuerich / Switzerland | Fax: ++41 1 632 1307 From Peter.Szwedyk@gs.com Mon, 7 Jun 1999 17:45:07 -0400 Date: Mon, 7 Jun 1999 17:45:07 -0400 From: Szwedyk, Peter Peter.Szwedyk@gs.com Subject: Beowulf vs. MOSIX It seems to me that for business applications, MOSIX might be a better way to go as a quick and easy way to take advantage of clusters. With its load balancing and transparent process migration, even existing serial applications should be able to take advantage of the power of clusters. With Beowulf, on the other hand, one must parallelize the code in order to see any improvement in performance. Is this assessment accurate? Any comments? --- Peter Szwedyk Goldman, Sachs & Co. Securities Lending Technology One New York Plaza, 48th Floor New York, NY 10004 Phone: 212-357-8105 | Fax: 212-428-1405 From rossini@biostat.washington.edu Mon, 7 Jun 1999 19:33:01 -0400 Date: Mon, 7 Jun 1999 19:33:01 -0400 From: A.J. Rossini rossini@biostat.washington.edu Subject: Beowulf vs. MOSIX >>>>> "SP" == Szwedyk, Peter writes: SP> It seems to me that for business applications, MOSIX might be SP> a better way to go as a quick and easy way to take advantage SP> of clusters. With its load balancing and transparent process SP> migration, even existing serial applications should be able to SP> take advantage of the power of clusters. With Beowulf, on the SP> other hand, one must parallelize the code in order to see any SP> improvement in performance. I wish people would stop commenting on "beowulf" vs. "mosix". Mosix, since it is now "commodity", ought to be included as a possible tool for any (intel-based, at this time), beowulf. Unless you'd rather split beowulf into "beowulf-pvm" and "beowulf-mpi"... See the mosix docs to realize that the mosix people have considered (and evaluated) pvm/mpi on mosix-enabled clusters... best, -tony -- A.J. Rossini Research Assistant Professor of Biostatistics Center for AIDS Research UW Biostatistics 206-720-4282 (4209=fax) 206-543-1044 (xxxx=fax) rossini@u.washington.edu rossini@biostat.washington.edu http://www.biostat.washington.edu/~rossini/ From jav@blazenet.net Mon, 7 Jun 1999 19:41:19 -0400 Date: Mon, 7 Jun 1999 19:41:19 -0400 From: jav jav@blazenet.net Subject: the need for speed Well, obviously there are physical limitations, but as the temperature of the system is lowered, the theoretical resistance is lowered allowing for inherent greater throughput. On the otherhand, if a CPU is overclocked, more heat than the CPU was engineered for will be produced and therefore the processor will literally melt. john > -----Original Message----- > From: Florin Boariu [SMTP:florin@bamberg.baynet.de] > Sent: Monday, 07 June, 1999 16:45 > To: Beowulf > Subject: Re: the need for speed > > Paul Eduard Schenker wrote: > > > > I have used Peltier elements (expensive) for an astronomical ccd > camera and > > got to about -35C by watercooling the "hot" side of the element to > about > > +18C. I read somewhere that you can go much lower with two Peltiers > in > > series. What you have to look at, i.e. calculate, is the amount of > heat you > > have to remove and then build the cooling system accordingly. > > > > Please excuse my naive question, but why do you need temperatures > below > zero to cool down a CPU? Is it to turn down the resistance of the > material, so that it doesn't melt when you clock it 2 or 3 times > higher? > (i think this is it, but i want to know for sure) > > How high can you really clock a CPU that way (supposing you really get > it as cool as you want)? > > florin. > -- > " [...] Linux source code would have to be shielded > from young eyes, lest they get the impression that > "fuck" is a valid engineering term." > ----------------------------------------------------- > > Blue Ribbon -- supporting free speech on the net > From brian@loki.chpc.utah.edu Mon, 7 Jun 1999 20:29:53 -0400 Date: Mon, 7 Jun 1999 20:29:53 -0400 From: Brian D. Haymore brian@loki.chpc.utah.edu Subject: Question about SMP within clusters A dual processor can be used if it is treated in a way like two seperate machines. i.e. you can fire off two seperate mpi threads to the same node. This can bee a problem if memory io or network io are high since either one can cause contetion with more then a single process accessing that device. You could also look into using the, although I have not tested these ideas yet, OpenMP support of the portland group compiler or the auto parralization of the same compiler compile for 2 cpu's and see how that works. It might work it might not. We usually run as if the dual processor is two machines on our system and the code we run doesn't seem to see much contention on either the nic or memory bus. -- ================================================= Brian D. Haymore, Systems Administrator Center for High Performance Computing, U of Utah Email: brian@chpc.utah.edu, Phone: (801) 585-1755 ================================================= On Mon, 7 Jun 1999, dan stanescu wrote: > > Hi, > > We're trying to put together our first beowulf, and consider > buying dual-processor P-II or P-III machines. However, I saw lots > of discussions, including on this list, that when running > i.e. a code parallelized with MPI on a dual-processor machine, > one doesn't get the expected performance. I'm wondering if there > there is anyone out there who has monitored this in detail and > can tell me if it's worth or not. > > Thanks a lot, > > ------------------------------------------------------------------ > Dan Stanescu, PhD | | > CFD Laboratory, ER-301 | Tel.: (514)848-3138 | > Concordia University | FAX : (514)848-8601 | > 1455 de Maisonneuve Blvd. West | E-mail: dan@cfdlab.concordia.ca | > Montreal, CANADA H3G 1M8 | | > ------------------------------------------------------------------ > From admin@cersa.admu.edu.ph Mon, 7 Jun 1999 20:47:14 -0400 Date: Mon, 7 Jun 1999 20:47:14 -0400 From: William Emmanuel S. Yu admin@cersa.admu.edu.ph Subject: 2.2.9 > Anyone upgrade to 2.2.9 ? If so how goes? has anyone had any kaka with > nfs? > yup. i did. nfs had some problems before with an error. but obvious is the problem for i have seen that i forgot to include in the compilation the nfs server code. ha. that might be the answer!!! besides does anybody know how to change the System.map after compiling a kernel because i get some pesky errors but they are just pesky as startup. something about invalid System.map version problem. william.s.yu@ieee.org From konold@alpha.tat.physik.uni-tuebingen.de Mon, 7 Jun 1999 21:38:56 -0400 Date: Mon, 7 Jun 1999 21:38:56 -0400 From: Martin Konold konold@alpha.tat.physik.uni-tuebingen.de Subject: Cables On Mon, 7 Jun 1999, Florent Calvayrac wrote: > that the corresponding MPI_ALLREDUCE are very defavorable > under Scampi, and indeed I get a better performance with a TCP/IP > Fast Ethernet network, because it seems that the reductions/distributions > are way better with such a communication network. > > Any comments ? Well, ScaMPI is not optimized in every respect but you could have wrapped up your MPI_ALLREDUCE in simple optimized MPI calls or you could have used the MPICH MPI_ALLREDUCE over SCI primitives inorder to gain much better performance. > What is BIP by the way ? It is a french effort which also includes an MPI implementation. Unfortunately I could never reproduce their published results. They claim one/zero copy implementation though. Regards, -- martin // Martin Konold, Herrenbergerstr. 14, 72070 Tuebingen, Germany // // Email: konold@kde.org // KDE: A stable GUI for a reliable OS. From mlucas@imagelinks.com Mon, 7 Jun 1999 23:00:13 -0400 Date: Mon, 7 Jun 1999 23:00:13 -0400 From: Mark Lucas mlucas@imagelinks.com Subject: Beowulf vs. MOSIX Peter, That is our opinion at AGIS/ImageLinks. We process satellite and aerial imagery which stresses CPU, bandwidth and storage. We have just concluded porting all of our code to run on LInux and are in the process of removing all of the SGIs and Suns replacing them with dual CPU Pentium boxes. More horsepower and substantially lower cost. We have been looking at MPI and PVM in our code thinking of BeoWulf clusters, but after attending LinuxExpo we have decided that Mosix is a better fit - especially given our workflow. We were impressed by how quickly the clusters at the show installed it and tried it out. Wiring PVM and MPI into the code is definitely desired so that there is more to spread around. Mark >It seems to me that for business applications, MOSIX might be a better way >to go as a quick and easy way to take advantage of clusters. With its load >balancing and transparent process migration, even existing serial >applications should be able to take advantage of the power of clusters. >With Beowulf, on the other hand, one must parallelize the code in order to >see any improvement in performance. > >Is this assessment accurate? Any comments? > >--- >Peter Szwedyk >Goldman, Sachs & Co. >Securities Lending Technology >One New York Plaza, 48th Floor >New York, NY 10004 >Phone: 212-357-8105 | Fax: 212-428-1405 ****************** Mark R. Lucas Chief Technical Officer AGIS/ ImageLinks, Inc. 4450 West Eau Gallie Blvd. Suite 164, Perimeter Center Melbourne, Florida 32934 (407) 253-0011 (work) (407) 253-5559 (fax) (407) 725-6842 (home) (407) 693-6842 (cellular) email: mlucas@imagelinks.com http://www.imagelinks.com ******************* From mas@ucla.edu Tue, 8 Jun 1999 00:41:24 -0400 Date: Tue, 8 Jun 1999 00:41:24 -0400 From: Michael Stein mas@ucla.edu Subject: Question about SMP within clusters >A dual processor can be used if it is treated in a way like two seperate >machines. i.e. you can fire off two seperate mpi threads to the same >node. If the program uses a fixed IP port for each process then running two on the same machine will result in a conflict (port in use)... How do you deal with that? [perhaps MPI doesn't have this problem?] From mas@ucla.edu Tue, 8 Jun 1999 00:41:26 -0400 Date: Tue, 8 Jun 1999 00:41:26 -0400 From: Michael Stein mas@ucla.edu Subject: the need for speed >Well, obviously there are physical limitations, but as the temperature >of the system is lowered, the theoretical resistance is lowered allowing >for inherent greater throughput. On the otherhand, if a CPU is >overclocked, more heat than the CPU was engineered for will be produced >and therefore the processor will literally melt. I would think that at some higher clock rate it would be necessary to cool the motherboard chipset and memory too. Then I'd worry that the electrolytic capacitors would freeze (I don't believe they work when frozen). In addition there is the thermal stress from the temperature changes -- would this cause parts to just pop off the circuit board after a few cycles (say room temp to -80 C?). From jteneyck@xyos.net Tue, 8 Jun 1999 02:09:31 -0400 Date: Tue, 8 Jun 1999 02:09:31 -0400 From: John TenEyck jteneyck@xyos.net Subject: Beowulf vs. MOSIX My question (which might be very naive) would be can you use PVM, MPI (LAM or some other implementation) along with MOSIX? _________________________________________________________________________ John TenEyck jteneyck@xyos.net http://jteneyck.xyos.net 409.229.8954 .-. __ _____ ____ ___ __ /v\ / / / _/ | / / / / / |/ / / \ / / / // |/ / / / /| / /( )\ / /____/ // /| / /_/ // | ^^-^^ /_____/___/_/ |_/\____//_/|_| >Phear The Penguin< If you refuse to accept anything but the best you very often get it. _________________________________________________________________________ ----- Original Message ----- From: Mark Lucas To: Szwedyk, Peter Cc: Sent: Monday, June 07, 1999 9:59 PM Subject: Re: Beowulf vs. MOSIX > Peter, > > That is our opinion at AGIS/ImageLinks. We process satellite and aerial > imagery which stresses CPU, bandwidth and storage. We have just concluded > porting all of our code to run on LInux and are in the process of removing > all of the SGIs and Suns replacing them with dual CPU Pentium boxes. More > horsepower and substantially lower cost. We have been looking at MPI and > PVM in our code thinking of BeoWulf clusters, but after attending LinuxExpo > we have decided that Mosix is a better fit - especially given our workflow. > We were impressed by how quickly the clusters at the show installed it and > tried it out. Wiring PVM and MPI into the code is definitely desired so > that there is more to spread around. > > Mark > > >It seems to me that for business applications, MOSIX might be a better way > >to go as a quick and easy way to take advantage of clusters. With its load > >balancing and transparent process migration, even existing serial > >applications should be able to take advantage of the power of clusters. > >With Beowulf, on the other hand, one must parallelize the code in order to > >see any improvement in performance. > > > >Is this assessment accurate? Any comments? > > > >--- > >Peter Szwedyk > >Goldman, Sachs & Co. > >Securities Lending Technology > >One New York Plaza, 48th Floor > >New York, NY 10004 > >Phone: 212-357-8105 | Fax: 212-428-1405 > > ****************** > Mark R. Lucas > Chief Technical Officer > > AGIS/ ImageLinks, Inc. > 4450 West Eau Gallie Blvd. > Suite 164, Perimeter Center > Melbourne, Florida 32934 > > (407) 253-0011 (work) > (407) 253-5559 (fax) > (407) 725-6842 (home) > (407) 693-6842 (cellular) > > email: mlucas@imagelinks.com > http://www.imagelinks.com > ******************* > From boklund@linux.nu Tue, 8 Jun 1999 04:24:51 -0400 Date: Tue, 8 Jun 1999 04:24:51 -0400 From: Andreas Boklund boklund@linux.nu Subject: 2.2.9, system map I didnt notice this "problem" untill i installed RH6.0, and i dont know if it's RedHat related or kernel related (my guess is RH). I just replaced /boot/system.map with /usr/src/linux/system.map and after that my systems have worked fine(I used kernel 2.2.7). So untill someone smarter then me tells uss why its behaving as it does, this is how i made it work. //Andreas >besides does anybody know how to change the System.map after compiling a >kernel because i get some pesky errors but they are just pesky as startup. >something about invalid System.map version problem. > >william.s.yu@ieee.org ************************************************** * UNS, linux and (Mud) * * * * Voice: 070-555 55 34 * * ICQ: 12030399 * * Email: boklund@linux.nu * * * * That is how you find me, How do -I- find you ? * ************************************************** From pesch@ibm.net Tue, 8 Jun 1999 04:24:17 -0400 Date: Tue, 8 Jun 1999 04:24:17 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: Computer Science research done on Beowulf class systems In my humble opinion Beowulf-like Clusters are very "commercial", i.e. you will see a steep rise in the quantity and quality of commercial software and hardware developed for them. The reason for my thinking is that: - the price /performance ratio of the hardware is far more attractive than with other solutions - you have the scalability you need in a commercial environment I therefore believe that you will soon see business-like packaging of clusters with a wide spectrum of applications ranging from making cartoons to weapons simulations to database- and web servers etc. etc. Paul At 09:43 AM 6/7/99 -0700, Hindman, John (DHS - ITSD) wrote: >Brad, et. al., here is an idea/question from an applications guy without the >technical know-how to answer it. > >Everything on Beowulf seems to be related to large scientific/engineering >applications. Is Beowulf suitable for a more business oriented architecture >of transaction processing against large relational databases? How about a >TCP/IP network connecting remote users to a Beowulf system with the database >on a storage area network? > >I have posed this question via fax and email to Red Hat and a professor >whose web page seemed oriented toward more general problems. I have had no >responses so far, so either the topic is potentially so commercially >lucrative that they don't want to talk about it, or so off the wall that it >isn't worth a reply. > >Thoughts, anyone? > >> -----Original Message----- >> From: Bradley M. Kuhn [SMTP:bkuhn@ebb.org] >> Sent: Monday, June 07, 1999 12:02 AM >> To: beowulf@beowulf.gsfc.nasa.gov >> Subject: Computer Science research done on Beowulf class systems >> >> >> I am posting to ask what (if any) types of Computer Science research is >> being done on Beowulf-class systems. Our Computer Science department is >> considering building one. However, there is some concern that this >> computer will be more helpful to the rest of the science departments than >> to >> the Computer Science department. >> >> I realize that "navel-gazing" research into making Beowulf systems better, >> faster, and more reliable is certainly possible, and projects like the one >> at NASA and the Mosix project are doing this type of research. >> >> I also know that work to make automatically parallelizing compilers (an >> active area of research in the compiler design community) is very >> possible. >> >> However, what I am looking for is information about *real* projects using >> Beowulf-class computers for Computer Science research. I have found lots >> of >> information on various aerospace, geological, and other scientific >> problems >> being solved with Beowulf class systems. However, I don't see lots of >> Computer Science projects using these systems. >> >> If anyone could tell me about such projects, I would much appreciate it. >> >> -- >> - bkuhn@ebb.org - Bradley M. Kuhn - bkuhn@gnu.org - >> http://www.ebb.org/bkuhn > > Paul Eduard Schenker 1 Peirce Hill Singapore 248558 Phone: 476 2245 Fax: 472 6480 email: pesch@ibm.net From pesch@ibm.net Tue, 8 Jun 1999 04:58:18 -0400 Date: Tue, 8 Jun 1999 04:58:18 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: hard disk reliability Joe, what is the best place to tap the temperetaure on a HDD - where does it get hottest - near the spindle? The CPU heatsink we're covering, of course. Paul > >We need the capability to get the temperature for the reasons you suggest. I >believe the best place to do it is on the processor die itself. The temp. >coefficient of silicon is a nice place to start designing for such an integrated >capability. But untill that comes to pass, external sensors are important. Som >recent BIOSs support temperature measurement at the CPU heatsink. > >-- >Joe Ferguson, ApeX Systems Integration Corp. >Voice: 919.468.8150 >FAX: 919.468.5288 >email: jferg@2boot.com > > > >Attachment Converted: "c:\ace\eudora\attach\jferg8.vcf" > Paul Eduard Schenker Phone: +65 - 476 2245 1 Peirce Hill Fax: +65 - 472 6480 Singapore 248558 email: pesch@ibm.net From lph@scali.com Tue, 8 Jun 1999 05:06:07 -0400 Date: Tue, 8 Jun 1999 05:06:07 -0400 From: L.P.Huse lph@scali.com Subject: Cables Florent Calvayrac wrote: >Keith Murphy wrote: >> Like you I do not want to generalize. Of course if cheaper hardware can do >> the job it should be used. However many Beowulf projects could certainly >> use a faster interconnect and SCI (or Myrinet) will improve their >> performance. >> >> If you use 2D Torus you will not need any switches, there is a 96 node 192 >> server SCI system running in Paderborn Germany with no switches rated at >> 86.4 GigaFlops. >> >> >> In the meantime it is the >> >> fastest interface available today and makes an ideal Beowulf interface. >> > >> >Please avoid generalizations. It isn't ideal if much cheaper hardware >> >can do the same job -- maybe your application doesn't need that much >> >network, or isn't sensitive to latency? And you still can't buy huge >> >SCI switches, which makes it inferior to Myrinet for large systems. >> > >> >Linux drivers, or better yet, open-source drivers will be a big step >> >forward for SCI and cluster computing. But you still have to look at >> >price/performance. >> > >> > >I had the occasion thanks to our German colleagues to test >the 32 processors SCI cluster in Paderborn, and I had a very sobering >experience. >I have developed a parallel Density Functional program under MPI where >the wavefunctions are distributed among the processors, and to >a good approximation the parallel work amounts to repetitively >summing up the density on the discretization grids. It seems (but it might be >wrong) >that the corresponding MPI_ALLREDUCE are very defavorable >under Scampi, and indeed I get a better performance with a TCP/IP >Fast Ethernet network, because it seems that the reductions/distributions >are way better with such a communication network. > >Any comments ? > >What is BIP by the way ? > >-- >Florent Calvayrac | Tel : 02 43 83 32 72 >Laboratoire de Physique de l'Etat Condense | Fax : 02 43 83 35 18 >UPRESA-CNRS 6087 | >Universite du Maine-Faculte des Sciences | >72085 Le Mans Cedex 9 Hi' Scalis initial MPI_Alreduce was based on the MPICH 1.0 implementation (linearly reduce + broadcast). The current implementation is based on the work of Rolf Rabenseifner and use binominal trees and overlap calculation and communication, improving performance on the 96 node cluster from 73 MB/s to 1450 MB/s for 64k buffers (MPI_SUM with MPI_DOUBLE). Feel free to contact us for another try ! /Lars Paul \\_// Lars Paul Huse; Parallisator & Doctor Scientarum Student (o-o) mailto:lph@scali.no http://www.ifi.uio.no/~larspaul ---oOOO-(_)-OOOo----------------------------------------------------- * .oooO Institutt for Informatikk - UiO (rom 3343) PO Box 1080, * ( ) Oooo. N-0316 OSLO Voice +47 22 85 24 34 Fax +47 22 85 24 01 ----\ (----( )------------------------------------------------------ \_) ) / Scali AS, Hvamstubben 17, n-2013 Skjetten. (_/ Voice +47 63 84 67 04 Fax +47 63 84 59 22 From pesch@ibm.net Tue, 8 Jun 1999 05:09:15 -0400 Date: Tue, 8 Jun 1999 05:09:15 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: hard disk reliability Joe, what is the best place to tap the temperetaure on a HDD - where does it get hottest - near the spindle? The CPU heatsink we're covering, of course. Paul > >We need the capability to get the temperature for the reasons you suggest. I >believe the best place to do it is on the processor die itself. The temp. >coefficient of silicon is a nice place to start designing for such an integrated >capability. But untill that comes to pass, external sensors are important. Som >recent BIOSs support temperature measurement at the CPU heatsink. > >-- >Joe Ferguson, ApeX Systems Integration Corp. >Voice: 919.468.8150 >FAX: 919.468.5288 >email: jferg@2boot.com > > > >Attachment Converted: "c:\ace\eudora\attach\jferg8.vcf" > Paul Eduard Schenker Phone: +65 - 476 2245 1 Peirce Hill Fax: +65 - 472 6480 Singapore 248558 email: pesch@ibm.net From einar@scali.com Tue, 8 Jun 1999 05:54:49 -0400 Date: Tue, 8 Jun 1999 05:54:49 -0400 From: Einar Rustad einar@scali.com Subject: Cables >> If you use 2D Torus you will not need any switches, there is a 96 node 192 >> server SCI system running in Paderborn Germany with no switches rated at >> 86.4 GigaFlops. > >But then I pay a huge price in terms of lowered bisection bandwidth, In terms of bisection bandwidth for the SCI-based 2D Torus, the number for a 96 node system exceeds 5GigaBytes/s measured with an MPI program at user level so what is this supposed to be lower than? >and I have to teach my queue system and my MPI implementation about >the topology of the network, Like any other MPP system, overall performance may be affected by physical location of the communicating processes. In the Scali 2D Topology there is sufficient interconnect bandwidth to sustain a rich mixture of network traffic. This means less need for sophisticated algorithms to determine where to place processes. The limiting factor for this topology is related to the PCI bus, not to the SCI interconnect. (Refer to the prsentation slide at: http://www.scali.com/Presentation/sld016.htm) >and I have to accept the fact that 2 >unrelated jobs on the same system will interfere with each other. I >could imagine that a program which only wanted to send a few small >packets with low latency might not be bothered by this, but I can >imagine other bandwidth-hog programs which would run better on fast >ethernet than SCI. > >-- g The SCI interconnect guarantees fair arbitration and forward progress. This is in contrast to the Ethernet protocol where high traffic generates network thrashing. We would like to point out that a "bandwidth-hog" MPI program with an all-to-all communication pattern measures 30MBytes/s per node for 64 nodes. Compare this to a peak (hardware level) bandwidth of 12,5MBytes (100Mbit/s) Ethernet, and this is without the communication protocol overheads needed to handle the not-to-be-trusted Ethernet. Einar Rustad, Scali Einar Rustad, Vice President Marketing and Operations Scali AS Computer Systems Voice: +47 63 84 67 07; FAX: +47 63 84 40 05 Cell. phone.: +47 92 48 45 10; email: einar@scali.com; Visiting/mail Addr: Hvamstubben 17, 2013 Skjetten, Norway http://www.scali.com From dmerchan@hiwaay.net Tue, 8 Jun 1999 06:05:45 -0400 Date: Tue, 8 Jun 1999 06:05:45 -0400 From: dmanddmer dmerchan@hiwaay.net Subject: Beowulf vs. MOSIX Mark, Are you sacrificing the visualization capabilties for faster backend computing? Obviously, the cpu power is your primary consideration. Are you keeping the SGI's as simply display stations? Or is 3D/4D even a consideration in your imagery? David Mark Lucas wrote: > > Peter, > > That is our opinion at AGIS/ImageLinks. We process satellite and aerial > imagery which stresses CPU, bandwidth and storage. We have just concluded > porting all of our code to run on LInux and are in the process of removing > all of the SGIs and Suns replacing them with dual CPU Pentium boxes. More > horsepower and substantially lower cost. We have been looking at MPI and > PVM in our code thinking of BeoWulf clusters, but after attending LinuxExpo > we have decided that Mosix is a better fit - especially given our workflow. > We were impressed by how quickly the clusters at the show installed it and > tried it out. Wiring PVM and MPI into the code is definitely desired so > that there is more to spread around. > > Mark > > >It seems to me that for business applications, MOSIX might be a better way > >to go as a quick and easy way to take advantage of clusters. With its load > >balancing and transparent process migration, even existing serial > >applications should be able to take advantage of the power of clusters. > >With Beowulf, on the other hand, one must parallelize the code in order to > >see any improvement in performance. > > > >Is this assessment accurate? Any comments? > > > >--- > >Peter Szwedyk > >Goldman, Sachs & Co. > >Securities Lending Technology > >One New York Plaza, 48th Floor > >New York, NY 10004 > >Phone: 212-357-8105 | Fax: 212-428-1405 > > ****************** > Mark R. Lucas > Chief Technical Officer > > AGIS/ ImageLinks, Inc. > 4450 West Eau Gallie Blvd. > Suite 164, Perimeter Center > Melbourne, Florida 32934 > > (407) 253-0011 (work) > (407) 253-5559 (fax) > (407) 725-6842 (home) > (407) 693-6842 (cellular) > > email: mlucas@imagelinks.com > http://www.imagelinks.com > ******************* From g.t.h.roest@wbmt.tudelft.nl Tue, 8 Jun 1999 06:33:35 -0400 Date: Tue, 8 Jun 1999 06:33:35 -0400 From: Gerben Roest g.t.h.roest@wbmt.tudelft.nl Subject: 2.2.9 At 06:39 PM 6/7/99 +0800, William Emmanuel S. Yu wrote: >besides does anybody know how to change the System.map after compiling a >kernel because i get some pesky errors but they are just pesky as startup. >something about invalid System.map version problem. After compiling a new kernel, copy the file "System.map" from /usr/src/linux to /boot (if that's the place where your kernel resides.). I always copy the newest System.map to /boot as e.g. System.map-2.2.9-1 and then make a soft link to System.map. Greetings, Gerben Roest. Linvision VoF Delft, The Netherlands From stefan@physc.su.se Tue, 8 Jun 1999 07:50:56 -0400 Date: Tue, 8 Jun 1999 07:50:56 -0400 From: Stefan Lindberg stefan@physc.su.se Subject: 2.2.9, system map It's a fact that as soon as you recompile the kernel you have to replace /boot/System.map with the one in /usr/src/linux. System.map is a textfile wich describes all kernelfunctions and where they are. If you use a static kernel it's ok to skip /boot/System.map although one might get some errors at startup but things compiled into the kernel will work. Things might break also when compiling programs that require some info about the kernel, what is implemented and such things. /S Andreas Boklund wrote: > I didnt notice this "problem" untill i installed RH6.0, and i dont > know if it's RedHat related or kernel related (my guess is RH). > I just replaced /boot/system.map with /usr/src/linux/system.map and > after that my systems have worked fine(I used kernel 2.2.7). So untill > someone smarter then me tells uss why its behaving as it does, this is > how i made it work. > > //Andreas > > >besides does anybody know how to change the System.map after compiling a > >kernel because i get some pesky errors but they are just pesky as startup. > >something about invalid System.map version problem. > > > >william.s.yu@ieee.org > > ************************************************** > * UNS, linux and (Mud) * > * * > * Voice: 070-555 55 34 * > * ICQ: 12030399 * > * Email: boklund@linux.nu * > * * > * That is how you find me, How do -I- find you ? * > ************************************************** -- ============= FOOS - Chemistry ==================== Stockholm University URL: www.fos.su.se/~stefan/ FOOS - Chemistry Phone: +46 8 674 7481 Stefan Lindberg Cell: +46 70 491 0223 S-106 91 Stockholm Office: Arrhenius,c454 BEOWULF: http://www.fos.su.se/~aatto/helge/ E-Mail: stefan@fos.su.se Get PGP public key at my homepage =================================================== From gerry@cs.tamu.edu Tue, 8 Jun 1999 07:52:58 -0400 Date: Tue, 8 Jun 1999 07:52:58 -0400 From: Gerry Creager gerry@cs.tamu.edu Subject: hard disk reliability jferg wrote: > > Wayde Milas wrote: > > > It is subjective. Id classify the Seagate Baracudas(sp)? 10000 rpm as > > HOT. a plain old vanilla 5400 rpm lvd as cool... If you can touch the > > drive witholut hurting yourself after its been active for 2 hours, its > > cool. otherwise its hot. :P > > > > Ibms some where inbetween... > > > > Never said it was scientific, Just personal experience. Hot drives tend > > to fail more often. But it is quantifiable. At NASA in the Manned Spaceflight program, there is a safety check for touch-temperature. That has, in fact, been quantified to 113 deg.F. If you can place your hand (finger/cheek/toe) on it and remain in place for 15 sec. without a burn or uncomfortable heat, it's "OK." > The major power consumption component in a hard drive is the power required to > overcome aerodynamic losses due to the spinning disk. (All other things being > equal) the power requirement due to aerodynamic losses goes as the cube of the > rotational speed. Thus, motor power in the 10KRPM drive vs the 5.4KRPM drive > goes as 10^3 / 5.4^3, or about a factor of 6.3. Fast drives are hotter in > more than way. > > It is also well known that failure rate rises rapidly with temperature. That's > one reason ovens are used in system stress testing. The term here is "accellerated life testing." However, if the hardware's designed to operate at an elevated temperature, the temp for accelerated testing has to be recalculated and ramped up. I do not recall the parameters of this calculation. -- Gerry Creager Mapping Sciences Laboratory 409.845.7201 Office Texas Agricultural Experiment Station 409.845.2273 Faz Texas A&M University System 409.228.7686 Pager (preferred) College Station, Texas 77843-2120 gerry@page4.tamu.edu Pager: 4092287686@mobile.att.net "Opinions expressed are mine and do not necessarily represent those of Texas A&M University." From tony.albers@danotech.dk Tue, 8 Jun 1999 08:26:27 -0400 Date: Tue, 8 Jun 1999 08:26:27 -0400 From: Tony Albers tony.albers@danotech.dk Subject: RH 6.0 Boot Booting RH 6.0 I get as far as "Turning on user and group quotas for local filesystems" and then the system just stops booting.. No error messages or anything, it just sits there.. Anybody got an idea? I'm running on a HP Netserver 5/100 LS2 with wide SCSI discs.. Thanks, Tony From mlucas@imagelinks.com Tue, 8 Jun 1999 08:31:02 -0400 Date: Tue, 8 Jun 1999 08:31:02 -0400 From: Mark Lucas mlucas@imagelinks.com Subject: Beowulf vs. MOSIX David, Our algorithms are very similar to ray tracing and are very CPU intensive. For visualization the 3D pipeline provided by the SGIs is great once you have the elevation surface and the texture map all lined up. Our work consists of building the texture map correctly (radiometrically and geometrically) and the 3D pipeline doesn't help us. Fortunately, the problem is very parallel in nature and is well suited for a cluster implementation. Mark >Mark, > >Are you sacrificing the visualization capabilties for faster backend >computing? Obviously, the cpu power is your primary consideration. Are >you keeping the SGI's as simply display stations? Or is 3D/4D even a >consideration in your imagery? > >David > >Mark Lucas wrote: >> >> Peter, >> >> That is our opinion at AGIS/ImageLinks. We process satellite and aerial >> imagery which stresses CPU, bandwidth and storage. We have just concluded >> porting all of our code to run on LInux and are in the process of removing >> all of the SGIs and Suns replacing them with dual CPU Pentium boxes. More >> horsepower and substantially lower cost. We have been looking at MPI and >> PVM in our code thinking of BeoWulf clusters, but after attending LinuxExpo >> we have decided that Mosix is a better fit - especially given our workflow. >> We were impressed by how quickly the clusters at the show installed it and >> tried it out. Wiring PVM and MPI into the code is definitely desired so >> that there is more to spread around. >> >> Mark >> >> >It seems to me that for business applications, MOSIX might be a better way >> >to go as a quick and easy way to take advantage of clusters. With its load >> >balancing and transparent process migration, even existing serial >> >applications should be able to take advantage of the power of clusters. >> >With Beowulf, on the other hand, one must parallelize the code in order to >> >see any improvement in performance. >> > >> >Is this assessment accurate? Any comments? >> > >> >--- >> >Peter Szwedyk >> >Goldman, Sachs & Co. >> >Securities Lending Technology >> >One New York Plaza, 48th Floor >> >New York, NY 10004 >> >Phone: 212-357-8105 | Fax: 212-428-1405 >> >> ****************** >> Mark R. Lucas >> Chief Technical Officer >> >> AGIS/ ImageLinks, Inc. >> 4450 West Eau Gallie Blvd. >> Suite 164, Perimeter Center >> Melbourne, Florida 32934 >> >> (407) 253-0011 (work) >> (407) 253-5559 (fax) >> (407) 725-6842 (home) >> (407) 693-6842 (cellular) >> >> email: mlucas@imagelinks.com >> http://www.imagelinks.com >> ******************* ****************** Mark R. Lucas Chief Technical Officer ImageLinks, Inc. 4450 West Eau Gallie Blvd. Suite 164, Perimeter Center Melbourne, Florida 32934 (407) 253-0011 (work) (407) 253-5559 (fax) (407) 725-6842 (home) (407) 693-6842 (cellular) email: mlucas@imagelinks.com http://www.imagelinks.com ******************* From ajl4@eecs.lehigh.edu Tue, 8 Jun 1999 08:37:45 -0400 Date: Tue, 8 Jun 1999 08:37:45 -0400 From: Adam Lazur ajl4@eecs.lehigh.edu Subject: Beowulf vs. MOSIX John TenEyck (jteneyck@xyos.net) said: > My question (which might be very naive) would be can you use PVM, MPI (LAM > or some other implementation) along with MOSIX? In short, yes you can. At Linux Expo we actually did a demo where the pvmpovray image rendered faster when spawning all the processes on one node and allowing MOSIX to balance the load than just using vanilla pvm to hand out the povray pieces. I also believe there may be a paper on that at the MOSIX website. .adam -- Adam Lazur - Computer Engineering Undergrad - Lehigh University icq# 3354423 - http://www.lehigh.edu/~ajl4 Windows 98 packs with solitare, Linux packs with DOOM. You can have your deck of cards. I'll take a chainsaw. From jferg@2boot.com Tue, 8 Jun 1999 09:12:27 -0400 Date: Tue, 8 Jun 1999 09:12:27 -0400 From: jferg jferg@2boot.com Subject: hard disk reliability This is a multi-part message in MIME format. --------------6D200278B50048D662C2B691 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Paul Eduard Schenker wrote: > Joe, what is the best place to tap the temperetaure on a HDD - where does > it get hottest - near the spindle? > > The CPU heatsink we're covering, of course. > > Paul > > > > >We need the capability to get the temperature for the reasons you suggest. I > >believe the best place to do it is on the processor die itself. The temp. > >coefficient of silicon is a nice place to start designing for such an > integrated > >capability. But untill that comes to pass, external sensors are > important. Som > >recent BIOSs support temperature measurement at the CPU heatsink. > > > >-- > >Joe Ferguson, ApeX Systems Integration Corp. > >Voice: 919.468.8150 > >FAX: 919.468.5288 > >email: jferg@2boot.com > > > > > > > >Attachment Converted: "c:\ace\eudora\attach\jferg8.vcf" > > > Paul Eduard Schenker Phone: +65 - 476 2245 > 1 Peirce Hill Fax: +65 - 472 6480 > Singapore 248558 email: pesch@ibm.net This is pure conjecture: Since the primary loss mechanism is viscous friction, ant there is vigorous pumping action, there is a large volume air flow circulating within the housing, I suspect the housing will have a reasonably uniform temperature. The heating effect is substantial. Consider a vacuum sweeper with a motor drain of about a Kilowatt. Almost all of the power goes into heating the air due to pumping and friction losses. Most of us have observed that the exhaust temperature is several degrees above ambient even with a substantial flow through the system. In the closed environment of a modern hard drive, there is no through flow (the enclosed airspace is captive), so the heat has to get out through the walls of the cavity by conduction across two air-to-metal interfaces. The temperature builds up enough to enable this conduction. BTW, in electric motor engineering jargon, the air friction losses are called "windage losses", and they can be a substantial portion of the overall losses, comparable to the electrical losses of the system Back to your question: The case should haave a fairly uniform temperature because of the internal circulation. To get good cooling, I would ensure that there is good external airflow across the exposed parts of the housing. -- Joe Ferguson, ApeX Systems Integration Corp. Voice: 919.468.8150 FAX: 919.468.5288 email: jferg@2boot.com --------------6D200278B50048D662C2B691 Content-Type: text/x-vcard; charset=us-ascii; name="jferg.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for jferg Content-Disposition: attachment; filename="jferg.vcf" begin:vcard n:Ferguson;Joe x-mozilla-html:FALSE org:ApeX Systems Integration Corp. adr:;;;;;; version:2.1 email;internet:jferg@2boot.com title:Tech Director x-mozilla-cpt:;0 fn:Joe Ferguson end:vcard --------------6D200278B50048D662C2B691-- From Fred_deBros@etherdome.mgh.harvard.edu Tue, 8 Jun 1999 09:53:29 -0400 Date: Tue, 8 Jun 1999 09:53:29 -0400 From: Fred_deBros@etherdome.mgh.harvard.edu Fred_deBros@etherdome.mgh.harvard.edu Subject: 2.2.9 > Anyone upgrade to 2.2.9 ? If so how goes? Well, I still try to: make make dep make bzImage, make modules make modules_install (I wanted the 4-cdrom changer to run) and I get on bootup: after sending BOOTP and RARP requests......unable to handle kernel paging request at virtual address 00d0030.....lots of computerese ....Aiee, killing interrupt handler. I dont think that is 2.2.9 per se. 2.2.5 runs fine. Am I doin sumpin wrong in make menuconfig? I am not knowledgeable enough to fix that. Somebody? fred From jav@blazenet.net Tue, 8 Jun 1999 09:55:11 -0400 Date: Tue, 8 Jun 1999 09:55:11 -0400 From: jav jav@blazenet.net Subject: the need for speed > -----Original Message----- > From: Michael Stein [SMTP:mas@ucla.edu] > Sent: Tuesday, 08 June, 1999 01:38 > To: Beowulf > Subject: RE: the need for speed > > >Well, obviously there are physical limitations, but as the > temperature > >of the system is lowered, the theoretical resistance is lowered > allowing > >for inherent greater throughput. On the otherhand, if a CPU is > >overclocked, more heat than the CPU was engineered for will be > produced > >and therefore the processor will literally melt. > > I would think that at some higher clock rate it would be necessary to > cool the motherboard chipset and memory too. Then I'd worry that the > electrolytic capacitors would freeze (I don't believe they work when > frozen). In addition there is the thermal stress from the temperature > changes -- would this cause parts to just pop off the circuit board > after a few cycles (say room temp to -80 C?). > [>>] True, the capcitors are definately not made for those temperatures, nor the motherboard which is why manufacturers of motherboards set a temperature range in which the systems may be operated. The manufacurer's limit can probably be streched about 10% below their suggestion for long durations, but going below will cause ugly problems. I believe that I have seen, however, units which pass a refrigerant coil tube just over the CPU and apply the cooling effect mainly to the processor. This would be an ideal situation for many reasons; (1) The CPU could be theoretically cooled to about -120 C safely (2) The cooling will dissapate itself across the components and cool the mainboard down somewhat, but not to the point of exhaustion (3) The moving parts of the cooling system could be located away from noise sensitive devices, etc. From pesch@ibm.net Tue, 8 Jun 1999 10:43:35 -0400 Date: Tue, 8 Jun 1999 10:43:35 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: Cables Dear friend in speed and Parallisator, your figures below successfully paralized my mind - 1450 MB/s -- now that will let us all go ballistic. Is the little troll left of your name involved in any way? Paul > >Scalis initial MPI_Alreduce was based on the MPICH 1.0 implementation >(linearly >reduce + broadcast). The current implementation is based on the work of >Rolf Rabenseifner and use binominal trees and overlap calculation and >communication, improving performance on the 96 node cluster from 73 MB/s >to 1450 MB/s for 64k buffers (MPI_SUM with MPI_DOUBLE). >Feel free to contact us for another try ! > >/Lars Paul > > \\_// Lars Paul Huse; Parallisator & Doctor Scientarum Student > (o-o) mailto:lph@scali.no http://www.ifi.uio.no/~larspaul >---oOOO-(_)-OOOo----------------------------------------------------- >* .oooO Institutt for Informatikk - UiO (rom 3343) PO Box 1080, >* ( ) Oooo. N-0316 OSLO Voice +47 22 85 24 34 Fax +47 22 85 24 01 >----\ (----( )------------------------------------------------------ > \_) ) / Scali AS, Hvamstubben 17, n-2013 Skjetten. > (_/ Voice +47 63 84 67 04 Fax +47 63 84 59 22 > > Paul Eduard Schenker Phone: +65 - 476 2245 1 Peirce Hill Fax: +65 - 472 6480 Singapore 248558 email: pesch@ibm.net From Andy.Hencke@wcom.com Tue, 8 Jun 1999 10:44:39 -0400 Date: Tue, 8 Jun 1999 10:44:39 -0400 From: Andrew Hencke Andy.Hencke@wcom.com Subject: Hello List; I'm looking for HELP Hello Beowulf folks, We are going to set up a Beowulf cluster here in Denver this summer, and I have a few initial questions. In the Installation Guide (http://www.beowulf-underground.org/doc_project/index.html) written by Jacek Radajewski and Douglas Eadline, the authors refer to a method of installing the clients: "The second method is the one I used in the first stage of our topcat system, that is installing the operating system on each client separately and then running a configuration script on the server which performs the rest of the setup." This is the way we would like to configure our clients, but the authors left those instructions out of the installation guide. Does anyone else have those instructions?????? Secondly, if there is anyone out there who would be willing to send us your email address for questions during setup, that would be greatly appreciated. Thirdly, we are debating running this installation from RedHat 5.2 or 6.0. Any thoughts about problems that might exist using the newer version of RedHat (and therefore newer version of Linux)????? Thanks, Andy Hencke University of Colorado, Denver From joelja@darkwing.uoregon.edu Tue, 8 Jun 1999 11:14:30 -0400 Date: Tue, 8 Jun 1999 11:14:30 -0400 From: Joel Jaeggli joelja@darkwing.uoregon.edu Subject: hard disk reliability probably on the top in the center. asuuming of course that you're mounting it right-side up. The tempature you're actually concerned about is that of the bearing races and the motors. it would of course be hard to test those directly... It's been a while since I had a drive that got hot enough to fry something on the pcb on the bottom (quentum atlas)... joelja On Tue, 8 Jun 1999, Paul Eduard Schenker wrote: > Joe, what is the best place to tap the temperetaure on a HDD - where does > it get hottest - near the spindle? > > The CPU heatsink we're covering, of course. > > Paul > > > > >We need the capability to get the temperature for the reasons you suggest. I > >believe the best place to do it is on the processor die itself. The temp. > >coefficient of silicon is a nice place to start designing for such an > integrated > >capability. But untill that comes to pass, external sensors are > important. Som > >recent BIOSs support temperature measurement at the CPU heatsink. > > > >-- > >Joe Ferguson, ApeX Systems Integration Corp. > >Voice: 919.468.8150 > >FAX: 919.468.5288 > >email: jferg@2boot.com > > > > > > > >Attachment Converted: "c:\ace\eudora\attach\jferg8.vcf" > > > Paul Eduard Schenker Phone: +65 - 476 2245 > 1 Peirce Hill Fax: +65 - 472 6480 > Singapore 248558 email: pesch@ibm.net > > > > -------------------------------------------------------------------------- Joel Jaeggli joelja@darkwing.uoregon.edu Academic User Services consult@gladstone.uoregon.edu PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -------------------------------------------------------------------------- It is clear that the arm of criticism cannot replace the criticism of arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of the right, 1843. From pesch@ibm.net Tue, 8 Jun 1999 11:24:04 -0400 Date: Tue, 8 Jun 1999 11:24:04 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: the need for speed When you increase the cpu clock speed do you increase the bus speed proportionally? if so you might need to cool some isolated parts, but you'd do it very localized; no need to cool your condensers... >I would think that at some higher clock rate it would be necessary to >cool the motherboard chipset and memory too. Then I'd worry that the >electrolytic capacitors would freeze (I don't believe they work when >frozen). In addition there is the thermal stress from the temperature >changes -- would this cause parts to just pop off the circuit board >after a few cycles (say room temp to -80 C?). > > > > Paul Eduard Schenker Phone: +65 - 476 2245 1 Peirce Hill Fax: +65 - 472 6480 Singapore 248558 email: pesch@ibm.net From rgb@phy.duke.edu Tue, 8 Jun 1999 12:06:33 -0400 Date: Tue, 8 Jun 1999 12:06:33 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: Computer Science research done on Beowulf class systems On Mon, 7 Jun 1999, Walter B. Ligon III wrote: (original thread?) > > > A few really > > > large sites might need to spread the load a bit, but would probably be better > > > served my multiple servers that work together than a beowulf. (from Greg) > > > > Obviously we are back to "What's a beowulf?" Multiple servers that > > work together is a traditional cluster. Businesses, btw, have used > > clusters for as long as scientific programmers have used clusters. The > > wall street firm I used to work for didn't have any machine with more > > than 2 CPUs, nor did they do any parallel programing, but they had a > > large cluster. They built it for availability and throughput reasons. (Walter) > Yeah, well, I really don't want to debate that. What you have said here is > exactly my point. Beowulf isn't really the approach for these problems, but > there are other good approaches. "Clusters" have been around for a long time > in many different forms, and certainly they are a technique for improving > throughput and capacity. Beowulf is a parallel computer architecture. Most > of the resources in a beowulf are for internal use. A lot of what I said > in that posting was actually generic to parallel computers - which HAVE > been around longer than computers clustered for business use. An interesting note on this subject (which I'm cc'ing to the Extreme Linux list as well, as it is as relevant to the EL folks as it is to just beowulfers): Network Computing magazine just came out with Linux for its cover story. It's claim: Linux is a mature and rapidly developing environment well able to compete head to head with NT or Netware, already in place in many organizations (although typically for dedicated "speciality" purposes more than as a general purpose solution), and with (obviously) superior price/performance for any purpose for which it competes head to head performance wise. They place it somewhat above NT but still below "commercial Unices" primarily because of (don't get angry at me, I didn't write this stuff, but the author, Greg Shipley, gshipley@neohapsis.com would undoubtedly LOVE your input:-): a) A lack of "robust SMP support". b) An "unpolished clustering technology". c) A lack of a "robust 64 bit journalized file system". d) A lack of "advanced options for high availability" (see Greg's remarks above). All of these latter elements are presumably present in "market leaders" like Compaq's Tru64 and Sun's Solaris on the high end. On the low(er) end, the article complains that Linux is still weak on the commercial database front: "Moving a company's financial system onto an early beta of Oracle 8 for Linux is a bad idea..." although he praises it for FTP and Web server farms. Similarly, the author sings the praises of Lotus Notes, Microsoft Exchange, and Novell Groupwise and bemoans the lack of a similar tool in Linux (while acknowledging its superiority as an SMTP, POP or IMAP mail relay or server platform, complains that there are still missing certain things like an enterprise level backup tool, worries about the "expert friendly" nature of Linux (as basically a Unix), and finally does a comparison of Linux documentation and support -- the "Linux Certified Engineer" (LCE, obnoxious as such a thing might be to you or me) gives businesses the warm fuzzies and is most definitely in the near future. (Parenthetically, I wonder how they will grant LCE's. Will I have to pay somebody hundreds of dollars and take a "course" that I could have given as an instructor instead? Will Don Becker or Alan Cox have an LCE? Or will there (more reasonably) be an open certification process that is either dirt cheap or free that doesn't require one to pay for or take a course at all if one can pass the exam without it. Enquiring minds want to know...;-) All of this strikes me as being a pretty fair treatment. It is certainly one of the best treatments Linux has received in a major computing mag -- NWC ends up basically endorsing it as very, very nearly "enterprise ready" (which I interpret as being ready to completely replace WinXX products and other Unices from top to bottom in an enterprise) and an overwhelming price/performance win whereever it is already deployed in the enterprise. My biggest bitch about the article is in its treatment of "robust SMP" and clustering, the topic of this thread. Of course we all know that SMP under linux is quite robust indeed and in 2.2.x becomes both robust and sophisticated. At the time the article was written, 2.2.x was not autodeployed in commercial Linux distributions and now is, so perhaps the author would remove this as an objection/obstacle. Still, there are quotes that annoy me: "Clustering is another area in which Linux lags for mainstream corporate needs. The Beowulf project hit the mainstream this spring by matching world record holder, a Cray T3t-900-AC64, in the PovRay benchmark test" ... (run by IBM on a 17 Netfinity cluster running Red Hat)..." Linux clusters have been popping up in education and aerospace research facilities for some time now. A few production Linux clusters even rate among the Top 500 most powerful computers in the world (www.top500.org). Organizations seeking raw, high-end computational power won't find a more cost-effective solution. >>But Linux clustering is little more than academic. Web services. databases and general high-availability services that would benefit from Linux clustering haven't matured yet.<<" (>>emphasis mine<<). This is a curious remark, since earlier he describes the widespread use of Linux in FTP and Web server farms (are not "farms" "clusters"?). I do think that the remarks concerning database "clusters" are apropos, but not Linux specific. The real problem (as I understand it) is that "database cluster" technlogy itself is fairly immature on any platform. Is this incorrect? Does NT or Solaris support some sort of superior "clustering"? Also, what "advanced options for high-availability" clustering are missing? > > So watch out for folks who use "beowulf" interchangably with "cluster". > > I don't, but most of the new people asking questions on this mailing > > list do. > > Well, I feel I should work to educate them, not support their misconceptions. This ongoing education thread (which reaches back over many iterations) is actually a very important one that both of you have contributed tremendously to over many years -- clearly the author of the NWC article recognizes a key element of the distinction, that beowulfs are powerful parallel numerical engines while business "clusters" are more amorphously defined and (whether or not the actual component tools exist for Linux) are not being widely >>marketed<< as turnkey solutions at this time. It is my own belief (based on reading this list for many years) that there is both some truth and some error in the author's statements concerning Linux clustering technology. I would say that in many cases it does in fact exist, but I would also agree that it isn't yet properly organized and packaged and resold, although there are a few vendors (VAR, Paralogics, others?) that are working on it. This represents a huge opportunity to entrepreneurs, as we were working hard on pointing out at the EL booth at Linux Expo last month. I'd say that anyone who identifies key business "clustering" technologies and develops them agressively over the next six to twelve months has an excellent chance of riding a wave as Linux surges into the Enterprise. They'll undoubtedly make a few bushel baskets of well-deserved moola in the meantime. In a lot of cases this will consist of identifying and integrating existing tools, in others porting existing tools, and in still others designing and building key components that are still missing. Alas, there isn't a lot that can be done with the database side of things except work on a clustered version of mysql (a possibility recently discussed on this list) -- the commercial products are being made openly available to non-corporate (discorporate? ;-) Linux humans which is good, but they are still not open source which makes it hard to tinker with them. This is a key time for developing partnerships and business alliances to speed the development process, as well. If disparate groups in possession of distinct pieces of the pie get together, they can build the pie a lot faster and there is plenty of pie-starved market to go around. The last thing that would be very useful (that is apropos to the beowulf list in particular) would be the integration of the capabilities of the classic "beowulf" with those of the "business cluster", which may well be a lot more amorphous or which may optimize completely different parts of the information processing stream (like PVFS optimizes disk access, or Web farms optimize availability). The most powerful and general purpose cluster information entity that I can imagine would be one with a "beowulf" component optimized for very fast parallel computation on problems with a variety of grain sizes, a (journalized, 64 bit?) PVFS component that provides very fast distributed access to a very large file structure, a parallel network component that provides load-balanced high-availability access to all of this data and compute power, and probably several other "clustered" components I haven't thought of. Such a Linux/COTS construct could serve as the core of a true enterprise level compute facility -- instant and balanced access to both data and processing power. Just musings, I know, but I thought they might be of interest on this list. I learned at Linux Expo that there is a rather large group of "lurkers on the lists"; people trying to understand the technology being developed and discussed to see how to apply it in their own niches. It would be really very interesting to develop some sort of list of what clustering technologies and products already exist in Linux (at what stages of development) and what is still "missing" that would earn Linux the NWC seal of approval as "Enterprise Ready" as a clustering foundation. Oh, one last interesting note. NWC repeated the infamous Samba/SMB comparison done by Mindcraft a few months ago. Their conclusion: Linux performs almost identically to NT as an SMB server -- either one can be a bit faster on any particular component of their benchmark depending on configuration details. Their auxiliary conclusion: If one is building a large operation, Linux (with fixed costs of maybe $100 for a distribution CD or two) is an overwhelming cost/benefit win, saving thousands of dollars. NWC also had a rather scathing editorial condemning the publication of the Mindcraft result as "independent" when in fact it was bought and paid for by Microsoft and run by Microsoft personnel on Microsoft systems. Their point is that Mindcraft has now destroyed any credibility they might have ever had. Not that they had a great deal to begin with. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb@phy.duke.edu Tue, 8 Jun 1999 12:20:45 -0400 Date: Tue, 8 Jun 1999 12:20:45 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: Beowulf vs. MOSIX On Mon, 7 Jun 1999, Szwedyk, Peter wrote: > It seems to me that for business applications, MOSIX might be a better way > to go as a quick and easy way to take advantage of clusters. With its load > balancing and transparent process migration, even existing serial > applications should be able to take advantage of the power of clusters. > With Beowulf, on the other hand, one must parallelize the code in order to > see any improvement in performance. > > Is this assessment accurate? Any comments? For the appropriate class of problems, this is both true and intelligent. Mosix turns a cluster into a virtual SMP machine and brings "parallel clustering" to embarassingly coarse grained (basically multiple serial) applications without any need to write parallel code or write the shell wrappers that one needed to manage these applications beforehand. It is going to be a godsend for many, many classes of problems -- for the first time, the network really >>is<< the computer, to borrow a really very fine line from Sun. However, there are still many other classes of problems for which MOSIX is not the answer. For some of these, real parallel computation is key. For others, parallelized access to data is key, and MOSIX doesn't necessarily eliminate a server bottleneck in the data stream. MOSIX will certainly offer instant and cost-beneficial gratification to many, many organizations seeking to utilize wasted compute resources transparently, but it is only one piece in a bigger puzzle. I think that the ultimate compute environment in medium to large businesses will evolve into something that has one or more "true beowulf" cores, a large and amorphous cluster (which will include most desktop workstations) running MOSIX as you describe, a parallelized filesystem and server construct to provide load-balanced, parallelized access to a large data warehouse, and tools to facilitate using all of these various components transparently (with MOSIX being just one of those tools). A user might seek to run a set of single threaded accounting processes that are MOSIX distributed but gets data in parallel with other applications accessing the same data space. Another user might run a complex SQL command to build a dataset, with parts of the command run (transparently) in parallel. Still another might be building a presentation that involves complex rendering and visualization of data landscapes, where the data is accessed in parallel from the parallelized filesystem, processed and rendered in parallel on a beowulf core, and displayed on a particular workstation (or even a collection of distributed workstations), again totally transparently. rgb > > --- > Peter Szwedyk > Goldman, Sachs & Co. > Securities Lending Technology > One New York Plaza, 48th Floor > New York, NY 10004 > Phone: 212-357-8105 | Fax: 212-428-1405 > > Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From alan@lxorguk.ukuu.org.uk Tue, 8 Jun 1999 12:52:42 -0400 Date: Tue, 8 Jun 1999 12:52:42 -0400 From: Alan Cox alan@lxorguk.ukuu.org.uk Subject: Computer Science research done on Beowulf class systems > They place it somewhat above NT but still below "commercial Unices" > primarily because of (don't get angry at me, I didn't write this stuff, > a) A lack of "robust SMP support". So-so. Compared to a 64cpu ultrasparc (which is after all the benchmark here) > b) An "unpolished clustering technology". > c) A lack of a "robust 64 bit journalized file system". > d) A lack of "advanced options for high availability" (see Greg's > remarks above). Wouldnt argue with those. In fact I'd say someone did their research. > given as an instructor instead? Will Don Becker or Alan Cox have an > LCE? Or will there (more reasonably) be an open certification process > that is either dirt cheap or free that doesn't require one to pay for or > take a course at all if one can pass the exam without it. Enquiring > minds want to know...;-) One thing I hope is we will see multiple sources of such things and perhaps an official body with some respect (preferably an existing one) that oversees quality of testing. Its a multisource OS,it should be a multiexaminable OS too 8) > "Clustering is another area in which Linux lags for mainstream > high-availability services that would benefit from Linux clustering > haven't matured yet.<<" > > (>>emphasis mine<<). This is a curious remark, since earlier he > describes the widespread use of Linux in FTP and Web server farms (are Thing about stuff like failover cases. Linux has clustering for performance not reliability. > problem (as I understand it) is that "database cluster" technlogy itself > is fairly immature on any platform. Is this incorrect? Does NT or See VMS. VMS is about 10 years ahead of all of us 8) There is a linux-ha list btw which discusses a lot of work on highly available linux (there are now several commercial options but not yet a good free one) Alan From dhart@indiana.edu Tue, 8 Jun 1999 12:57:02 -0400 Date: Tue, 8 Jun 1999 12:57:02 -0400 From: Dave Hart dhart@indiana.edu Subject: zombies I've been setting up a cluster, and have had a shockingly large number of MPI jobs fail [ps shows them as zombies]. They typically leave p4 error messages such as "Timeout in establishing connection" [or "net_recv read: probable EOF on socket: 1" or "Trying to receive a message when there are no connections" but these may be left from when I kill all the processes]. Has anyone had such an experience? Any advice? -- David Hart http://php.indiana.edu/~dhart Research Computing Support 812-855-2632 University Information Technology Services Indiana University From meisterj@acm.org Tue, 8 Jun 1999 13:52:21 -0400 Date: Tue, 8 Jun 1999 13:52:21 -0400 From: JackM meisterj@acm.org Subject: Computer Science research done on Beowulf class systems See www.linuxcertification.org. As I understand it, they are trying not to be distribution specific. You don't have to pay for classes, but you do have to pay the testing fees. There was some talk of free tests over the Web, but until there exists some way to verify identity existing test centers will have to do. ---------- > On Mon, 7 Jun 1999, Walter B. Ligon III wrote: > > (Parenthetically, I wonder how they will grant LCE's. Will I have to > pay somebody hundreds of dollars and take a "course" that I could have > given as an instructor instead? Will Don Becker or Alan Cox have an > LCE? Or will there (more reasonably) be an open certification process > that is either dirt cheap or free that doesn't require one to pay for or > take a course at all if one can pass the exam without it. Enquiring > minds want to know...;-) > > All of this strikes me as being a pretty fair treatment. It is > certainly one of the best treatments Linux has received in a major > computing mag -- NWC ends up basically endorsing it as very, very nearly > "enterprise ready" (which I interpret as being ready to completely > replace WinXX products and other Unices from top to bottom in an > enterprise) and an overwhelming price/performance win whereever it is > already deployed in the enterprise. > > > rgb > > Robert G. Brown http://www.phy.duke.edu/~rgb/ > Duke University Dept. of Physics, Box 90305 > Durham, N.C. 27708-0305 > Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu > > > From deadline@plogic.com Tue, 8 Jun 1999 14:32:11 -0400 Date: Tue, 8 Jun 1999 14:32:11 -0400 From: Douglas Eadline deadline@plogic.com Subject: Computer Science research done on Beowulf class systems On Tue, 8 Jun 1999, Robert G. Brown wrote: > > a) A lack of "robust SMP support". > b) An "unpolished clustering technology". > c) A lack of a "robust 64 bit journalized file system". > d) A lack of "advanced options for high availability" (see Greg's > remarks above). So now we have "Road Map To the Future" which seems to be an often cited deficiency of Linux. Of course my stock reply to the road map question is "if it is important it will get done, remember Linux is a relative newcomer to the computer party." > > All of these latter elements are presumably present in "market leaders" > like Compaq's Tru64 and Sun's Solaris on the high end. On the low(er) > end, the article complains that Linux is still weak on the commercial > database front: > > "Moving a company's financial system onto an early beta of Oracle 8 > for Linux is a bad idea..." > Agree. Moving a company's financial system to anything is very risky. A prudent manager will take a wait and see approach. After all their job and career are at stake. > > "Clustering is another area in which Linux lags for mainstream > corporate needs. The Beowulf project hit the mainstream this spring by > matching world record holder, a Cray T3t-900-AC64, in the PovRay > benchmark test" ... (run by IBM on a 17 Netfinity cluster running Red > Hat)..." Linux clusters have been popping up in education and aerospace > research facilities for some time now. A few production Linux clusters > even rate among the Top 500 most powerful computers in the world > (www.top500.org). Organizations seeking raw, high-end computational > power won't find a more cost-effective solution. >>But Linux clustering > is little more than academic. Web services. databases and general > high-availability services that would benefit from Linux clustering > haven't matured yet.<<" I would agree with this and change "haven't matured yet" to maturing with the rest of Linux Market. (which IMO is quite rapid). It is becoming much more that an academic market. Indeed, we have delivered machines into several production environments. By the end of the year the numbers will be much higher because people in the vertical markets will talk to each other and the success stories will spread. > > (>>emphasis mine<<). This is a curious remark, since earlier he > describes the widespread use of Linux in FTP and Web server farms (are > not "farms" "clusters"?). I do think that the remarks concerning > database "clusters" are apropos, but not Linux specific. The real > problem (as I understand it) is that "database cluster" technology itself > is fairly immature on any platform. Is this incorrect? Does NT or > Solaris support some sort of superior "clustering"? Also, what > "advanced options for high-availability" clustering are missing? > > > It is my own belief (based on reading this list for many years) that > there is both some truth and some error in the author's statements > concerning Linux clustering technology. I would say that in many cases > it does in fact exist, but I would also agree that it isn't yet properly > organized and packaged and resold, although there are a few vendors > (VAR, Paralogics, others?) that are working on it. This represents a > huge opportunity to entrepreneurs, as we were working hard on pointing > out at the EL booth at Linux Expo last month. I'd say that anyone who > identifies key business "clustering" technologies and develops them > aggressively over the next six to twelve months has an excellent chance > of riding a wave as Linux surges into the Enterprise. They'll > undoubtedly make a few bushel baskets of well-deserved moola in the > meantime. A few points here. The "turn-key" cluster does exist, but it is not the same as punching out desktop PCs. There are a lot of variables that go into the specification, design, and configuration of a "cluster" that are not present in the "desktop" sales model. The net effect is a slower growth rate of this market. (e.g. the market is learning) Much of the "raw" cluster technology exists and has been employed by the people on this list. There is, however, IMO, a big leap from these efforts to a supportable cluster product that can be used in the "mainstream". I am not saying this won't happen (indeed it will), but "getting your cluster to work" is very much different than selling a productized version on which ABC company is going to run its mission critical database. Technologies have to be tested and qualified as to availability, stability, performance, support, etc. This does slow things down a bit, but progress is moving forward. Quite frankly, I am amazed at how rapidly Linux and clusters have made inroads into the market place. I mean the mere fact that Linux is being considered a serious alternative to "whatever" is quite astounding. When the criticism stops so does the interest. Doug ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.861.6960 115 Research Drive | PARALLEL | Fax:+610.861.8247 Bethlehem, PA 18017 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- From jim_pendergraft@dg-rtp.dg.com Tue, 8 Jun 1999 15:19:18 -0400 Date: Tue, 8 Jun 1999 15:19:18 -0400 From: Jim Pendergraft jim_pendergraft@dg-rtp.dg.com Subject: Computer Science research done on Beowulf class systems "Robert G. Brown" wrote: > > Network Computing magazine just came out with Linux for its cover story. > It's claim: Linux is a mature and rapidly developing environment well > able to compete head to head with NT or Netware, already in place in > many organizations (although typically for dedicated "speciality" > purposes more than as a general purpose solution), and with (obviously) > superior price/performance for any purpose for which it competes head to > head performance wise. > > They place it somewhat above NT but still below "commercial Unices" > primarily because of (don't get angry at me, I didn't write this stuff, > but the author, Greg Shipley, gshipley@neohapsis.com would undoubtedly > LOVE your input:-): > > a) A lack of "robust SMP support". > b) An "unpolished clustering technology". > c) A lack of a "robust 64 bit journalized file system". > d) A lack of "advanced options for high availability" (see Greg's > remarks above). > All of this strikes me as being a pretty fair treatment. It is > certainly one of the best treatments Linux has received in a major > computing mag -- NWC ends up basically endorsing it as very, very nearly > "enterprise ready" (which I interpret as being ready to completely > replace WinXX products and other Unices from top to bottom in an > enterprise) and an overwhelming price/performance win whereever it is > already deployed in the enterprise. Sounds to me like their specific shortcomings and general conclusions are both exactly on target. > My biggest bitch about the article is in its treatment of "robust SMP" > and clustering, the topic of this thread. Of course we all know that > SMP under linux is quite robust indeed and in 2.2.x becomes both robust 'Robust' means more than 'doesn't crash'. If you are comparing to NT, then Linux SMP is great. Compared to the mature Unix variants, though, it doesn't look at good. Scaling is also critical. There is a lot of work to be done in the kernel and lib/cmd space to really implement SMP well. It takes lots of time & tuning to get all the locking down to appropriate granularity - on systems that very few linux users have even seen - 4, 8, 16 way, or larger. It also requires scheduling improvements, and processor affinity enhancements. These last are seeing some work now though. > and sophisticated. At the time the article was written, 2.2.x was not > autodeployed in commercial Linux distributions and now is, so perhaps > the author would remove this as an objection/obstacle. Still, there are > quotes that annoy me: > > "Clustering is another area in which Linux lags for mainstream > corporate needs. The Beowulf project hit the mainstream this spring by > matching world record holder, a Cray T3t-900-AC64, in the PovRay > benchmark test" ... (run by IBM on a 17 Netfinity cluster running Red > Hat)..." Linux clusters have been popping up in education and aerospace > research facilities for some time now. A few production Linux clusters > even rate among the Top 500 most powerful computers in the world > (www.top500.org). Organizations seeking raw, high-end computational > power won't find a more cost-effective solution. >>But Linux clustering > is little more than academic. Web services. databases and general > high-availability services that would benefit from Linux clustering > haven't matured yet.<<" > > (>>emphasis mine<<). This is a curious remark, since earlier he > describes the widespread use of Linux in FTP and Web server farms (are > not "farms" "clusters"?). I do think that the remarks concerning No, they aren't. Business clusters are HA clusters. A master server dispatching http requests to a gaggle of other servers is not an HA cluster. I wouldn't even call it a cluster at all. If it is, then every proxy server (and everything it talks to) is a cluster :-) Beowulf is not HA - and business clusters are about HA (pretty much exclusively). It can be as simple as failover - but that failover has to be fast, transparent to clients, and reliable. > database "clusters" are apropos, but not Linux specific. The real > problem (as I understand it) is that "database cluster" technlogy itself > is fairly immature on any platform. Is this incorrect? Does NT or > Solaris support some sort of superior "clustering"? Also, what > "advanced options for high-availability" clustering are missing? Yes. Shared filesystems (and the underlying devices of course) are one key. Every node in an HA cluster should mount all the filesystems, and all should be able to read/write them (without major performance penalties - so NFS won't go - they all need to be on a shared SCSI or Fiber bus). If one node goes down, that node's filesystem locks and in progress transactions need to be released and rolled back within a few minutes at most, and other nodes take over its services transparently. fsck (which takes hours on a large fs of many gigs) is not sufficient. SGI's xfs is a step in that direction, but it needs layers above and below to make it complete. Other interesting things like rolling upgrades - you can upgrade each node on the cluster while the rest is still running and the services never go away. Decent cluster administration tools. Support for huge files and huge filesystems. There are more, but those are some major ones. Read the linux-ha mailing list for more info. > This ongoing education thread (which reaches back over many iterations) > is actually a very important one that both of you have contributed > tremendously to over many years -- clearly the author of the NWC article > recognizes a key element of the distinction, that beowulfs are powerful > parallel numerical engines while business "clusters" are more > amorphously defined and (whether or not the actual component tools exist They are HA clusters. I guess he thought it went without saying :-) > for Linux) are not being widely >>marketed<< as turnkey solutions at > this time. At this point it would have to be mostly marketing - there are no good solutions yet that I know of for real enterprise level HA linux clusters. There are a lot of missing pieces, and those pieces are complex, and expensive to test & tune. It takes millions of $ in equipment to run a big DB benchmark - and you have to be able to do it over and over and over to tune the hardware and kernel locking, and scheduling, and memory management, and ... - one bottleneck at a time. Not many people have both the motivation and resources to work on these sorts of problems (except for those in the business of selling those systems). And then there is testing all the HA features - more hardware and time and $ (but not as much since you don't need boatloads of disk). There are both free and proprietary solutions out there now in this space, but none are close to that level of capability yet. And if someone did manage to do a great job of tuning SMP and cluster performance, would the kernel changes (sure to be pervasive and major) ever be intergrated into the root sources? Not unless Linus changes his story (workstation performance & simplicity/maintainability are paramount). > The last thing that would be very useful (that is apropos to the beowulf > list in particular) would be the integration of the capabilities of the > classic "beowulf" with those of the "business cluster", which may well > be a lot more amorphous or which may optimize completely different parts > of the information processing stream (like PVFS optimizes disk access, > or Web farms optimize availability). The most powerful and general > purpose cluster information entity that I can imagine would be one with > a "beowulf" component optimized for very fast parallel computation on > problems with a variety of grain sizes, a (journalized, 64 bit?) PVFS > component that provides very fast distributed access to a very large > file structure, a parallel network component that provides load-balanced > high-availability access to all of this data and compute power, and > probably several other "clustered" components I haven't thought of. > Such a Linux/COTS construct could serve as the core of a true enterprise > level compute facility -- instant and balanced access to both data and > processing power. See above - high performance and HA aren't mutually exclusive but the tradeoffs to do both would make it hard to do either well, much less both. But I'd love to see someone succeed. Maybe RedHat and VA Linux Systems will announce it tomorrow :-) Or IBM, or SGI... Jim -- Jim Pendergraft (jim_pendergraft@dg.com) (919)248-6136 Data General, 62 Alexander Drive, Research Triangle Park, NC 27709 From brian@loki.chpc.utah.edu Tue, 8 Jun 1999 15:24:57 -0400 Date: Tue, 8 Jun 1999 15:24:57 -0400 From: Brian D. Haymore brian@loki.chpc.utah.edu Subject: Question about SMP within clusters You can, depending on the code, see a conflict from having both processes on the same, dual processor, hit either the memory bus or NIC at the same time and contend for use of that device. The trick is to know if your code is prone to do that or not. If it is there are some software tricks that you could do to reduce the effects of this. -- ================================================= Brian D. Haymore, Systems Administrator Center for High Performance Computing, U of Utah Email: brian@chpc.utah.edu, Phone: (801) 585-1755 ================================================= On Mon, 7 Jun 1999, Michael Stein wrote: > >A dual processor can be used if it is treated in a way like two seperate > >machines. i.e. you can fire off two seperate mpi threads to the same > >node. > > If the program uses a fixed IP port for each process then running two > on the same machine will result in a conflict (port in use)... > > How do you deal with that? [perhaps MPI doesn't have this problem?] > > From rgb@phy.duke.edu Tue, 8 Jun 1999 15:42:48 -0400 Date: Tue, 8 Jun 1999 15:42:48 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: Computer Science research done on Beowulf class systems On Tue, 8 Jun 1999, Jim Pendergraft wrote: > "Robert G. Brown" wrote: > > (text deleted, up to) > See above - high performance and HA aren't mutually exclusive but the > tradeoffs to do both would make it hard to do either well, much less > both. But I'd love to see someone succeed. Maybe RedHat and VA Linux > Systems will announce it tomorrow :-) Or IBM, or SGI... A most informative response. Thank you! rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From dek@cgl.ucsf.edu Tue, 8 Jun 1999 18:19:02 -0400 Date: Tue, 8 Jun 1999 18:19:02 -0400 From: dek@cgl.ucsf.edu dek@cgl.ucsf.edu Subject: Question about SMP within clusters "Brian D. Haymore" writes: >You can, depending on the code, see a conflict from having both processes >on the same, dual processor, hit either the memory bus or NIC at the same >time and contend for use of that device. The trick is to know if your >code is prone to do that or not. If it is there are some software tricks >that you could do to reduce the effects of this. I've written a tool to watch this sort of stuff in realtime. I haven't worked on this program for a while. I'm releasing it to everybody to enjoy and possibly polish up. It's based on the PPro (and PII/Celery/PIII) performance counter patches. This lets you watch them in realtime (using a vt100 terminal such as xterm) in a textual plot. You can very use this to quickly determine the bottleneck of your program- microcode, data cache, instruction cache, memory, CPU, FPU, bus, and other resources, by running your program and this one at the same time (while no other active processes are running). Multiple processors are supported. If you have more than 2 processors then you need to increse the NR_CPUS variable. Dave /* * Linux performance counter viewer * * * Copyright (c) 1999 by David Konerding and the Regents of the * University of California * * Permission to use, copy, modify, and distribute this software and its * documentation for any purpose and without fee is hereby granted, * provided that the above copyright notice appear in all copies and that * both that copyright notice and this permission notice appear in * supporting documentation. * * This file is provided AS IS with no warranties of any kind. The author * shall have no liability with respect to the infringement of copyrights, * trade secrets or any patents by this file or any part thereof. In no * event will the author be liable for any lost revenue or profits or * other special, indirect and consequential damages. * * * I compiled this with: * gcc -O6 -L/usr/local/src/perf -Wall -g -I/usr/local/src/perf -o Experiment Experiment.o -lperf * * This program has to be run as root. * * You need to install the PPro(or PII/PIII/Celeron) perf counter patches for * the 2.2.1 kernel * http://beowulf.gsfc.nasa.gov/software/perf-0.6.tar.gz * I used the directory /usr/local/src/perf to install the perf package * * You can comment out (or uncomment) counter types in the counterList struct below. * Having a large terminal (60 or more lines tall) is great because you can watch all the counters * */ #include #include #include #include #include #include #include #include #include #include #include #include #include /* macro to get the number of items in an array */ #define countof(x) (sizeof((x))/sizeof((*x))) /* define the number of CPUs to be the number in the system */ /* is determined at run-time from /proc/cpuinfo */ /* only defined to static struct sizing*/ #define NR_CPUS 2 /* define the total MHz of the chip for normalization */ /* is determined at run-time from /proc/cpuinfo */ /* #define TOTAL_MHZ (200.*1.e6) */ /* define the delay time during which a counter is sampled */ #define DELAY 1e4 /* define the number of measurements to perform before exiting */ #define MEASUREMENT_MAX 10000 /* define the "width" of the data bar */ #define SCREEN_WIDTH 40 /* basic structure to hold counters to be observed */ struct counter { char *name; int config; int num_counter; }; /* figure out how to use the INST_RETIRED to get percentage of insts which are FLOPS, microops/inst, insts (decoded/retired)/cycle, %fp divides, % resource stalls */ /* list of counters to monitor */ struct counter counterList[] = { {"PERF_CPU_CLK_UNHALTED", PERF_CPU_CLK_UNHALTED, 0}, {"PERF_INST_RETIRED", PERF_INST_RETIRED, 0}, {"PERF_DATA_MEM_REFS", PERF_DATA_MEM_REFS, 0}, {"PERF_UOPS_RETIRED", PERF_UOPS_RETIRED, 0}, {"PERF_INST_DECODER", PERF_INST_DECODER, 0}, {"PERF_RESOURCE_STALLS", PERF_RESOURCE_STALLS, 0}, {"PERF_IFU_IFETCH", PERF_IFU_IFETCH, 0}, {"PERF_BUS_REQ_OUTSTANDING", PERF_BUS_REQ_OUTSTANDING, 0}, {"PERF_BUS_TRANS_IO", PERF_BUS_TRANS_IO | PERF_SELF, 0}, {"PERF_BUS_TRAN_MEM", PERF_BUS_TRAN_MEM | PERF_SELF, 0}, {"PERF_BUS_DATA_RCV", PERF_BUS_DATA_RCV, 0}, {"PERF_FLOPS", PERF_FLOPS, 0}, {"PERF_FP_COMP_OPS_EXE", PERF_FP_COMP_OPS_EXE, 0}, {"PERF_FP_ASSIST", PERF_FP_ASSIST, 1}, {"PERF_MUL", PERF_MUL, 1}, {"PERF_DIV", PERF_DIV, 1}, {"PERF_CYCLES_DIV_BUSY", PERF_CYCLES_DIV_BUSY, 0}, /* {"PERF_BR_INST_RETIRED", PERF_BR_INST_RETIRED, 0}, */ /* {"PERF_BR_TAKEN_RETIRED", PERF_BR_TAKEN_RETIRED, 0}, */ /* {"PERF_BR_MISS_PRED_RETIRED", PERF_BR_MISS_PRED_RETIRED, 0}, */ /* {"PERF_BR_MISS_PRED_TAKEN_RET", PERF_BR_MISS_PRED_TAKEN_RET, 0}, */ /* {"PERF_BR_INST_DECODED", PERF_BR_INST_DECODED, 0}, */ {"PERF_L2_LINES_IN", PERF_L2_LINES_IN, 0}, {"PERF_L2_LINES_OUT", PERF_L2_LINES_OUT, 0}, {"PERF_L2_DBUS_BUSY", PERF_L2_DBUS_BUSY, 0}, {"PERF_L2_DBUS_BUSY_RD", PERF_L2_DBUS_BUSY_RD, 0}, /* {"PERF_BR_BTB_MISSES", PERF_BR_BTB_MISSES, 0}, */ /* {"PERF_BR_BOGUS", PERF_BR_BOGUS, 0}, */ /* {"PERF_BACLEARS", PERF_BACLEARS, 0}, */ {"PERF_DCU_LINES_IN", PERF_DCU_LINES_IN, 0}, {"PERF_DCU_M_LINES_IN", PERF_DCU_M_LINES_IN, 0}, {"PERF_DCU_M_LINES_OUT", PERF_DCU_M_LINES_OUT, 0}, {"PERF_DCU_MISS_STANDING", PERF_DCU_MISS_STANDING, 0}, {"PERF_IFU_IFETCH_MISS", PERF_IFU_IFETCH_MISS, 0}, {"PERF_ITLB_MISS", PERF_ITLB_MISS, 0}, {"PERF_IFU_MEM_STALL", PERF_IFU_MEM_STALL, 0}, {"PERF_ILD_STALL", PERF_ILD_STALL, 0}, {"PERF_L2_IFETCH", PERF_L2_IFETCH | PERF_CACHE_ALL, 0}, {"PERF_L2_LD", PERF_L2_LD | PERF_CACHE_ALL, 0}, {"PERF_L2_ST", PERF_L2_ST | PERF_CACHE_ALL, 0}, {"PERF_L2_LINES_INM", PERF_L2_LINES_INM, 0}, {"PERF_L2_LINES_OUTM", PERF_L2_LINES_OUTM, 0}, {"PERF_L2_RQSTS", PERF_L2_RQSTS | PERF_CACHE_ALL, 0}, {"PERF_L2_ADS", PERF_L2_ADS, 0}, /* {"PERF_BUS_DRDY_CLOCKS", PERF_BUS_DRDY_CLOCKS | PERF_SELF, 0}, */ /* {"PERF_BUS_LOCK_CLOCKS", PERF_BUS_LOCK_CLOCKS | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_BRD", PERF_BUS_TRAN_BRD | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_RFO", PERF_BUS_TRAN_RFO | PERF_SELF, 0}, */ /* {"PERF_BUS_TRANS_WB", PERF_BUS_TRANS_WB | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_IFETCH", PERF_BUS_TRAN_IFETCH | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_INVAL", PERF_BUS_TRAN_INVAL | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_PWR", PERF_BUS_TRAN_PWR | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_P", PERF_BUS_TRAN_P | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_DEF", PERF_BUS_TRAN_DEF | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_BURST", PERF_BUS_TRAN_BURST | PERF_SELF, 0}, */ /* {"PERF_BUS_TRAN_ANY", PERF_BUS_TRAN_ANY | PERF_SELF, 0}, */ /* {"PERF_BUS_BNR_DRV", PERF_BUS_BNR_DRV, 0}, */ /* {"PERF_BUS_HIT_DRV", PERF_BUS_HIT_DRV, 0}, */ /* {"PERF_BUS_HITM_DRV", PERF_BUS_HITM_DRV, 0}, */ /* {"PERF_BUS_SNOOP_STALL", PERF_BUS_SNOOP_STALL, 0}, */ {"PERF_LD_BLOCK", PERF_LD_BLOCK, 0}, {"PERF_SB_DRAINS", PERF_SB_DRAINS, 0}, {"PERF_MISALIGN_MEM_REF", PERF_MISALIGN_MEM_REF, 0}, /* {"PERF_HW_INT_RX", PERF_HW_INT_RX, 0}, */ /* {"PERF_CYCLES_INST_MASKED", PERF_CYCLES_INST_MASKED, 0}, */ /* {"PERF_CYCLES_INT_PENDING_AND_MASKED", PERF_CYCLES_INT_PENDING_AND_MASKED, 0}, */ {"PERF_PARTIAL_RAT_STALLS", PERF_PARTIAL_RAT_STALLS, 0}, {"PERF_SEGMENT_REG_LOADS", PERF_SEGMENT_REG_LOADS, 0}, }; int counterListLookupByNum(int count, struct counter *cl) { int len = countof(counterList); int i; for(i=0; i < len; i++) { if (count == cl[i].config) return i; } return -1; } int counterListLookupByName(char *name, struct counter *cl) { int len = countof(counterList); int i; for(i=0; i < len; i++) { if (strncmp(name, cl[i].name, strlen(name) != strlen(name))) return i; } fprintf(stderr, "LookupByName of '%s' not found\n", cl[i].name); exit(1); } /* simple data structure to hold measurements */ struct measurement { struct timeval tv[NR_CPUS]; unsigned long long tsc[NR_CPUS]; unsigned long long value[NR_CPUS]; }; struct cpus { int num_cpus; double *cpu_mhz; }; #define MAX_LINE_LEN 80 int readln(file, string) FILE *file; char *string; { int i, j; if (file == NULL) { fprintf(stderr, "readln error: fileptr is NULL!\n"); return 0; } for (i=0; inum_cpus=num_cpus; tmp_cpus->cpu_mhz = (double *)malloc(sizeof(double)*num_cpus); for(i=0; i < num_cpus; i++) { tmp_cpus->cpu_mhz[i] = tmp_mhz[i]; } return tmp_cpus; } inline unsigned long long TscCounter(void) { unsigned long high, low; __asm__ __volatile__(".byte 0x0f,0x31" /* can use rdtsc now / */ : "=a" (low), "=d" (high)); return ((unsigned long long) high << 32) + low; } /* convenience function for setting up the counter to monitor */ void SetupPerf(int counter, int config, int num_cpus) { int r, cpu; /* OS: count ring 0 */ /* USR: count ring 3 */ int real_config = config | PERF_OS | PERF_USR; for(cpu = 0; cpu < num_cpus; cpu++) { r = perf_sys_set_config(cpu, counter, real_config); if(r != 0) { perror("perf_sys_config proc=0 ctr=0"); exit(1); } } } /* convenience function for starting monitor */ void inline StartPerf(void) { int r; r = perf_sys_start(); if(r != 0) {perror("perf_sys_start"); exit(1);} } /* convenience function for stopping monitor */ void inline StopPerf(void) { int r; r = perf_sys_stop(); if(r != 0) {perror("perf_sys_stop");exit(1);} } /* convenience function for resetting counter */ void inline ResetPerf(void) { int r; r = perf_sys_reset(); if(r != 0) {perror("perf_sys_reset");exit(1);} } /* measure performance counters and stamp it with the "current" time */ int inline MeasurePerf(struct measurement *measure, int counter, int num_cpus) { int r, cpu; unsigned long long ct; for(cpu=0; cpu < num_cpus; cpu++) { /* r = gettimeofday(&measure->tv[cpu], (struct timezone *)NULL); */ /* if (r != 0) {perror("gettimeofday"); exit(1);} */ measure->tsc[cpu] = TscCounter(); r = perf_sys_read(cpu, counter, &ct); if(r != 0) {perror("perf_sys_read"); exit(1);} measure->value[cpu] = ct; } return 0; } #define getByNum(i,config) (i*2+counterListLookupByNum(config, counter)) #define getByName(i,name) (i*2+counterListLookupByName(name, counter)) /* Print the performance counter value as a data bar, normalized to the total # of clocks (note: not all the counters are meaningful when normalized to the total # of clocks!! */ void PrintAnalyzedPerf(struct cpus *cpu_data, struct counter *counter, struct measurement **measure1, struct measurement **measure2, int i, int len) { int cpu; unsigned long long dm, dtsc; double dt; double val; double t0, t1; unsigned long long m0, m1, tsc0, tsc1; int n, v; int N; int width; double total_mhz; unsigned long long dclk, dinst_ret, ddata_mem_refs, duops_retired, difu_ifetch, dbr_inst_ret, dinst_dec, dfops, dflops, ddiv_busy; /* move to the top of the screen */ printf("\33[H"); printf("%d cpus:\t", cpu_data->num_cpus); for (n=0; n < cpu_data->num_cpus; n++) { printf("%f\t", cpu_data->cpu_mhz[n]); } printf("\n"); /* analysis */ /* oops- currently not "correct"- we're dividing delta counter values by delta counter values- but over difference time intervals!! */ #ifdef 0 for(cpu = 0; cpu < cpu_data->num_cpus; cpu++) { dclk = measure2[getByNum(i,PERF_CPU_CLK_UNHALTED)]->value[cpu]- measure1[getByNum(i,PERF_CPU_CLK_UNHALTED)]->value[cpu]; dinst_ret = measure2[getByNum(i,PERF_INST_RETIRED)]->value[cpu]- measure1[getByNum(i,PERF_INST_RETIRED)]->value[cpu]; ddata_mem_refs = measure2[getByNum(i,PERF_DATA_MEM_REFS)]->value[cpu]- measure1[getByNum(i,PERF_DATA_MEM_REFS)]->value[cpu]; duops_retired = measure2[getByNum(i,PERF_UOPS_RETIRED)]->value[cpu]- measure1[getByNum(i,PERF_UOPS_RETIRED)]->value[cpu]; dinst_dec = measure2[getByNum(i,PERF_INST_DECODER)]->value[cpu]- measure1[getByNum(i,PERF_INST_DECODER)]->value[cpu]; difu_ifetch = measure2[getByNum(i,PERF_IFU_IFETCH)]->value[cpu]- measure1[getByNum(i,PERF_IFU_IFETCH)]->value[cpu]; dflops = measure2[getByNum(i,PERF_FLOPS)]->value[cpu]- measure1[getByNum(i,PERF_FLOPS)]->value[cpu]; dfops = measure2[getByNum(i,PERF_FP_COMP_OPS_EXE)]->value[cpu]- measure1[getByNum(i,PERF_FP_COMP_OPS_EXE)]->value[cpu]; dbr_inst_ret = measure2[getByNum(i,PERF_BR_INST_RETIRED)]->value[cpu]- measure1[getByNum(i,PERF_BR_INST_RETIRED)]->value[cpu]; ddiv_busy = measure2[getByNum(i,PERF_CYCLES_DIV_BUSY)]->value[cpu]- measure1[getByNum(i,PERF_CYCLES_DIV_BUSY)]->value[cpu]; printf("(cpu %d) microops retired/inst retired: %10.3f\n", cpu,(double)duops_retired/dinst_ret); printf("(cpu %d) inst fetch/cycle: %10.3f\n", cpu,(double)difu_ifetch/dclk); printf("(cpu %d) inst decoded/cycle: %10.3f\n", cpu,(double)dinst_dec/dclk); printf("(cpu %d) inst retired/cycle: %10.3f\n", cpu,(double)dinst_ret/dclk); printf("(cpu %d) %% br retired: %10.3f\n", cpu,(double)dbr_inst_ret/dinst_ret); printf("(cpu %d) fops: %10.3f\n", cpu, (double)dfops); printf("(cpu %d) flops retired/inst retired: %10.3f\n", cpu,(double)dflops/dinst_ret); printf("(cpu %d) cycles divider busy/cycle: %10.3f\n", cpu,(double)ddiv_busy/dclk); printf("\n"); } #endif /* counter data bars */ /* for(n=0;nnum_cpus; cpu++) { N = (i*2+v); /* get the counter values for the two measurements (old and new) */ m0 = measure1[N]->value[cpu]; m1 = measure2[N]->value[cpu]; /* get the time interval between the two measurements (old and new) */ t0 = measure1[N]->tv[cpu].tv_sec + (measure1[N]->tv[cpu].tv_usec/1.e6); t1 = measure2[N]->tv[cpu].tv_sec + (measure2[N]->tv[cpu].tv_usec/1.e6); /* time stamp counters occur every clock cycle (even if CPU is HLTed!) */ tsc0 = measure1[N]->tsc[cpu]; tsc1 = measure2[N]->tsc[cpu]; /* tscs occur every 1/(TOTAL_MHZ*1e6) */ /* get the difference in the time and values */ dm = m1-m0; dt = t1-t0; dtsc = tsc1-tsc0; total_mhz = cpu_data->cpu_mhz[cpu]; /* use the timestamp counter, not the time */ width = (double)dm/dtsc * SCREEN_WIDTH; val = (double)dm/dtsc * total_mhz; /* printf("%f\n", (double)dm/dtsc*total_mhz); */ /* then generate the "per seconds", normalized to TOTAL_MHZ */ printf("%-30s(%d) |", counter[v].name, cpu); /* print millions of normalized (to total MHz) counts per sec*/ printf("%7.3fM/s|", val); /* print normalized (to total MHz) counts per sec*/ /* printf("%9.0f|",val*1e6); */ /* print count */ /* printf("%10lld|",measure2[N]->value[cpu]- measure1[N]->value[cpu]); */ for(n=0; n<(int)SCREEN_WIDTH; n++) (nnum_cpus); measure1 = (struct measurement *)malloc(sizeof(struct measurement)); measure2 = (struct measurement *)malloc(sizeof(struct measurement)); /* core measurement routine */ StartPerf(); MeasurePerf(measure1, counterList[v].num_counter, cpu_data->num_cpus); usleep(DELAY); MeasurePerf(measure2, counterList[v].num_counter, cpu_data->num_cpus); StopPerf(); ResetPerf(); /* store the measurements away to be printed later */ n = (i*2+v); keep1[n] = measure1; keep2[n] = measure2; } PrintAnalyzedPerf(cpu_data, counterList, keep1, keep2, i, len); } ResetPerf(); return 0; } From bcomisky@endgate.com Tue, 8 Jun 1999 18:46:38 -0400 Date: Tue, 8 Jun 1999 18:46:38 -0400 From: William Comisky bcomisky@endgate.com Subject: Hello List; I'm looking for HELP I would also like to see what kinds of scripts people use to configure their client systems. Maybe we could make a repository at beowulf.underground with a link in the FAQ? Undoubtedly many configurations are very system specific, but it would be helpful to see what other people do and have something to start with to tailor to our own needs. Does anyone using the Redhat kickstart installation have a good example of a client configuration file? I've been debating whether or not to install locally or to boot from the server. Does anyone have any idea what kind of network traffic is generated by not having the OS installed locally? I would think that once what you need gets cached, it would be minimal. Booting from the server seems like the easiest to manage in the long run, though perhaps with some good scripts the local installation is relatively painless. Bill -- Bill Comisky bcomisky@endgate.com ---------- From: Andy.Hencke Sent: Tuesday, June 08, 1999 1:47 PM To: beowulf Cc: 'Chris Giem' Subject: Hello List; I'm looking for HELP Hello Beowulf folks, We are going to set up a Beowulf cluster here in Denver this summer, and I have a few initial questions. In the Installation Guide (http://www.beowulf-underground.org/doc_project/index.html) written by Jacek Radajewski and Douglas Eadline, the authors refer to a method of installing the clients: "The second method is the one I used in the first stage of our topcat system, that is installing the operating system on each client separately and then running a configuration script on the server which performs the rest of the setup." This is the way we would like to configure our clients, but the authors left those instructions out of the installation guide. Does anyone else have those instructions?????? Secondly, if there is anyone out there who would be willing to send us your email address for questions during setup, that would be greatly appreciated. Thirdly, we are debating running this installation from RedHat 5.2 or 6.0. Any thoughts about problems that might exist using the newer version of RedHat (and therefore newer version of Linux)????? Thanks, Andy Hencke University of Colorado, Denver From armadilo@daft.com Tue, 8 Jun 1999 19:22:25 -0400 Date: Tue, 8 Jun 1999 19:22:25 -0400 From: The Armadillo with the Mask armadilo@daft.com Subject: Computer Science research done on Beowulf class systems On Tue, 8 Jun 1999, Douglas Eadline wrote: > On Tue, 8 Jun 1999, Robert G. Brown wrote: > > > > > a) A lack of "robust SMP support". > > b) An "unpolished clustering technology". > > c) A lack of a "robust 64 bit journalized file system". ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ?? Hasn't this been taken care of in some part by SGI's recent XFS announcements? ---Steve From alan@lxorguk.ukuu.org.uk Tue, 8 Jun 1999 19:46:57 -0400 Date: Tue, 8 Jun 1999 19:46:57 -0400 From: Alan Cox alan@lxorguk.ukuu.org.uk Subject: Computer Science research done on Beowulf class systems > > > b) An "unpolished clustering technology". > > > c) A lack of a "robust 64 bit journalized file system". > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > ?? Hasn't this been taken care of in some part by SGI's recent XFS > announcements? Where do I find it on my Red Hat/Debian/Slackware CD. Thats the question that matters for such a review - fair ? From armadilo@daft.com Tue, 8 Jun 1999 19:50:20 -0400 Date: Tue, 8 Jun 1999 19:50:20 -0400 From: The Armadillo with the Mask armadilo@daft.com Subject: Computer Science research done on Beowulf class systems On Wed, 9 Jun 1999, Alan Cox wrote: > > > > b) An "unpolished clustering technology". > > > > c) A lack of a "robust 64 bit journalized file system". > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > ?? Hasn't this been taken care of in some part by SGI's recent XFS > > announcements? > > Where do I find it on my Red Hat/Debian/Slackware CD. > > Thats the question that matters for such a review - fair ? Fair enough..... ---Steve From root@itso.jccc.net Tue, 8 Jun 1999 20:25:15 -0400 Date: Tue, 8 Jun 1999 20:25:15 -0400 From: root root@itso.jccc.net Subject: Smallest Linux PC On Earth? Good Beowulf node? Keep me posted ... I'd be really interested in that - James jmontign@jccc.net From mwd@sgi.com Tue, 8 Jun 1999 21:42:29 -0400 Date: Tue, 8 Jun 1999 21:42:29 -0400 From: Mark Dalton mwd@sgi.com Subject: Computer Science research done on Beowulf class systems > > > > > b) An "unpolished clustering technology". > > > > c) A lack of a "robust 64 bit journalized file system". > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > ?? Hasn't this been taken care of in some part by SGI's recent XFS > > announcements? > > Where do I find it on my Red Hat/Debian/Slackware CD. > > Thats the question that matters for such a review - fair ? > Exactly, or where can I download it from.. Actually the press release was the start of the project with SGI and Veritas. Because some people really wanted the Veritas software also. SGI/Veritas: http://biz.yahoo.com/bw/990520/ca_veritas_1.html SGI announcement: http://biz.yahoo.com/prnews/990517/ca_sgi_lin_2.html http://www.news.com/News/Item/0,4,84-36807,00.html?tt.yfin..txt.ni Perhaps some of the linux community will offer to help get this to the world faster also. Mark -- Mark Dalton CH3-S-CH2 H H O H Silicon Graphics, Inc. | | | \ | Eagan, MN 55121 CH2-C-COO //\ ---C--CH2-C-COO C-CH2-C-COO mwd@sgi.com | | || || | // | NH3 \\/ \ / CH NH3 O NH3 NH My home page: http://www.cbc.umn.edu/~mwd/mwd.html Cell Biology: http://www.cbc.umn.edu/~mwd/cell.html From kodym@mit.jyu.fi Wed, 9 Jun 1999 04:16:56 -0400 Date: Wed, 9 Jun 1999 04:16:56 -0400 From: Petr Ladislav Kodym kodym@mit.jyu.fi Subject: 2.2.9, system map Hi, >It's a fact that as soon as you recompile the kernel you have to replace >/boot/System.map with the one in /usr/src/linux. If you want to save some work check http://users.dhp.com/~whisper/buildkernel It's very nice script that has saved me a lot of typing during my last 20+ recompilations of 2.2.9 kernel :-(((. It handles things like copying System.map, updating lilo.conf and running lilo automatically. Petr From JesseP@europe.stortek.com Wed, 9 Jun 1999 04:23:58 -0400 Date: Wed, 9 Jun 1999 04:23:58 -0400 From: Jessen, Per JesseP@europe.stortek.com Subject: Computer Science research done on Beowulf class systems > -----Original Message----- > From: Robert G. Brown [mailto:rgb@phy.duke.edu] > Sent: 08 June 1999 17:06 [snip] > a) A lack of "robust SMP support". Define "robust" ? We have a couple of SMP machines running here, and are quite happy with it. Not sure what "robust" really means in this context. > b) An "unpolished clustering technology". > c) A lack of a "robust 64 bit journalized file system". > d) A lack of "advanced options for high availability" (see Greg's > remarks above). Yeah. Agreed. [snip] > "Moving a company's financial system onto an early beta of Oracle 8 > for Linux is a bad idea..." Probably - but what about IBMs DB2 ? It's been out in beta for a while (I got a CD in the mail a couple of weeks ago), and is now shipping with the TurboLinux distribution. (see http://www.ibm.com ) [snip] > My biggest bitch about the article is in its treatment of "robust SMP" > and clustering, the topic of this thread. Of course we all know that > SMP under linux is quite robust indeed and in 2.2.x becomes both robust > and sophisticated. At the time the article was written, 2.2.x was not > autodeployed in commercial Linux distributions and now is, so perhaps Ah, when was this article written ? SuSE Linux was shipped as "2.2.x-ready", and the latest version (the UK english 6.1) from about 4 weeks ago, came with the recommended option of installing with the 2.2.5 kernel. [rgb snipped, sorry - very pleasant reading] regards Per Jessen ENIDAN Technologies Ltd, London From down@ici.net Wed, 9 Jun 1999 05:02:22 -0400 Date: Wed, 9 Jun 1999 05:02:22 -0400 From: Nate Downes down@ici.net Subject: using busses for networking SOmeone once told me that you could use a bus to add multiple CPU's to a system. I've seen one case of someone having done it (an Amiga, which had no less than 5 CPU's in it's small case) and I was wondering if it would still be possible to do such a thing with modern day PCI backplanes? Nate Downes From lph@scali.com Wed, 9 Jun 1999 07:33:54 -0400 Date: Wed, 9 Jun 1999 07:33:54 -0400 From: L.P.Huse lph@scali.com Subject: Cables Hi' Even being Norwegian I am only distant related to trolls ,-) Aggregated application bandwidth of 1450 Mbyte/s (actual network traffic is somewhat higher) for the allreduce collective operation on a machine with bisection bandwidth of more than 4 Gbyte/s isn't unrealistic high - is it ? /lars paul At 10:48 PM 6/8/99 +0800, Paul Eduard Schenker wrote: >Dear friend in speed and Parallisator, your figures below successfully >paralized my mind - 1450 MB/s -- now that will let us all go ballistic. Is >the little troll left of your name involved in any way? > >Paul > >> >>Scalis initial MPI_Alreduce was based on the MPICH 1.0 implementation >>(linearly >>reduce + broadcast). The current implementation is based on the work of >>Rolf Rabenseifner and use binominal trees and overlap calculation and >>communication, improving performance on the 96 node cluster from 73 MB/s >>to 1450 MB/s for 64k buffers (MPI_SUM with MPI_DOUBLE). >>Feel free to contact us for another try ! >> >>/Lars Paul >> >> \\_// Lars Paul Huse; Parallisator & Doctor Scientarum Student >> (o-o) mailto:lph@scali.no http://www.ifi.uio.no/~larspaul >>---oOOO-(_)-OOOo----------------------------------------------------- >>* .oooO Institutt for Informatikk - UiO (rom 3343) PO Box 1080, >>* ( ) Oooo. N-0316 OSLO Voice +47 22 85 24 34 Fax +47 22 85 24 01 >>----\ (----( )------------------------------------------------------ >> \_) ) / Scali AS, Hvamstubben 17, n-2013 Skjetten. >> (_/ Voice +47 63 84 67 04 Fax +47 63 84 59 22 >> >> >Paul Eduard Schenker Phone: +65 - 476 2245 >1 Peirce Hill Fax: +65 - 472 6480 >Singapore 248558 email: pesch@ibm.net From rriendeau@net-quotient.com Wed, 9 Jun 1999 12:37:43 -0400 Date: Wed, 9 Jun 1999 12:37:43 -0400 From: Richard Riendeau rriendeau@net-quotient.com Subject: Question about Channel Bonding... I have a quick question about channel bonding... I have a system ( Message Based QUERY/RESPONSE scenario ) where the message sizes for the query are much smaller than the RESPONSE. In this ansymetric scenario- my back route's bandwidth fills up much faster than my incoming ( They are on two different network seqments ) If I added more NIC cards to the back route and channel bonded them ( Adding more network segments to increase bandwidth )... (1) Would my receiving clients (The machines getting the responses) need any special configuration to "Hear" a message sent from a channel bonded node. Here is a small diagram in ASCII : CLIENTS REQUEST BROKER WORKER NODE /----------------------------- NODE A -------------------------- NODE B NODE C ---\ /---- 192.0.3.0 NODE D ----\_____ CLOUD OF VARIOUS ROUTES_____/------ CHANNEL BONDED TO 192.0.3.0 NODE E ----/ ASSUME INFINITE BANDWIDTH \------ CHANNEL BONDED TO 192.0.3.0 NODE F ---/ \----- CHANNEL BONDED TO 192.0.3.0 So as I understand today- this would allow for NODE B (NODE B is the focus of this question) to receive n Mb/s of requests and respond with approximately 4n Mb/s of responses. Is this accurate? -Rich NetQuotient Consulting Group From ronelson@vt.edu Wed, 9 Jun 1999 17:48:53 -0400 Date: Wed, 9 Jun 1999 17:48:53 -0400 From: Rob Nelson ronelson@vt.edu Subject: Computer Science research done on Beowulf class systems > Probably - but what about IBMs DB2 ? It's been out in beta for a while > (I got a CD in the mail a couple of weeks ago), and is now shipping with > the TurboLinux distribution. > (see http://www.ibm.com ) TurboLinux ships with DB2 configured in its upper end package, right? I see an ad in the June LJ for a variety of TurboLinux setups, but they include Oracle 8 when shipping with SQL...seems like an old ad. >> My biggest bitch about the article is in its treatment of "robust SMP" >> and clustering, the topic of this thread. Of course we all know that >> SMP under linux is quite robust indeed and in 2.2.x becomes both robust I think it was also being compared relative to how other systems perform SMP-wise. How does Linux compare to Solaris on a SPARC system with 16 processors? Rob Nelson ronelson@vt.edu From bcomisky@endgate.com Wed, 9 Jun 1999 18:07:39 -0400 Date: Wed, 9 Jun 1999 18:07:39 -0400 From: William Comisky bcomisky@endgate.com Subject: Hello List; I'm looking for HELP I would also like to see what kinds of scripts people use to configure their client systems. Maybe we could make a repository at beowulf-underground with a link in the FAQ? Undoubtedly many configurations are very system specific, but it would be helpful to see what other people do and have something to start with to tailor to our own needs. Does anyone using the Redhat kickstart installation have a good example of a client configuration file? I've been debating whether or not to install locally or to boot from the server. Does anyone have any idea what kind of network traffic is generated by not having the OS installed locally? I would think that once what you need gets cached, it would be minimal. Booting from the server seems like the easiest to manage in the long run, though perhaps with some good scripts the local installation (or cloning, etc.) is relatively painless. Bill -- Bill Comisky bcomisky@endgate.com ---------- From: Andy.Hencke Sent: Tuesday, June 08, 1999 1:47 PM To: beowulf Cc: 'Chris Giem' Subject: Hello List; I'm looking for HELP Hello Beowulf folks, We are going to set up a Beowulf cluster here in Denver this summer, and I have a few initial questions. In the Installation Guide (http://www.beowulf-underground.org/doc_project/index.html) written by Jacek Radajewski and Douglas Eadline, the authors refer to a method of installing the clients: "The second method is the one I used in the first stage of our topcat system, that is installing the operating system on each client separately and then running a configuration script on the server which performs the rest of the setup." This is the way we would like to configure our clients, but the authors left those instructions out of the installation guide. Does anyone else have those instructions?????? Secondly, if there is anyone out there who would be willing to send us your email address for questions during setup, that would be greatly appreciated. Thirdly, we are debating running this installation from RedHat 5.2 or 6.0. Any thoughts about problems that might exist using the newer version of RedHat (and therefore newer version of Linux)????? Thanks, Andy Hencke University of Colorado, Denver From admin@cersa.admu.edu.ph Wed, 9 Jun 1999 19:25:58 -0400 Date: Wed, 9 Jun 1999 19:25:58 -0400 From: William Emmanuel S. Yu admin@cersa.admu.edu.ph Subject: Computer Science research done on Beowulf class systems On Tue, 8 Jun 1999, The Armadillo with the Mask wrote: > > > On Wed, 9 Jun 1999, Alan Cox wrote: > > > > > > b) An "unpolished clustering technology". > > > > > c) A lack of a "robust 64 bit journalized file system". > > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > ?? Hasn't this been taken care of in some part by SGI's recent XFS > > > announcements? > > > > Where do I find it on my Red Hat/Debian/Slackware CD. > > > > Thats the question that matters for such a review - fair ? > > Fair enough..... hey. where can i find a 64bit journalized file system for linux? did sgi release their xfs modifications already? william.s.yu@ieee.org From khalid@ctms.com.my Wed, 9 Jun 1999 21:35:14 -0400 Date: Wed, 9 Jun 1999 21:35:14 -0400 From: Khalid Amanullah khalid@ctms.com.my Subject: Parallel OpenGL Hi! Is there a Parallel OpenGL library or anybody working to parallelize it? Regards. Khalid. From markgw@sandpit.melbourne.sgi.com Wed, 9 Jun 1999 23:48:19 -0400 Date: Wed, 9 Jun 1999 23:48:19 -0400 From: Mark Goodwin markgw@sandpit.melbourne.sgi.com Subject: Computer Science research done on Beowulf class systems On Jun 9, 17:49, William Emmanuel S. Yu wrote: > Subject: Re: Computer Science research done on Beowulf class systems > > > hey. where can i find a 64bit journalized file system for linux? did sgi > release their xfs modifications already? > We're working on it .. expect to see something around August timeframe. The delay is due to our legal requirement for an encumberance (sp?) review of the XFS code before we can release it to the linux community as open source. Mark Goodwin SGI Engineering From shahin@labf.org Thu, 10 Jun 1999 01:00:04 -0400 Date: Thu, 10 Jun 1999 01:00:04 -0400 From: Mofeed Shahin shahin@labf.org Subject: Parallel OpenGL G'day all, Try looking at this : http://www.lri.fr/~alex/PMesa/ Cheers Mof. On Wed, 9 Jun 1999, Khalid Amanullah wrote: > Hi! > Is there a Parallel OpenGL library or anybody working to parallelize it? > > Regards. > Khalid. > From philip_juels@harvard.edu Thu, 10 Jun 1999 09:41:30 -0400 Date: Thu, 10 Jun 1999 09:41:30 -0400 From: Philip Juels philip_juels@harvard.edu Subject: User logins I don't know if this has been talked about. but how do you all administer login accounts in your cluster? Identical but independent login accounts on each node? NIS? With large clusters, administering user login accounts must be a nightmare. Philip Juels philip_juels@harvard.edu From rgb@phy.duke.edu Thu, 10 Jun 1999 10:53:02 -0400 Date: Thu, 10 Jun 1999 10:53:02 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: User logins On Wed, 9 Jun 1999, Philip Juels wrote: > I don't know if this has been talked about. but how do you all > administer login accounts in your cluster? Identical but independent > login accounts on each node? NIS? With large clusters, administering > user login accounts must be a nightmare. Since we have a "cluster", rather than a "beowulf" per se, we use NIS, but one could also use e.g. rdist to keep things synchronized. That is, even on a large cluster (or large LAN) there are a number of tools and approaches designed to make adminstration of accounts scale decently. NIS is expensive and a bit clunky, but login/authentication is a one-shot serial expense and irrelevant compared to parallel runtime a long calculation. If your cluster is a "true beowulf" with a head/gateway/firewall node, there are other solutions, e.g. -- logging into just the head node (which requires NIS or external accounts to be set up) and then su-ing to a predefined account defined on all the nodes with no password required and little or no authentication internally. Jobs are then run using this account. This is "convenient" in some ways because this one account would be the one which owns pvmd, for example, which can simplify the management of pvmd and its associated locks. It does presuppose both a lot of trust between users of the 'wulf and a certain homogeneity of purpose -- if lots of users and groups use it, you will probably want either several of these accounts or to use NIS or rdist to propagate your usual accounts for logging and accountability purposes. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From morrone@capsl.udel.edu Thu, 10 Jun 1999 12:07:07 -0400 Date: Thu, 10 Jun 1999 12:07:07 -0400 From: Christopher J. Morrone morrone@capsl.udel.edu Subject: User logins On Wed, 9 Jun 1999, Philip Juels wrote: > I don't know if this has been talked about. but how do you all > administer login accounts in your cluster? Identical but independent > login accounts on each node? NIS? With large clusters, administering > user login accounts must be a nightmare. I just rdist the shadow, passwd, and group files to the slave nodes. NIS proved to be too costly in the cluster. Its not really bad to manage the accounts. I just use the normal user administration tools on the head node (adduser, deluser, chsh, etc.) and then run an rdist to distribute all of the passwords. From joelja@darkwing.uoregon.edu Thu, 10 Jun 1999 12:35:23 -0400 Date: Thu, 10 Jun 1999 12:35:23 -0400 From: Joel Jaeggli joelja@darkwing.uoregon.edu Subject: User logins We use rdist for password, group, host, files and other housekeeping tasks... joelja On Wed, 9 Jun 1999, Philip Juels wrote: > I don't know if this has been talked about. but how do you all > administer login accounts in your cluster? Identical but independent > login accounts on each node? NIS? With large clusters, administering > user login accounts must be a nightmare. > > Philip Juels > philip_juels@harvard.edu > -------------------------------------------------------------------------- Joel Jaeggli joelja@darkwing.uoregon.edu Academic User Services consult@gladstone.uoregon.edu PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -------------------------------------------------------------------------- It is clear that the arm of criticism cannot replace the criticism of arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of the right, 1843. From mariog@matna2.dma.unina.it Thu, 10 Jun 1999 12:36:42 -0400 Date: Thu, 10 Jun 1999 12:36:42 -0400 From: Dr. Mario Rosario Guarracino mariog@matna2.dma.unina.it Subject: Parallel OpenGL A parallel version (distributed memory) is at http://www.cs.sandia.gov/VIS/pmesa.html Mario Guarracino On Thu, 10 Jun 1999, Mofeed Shahin wrote: > > G'day all, > > Try looking at this : > http://www.lri.fr/~alex/PMesa/ > > Cheers Mof. > > On Wed, 9 Jun 1999, Khalid Amanullah wrote: > > > Hi! > > Is there a Parallel OpenGL library or anybody working to parallelize it? > > > > Regards. > > Khalid. > > > > From dhart@indiana.edu Thu, 10 Jun 1999 12:58:51 -0400 Date: Thu, 10 Jun 1999 12:58:51 -0400 From: Dave Hart dhart@indiana.edu Subject: configuring MPICH for shared and distributed memory I'm setting up a cluster of dual P2's, and now trying to get SMP compiling to work. [The install doc says shared comm is not supported on LINUX, but some folk have reported success, and their are internal reasons to push for this]. I'm using RH5.2, 2.2.7 kernal, Portland Group F77/F90 compilers v3.0-4 and gcc 2.7.2.3. I was using pentium group's pgcc-2.91.66, based on egcs, but I couldn't get mpich-1.1.2 to compile with -comm=shared; same with Portland's pgcc. Now -comm=shared works, but I ran the NAS benchmarks with it, and the statistics are _much_ worse. I did conservative compiles both before and after [also aggressive ones, before]. Here's the command I used to configure mpich, the other compiles were similar: configure -device=ch_p4 -comm=shared -file_system=nfs+ufs\ -mpe -nodevdebug -rsh=ssh \ -cc=gcc -c++=g++ -cflags="-O -m486" \ -fc=pgf77 -f90=pgf90 \ -fflags="-tp p6 -Msignextend -fast -L/usr/mpich/build/LINUX/ch_p4/lib/ -lmpich" \ -f90flags="-tp p6 -L/usr/mpich/build/LINUX/ch_p4/lib/ -lmpich" Also, I'm getting a huge [huger?] number of mpi failures due to connection timeouts [the job just hangs, ps aux shows them as zombies]. Anybody got a spare clue? - Dave -- David Hart http://php.indiana.edu/~dhart Research Computing Support 812-855-2632 University Information Technology Services Indiana University From armadilo@daft.com Thu, 10 Jun 1999 13:14:45 -0400 Date: Thu, 10 Jun 1999 13:14:45 -0400 From: The Armadillo with the Mask armadilo@daft.com Subject: Computer Science research done on Beowulf class systems On Sun, 9 Jun 1996, William Emmanuel S. Yu wrote(with the help of his time machine): > [...] > hey. where can i find a 64bit journalized file system for linux? did sgi > release their xfs modifications already? Not officially released yet, although my Linux-friendly contacts inside SGI tell me things are going quicker than expected. A From morrone@capsl.udel.edu Thu, 10 Jun 1999 14:29:57 -0400 Date: Thu, 10 Jun 1999 14:29:57 -0400 From: Christopher J. Morrone morrone@capsl.udel.edu Subject: User logins On Thu, 10 Jun 1999, Robert G. Brown wrote: > On Wed, 9 Jun 1999, Philip Juels wrote: > > > I don't know if this has been talked about. but how do you all > > administer login accounts in your cluster? Identical but independent > > login accounts on each node? NIS? With large clusters, administering > > user login accounts must be a nightmare. > > Since we have a "cluster", rather than a "beowulf" per se, we use NIS, > but one could also use e.g. rdist to keep things synchronized. That is, > even on a large cluster (or large LAN) there are a number of tools and > approaches designed to make adminstration of accounts scale decently. > NIS is expensive and a bit clunky, but login/authentication is a > one-shot serial expense and irrelevant compared to parallel runtime a > long calculation. Thats not necessarily true. On our cluster, the slave nodes were doing NIS lookups all though the computation, and average CPU usage on the head node was %80...probably related to the NFS usage. But maybe the slaves can be set to cache the NIS info? From aclose72@yahoo.com Thu, 10 Jun 1999 14:41:05 -0400 Date: Thu, 10 Jun 1999 14:41:05 -0400 From: andrew close aclose72@yahoo.com Subject: Computer Science research done on Beowulf class systems i apologize if this is an extremely ignorant question, but since there has been a lot of talk about it lately... what is a journaling file system, and why is it so sought after? thanks andy _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com From uthayopa@mcs.anl.gov Thu, 10 Jun 1999 14:50:11 -0400 Date: Thu, 10 Jun 1999 14:50:11 -0400 From: Putchong Uthayopas uthayopa@mcs.anl.gov Subject: User logins Hi, My researcher wrote our own script. If you want it , we can give it to you. Our group strongly support opensource movement. Putchong. On Thu, 10 Jun 1999, Joel Jaeggli wrote: > We use rdist for password, group, host, files and other housekeeping > tasks... > > joelja > > On Wed, 9 Jun 1999, Philip Juels wrote: > > > I don't know if this has been talked about. but how do you all > > administer login accounts in your cluster? Identical but independent > > login accounts on each node? NIS? With large clusters, administering > > user login accounts must be a nightmare. > > > > Philip Juels > > philip_juels@harvard.edu > > > > -------------------------------------------------------------------------- > Joel Jaeggli joelja@darkwing.uoregon.edu > Academic User Services consult@gladstone.uoregon.edu > PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E > -------------------------------------------------------------------------- > It is clear that the arm of criticism cannot replace the criticism of > arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of > the right, 1843. > > > From spiffy@tamu.edu Thu, 10 Jun 1999 15:05:17 -0400 Date: Thu, 10 Jun 1999 15:05:17 -0400 From: Scott Patrick Faasse spiffy@tamu.edu Subject: channel bonding I have heard the phrase "channel bonding" in my reading of materials about beowulf class systems. right now i am piddling around and have 5 p5-75's clustered on a 10Mbit ethernet LAN. now i have some more ne2000 cards laying around and another hub. can i channel bond with that. and how exaclty do you go about "bonding" your cards together? -spiffy ---------------------------------------------------- Scott "spiffy" Faasse web: temporarily unavailable email: spiffy@tamu.edu ---------------------------------------------------- From bryan@cog-tech.com Thu, 10 Jun 1999 15:24:32 -0400 Date: Thu, 10 Jun 1999 15:24:32 -0400 From: Bryan Thompson bryan@cog-tech.com Subject: User logins On Wed, 9 Jun 1999, Philip Juels wrote: > I don't know if this has been talked about. but how do you all > administer login accounts in your cluster? Identical but independent > login accounts on each node? NIS? With large clusters, administering > user login accounts must be a nightmare. Since we have a "cluster", rather than a "beowulf" per se, we use NIS, but one could also use e.g. rdist to keep things synchronized. That is, even on a large cluster (or large LAN) there are a number of tools and approaches designed to make adminstration of accounts scale decently. NIS is expensive and a bit clunky, but login/authentication is a one-shot serial expense and irrelevant compared to parallel runtime a long calculation. If your cluster is a "true beowulf" with a head/gateway/firewall node, there are other solutions, e.g. -- logging into just the head node (which requires NIS or external accounts to be set up) and then su-ing to a predefined account defined on all the nodes with no password required and little or no authentication internally. Jobs are then run using this account. This is "convenient" in some ways because this one account would be the one which owns pvmd, for example, which can simplify the management of pvmd and its associated locks. It does presuppose both a lot of trust between users of the 'wulf and a certain homogeneity of purpose -- if lots of users and groups use it, you will probably want either several of these accounts or to use NIS or rdist to propagate your usual accounts for logging and accountability purposes. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From Mark.Butler-contractor@jntf.osd.mil Thu, 10 Jun 1999 16:06:48 -0400 Date: Thu, 10 Jun 1999 16:06:48 -0400 From: Butler, Mark, CTR Mark.Butler-contractor@jntf.osd.mil Subject: Journaling File Systems I second this request (question). What's a journaling file system and what's all the fuss about? > -----Original Message----- > From: andrew close [SMTP:aclose72@yahoo.com] > Sent: Thursday, June 10, 1999 12:41 PM > To: beowulf@beowulf.gsfc.nasa.gov; extreme-linux@acl.lanl.gov > Subject: Re: Computer Science research done on Beowulf class systems > > i apologize if this is an extremely ignorant question, > but since there has been a lot of talk about it > lately... > > what is a journaling file system, and why is it so > sought after? > > thanks > > andy > > > _________________________________________________________ > Do You Yahoo!? > Get your free @yahoo.com address at http://mail.yahoo.com From rgb@phy.duke.edu Thu, 10 Jun 1999 16:22:51 -0400 Date: Thu, 10 Jun 1999 16:22:51 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: User logins On Thu, 10 Jun 1999, Christopher J. Morrone wrote: > Thats not necessarily true. On our cluster, the slave nodes were doing > NIS lookups all though the computation, and average CPU usage on > the head node was %80...probably related to the NFS > usage. But maybe the slaves can be set to cache the NIS info? I think that this depends on what kind of job you are running. Anything that is spawning a lot of node connections or opening and closing a lot of communication channels will require a lot of authentication or host lookups. Something that spawns slave tasks once and maintains open communication channels only requires them once, at the beginning. I wouldn't argue that the rdist strategy is likely to be the more efficient of the two and that sometimes it will matter, but it is probably very slightly more of a hassle to set up also. If one is already running NIS and your parallel job strategies aren't NIS-use intensive then it may make more sense to just go ahead and use it. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From rgb@phy.duke.edu Thu, 10 Jun 1999 16:28:21 -0400 Date: Thu, 10 Jun 1999 16:28:21 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: Computer Science research done on Beowulf class systems On Thu, 10 Jun 1999, andrew close wrote: > i apologize if this is an extremely ignorant question, > but since there has been a lot of talk about it > lately... > > what is a journaling file system, and why is it so > sought after? I'm not absolutely certain, but I believe that it is a filesystem that (sort of, I'm not actually sure how they accomplish the journaling) only stores and indexes diffs; nothing is ever actually deleted and all changes are always reversible. They are very useful to companies that require a full, auditible trail of all informational transactions. They are thus like CVS or RCS, sort of, but for a whole filesystem. Obviously very space inefficient but just the ticket for certain kinds of applications and operations. rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From morrone@capsl.udel.edu Thu, 10 Jun 1999 16:43:31 -0400 Date: Thu, 10 Jun 1999 16:43:31 -0400 From: Christopher J. Morrone morrone@capsl.udel.edu Subject: User logins On Thu, 10 Jun 1999, Robert G. Brown wrote: > On Thu, 10 Jun 1999, Christopher J. Morrone wrote: > > > Thats not necessarily true. On our cluster, the slave nodes were doing > > NIS lookups all though the computation, and average CPU usage on > > the head node was %80...probably related to the NFS > > usage. But maybe the slaves can be set to cache the NIS info? > > I think that this depends on what kind of job you are running. Anything > that is spawning a lot of node connections or opening and closing a lot > of communication channels will require a lot of authentication or host > lookups. Something that spawns slave tasks once and maintains open > communication channels only requires them once, at the beginning. Like I said, that is not strictly true. Our software opens a file makes TCP connections to the other nodes only once at the beginning of the run. Yet NIS requests happen all through execution. I think that its making NIS requests almost every time file I/O is performed (on ALREADY open files and sockets). It would be nice if NIS performed as you described, but for us, that was not the case. From tim@santafe.edu Thu, 10 Jun 1999 16:50:46 -0400 Date: Thu, 10 Jun 1999 16:50:46 -0400 From: Tim Carlson tim@santafe.edu Subject: Journaling File Systems On Thu, 10 Jun 1999, Butler, Mark, CTR wrote: > I second this request (question). What's a journaling file system and > what's all the fuss about? I wasn't going to reply to the list, but after two requests I figured why not. http://www.veritas.com/library/ai/whit-00002.html From ulairi@ecs.csun.edu Thu, 10 Jun 1999 17:14:21 -0400 Date: Thu, 10 Jun 1999 17:14:21 -0400 From: Ulairi ulairi@ecs.csun.edu Subject: User logins How about only a single machine (front end) that allows direct access to it, then all the jobs are spooled via rsh? Then you only maintain the accounts on the frontend, run ssh (kill telnet to it) and the internal cluster does not talk to the world at all (many ways of doing that) From dominique.chabord@bluedjinn.com Thu, 10 Jun 1999 17:15:12 -0400 Date: Thu, 10 Jun 1999 17:15:12 -0400 From: Dominique Chabord dominique.chabord@bluedjinn.com Subject: Computer Science research done on Beowulf class systems Hi Andrew, My focus is high availability and reduce downtime and data loss after crash. A journalled file system is supposed to restart faster after a crash because it doesn't need to perform the llllooooonnnnnggggg FSCK that never ends on vanilla file systems. There are also good features which can be built on such new technology, but my interest is fast restart first data integrity. regards Dominique -----Message d'origine----- De : andrew close À : beowulf@beowulf.gsfc.nasa.gov ; extreme-linux@acl.lanl.gov Date : jeudi 10 juin 1999 22:34 Objet : Re: Computer Science research done on Beowulf class systems >i apologize if this is an extremely ignorant question, >but since there has been a lot of talk about it >lately... > >what is a journaling file system, and why is it so >sought after? > >thanks > >andy > > >_________________________________________________________ >Do You Yahoo!? >Get your free @yahoo.com address at http://mail.yahoo.com > > From janl@linpro.no Thu, 10 Jun 1999 19:28:33 -0400 Date: Thu, 10 Jun 1999 19:28:33 -0400 From: janl@linpro.no janl@linpro.no Subject: User logins "Christopher J. Morrone" tastet: > On Thu, 10 Jun 1999, Robert G. Brown wrote: > > > On Thu, 10 Jun 1999, Christopher J. Morrone wrote: > > > > > Thats not necessarily true. On our cluster, the slave nodes were doing > > > NIS lookups all though the computation, and average CPU usage on > > > the head node was %80...probably related to the NFS > > > usage. But maybe the slaves can be set to cache the NIS info? Libc 6.1 which comes with redhat 6.0 among others has a nis/name cache demon to replace ypbind which should help a lot. > > I think that this depends on what kind of job you are running. Anything > > that is spawning a lot of node connections or opening and closing a lot > > of communication channels will require a lot of authentication or host > > lookups. Something that spawns slave tasks once and maintains open > > communication channels only requires them once, at the beginning. > > Like I said, that is not strictly true. Our software opens a file makes > TCP connections to the other nodes only once at the beginning of the run. > Yet NIS requests happen all through execution. I think that its making > NIS requests almost every time file I/O is performed (on ALREADY open > files and sockets). NIS requests will normaly only be made when opening rsh/rlogin/ssh connections, when logging in in any other way --- all of which requires user authentication, host lookups and similar things. NIS requests will normaly _not_ be made whenever you write to a file, read from a file or open or close a file, or socket, pipe or manipulate any other icp mechanism. NIS requests can be made when you use the getserv*, getproto*, getpw*, gethost* and similar library calls which requires lookups in some kind of table or database. These tables and databases are enumerated in /etc/nsswitch.conf and you can disable NIS for specific kinds of queries and enable it for others. The kernel itself does all its checking of permissions based on UIDs and GIDs, for the most part, this does not require NIS lookups. That being said, the NIS code in libc5 is not fast in any way, and libc6 is even worse. It would not surprise me a lot if libc6 makes un-needed NIS calls. The NIS code in libc4 was pretty good. But how hard you're hit by this badness depends on your application. strace and ltrace (I think the library trace program was called ltrace) are your friends if you want to fint out what _really_ happens. Nicolai From eugene.leitl@lrz.uni-muenchen.de Thu, 10 Jun 1999 20:22:37 -0400 Date: Thu, 10 Jun 1999 20:22:37 -0400 From: Eugene Leitl eugene.leitl@lrz.uni-muenchen.de Subject: G3 goes Beo http://www.blacklablinux.com/products/ BLACK LAB LINUX PRODUCTS Commodity Parallel Computing Systems "Commodity" refers to the construction of high-performance parallel computing systems from off-the-shelf-hardware and software. Black Lab Linux systems offer a relative low cost of entry, a valuable price-to-performance ratio, and current market-based upgrade path. The following material provides not only an introduction to standard configurations, but also an opportunity to submit a request for a Price Quotation whereby we will contact you and discuss further your requirements. Standard Cluster Configurations 4 node 8 node 16 node 32 node Fast Ethernet switch BayStack 350-12T BayStack 350-12T BayStack 350-24T to be determined *Gigabit Ethernet switch 9 port Alteon ACEswitch 180 9 port Alteon ACEswitch 180 Alteon 708 switch, 16 Gigabit ports on 4 modules Alteon 714 switch, 32 ports on 8 modules" **Rack Chassis 1 half-height 2 racks 3 racks 6 racks * Options Networking Fabric Fast Ethernet Only Gigabit Ethernet with Jumbo Frames support, NetGear NICs Gigabit Ethernet with Jumbo Frames support, Alteaon NICs iMac as System Console G3 rack mount equipment ** Preliminary Specs BL Computation Node Processor: PowerPC 750 (G3): 350 400 450, 1 MB L2 cache, 100 Mbit/sec fast ethernet networking (built in) RAM: 256 MB SDRAM RAM: 512 MB SDRAM RAM: 1 GB SDRAM 6 GB Ultra ATA hard disk 12 GB Ultra ATA hard disk CDROM drive for software installation/upgrades ATI RAGE 128 graphics card Note that 'Computation nodes' can be recycled as desktop machines when cluster is upgraded. Standard Software Configurations The following are currently in development; specifications and software availability are subject to change. Black Lab Linux installed and configured EGCS (gcc) compiler SSH (secure, encrypted connections to cluster--available in US only due to export restrictions) development libraries MPICH and PVM parrallel computation libraries Experimental software CODA distributed filesystem Kerberos 5 authentication system (available in US only due to export restrictions) GNU Queue job scheduling system Optional Software Configurations MPICH and MPI-Pro are Message Passing Interfaces. MPICH is a freely available, portable implementation of MPI, the standard for message-passing libraries. MPI-Pro is a fully optimized, high performance implimentation message-passing library. MPI-Pro should be available end of 3rd quarter 1999. MPI-Pro by MPI Software Technology Optimized VSI/Pro (Vector, Signal, and Image Processing Library) by MPI Software Technology From admin@cersa.admu.edu.ph Thu, 10 Jun 1999 20:32:59 -0400 Date: Thu, 10 Jun 1999 20:32:59 -0400 From: William Emmanuel S. Yu admin@cersa.admu.edu.ph Subject: Computer Science research done on Beowulf class systems On Thu, 10 Jun 1999, Mark Goodwin wrote: > On Jun 9, 17:49, William Emmanuel S. Yu wrote: > > Subject: Re: Computer Science research done on Beowulf class systems > > > > > > hey. where can i find a 64bit journalized file system for linux? did sgi > > release their xfs modifications already? > > > > We're working on it .. expect to see something around August timeframe. > The delay is due to our legal requirement for an encumberance (sp?) review > of the XFS code before we can release it to the linux community as open > source. > where can i read updates on this? any site i can check out regularly to monitor dev? william.s.yu@ieee.org From markgw@sandpit.melbourne.sgi.com Thu, 10 Jun 1999 20:36:45 -0400 Date: Thu, 10 Jun 1999 20:36:45 -0400 From: Mark Goodwin markgw@sandpit.melbourne.sgi.com Subject: Computer Science research done on Beowulf class systems On Jun 10, 18:41, William Emmanuel S. Yu wrote: > Subject: Re: Computer Science research done on Beowulf class systems > > On Thu, 10 Jun 1999, Mark Goodwin wrote: > > > On Jun 9, 17:49, William Emmanuel S. Yu wrote: > > > Subject: Re: Computer Science research done on Beowulf class systems > > > > > > > > > hey. where can i find a 64bit journalized file system for linux? did sgi > > > release their xfs modifications already? > > > > > > > We're working on it .. expect to see something around August timeframe. > > The delay is due to our legal requirement for an encumberance (sp?) review > > of the XFS code before we can release it to the linux community as open > > source. > > > where can i read updates on this? any site i can check out regularly to > monitor dev? > For those who have asked, the press release about SGI's open source intentions for XFS is available at http://www.sgi.com/newsroom/press_releases/1999/may/xfs.html For more detailed technical info, see the XFS "white paper" http://www.sgi.com/Technology/xfs-whitepaper.html There has also been quite a bit of discussion about XFS in the linux kernel lists recently. For a summary, see http://www.kt.opensrc.org/kt19990603_21.html#2 For regular updates on the progress of this project, you'll have to watch out for SGI press releases for the time being. I'll ping our marketing folks to see if we can set up a web page or a mailing list. The SGI press room web page is http://www.sgi.com/newsroom/ Hope this helps -- Mark Goodwin SGI Engineering From pesch@ibm.net Thu, 10 Jun 1999 21:12:34 -0400 Date: Thu, 10 Jun 1999 21:12:34 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: Computer Science research done on Beowulf class systems Mark, as your apparently doing a journalized fs - what do you think about an idexed filessystem? I've been playing with the idea for a while, never got going on it but still find it attractive. Paul At 01:46 PM 6/10/99 -0500, Mark Goodwin wrote: >On Jun 9, 17:49, William Emmanuel S. Yu wrote: >> Subject: Re: Computer Science research done on Beowulf class systems >> >> >> hey. where can i find a 64bit journalized file system for linux? did sgi >> release their xfs modifications already? >> > >We're working on it .. expect to see something around August timeframe. >The delay is due to our legal requirement for an encumberance (sp?) review >of the XFS code before we can release it to the linux community as open >source. > > Mark Goodwin > SGI Engineering > > Paul Eduard Schenker Phone: +65 - 476 2245 1 Peirce Hill Fax: +65 - 472 6480 Singapore 248558 email: pesch@ibm.net From eugene.leitl@lrz.uni-muenchen.de Fri, 11 Jun 1999 02:17:23 -0400 Date: Fri, 11 Jun 1999 02:17:23 -0400 From: Eugene Leitl eugene.leitl@lrz.uni-muenchen.de Subject: CPU: K7 Curtain call Very informative in-depth site on the forthcoming AMD K7. http://www.jc-news.com/pc/article.cgi?AMD/Curtain_Call_K7 Interesting cost speculations: The price of the K7 is a big unknown blot in a sea of misinformation. So far, we've had estimates of the 500MHz part hitting $800 at startup (courtesy Sharky's, of course), and of the same part hitting around $250 at intro (this one's from zdnet, I recall). In reality, I think we can expect the K7 to be priced reasonably close to parity with similarly L2-cached P6 parts, perhaps somewhat above. PIII-500 currently runs for as low as $450. I'm told that Intel's most recent price drop just occurred, so I think it would be logical to assume that the K7-500 will be above $400 when announced (at least if it were on pricewatch). One bit that seems to be too oft-forgotten ist he motherboards. I've heard guesses as high as north of $300 for Slot-A boards. Surely, Slot-A is packed with some pretty advanced (for x86, that is) technology, and probably costs more to produce than the Slot-1 boards, especially given that it is using an untried-for-consumer-market EV-6 architecture. I presume that prices may start around $200, just as the Slot-1 boards started. I get the impression, however, that the current boards are built with cutting costs in mind, so I can accept some boards starting perhaps $150. Then again, I could be totally wrong and we could see $110 Slot-A boards popping up right away! -- Eugene P.S. Traffic has picked up recently, and S/N is turning sour. How about using Subject: keywords? From pdgtech@wantree.com.au Fri, 11 Jun 1999 05:08:59 -0400 Date: Fri, 11 Jun 1999 05:08:59 -0400 From: Peter de Groot pdgtech@wantree.com.au Subject: nfs performance probs This is probably the wrong list, but this seems to be where all the gurus are ;-) I use nfs A LOT, mainly in a sort of COW configuration. Problem. How do I stop the kernel(?) from renicing the nfsd daemons to a lower priority. When the machine acting as a nfs server is also doing a CPU intensive job. nfs practically stops. This is a major hassle. Our nfs server daemons easily clock up 10 mins CPU, and from what I have been told, the kernel automatically drops the priority. Our compute intensive jobs can easily take a day or so, and when that is happening, you might as well write off using the data on the local drive. I did sort of fiddle around with cron jobs and so on, but that is obviously not terribly elegant. Didn't work real good either. Maybe I have got it all wrong too.... I have noticed this behavior on both SGs and old DEC ULTRIX boxes. Regards Peter ______________________________________________________________ PDG Technical Services Pty Ltd Peter de Groot P.O. Box 10349 08) 90916817 Kalgoorlie 6430 Western Australia ______________________________________________________________ From bahnsen@theo-physik.uni-kiel.de Fri, 11 Jun 1999 06:41:21 -0400 Date: Fri, 11 Jun 1999 06:41:21 -0400 From: Robert Bahnsen bahnsen@theo-physik.uni-kiel.de Subject: Hardware for memory intensive task Hi, we intend to set up a compute cluster for tasks with pretty much memory (1GB) per node for operations with big matrices. We are familiar with SUN WS, but they seem to fall back. We doubt wether a PC can perform well with so much memory per node. So, are there alternatives to EV6 in some DS10 boxes? Thanks for comments -- Robert Bahnsen Institut fuer Theoretische Physik und Astrophysik Universitaet Kiel, Leibnizstr. 15, D-24098 Kiel, Germany Tel: +49 431 8804112 Fax: +49 431 8804094 bahnsen@tp.cau.de http://www.theo-physik.uni-kiel.de/schattke/ From rajkumar@dgs.monash.edu.au Fri, 11 Jun 1999 08:03:36 -0400 Date: Fri, 11 Jun 1999 08:03:36 -0400 From: Rajkumar Buyya rajkumar@dgs.monash.edu.au Subject: Cluster Computing Workshop Call for Papers 1st IEEE International Workshop on Cluster Computing (IWCC'99) http://www.dgs.monash.edu.au/~rajkumar/tfcc/IWCC99/ Melbourne, Australia, Dec. 2 , 1999 (In conjunction with PART '99 - Nov. 29 - Dec. 1, 1999) Sponsored by the IEEE Computer Society, through the Task Force on Cluster Computing (TFCC) Co-sponsored by: MPI Software Technology, Inc., USA Asian Technology Information Program (ATIP), Japan Genias Software, Germany ------------------------------------------------------------------------ Call For Participation The International Workshop on Cluster Computing (IWCC'99) will be held in Melbourne, Australia on Dec. 2, 1999. IWCC'99 is associated with the PART '99, The 6th Autralasian Conference on Parallel and Real-Time Systems. IWCC'99 is sponsored by the IEEE Computer Society, through the Task Force on Cluster Computing (TFCC). The availability of high-speed networks and increasingly powerful commodity microprocessors are making the usage of clusters, or networks, of computers an appealing vehicle for cost effective parallel computing. Clusters, built using commodity-of-the-shelf (COTS) hardware components as well as free, or commonly used, software, are playing a major role in redefining the concept of supercomputing. Cluster computing systems range from diverse elements within a single computer to co-ordinated, geographically distributed machines with different architectures. A cluster computing system provides a variety of capabilities that can be orchestrated to execute multiple tasks with varied computational requirements. Applications in these environments achieve performance by exploiting the affinity of different tasks to different computational platforms or paradigms, while considering the overhead of inter-task communication and the co-ordination of distinct data sources and/or administrative domains. IWCC is an international meeting on Cluster Computing and will serve as a forum to present the latest work by international researchers and developers as well as highlight activities in this area around Asia Pacific rim. The topics of interest include, but are not limited to, are: * Cluster Hardware (Cluster of PCs, Workstations, or SMPs) * High Performance communication networks and interfaces * Light Weight Communication Protocols * Cluster Middleware/Underware o Single System Image Infrastructure o System Availability Infrastructure * Issues in Building Scalable Services * File Systems and Parallel I/O * Job and Resource Management * Data Distribution and Load Balancing * Programming Paradigms/Environment for Clusters * Message Passing Systems such as MPI and PVM for Clusters * Problem Solving Environments for Clusters * Tools for Operating and Managing Clusters * Java for High Performance Computing * Algorithms for Solving Problems on Clusters * Scientific, Engineering, and Commercial Applications on Clusters It is planned that workshop papers will be published through the IEEE Computer Society Press. The proceedings will also be available on the Web. See http://www.dcs.port.ac.uk/~mab/tfcc/IWCC99/ or http://www.dgs.monash.edu.au/~rajkumar/tfcc/IWCC99/ for the latest information concerning the workshop. Paper Submission Authors are invited to submit papers consisting of original unpublished research in all areas of cluster computing to the Programme Chair. All submissions will be reviewed by the Program Committee and outside international referees. Papers should not exceed ten (10) single-spaced pages of text using 12 point size type on A4 pages. References, figures, tables, etc. may be included in addition to the ten pages of text. The paper should include an abstract of approximately 100 words. Authors must submit their papers electronically through the link at http://dhpc.adelaide.edu.au/conferences/IWCC99/ Authors should submit a PostScript (level 2) file and make sure that it will print on a PostScript printer that uses A4 sized paper. Manuscripts must be received by on or before the deadline for submission. General/Organising Chairs: * Mark Baker (Portsmouth University, UK) * Rajkumar Buyya (Monash University, Australia) Program Chair: * Ken Hawick (Adelaide University, Australia) Program Committee: * David Abramson (Monash University, Melbourne, Australia) * Hamid Arabnia (University of Georgia, USA) * David Bader (University of New Mexico, USA) * Mark Baker (Portsmouth University, UK) * Ricardo Bianchini (Federal University of Rio de Janeiro, Brazil) * Suchendra Bhandarkar (University of Georgia, USA) * Luc Bouge (LIP, ENS Lyon, France) * Marian Bubak (Institute of Computer Science, Poland) * Rajkumar Buyya (Monash University, Australia) * Giovanni Chiola (University of Genoa, Italy) * Paul Coddington (University of Adelaide, Australia) * Toni Cortes (Universitat Politecnica de Catalunya, Spain) * Dave DeRoure (University of Southampton, UK) * Joao Gabriel Silva (Coimbra University, Portugal) * Al Geist (Oakridge National Lab. USA) * Andrezj Goscinski (Deakin University, Australia) * Wolfgang Gentzsch (Genias GmbH, Germany) * Bill Gropp (Argonne National Lab., USA) * Salim Hariri (Arizona University, USA) * Dan Hyde (Bucknell University, USA) * Yutaka Ishikawa (Real World Computing Partnership, Japan) * Heath James (University of Adelaide, Australia) * Hai Jin (University of Hong Kong, China) * Daniel S. Katz (Jet Propulsion Lab., California Institute of Technology, USA) * Chung-Ta King (National Tsing Hua University, Taiwan) * Kevin Maciunas (University of Adelaide, Australia) * Piyush Maheshwari (University of New South Wales, Sydney) * Chris McDonald (University of Western Australia, Perth, Australia) * John Morris (University of Western Australia, Perth, Australia) * Marcin Paprzycki (University of Southern Mississippi, USA) * Robert Pennington (NCSA, USA) * Ira Pramanick (Sun Microsystems, USA) * Radharamanan Radhakrishnan (University of Cincinnati, USA) * Rajeev Raje (Purdue University, USA) * Mohan Ram (Centre for Development of Advanced Computing, India) * Wolfgang Rehm (TU Chemnitz, Germany) * Paul Roe (Queensland University of Technology, Brisbane) * Harjinder Sandhu (York University, Toronto, Canada) * Danial Saverese (California Institute of Technology, USA) * Hong Shen (Griffith University, Brisbane, Australia) * R. K. Shyamasundar (Tata Institute of Fundamental Research, India) * Tony Skjellum (MPI Software Technology, USA) * Thomas Sterling (California Institute of Technology, USA) * Peter Strazdins (Australian National University, Canberra) * Chengzheng Sun (Griffith University, Brisbane, Australia) * Yong-Meng Teo ( National University of Singapore, Singapore) * Putchong Uthayopas (Kasetsart University, Bangkok, Thailand) * David W Walker (Cardiff University, UK) * Barry Wilkinson (University of North Carolina, USA) * Albert Zomaya (University of Western Australia, Perth, Australia) Publicity Chairs: * David Bader (University of New Mexico, USA) * Albert Zomaya (University of Western Australia) Finance and Local Arrangements Chair: * Mahbub Hassan (Monash University, Australia) Registration Chair: * Jahan Hassan (Monash University, Australia) Important Dates: Call For Papers : May 5, 1999 Paper Submission : August 5, 1999 Notification of Acceptance : August 25, 1999 Camera Ready Papers and Pre-registration due on : September 17, 1999 IWCC'99 Workshop : December 2, 1999 -------------------------------------------------------------------------- From jferg@2boot.com Fri, 11 Jun 1999 09:26:05 -0400 Date: Fri, 11 Jun 1999 09:26:05 -0400 From: jferg jferg@2boot.com Subject: Journaling File Systems This is a multi-part message in MIME format. --------------2C6C0073D142083FADBE4A19 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit "Butler, Mark, CTR" wrote: > I second this request (question). What's a journaling file system and > what's all the fuss about? > > > -----Original Message----- > > From: andrew close [SMTP:aclose72@yahoo.com] > > Sent: Thursday, June 10, 1999 12:41 PM > > To: beowulf@beowulf.gsfc.nasa.gov; extreme-linux@acl.lanl.gov > > Subject: Re: Computer Science research done on Beowulf class systems > > > > i apologize if this is an extremely ignorant question, > > but since there has been a lot of talk about it > > lately... > > > > what is a journaling file system, and why is it so > > sought after? > > > > thanks > > > > andy > > > > > > _________________________________________________________ > > Do You Yahoo!? > > Get your free @yahoo.com address at http://mail.yahoo.com Another approach with similar advantages is the "Log Structured Filesystem", LFS, which is optimized for write operations. It is covered in some detail in McKusick et. al. "The Design and Implementation of the 4.4BSD Operating System", Addison-Wesley, 1996, ISBN 0-201-45979-4. -- Joe Ferguson, ApeX Systems Integration Corp. Voice: 919.468.8150 FAX: 919.468.5288 email: jferg@2boot.com --------------2C6C0073D142083FADBE4A19 Content-Type: text/x-vcard; charset=us-ascii; name="jferg.vcf" Content-Transfer-Encoding: 7bit Content-Description: Card for jferg Content-Disposition: attachment; filename="jferg.vcf" begin:vcard n:Ferguson;Joe x-mozilla-html:FALSE org:ApeX Systems Integration Corp. adr:;;;;;; version:2.1 email;internet:jferg@2boot.com title:Tech Director x-mozilla-cpt:;0 fn:Joe Ferguson end:vcard --------------2C6C0073D142083FADBE4A19-- From deadline@plogic.com Fri, 11 Jun 1999 10:32:16 -0400 Date: Fri, 11 Jun 1999 10:32:16 -0400 From: Douglas Eadline deadline@plogic.com Subject: CPU: K7 Curtain call On Thu, 10 Jun 1999, Eugene Leitl wrote: This may also be of interest: http://www.aceshardware.com/ BTW: I'm at USENIX and the Extreme Linux Session. Great stuff. Doug > > Very informative in-depth site on the forthcoming AMD K7. > > http://www.jc-news.com/pc/article.cgi?AMD/Curtain_Call_K7 > > Interesting cost speculations: > > The price of the K7 is a big unknown blot in a > sea of misinformation. So far, we've had estimates > of the 500MHz part hitting $800 at startup > (courtesy Sharky's, of course), and of the same part > hitting around $250 at intro (this one's from zdnet, > I recall). In reality, I think we can expect the K7 to > be priced reasonably close to parity with similarly > L2-cached P6 parts, perhaps somewhat above. > PIII-500 currently runs for as low as $450. I'm told > that Intel's most recent price drop just occurred, so > I think it would be logical to assume that the > K7-500 will be above $400 when announced (at > least if it were on pricewatch). > > One bit that seems to be too oft-forgotten ist > he motherboards. I've heard guesses as high as > north of $300 for Slot-A boards. Surely, Slot-A is > packed with some pretty advanced (for x86, that is) > technology, and probably costs more to produce > than the Slot-1 boards, especially given that it is > using an untried-for-consumer-market EV-6 > architecture. I presume that prices may start around > $200, just as the Slot-1 boards started. I get the > impression, however, that the current boards are > built with cutting costs in mind, so I can accept > some boards starting perhaps $150. Then again, I > could be totally wrong and we could see $110 > Slot-A boards popping up right away! > > -- Eugene > > P.S. Traffic has picked up recently, and S/N is turning sour. How > about using Subject: keywords? > ------------------------------------------------------------------- Paralogic, Inc. | PEAK | Voice:+610.861.6960 115 Research Drive | PARALLEL | Fax:+610.861.8247 Bethlehem, PA 18017 USA | PERFORMANCE | http://www.plogic.com ------------------------------------------------------------------- From Giovanni@averell.umh.ac.be Fri, 11 Jun 1999 11:50:09 -0400 Date: Fri, 11 Jun 1999 11:50:09 -0400 From: Giovanni Scalmani Giovanni@averell.umh.ac.be Subject: channel bonding problem under 2.3.0 and RH6 Hi! what I'm missing here? 1- got two (actually more than two) dual PII 400MHz and 2 100BT NICs ( 3com Fast EtherLink PCI, i.e. 'boomerang' cards); the two NICs are attached to two different switches D-Link DES 1008 2- installed RH6 and configured both cards eth0 : 192.168.1.x ( 255.255.255.0 ) eth1 : 192.168.2.x ( 255.255.255.0 ) 3- got linux-2.3.0 4- got and applied the channel bonding patch for 2.3.0 from http://www.glasscity.net/users/gsaraber/gerard/linux.html 5- got and compiled the ifenslave.c program from the same site (actually it seems identical to the original ifenslave.c from CESDIS) 6- issued (as root) the following commands (first on the 'receiver' and then on the 'transmitter') ifconfig eth1 down ifenslave -v eth0 eth1 ifenslave.c:v0.07 9/9/97 Donald Becker (becker@cesdis.gsfc.nasa.gov) The hardware address (SIOCGIFHWADDR) of eth0 is type 1 00:10:5a:5a:0e:24. The interface eth1 is up, shutting it down it to enslave it. Set the slave's hardware address to 00:10:5a:5a:0e:24. Set the slave's IP address to 0.0.192.168. IS THIS OK ?! Set the slave's MTU to 1500. Set the slave's destination address to 0.0.192.168. AND THIS ? Set the slave's broadcast address to 0.0.192.168. AND THIS ? Set the slave's netmask to 0.0.255.255. AND THIS ? Set the slave's flags 1043. 7- ifconfig gives something looking better ... eth0 Link encap:Ethernet HWaddr 00:10:5A:5A:0E:24 inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 RX packets:5734329 errors:0 dropped:0 overruns:0 frame:0 TX packets:5719315 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 Interrupt:5 Base address:0xa800 eth1 Link encap:Ethernet HWaddr 00:10:5A:5A:0E:24 inet addr:192.168.1.4 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 Interrupt:10 Base address:0xa400 However NetPIPE between the two machines does not show anything beyond 89 Mbps and the leds on the 'slave' network switch do not blink! So ... what I'm missing? Should I bring down eth1 on the remaining nodes and restart the 'slave' natwork switch ? Thank you for you help. Regards, Giovanni ---------------------------------------------------------------------- Scalmani Giovanni Giovanni@averell.umh.ac.be Service de Chimie des Materiaux Nouveaux Centre de Recherche en Electronique et Photonique Moleculaires Universite' de Mons-Hainaut Place du Parc, 20 Phone: ++32-(0)65-373361 B-7000 Mons (Belgium) Fax: ++32-(0)65-373366 ---------------------------------------------------------------------- From dan@cfdws10.concordia.ca Fri, 11 Jun 1999 13:37:37 -0400 Date: Fri, 11 Jun 1999 13:37:37 -0400 From: dan stanescu dan@cfdws10.concordia.ca Subject: 2-machine cluster setup Hi, we're wondering if anyone can give us an advice re. the SAN for a 2-node beowulf. Since no switch/hub is necessary, I thought only one cross-over Ethernet cable would be enough. I saw however that some vendors use two cross-over cables, with 2 NICs per node only for the SAN. I wonder what the benefits of that would be? Thanks for any advice, ------------------------------------------------------------------ Dan Stanescu, PhD | | CFD Laboratory, ER-301 | Tel.: (514)848-3138 | Concordia University | FAX : (514)848-8601 | 1455 de Maisonneuve Blvd. West | E-mail: dan@cfdlab.concordia.ca | Montreal, CANADA H3G 1M8 | | ------------------------------------------------------------------ From tadavis@lbl.gov Fri, 11 Jun 1999 14:24:11 -0400 Date: Fri, 11 Jun 1999 14:24:11 -0400 From: Thomas Davis tadavis@lbl.gov Subject: nfs performance probs Peter de Groot wrote: > > This is probably the wrong list, but this seems > to be where all the gurus are ;-) > > I use nfs A LOT, mainly in a sort of COW configuration. > > Problem. How do I stop the kernel(?) from renicing the > nfsd daemons to a lower priority. When the machine > acting as a nfs server is also doing a CPU intensive > job. nfs practically stops. > > This is a major hassle. Our nfs server daemons easily > clock up 10 mins CPU, and from what I have been told, > the kernel automatically drops the priority. > What version of the Linux kernel, and is it the user space NFSD or the kernel NFSD? >From what I can tell, the kNFSD doesn't have this problem. (I've got 7 machines running it here, all serving up 62gb partitions.. and they easily do a sustained 5-6mb per second transfer rates..) -- ------------------------+-------------------------------------------------- Thomas Davis | PDSF Project Leader tadavis@lbl.gov | (510) 486-4524 | "Only a petabyte of data this year?" From Lechner@drs-esg.com Fri, 11 Jun 1999 14:41:40 -0400 Date: Fri, 11 Jun 1999 14:41:40 -0400 From: Lechner, David Lechner@drs-esg.com Subject: Extreme Linux material from USENIX-99 Here is a general question from an atendee that was unable to get registered into the "CLOSED" EL workshop - Will the presentations or URLs for the presentation material from the workshop be posted on the Usenix or EL or Beowulf web sites? Disappointed atendees and distant non-atendees might like to know what went on (even in the limited form of the presentations). W/Thanks/ Dave Lechner From rgb@phy.duke.edu Fri, 11 Jun 1999 16:01:00 -0400 Date: Fri, 11 Jun 1999 16:01:00 -0400 From: Robert G. Brown rgb@phy.duke.edu Subject: Journaling File Systems > what is a journaling file system, and why is it so > sought after? I case everybody hasn't realized it, my previous answer to this question was wrong. It isn't a filesystem with an audit trail/versioning system, it is a filesystem that does writes in two steps in such a way that the filesystem is never in an inconsistent state. Consequently, fsck's are basically unnecessary, which saves a lot of time. As my correctors instructed me, they write faster and it is pleasant to see the initial filesystem check after a poweroff reboot take only 2 seconds (even with cone-head quantities of disk). rgb Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb@phy.duke.edu From josip@icase.edu Fri, 11 Jun 1999 16:43:41 -0400 Date: Fri, 11 Jun 1999 16:43:41 -0400 From: Josip Loncaric josip@icase.edu Subject: 2-machine cluster setup dan stanescu wrote: > > Hi, > > we're wondering if anyone can give us an advice re. the SAN > for a 2-node beowulf. Since no switch/hub is necessary, > I thought only one cross-over Ethernet cable would be > enough. I saw however that some vendors use two cross-over cables, > with 2 NICs per node only for the SAN. I wonder what the benefits > of that would be? A single crossover cable would work nicely -- you'll get a direct full duplex connection. Double connection (and using Ethernet bonding) would give you almost twice the bandwidth but still the same latency. More bandwidth is nice, but there is little benefit in doubling the bandwidth if your limitation is latency. Know your code's bottlenecks, and act accordingly. If you are uncertain, start with one connection, measure, then add the second connection if needed. Then again, Fast Ethernet cards are cheap (e.g. NetGear FA310TX was $11.95 each at buy.com recently). Adding two FA310TX cards and a crossover cable should cost under $40 including shipping (in the USA). Not bad, although you would need a driver update to get full performance. Sincerely, Josip -- Dr. Josip Loncaric, Senior Staff Scientist mailto:josip@icase.edu ICASE, Mail Stop 132C http://www.icase.edu/~josip/ NASA Langley Research Center mailto:j.loncaric@larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134 From wro@usc.edu Fri, 11 Jun 1999 17:32:35 -0400 Date: Fri, 11 Jun 1999 17:32:35 -0400 From: Wonwoo Ro wro@usc.edu Subject: question about 3COM-3c905B Dear all, We are installed Redhat 6.0 on 9 nodes of Pentiums processors. When we test TCP, the result is extremely bad. We are using 3COM's 3C905B Ethernet (10/100) Card. UDP test is quiet good. If anybody has some knowledge about this, please send me e-mail. The benchmark test software is netperf and netpipe. We use 3c59x in Redhat 6.0 as device driver. Thank you. From jteneyck@xyos.net Fri, 11 Jun 1999 17:59:19 -0400 Date: Fri, 11 Jun 1999 17:59:19 -0400 From: John M. TenEyck jteneyck@xyos.net Subject: 2-machine cluster setup I believe this would probably be used in conjunction with channel bonding for better throughput. John TenEyck _________________________________________________________________________ John TenEyck jteneyck@xyos.net http://jteneyck.xyos.net 409.229.8954 .-. __ _____ ____ ___ __ /v\ / / / _/ | / / / / / |/ / / \ / / / // |/ / / / /| / /( )\ / /____/ // /| / /_/ // | ^^-^^ /_____/___/_/ |_/_____//_/|_| >Phear The Penguin< If you refuse to accept anything but the best you very often get it. _________________________________________________________________________ On Fri, 11 Jun 1999, dan stanescu wrote: > > Hi, > > we're wondering if anyone can give us an advice re. the SAN > for a 2-node beowulf. Since no switch/hub is necessary, > I thought only one cross-over Ethernet cable would be > enough. I saw however that some vendors use two cross-over cables, > with 2 NICs per node only for the SAN. I wonder what the benefits > of that would be? > > Thanks for any advice, > > ------------------------------------------------------------------ > Dan Stanescu, PhD | | > CFD Laboratory, ER-301 | Tel.: (514)848-3138 | > Concordia University | FAX : (514)848-8601 | > 1455 de Maisonneuve Blvd. West | E-mail: dan@cfdlab.concordia.ca | > Montreal, CANADA H3G 1M8 | | > ------------------------------------------------------------------ > From evt@texelsoft.com Fri, 11 Jun 1999 18:37:40 -0400 Date: Fri, 11 Jun 1999 18:37:40 -0400 From: texelsoft evt@texelsoft.com Subject: atx mobo question I recently got a bunch of asus atx mobos. Is there a convenient way to short/hack the atx switch leads so the machines will autoboot after a power failure without requiring someone to wallop the switch? -evt Eric van Tassell Texel Software, Inc. - System Software Engineering for Windows/NT 277 Cochran Hill Rd. Voice : 603-487-5006 New Boston, NH 03070 USA Fax : 603-487-5166 email : evt@texelsoft.com http://www.texelsoft.com > -----Original Message----- > From: owner-beowulf@beowulf.gsfc.nasa.gov > [mailto:owner-beowulf@beowulf.gsfc.nasa.gov]On Behalf Of > Fred_deBros@etherdome.mgh.harvard.edu > Sent: Tuesday, June 08, 1999 9:47 AM > To: Kat kirk > Cc: Dave Hart; beowulf@beowulf.gsfc.nasa.gov; extreme-linux@acl.lanl.gov > Subject: Re: 2.2.9 > > > > > > > Anyone upgrade to 2.2.9 ? If so how goes? > > Well, I still try to: > > make make dep make bzImage, make modules make modules_install (I > wanted the > 4-cdrom changer to run) and I get on bootup: > > after sending BOOTP and RARP requests......unable to handle kernel paging > request at virtual address 00d0030.....lots of computerese ....Aiee, > killing interrupt handler. > > I dont think that is 2.2.9 per se. 2.2.5 runs fine. Am I doin sumpin > wrong in make menuconfig? > > I am not knowledgeable enough to fix that. Somebody? > > fred > > > From joelja@darkwing.uoregon.edu Fri, 11 Jun 1999 19:36:19 -0400 Date: Fri, 11 Jun 1999 19:36:19 -0400 From: Joel Jaeggli joelja@darkwing.uoregon.edu Subject: atx mobo question actually I think you can do it in the bios on asus boards... you might check the manual... joelja On Fri, 11 Jun 1999, texelsoft wrote: > I recently got a bunch of asus atx mobos. Is there a convenient way to > short/hack the atx switch leads so the machines will autoboot after a power > failure without requiring someone to wallop the switch? > > -evt > > Eric van Tassell > Texel Software, Inc. - System Software Engineering for Windows/NT > 277 Cochran Hill Rd. Voice : 603-487-5006 > New Boston, NH 03070 USA Fax : 603-487-5166 > email : evt@texelsoft.com > http://www.texelsoft.com > > > > -----Original Message----- > > From: owner-beowulf@beowulf.gsfc.nasa.gov > > [mailto:owner-beowulf@beowulf.gsfc.nasa.gov]On Behalf Of > > Fred_deBros@etherdome.mgh.harvard.edu > > Sent: Tuesday, June 08, 1999 9:47 AM > > To: Kat kirk > > Cc: Dave Hart; beowulf@beowulf.gsfc.nasa.gov; extreme-linux@acl.lanl.gov > > Subject: Re: 2.2.9 > > > > > > > > > > > > > Anyone upgrade to 2.2.9 ? If so how goes? > > > > Well, I still try to: > > > > make make dep make bzImage, make modules make modules_install (I > > wanted the > > 4-cdrom changer to run) and I get on bootup: > > > > after sending BOOTP and RARP requests......unable to handle kernel paging > > request at virtual address 00d0030.....lots of computerese ....Aiee, > > killing interrupt handler. > > > > I dont think that is 2.2.9 per se. 2.2.5 runs fine. Am I doin sumpin > > wrong in make menuconfig? > > > > I am not knowledgeable enough to fix that. Somebody? > > > > fred > > > > > > > -------------------------------------------------------------------------- Joel Jaeggli joelja@darkwing.uoregon.edu Academic User Services consult@gladstone.uoregon.edu PGP Key Fingerprint: 1DE9 8FCA 51FB 4195 B42A 9C32 A30D 121E -------------------------------------------------------------------------- It is clear that the arm of criticism cannot replace the criticism of arms. Karl Marx -- Introduction to the critique of Hegel's Philosophy of the right, 1843. From wcs@nersc.gov Fri, 11 Jun 1999 19:48:26 -0400 Date: Fri, 11 Jun 1999 19:48:26 -0400 From: Bill Saphir wcs@nersc.gov Subject: Extreme Linux material from USENIX-99 On Fri, 11 Jun 1999, Lechner, David wrote: > Here is a general question from an atendee that was unable to get registered > into the "CLOSED" EL workshop - Will the presentations or URLs for the > presentation material from the workshop be posted on the Usenix or EL or > Beowulf web sites? Disappointed atendees and distant non-atendees might > like to know what went on (even in the limited form of the presentations). > W/Thanks/ We were told that slides from most of the talks should be available at the Extreme Linux web site (www.extremelinux.org) within a few days. There was also a small bound procedings (short papers) distributed at the workshop. Pete - would it be possible to put these up as well? Bill From jav@blazenet.net Fri, 11 Jun 1999 21:57:09 -0400 Date: Fri, 11 Jun 1999 21:57:09 -0400 From: jav jav@blazenet.net Subject: 2-machine cluster setup yes and no, as the previous reply to this message from Josip has stated, you will get increased bandwidth, but the latency is enough to negate any "real" positive effects when only dealing with 2 nodes. For simplicity's sake channel bonding really isn't that necessary. Instead, buy the 4 NIC's and be prepared when Murphy's Law takes effect. John > -----Original Message----- > From: John M. TenEyck [SMTP:jteneyck@xyos.net] > Sent: Friday, 11 June, 1999 17:57 > To: dan stanescu > Cc: beowulf@beowulf.gsfc.nasa.gov > Subject: Re: 2-machine cluster setup > > I believe this would probably be used in conjunction with channel > bonding > for better throughput. > > John TenEyck > > ______________________________________________________________________ > ___ > > John TenEyck > jteneyck@xyos.net > http://jteneyck.xyos.net > 409.229.8954 > .-. __ _____ ____ ___ __ > /v\ / / / _/ | / / / / / |/ / > / \ / / / // |/ / / / /| / > /( )\ / /____/ // /| / /_/ // | > ^^-^^ /_____/___/_/ |_/_____//_/|_| > >Phear The Penguin< > > If you refuse to accept anything but the best you very often get it. > ______________________________________________________________________ > ___ > > > On Fri, 11 Jun 1999, dan stanescu wrote: > > > > > Hi, > > > > we're wondering if anyone can give us an advice re. the SAN > > for a 2-node beowulf. Since no switch/hub is necessary, > > I thought only one cross-over Ethernet cable would be > > enough. I saw however that some vendors use two cross-over cables, > > with 2 NICs per node only for the SAN. I wonder what the benefits > > of that would be? > > > > Thanks for any advice, > > > > ------------------------------------------------------------------ > > Dan Stanescu, PhD | | > > CFD Laboratory, ER-301 | Tel.: (514)848-3138 | > > Concordia University | FAX : (514)848-8601 | > > 1455 de Maisonneuve Blvd. West | E-mail: dan@cfdlab.concordia.ca | > > Montreal, CANADA H3G 1M8 | | > > ------------------------------------------------------------------ > > From nick_boulter@hotmail.com Sat, 12 Jun 1999 08:31:08 -0400 Date: Sat, 12 Jun 1999 08:31:08 -0400 From: Nick Boulter nick_boulter@hotmail.com Subject: Fwd: Linux security threat -fwd- I have been reading the list for some time, and have had personal contact with some other members. This came through my email from the university (Bristol, England) and I thought some of you might be interested. Nick. >From: Oliver Miles >Reply-To: oliver.miles@bristol.ac.uk >To: cs-dept@compsci.bristol.ac.uk >Subject: Linux security threat >Date: Fri, 11 Jun 1999 15:32:22 +0100 > >--- Begin Forwarded Message --- >Date: Wed, 9 Jun 1999 10:30:16 +0100 (BST) >From: John Murphy >Subject: Linux security threat >Sender: owner-bris-ss@bristol.ac.uk >To: bris-ss@bristol.ac.uk > >Reply-To: John Murphy >Message-ID: > > > >RE: LINUX SECURITY THREAT > >FAO: ANYONE LOOKING AFTER LINUX BASED SYSTEMS > > >We've received notification of a current security threat to Linux based >systems. Indeed a number of machines here at Bristol have already been >compromised. > >For details of this threat and how to protect your systems against it, >please look at: > > http://www.cse.bris.ac.uk/pcs/linux/mountd-vul.html > >IMPORTANT: > > * DON'T PANIC! > * Read the above web page as soon as you can > * Act on it at a convenient (and planned) time in the near future > * Call us if you need help with this > > >Regards, > >John Murphy, J.Murphy@bris.ac.uk, x7863 >Martin Radford, Martin.Radford@bristol.ac.uk, x3048 > >Computing Service. > > > > >--- End Forwarded Message --- > > >---------------------- >Oliver Miles, PC + Mac Sys Admin >CompSci, ElecEng, FacEng AUT rep. >Dept Computer Science, University of Bristol >Merchant Venturers Building, Bristol BS8 1UB >tel:0117-954 5158, fax:0117-954 5208 > > ______________________________________________________ Get Your Private, Free Email at http://www.hotmail.com From pdgtech@wantree.com.au Sat, 12 Jun 1999 13:12:54 -0400 Date: Sat, 12 Jun 1999 13:12:54 -0400 From: Peter de Groot pdgtech@wantree.com.au Subject: nfs performance probs > > > > Problem. How do I stop the kernel(?) from renicing the > > nfsd daemons to a lower priority. When the machine > > acting as a nfs server is also doing a CPU intensive > > job. nfs practically stops. > > > > What version of the Linux kernel, and is it the user space NFSD or the > kernel NFSD? Apologies. I have not had a linux box in such a situation. SGs, and ULTRIX definitely. I have had a yarn to the support departments of both of these fine organisations, and they have led me to believe that this behaviour was a "feature" of unix itself ... from day 1.... ....a fundamental part of the original design...... Hence my posting to a list with a lot of guys who get down and dirty :-) To be honest, I am also hoping to get a reply from the bloke from SGI who keeps an eye on this list. BTW. Does not DEC (sorry Compaq) already that journaling file system happening. I think that I saw it on one of the early alphas running Digital Unix?? >From what you are saying, it appears that linux does not exhibit this behaviour. MMmmmmmmm. I know of a few sites that are muttering about file servers..... I love the users dearly, but they get as close to to the nfsds as I would to their email ;-) nfsd started at boot time. Usually about 4 daemons. The nfs server usually gets hammered by only one client at a time. > > >From what I can tell, the kNFSD doesn't have this problem. (I've got 7 > machines running it here, all serving up 62gb partitions.. and they > easily do a sustained 5-6mb per second transfer rates..) Tasty. kNFSD. I do not know of that one ?? Regards Peter ______________________________________________________________ PDG Technical Services Pty Ltd Peter de Groot P.O. Box 10349 08) 90916817 Kalgoorlie 6430 Western Australia ______________________________________________________________ From omri@NMR.MGH.Harvard.EDU Sat, 12 Jun 1999 17:39:45 -0400 Date: Sat, 12 Jun 1999 17:39:45 -0400 From: Omri Schearz omri@NMR.MGH.Harvard.EDU Subject: extremelinux.org in limbo right now. (Cox report fallout?) Does anyone know what is going on with extremelinux? Presumably it was moved off LANL's network, whereto I know not. Omri Schwarz --- omri@nmr.mgh.harvard.edu Timeless wisdom of biomedical engineering: "Noise is principally due to the presence of the patient." -- R.F. Farr From bob@drzyzgula.org Sat, 12 Jun 1999 22:01:59 -0400 Date: Sat, 12 Jun 1999 22:01:59 -0400 From: Bob Drzyzgula bob@drzyzgula.org Subject: extremelinux.org in limbo right now. (Cox report fallout?) Huh? I got through just fine; the page comes up in my browser, and a traceroute places it inside LANL. Perhaps you hit a routing glitch? --Bob % traceroute -i ppp0 www.extremelinux.org traceroute to www.extremelinux.org (204.121.3.20), 30 hops max, 40 byte packets 1 lvdc4.popsite.net (209.100.18.5) 119.983 ms 118.704 ms 119.740 ms 2 popsitedc-gw (209.100.18.1) 119.673 ms 118.938 ms 119.726 ms 3 s11-0-0-16.vienna1-cr2.bbnplanet.net (4.1.3.189) 119.688 ms 118.840 ms 11 9.762 ms 4 f0-0-0.vienna1-br1.bbnplanet.net (4.1.0.1) 129.628 ms 119.438 ms 119.150 ms 5 p2-0.vienna1-nbr2.bbnplanet.net (4.0.5.49) 119.687 ms 118.758 ms 129.782 ms 6 p0-0-0.maeeast.bbnplanet.net (4.0.1.94) 189.695 ms 208.761 ms 119.752 ms 7 mae-east-rt1.es.net (192.41.177.251) 129.601 ms 128.607 ms 149.755 ms 8 lanl3-atms.es.net (134.55.24.5) 199.680 ms 199.009 ms 199.752 ms 9 lanl-gw.lanl.gov (192.16.1.1) 199.587 ms 208.219 ms 199.732 ms 10 server.acl.lanl.gov (204.121.3.20) 199.590 ms 198.875 ms 199.751 ms > -----Original Message----- > From: owner-beowulf@beowulf.gsfc.nasa.gov > [mailto:owner-beowulf@beowulf.gsfc.nasa.gov]On Behalf Of Omri Schearz > Sent: Saturday, June 12, 1999 5:40 PM > To: beowulf@beowulf.gsfc.nasa.gov > Subject: extremelinux.org in limbo right now. (Cox report fallout?) > > > Does anyone know what is going on with > extremelinux? > > Presumably it was moved off LANL's network, > whereto I know not. > > Omri Schwarz --- omri@nmr.mgh.harvard.edu > Timeless wisdom of biomedical engineering: > "Noise is principally due to the presence of the > patient." -- R.F. Farr > > From uthayopa@mcs.anl.gov Sun, 13 Jun 1999 12:22:30 -0400 Date: Sun, 13 Jun 1999 12:22:30 -0400 From: Putchong Uthayopas uthayopa@mcs.anl.gov Subject: question about 3COM-3c905B Hi, We used to use 3COM card for a while. This is what we learn. 1. 3C905BTXA work perfectly on all kernel. 2. We use Kernel 2.0.36 3c905TXB driver have some problem of packet loss once in a while. But I guess that it got fixed already in RH6.0 Kernel+new driver. (Don Becker can answer this better). 3. We also found that using Netpipe benchmark and stuff , you need to turn off NAGLE algorithm to get high perfromance. We use to get about 40-60 Mbps, after turn it off we can get about 90Mbps. Turn if off is the same as you use TCP_NODELAY option on socket level. How to turn it off? Easy, it is one parameter under tcp when you recompile kernel. Select it to turn off, recomplie kernel and it will work. Don't forget to do that for every machines Hope this help. If it works , please post you result. Putchong. On Fri, 11 Jun 1999, Wonwoo Ro wrote: > > Dear all, > > We are installed Redhat 6.0 on 9 nodes of Pentiums processors. > When we test TCP, the result is extremely bad. We are using 3COM's 3C905B > Ethernet (10/100) Card. UDP test is quiet good. If anybody has some > knowledge about this, please send me e-mail. The benchmark test software > is netperf and netpipe. We use 3c59x in Redhat 6.0 as device driver. > > Thank you. > > From hogue@mshri.on.ca Sun, 13 Jun 1999 17:34:36 -0400 Date: Sun, 13 Jun 1999 17:34:36 -0400 From: Christopher Hogue hogue@mshri.on.ca Subject: 2U & 1U Rack systems Hi folks Went looking for 2U and 1U rackmount cases and found these... http://www.tesys.com/servers/rackmount.shtml The 2U systems take either the ASUS P2B-D2 or Intel N44BX boards, which have the NIC, video, scsi on board (i.e. no cards needed.) Anyone know of any other 2U cases like this? Chris. --------------------------------------- Christopher W.V. Hogue, Ph.D. Samuel Lunenfeld Research Institute Mt. Sinai Hospital 600 University Ave. Toronto Ontario Canada M5G 1X5 (416) 586-4800 xt2866 From hogue@mshri.on.ca Sun, 13 Jun 1999 17:46:23 -0400 Date: Sun, 13 Jun 1999 17:46:23 -0400 From: Christopher Hogue hogue@mshri.on.ca Subject: YAC (Yet Another Cluster) Hi all In light of all the "yet anothers.." going around. we named our cluster YAC (Yet Another Cluster) http://bioinfo.mshri.on.ca/yac/ Pictures & Specs on the site. Thanks for all the good information being passed around here, my hat is off to all of you who shared your experience and time contributing. Chris. --------------------------------------- Christopher W.V. Hogue, Ph.D. Samuel Lunenfeld Research Institute Mt. Sinai Hospital 600 University Ave. Toronto Ontario Canada M5G 1X5 (416) 586-4800 xt2866 fax (416) 586-8857 hogue@mshri.on.ca http://bioinfo.mshri.on.ca From lzupm@hotmail.com Sun, 13 Jun 1999 21:26:45 -0400 Date: Sun, 13 Jun 1999 21:26:45 -0400 From: Lang Zhi lzupm@hotmail.com Subject: question about 3COM-3c905B >Hi, > >We used to use 3COM card for a while. This is what we learn. >1. 3C905BTXA work perfectly on all kernel. >2. We use Kernel 2.0.36 3c905TXB driver have some problem of packet loss >once in a while. But I guess that it got fixed already in RH6.0 Kernel+new >driver. (Don Becker can answer this better). >3. We also found that using Netpipe benchmark and stuff , you need to turn >off NAGLE algorithm to get high perfromance. We use to get about >40-60 Mbps, after turn it off we can get about 90Mbps. Turn if off is the >same as you use TCP_NODELAY option on socket level. How to turn it off? >Easy, it is one parameter under tcp when you recompile kernel. Select it >to turn off, recomplie kernel and it will work. Don't forget to do that >for every machines > >Hope this help. If it works , please post you result. > >Putchong. We are using 3Com905B TX (Cyclone) and very happy with it. Netperf give around 95 Mbs between any host. (with the command : netperf -H node1 ) We are using kernel 2.0.36 and 2.2.9 without any patch and 3Com906B work great with it . BTW,we use 3 Com Switch 3300 (3C16980). -lz ______________________________________________________ Get Your Private, Free Email at http://www.hotmail.com From pesch@ibm.net Mon, 14 Jun 1999 02:31:54 -0400 Date: Mon, 14 Jun 1999 02:31:54 -0400 From: Paul Eduard Schenker pesch@ibm.net Subject: question about 3COM-3c905B How compatible are different brands of NICs (diferent chip sets) with the different brands of switches? Paul > >We are using 3Com905B TX (Cyclone) and very happy with it. >Netperf give around 95 Mbs between any host. >(with the command : netperf -H node1 ) > >We are using kernel 2.0.36 and 2.2.9 without any patch and 3Com906B work >great with it . > >BTW,we use 3 Com Switch 3300 (3C16980). > >-lz > > >______________________________________________________ >Get Your Private, Free Email at http://www.hotmail.com > > Paul Eduard Schenker Phone: +65 - 476 2245 1 Peirce Hill Fax: +65 - 472 6480 Singapore 248558 email: pesch@ibm.net From briddonj@us.ibm.com Mon, 14 Jun 1999 03:26:42 -0400 Date: Mon, 14 Jun 1999 03:26:42 -0400 From: briddonj@us.ibm.com briddonj@us.ibm.com Subject: Computer Science research done on Beowulf class systems >> Probably - but what about IBMs DB2 ? It's been out in beta for a while >> (I got a CD in the mail a couple of weeks ago), and is now shipping with >> the TurboLinux distribution. >> (see http://www.ibm.com ) >TurboLinux ships with DB2 configured in its upper end package, right? I see an >ad in the June LJ for a variety of TurboLinux setups, but they include Oracle 8 >when shipping with SQL...seems like an old ad. Yes, IBM has announced plans to ship an optimized version of DB2 for TurboLinux. I also believe that TurboLinux is planning on support for clustering as well. We tried to arrange a demo of it on the cluster at LinuxExpo, but there was not enough time to prepare. Continued Success, Julie Briddon IBM Netfinity ServerProven tel: 919.543.7876 (t/l: 441) fax: 919.486.1430 Internet address: briddonj@us.ibm.com From huffine@wopr.nrl.navy.mil Mon, 14 Jun 1999 07:36:17 -0400 Date: Mon, 14 Jun 1999 07:36:17 -0400 From: Christopher Huffine huffine@wopr.nrl.navy.mil Subject: NFS Problems under Redhat 6.0 Anyone else having any NFS problems under RedHat 6.0? My cluster gets through NFS-Root mounting about half of the machines (7-8) -- each subsequent machine fails with a mount error -111. The Linux-Kernel list indicates that there have been major overhauls to the NFS code in the later 2.2.x kernels. I am wondering if it was a mistake to try to make 2.2.x work at this point in time. Thanks for any insignts, Chris From asabigue@fing.edu.uy Mon, 14 Jun 1999 08:20:02 -0400 Date: Mon, 14 Jun 1999 08:20:02 -0400 From: Ariel Sabiguero Yawelak asabigue@fing.edu.uy Subject: atx mobo question I was offered some TECRAM motherboards and they have an option somewhere in the setup that allows them to auto-boot after a power failure. As far as I remember, the option was not 100% obvious from the setup menu, but works. Take a look around in the setup options, and maybe, you can ask asus an upgrade of the BIOS that handle this. Regards. Ariel On Fri, 11 Jun 1999, texelsoft wrote: > I recently got a bunch of asus atx mobos. Is there a convenient way to > short/hack the atx switch leads so the machines will autoboot after a power > failure without requiring someone to wallop the switch? > > -evt > > Eric van Tassell > Texel Software, Inc. - System Software Engineering for Windows/NT > 277 Cochran Hill Rd. Voice : 603-487-5006 > New Boston, NH 03070 USA Fax : 603-487-5166 > email : evt@texelsoft.com > http://www.texelsoft.com > > > > -----Original Message----- > > From: owner-beowulf@beowulf.gsfc.nasa.gov > > [mailto:owner-beowulf@beowulf.gsfc.nasa.gov]On Behalf Of > > Fred_deBros@etherdome.mgh.harvard.edu > > Sent: Tuesday, June 08, 1999 9:47 AM > > To: Kat kirk > > Cc: Dave Hart; beowulf@beowulf.gsfc.nasa.gov; extreme-linux@acl.lanl.gov > > Subject: Re: 2.2.9 > > > > > > > > > > > > > Anyone upgrade to 2.2.9 ? If so how goes? > > > > Well, I still try to: > > > > make make dep make bzImage, make modules make modules_install (I > > wanted the > > 4-cdrom changer to run) and I get on bootup: > > > > after sending BOOTP and RARP requests......unable to handle kernel paging > > request at virtual address 00d0030.....lots of computerese ....Aiee, > > killing interrupt handler. > > > > I dont think that is 2.2.9 per se. 2.2.5 runs fine. Am I doin sumpin > > wrong in make menuconfig? > > > > I am not knowledgeable enough to fix that. Somebody? > > > > fred > > > > > > > > ==================================== =============================== Ariel Sabiguero Yawelak Centro de Calculo - CeCal mail:asabigue@fing.edu.uy Facultad de Ingenieria University de