How do you keep clusters running....
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Jim Fraser fraser5 at cox.netWed Apr 3 13:37:56 PST 2002
- Previous message: How do you keep clusters running....
- Next message: How do you keep clusters running....
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Sounds to me like you have a heat problem. dual ultra thin's generally run pretty hot. good luck with it. There is just no room for any serious air to move thru that case. The fan diameter is so small that they require ridiculous rpms to move the needed volume making them noisy and prone to fail, add to that the high heat and you accelerate the mtbf to tomorrow. Most fans fail quickly in high heat conditions. I think the basic rack design concept while rugged and strong is fundamentally flawed and over priced. I would invest in a serious rack fan that moves major air out of that case somehow. good luck with it. jim -----Original Message----- From: beowulf-admin at beowulf.org [mailto:beowulf-admin at beowulf.org]On Behalf Of Cris Rhea Sent: Wednesday, April 03, 2002 4:04 PM To: beowulf at beowulf.org Subject: How do you keep clusters running.... What are folks doing about keeping hardware running on large clusters? Right now, I'm running 10 Racksaver RS-1200's (for a total of 20 nodes)... Sure seems like every week or two, I notice dead fans (each RS-1200 has 6 case fans in addition to the 2 CPU fans and 2 power supply fans). My last fan failure was a CPU fan that toasted the CPU and motherboard. How are folks with significantly more nodes than mine dealing with constant maintenance on their nodes? Do you have whole spare nodes sitting around- ready to be installed if something fails, or do you have a pile of spare parts? Did you get the vendor (if you purchased prebuilt systems) to supply a stockpile of warranty parts? One of the problems I'm facing is that every time something croaks, Racksaver is very good about replacing it under warranty, but getting the new parts delivered usually takes several days. For some things like fans, they sent extras for me to keep on-hand. For my last fan/CPU/motherboard failure, the node pair will be down ~5 days waiting for parts. Comments? Thoughts? Ideas? Thanks- --- Cris ---- Cristopher J. Rhea Mayo Foundation Research Computing Facility Pavilion 2-25 crhea at Mayo.EDU Rochester, MN 55905 Fax: (507) 266-4486 (507) 284-0587 _______________________________________________ Beowulf mailing list, Beowulf at beowulf.org To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
- Previous message: How do you keep clusters running....
- Next message: How do you keep clusters running....
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
