How do you keep clusters running....
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Steve Gaudet SGaudet at turbotekcomputer.comWed Apr 3 13:26:28 PST 2002
- Previous message: What could be the performance of my cluster
- Next message: GbE Channel Bonding
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello Chris, > What are folks doing about keeping hardware running on large clusters? > > Right now, I'm running 10 Racksaver RS-1200's (for a total of > 20 nodes)... > > Sure seems like every week or two, I notice dead fans (each RS-1200 > has 6 case fans in addition to the 2 CPU fans and 2 power > supply fans). > > My last fan failure was a CPU fan that toasted the CPU and > motherboard. > > How are folks with significantly more nodes than mine dealing > with constant > maintenance on their nodes? Do you have whole spare nodes > sitting around- > ready to be installed if something fails, or do you have a pile of > spare parts? Did you get the vendor (if you purchased > prebuilt systems) > to supply a stockpile of warranty parts? > > One of the problems I'm facing is that every time something croaks, > Racksaver is very good about replacing it under warranty, but getting > the new parts delivered usually takes several days. > > For some things like fans, they sent extras for me to keep on-hand. > > For my last fan/CPU/motherboard failure, the node pair will be > down ~5 days waiting for parts. > > Comments? Thoughts? Ideas? ------------------------------------------ The vendor of choise should be using quality parts. We don't see these issues here. Steve Gaudet Linux Solutions Engineer ..... <(©¿©)> =================================================================== | Turbotek Computer Corp. tel:603-666-3062 ext. 21 | | 8025 South Willow St. fax:603-666-4519 | | Building 2, Unit 105 toll free:800-573-5393 | | Manchester, NH 03103 e-mail:sgaudet at turbotekcomputer.com | | web: http://www.turbotekcomputer.com | ===================================================================
- Previous message: What could be the performance of my cluster
- Next message: GbE Channel Bonding
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
