Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] High quality hardware

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Daniel Fernandez daniel at labtie.mmt.upc.es
Tue May 25 12:40:25 PDT 2004


On Tue, 2004-05-25 at 18:14, Mark Hahn wrote:
> > countinuous "lamd" daemoun hangups. But the next day all nodes ran
> > almost fine with identical test suites. Moreover, now it's giving very
> 
> but do you have monitoring of, for instance, temperature?  lack of
> repeatability will make your life very difficult.

I'm running continuous tests, and saving results , there's only one node
out of range running a CPU temperature of 60ºC ( crashed powersource fan
) but its *not* on the group of nodes that causes most of problems.

> > 	-Continuous run 24h a day of common hardware not prepared to.
> 
> which generally equates to temperature.
> 
> > 	-Defective mainboard/memory 
> 
> but what you describe is a degredation, no?  that is, it used to work fine,
> but now, sometimes intermittently fails?
> 
> > 	-External inteferences/noise
> 
> are these systems based on bare boards?  or multiple boards per chassis?
> 
> > The last argument seemed to gain our attention, in that case what would
> > be the best case material for shielding ?
> 
> mu-metal, I suppose.  but that's rather extreme!  are you proposing some 
> kind of EMF interference *through* the case?  or some kind of exotic noise
> 
> > Also, what kind of mainboard manufacturers do you trust most ? I'm
> 
> I go with "A-list" vendors: recognizable vendors, preferably not entirely
> focused on either low-end or gamer markets.  asus, tyan, supermicro, msi,
> celestica, hp, ibm, apple, dell, etc.
> 
> > referring mainly to Socket A platform,
> 
> ah.  I wonder if that's your problem, then.  socket A has always had a rep
> for being somewhat fiddly to run stably, and to keep cool.  the latter is 
> presumably just because the chips dissipate a fair amount of power, and need
> rather good contact with a rather good heatsink.  unlike intel or recent AMD
> systems, which have builtin heatspreaders.  still, if you have properly
> mounted, fan-working, copper heatsinks with good through-case airflow,
> I'd think you could expect stable behavior.

Our CPUs ( XP 2600+ Thoroughbred/Barton ) are working at a temperature
range of 44-48 Cº and our ambient temperature is about 21-23 ºC ... I
don't consider it too high, maybe I am wrong and this range of
temperatures don't assure a 100% fail-free 365x24h run.


> > we're currently using Asus and
> > MSI but alse Tyan seems a good option.
> 
> those work for me.  oddly, I'm getting feedback from OEM channels that Tyan
> is having trouble stocking/delivering products.  that's kind of worrisome,
> since I tend to like their products...
> 
> regards, mark hahn.
> 
> 

Cheers.

-- 
Daniel Fernandez <daniel at labtie.mmt.upc.es>
Heat And Mass Transfer Center - CTTC
www.cttc.upc.edu
c/ Colom nº11
UPC Campus Industrials Terrassa , Edifici TR4




More information about the Beowulf mailing list