Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

thermal kill switch

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Timothy H. Keitt tkeitt at mail.utexas.edu
Thu Oct 24 13:34:23 PDT 2002


Anyone have experience with the APC UPS + sensor card approach? I was
thinking of going that route.

T.

On Wed, 2002-10-23 at 02:03, Robert G. Brown wrote:
> On Tue, 22 Oct 2002, Andre Lehovich wrote:
> 
> > We had the air-conditioning fail yesterday.  Caught it in
> > time to shut down by hand, but we won't be so lucky next
> > time.  RGB's book recommends a thermal kill switch, but
> > doesn't give details on implementation.  One obvious idea is
> > to have a daemon monitor lm-sensors and shutdown each node
> > as it gets too hot.  This is easy and cheap.
> > 
> > But, is there anything better?  We have not yet had the
> > electric and cooling contractors refit our server room.  Is
> > there anything we should have them install during the
> > rewiring?  What are the pros/cons of a room-wide kill switch
> > vs. the lm-sensors approach?
> 
> We have a room-wide kill switch set to be a "last resort".  They are
> remarkably difficult to find in e.g. a web search, but our architect and
> electrical contractors came up with one, so they must be in electrical
> component catalogs somewhere if you know where to look.
> 
> A second option is to get an electronically readable thermometer (with
> one or more sensors) for the ambient room air.  netbotz (netbotz.com)
> sell moderately expensive (order $1K) monitoring devices that sample
> room air temperature, humidity, switch state (so you can get an alarm or
> take pictures when a door is opened or a motion detector detects motion)
> and have a built in camera and both a web and SNMP interface for remote
> monitoring.  It generates "alarm" mail if e.g. temperature or sound
> levels exceed a given threshold.  It is a straightforward matter to hook
> a script into one that either polls the device and sends nodes a
> poweroff command on an alarm or responds to alarm mail ditto.
> 
> If you are a DIY sort of person and don't want to pay for a netbot, you
> can build the functional equivalent of a netbot out of component parts
> and scripts.  A PC-TV card (bttv driver) and an X10 camera will let you
> watch real-time video of your cluster room in an xawtv window or serve
> you images updated every second or five on a web page -- I have the
> scripts and html for the latter already set up, as I have one at home.
> To do temperature, you can invest in an ibutton thermochron:
> 
> http://www.ibutton.com/ibuttons/thermochron.html
> 
> or (perhaps more reasonably) in a sensorsoft thermometer, readable from
> an RS232 interface for around $100.  Or build your own serial port
> readable thermometer for around $35 if you are a real DIY fanatic and
> have a 5V power supply handy.  Again, scripts to read and act are
> necessary, some are already posted on the web.  I imagine that one could
> set up sound alarms with an ordinary microphone and sound card although
> I've never tried it.  In our server room we'd be checking to make sure
> that the sound level stays HIGH, as the AC is in the room so ambient
> noise is like working right behind a jet engine during takeoff.  We'd
> want an alarm to be triggered if that lovely sound ever went OFF.
> 
> lmsensors is the final option, but it has some flaws.  For one thing, it
> monitors temperatures inside individual systems, not ambient room
> temperatures.  Not all systems/chips are well supported.  The lmsensors
> kernel module was designed by individuals who have never heard of the
> term "API" (as in, you'll need custom code to glean results for EACH
> CHIP AND CONFIGURATION as they don't digest raw output at all -- you
> might as well plan to become expert in the particular chip(s) your
> systems have to monitor them).  Some silly motherboards (the pile of
> Tyan dual AMD's we own coming to mind) have insane BIOSn that require(d)
> one to hand-enable onboard sensors at the beginning of EACH BOOT in
> order to have them functioning and accessible to lmsensors.
> 
> In summary, lmsensors is great if it works for you, primarily to
> protect individual systems but not so great for protecting the entire
> room.
> 
> This gives you a pretty wide range of ways to protect and monitor your
> cluster/server room, at a wide range of prices -- "free" (if it works)
> for lmsensors, a few $100 for DIY or over-the-counter thermal sensors
> and video, order of $1000 to get serious integrated monitors that are
> almost plug-n-play with a minimal amount of your time and effort
> (netbotz are network appliances so they literally plug in, snap onto
> your network, get IP from DHCP and can be configured and monitored from
> a serial interface or over the network -- a bit windows-centric in
> supplied configuration tools as usual, but one CAN get by with minicom).
> 
> HTH
> 
>    rgb
> 
> -- 
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-- 
Timothy H. Keitt
The University of Texas at Austin
Section of Integrative Biology
1 University Station C0930
Austin, Texas 78712-0253 USA



More information about the Beowulf mailing list