thermal kill switch

Timothy H. Keitt tkeitt at mail.utexas.edu
Thu Oct 24 13:34:23 PDT 2002


Anyone have experience with the APC UPS + sensor card approach? I was
thinking of going that route.

T.

On Wed, 2002-10-23 at 02:03, Robert G. Brown wrote:
> On Tue, 22 Oct 2002, Andre Lehovich wrote:
> 
> > We had the air-conditioning fail yesterday.  Caught it in
> > time to shut down by hand, but we won't be so lucky next
> > time.  RGB's book recommends a thermal kill switch, but
> > doesn't give details on implementation.  One obvious idea is
> > to have a daemon monitor lm-sensors and shutdown each node
> > as it gets too hot.  This is easy and cheap.
> > 
> > But, is there anything better?  We have not yet had the
> > electric and cooling contractors refit our server room.  Is
> > there anything we should have them install during the
> > rewiring?  What are the pros/cons of a room-wide kill switch
> > vs. the lm-sensors approach?
> 
> We have a room-wide kill switch set to be a "last resort".  They are
> remarkably difficult to find in e.g. a web search, but our architect and
> electrical contractors came up with one, so they must be in electrical
> component catalogs somewhere if you know where to look.
> 
> A second option is to get an electronically readable thermometer (with
> one or more sensors) for the ambient room air.  netbotz (netbotz.com)
> sell moderately expensive (order $1K) monitoring devices that sample
> room air temperature, humidity, switch state (so you can get an alarm or
> take pictures when a door is opened or a motion detector detects motion)
> and have a built in camera and both a web and SNMP interface for remote
> monitoring.  It generates "alarm" mail if e.g. temperature or sound
> levels exceed a given threshold.  It is a straightforward matter to hook
> a script into one that either polls the device and sends nodes a
> poweroff command on an alarm or responds to alarm mail ditto.
> 
> If you are a DIY sort of person and don't want to pay for a netbot, you
> can build the functional equivalent of a netbot out of component parts
> and scripts.  A PC-TV card (bttv driver) and an X10 camera will let you
> watch real-time video of your cluster room in an xawtv window or serve
> you images updated every second or five on a web page -- I have the
> scripts and html for the latter already set up, as I have one at home.
> To do temperature, you can invest in an ibutton thermochron:
> 
> http://www.ibutton.com/ibuttons/thermochron.html
> 
> or (perhaps more reasonably) in a sensorsoft thermometer, readable from
> an RS232 interface for around $100.  Or build your own serial port
> readable thermometer for around $35 if you are a real DIY fanatic and
> have a 5V power supply handy.  Again, scripts to read and act are
> necessary, some are already posted on the web.  I imagine that one could
> set up sound alarms with an ordinary microphone and sound card although
> I've never tried it.  In our server room we'd be checking to make sure
> that the sound level stays HIGH, as the AC is in the room so ambient
> noise is like working right behind a jet engine during takeoff.  We'd
> want an alarm to be triggered if that lovely sound ever went OFF.
> 
> lmsensors is the final option, but it has some flaws.  For one thing, it
> monitors temperatures inside individual systems, not ambient room
> temperatures.  Not all systems/chips are well supported.  The lmsensors
> kernel module was designed by individuals who have never heard of the
> term "API" (as in, you'll need custom code to glean results for EACH
> CHIP AND CONFIGURATION as they don't digest raw output at all -- you
> might as well plan to become expert in the particular chip(s) your
> systems have to monitor them).  Some silly motherboards (the pile of
> Tyan dual AMD's we own coming to mind) have insane BIOSn that require(d)
> one to hand-enable onboard sensors at the beginning of EACH BOOT in
> order to have them functioning and accessible to lmsensors.
> 
> In summary, lmsensors is great if it works for you, primarily to
> protect individual systems but not so great for protecting the entire
> room.
> 
> This gives you a pretty wide range of ways to protect and monitor your
> cluster/server room, at a wide range of prices -- "free" (if it works)
> for lmsensors, a few $100 for DIY or over-the-counter thermal sensors
> and video, order of $1000 to get serious integrated monitors that are
> almost plug-n-play with a minimal amount of your time and effort
> (netbotz are network appliances so they literally plug in, snap onto
> your network, get IP from DHCP and can be configured and monitored from
> a serial interface or over the network -- a bit windows-centric in
> supplied configuration tools as usual, but one CAN get by with minicom).
> 
> HTH
> 
>    rgb
> 
> -- 
> Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
> Duke University Dept. of Physics, Box 90305
> Durham, N.C. 27708-0305
> Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
-- 
Timothy H. Keitt
The University of Texas at Austin
Section of Integrative Biology
1 University Station C0930
Austin, Texas 78712-0253 USA



More information about the Beowulf mailing list