[Beowulf] Power Cycling Question
sassy-work at sassy.formativ.net
Fri Jul 16 23:43:27 UTC 2021
interesting topic and quite apt when I look at the flooding in Germany,
Belgian and The Netherlands.
I guess there are a number of reasons why people are not doing it. Discarding
the usual "we never done that" one, I guess the main problem is: when do you
want to turn it off? After 5 mins being idle? Maybe 10 mins? One hour? How
often do you then need to boot them up again and how much energy does that
cost? From chatting to a few people who tried it in the past it somehow
transpired that you do not save as much energy as you were hoping for.
However, on thing came to my mind: is it possible to simply suspend it to disc
and then let it be sleeping? That way, you wake the node up quicker and
probably need less power when it is suspended. Think of laptops.
The other way around would simply be: we know in say the summer, there is less
demand so we simply turn X number of nodes off and might do some maintenance
on them. So you are running the whole cluster for say 6 weeks with limited
capacity. That might mean a few jobs are queuing but that also will give us a
window to do things. Once people are coming back, the maintenance is done and
the cluster can run at full capacity again.
Just some (crazy?) ideas.
All the best
Am Freitag, 16. Juli 2021, 20:35:11 BST schrieb Douglas Eadline:
> Hi everyone:
> Reducing power use has become an important topic. One
> of the questions I always wondered about is
> why more cluster do not turn off unused nodes. Slurm
> has hooks to turn nodes off when not in use and
> turn them on when resources are needed.
> My understanding is that power cycling creates
> temperature cycling, that then leads to premature node
> failure. Makes sense and has anyone ever studied/tested
> this ?
> The only other reason I can think of is that the delay
> in server boot time makes job starts slow or power
> surge issues.
> I'm curious about other ideas or experiences.
More information about the Beowulf