[Beowulf] What services do you run on your cluster nodes?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Perry E. Metzger perry at piermont.comWed Sep 24 07:00:55 PDT 2008
- Previous message: [Beowulf] What services do you run on your cluster nodes?
- Next message: [Beowulf] What services do you run on your cluster nodes?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Patrick Geoffray <patrick at myri.com> writes: > Perry E. Metzger wrote: >>> You realize that most big HPC systems are using interconnects that >>> don't generate many or any interrupts, right? >> >> Of course. Usually one even uses interrupt pacing/mitigation even in >> gig ethernet on a modern machine -- otherwise you're not going to get >> reasonable performance. (For 10Gig, you have to do even uglier >> tricks.) > > What Greg is trying to say is that high-speed interconnects used in > HPC do not raises interrupts at all. Data is delivered directly in > user-space, and the app (or the communication library) busy polls on See the message I sent to Larry Stewart a few minutes ago -- no need for me to repeat myself... > However, it is only important for large machines with tightly coupled > codes. For the majority of the cases, it's just being anal. Even in large machines with very tight coupling, unless you've done very special things to the kernel, you have no random incoming interrupts (many devices on modern hardware will demand attention at intervals a lot more frequently than every few hours even if you aren't touching them), you've turned off SMM, you're doing no disk i/o, etc., you have to be a *little* tolerant of timing not being what you want, because things will get in the way. Not too often, but a lot more often than every few hours, so if a problem every few hours on one node in the cluster is an issue, you're going to have trouble on stock PC hardware. A Postfix daemon going off at 2am to send out a grep of the logs is down in the noise compared to that sort of thing. Not that I think this is the right way to manage a machine -- you want machines sending each other machine generated and parsed status information -- but I'm just pointing out an extra daemon doing nothing isn't your biggest worry. Perry -- Perry E. Metzger perry at piermont.com
- Previous message: [Beowulf] What services do you run on your cluster nodes?
- Next message: [Beowulf] What services do you run on your cluster nodes?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
