[Beowulf] scheduler recommendations for a HPC cluster
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Rahul Nabar rpnabar at gmail.comTue Oct 6 12:22:14 PDT 2009
- Previous message: [Beowulf] Home Beowulf
- Next message: [Beowulf] scheduler recommendations for a HPC cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Any strong / weak recommendations for / against schedulers? For a long time we have worked happily with a Torque + Maui system. It isn't perfect but works (and is free!). But rarely does a chance present itself to go for something "newer and better" on a in-production system since people hate changes and outages. This time as we shop for a new cluster it presents me the opportunity to change if something better exists. Any comments? What are other users using out there? Any horror stories? Or any super good finds? I shy against LSF etc since those cost a lot of money. Especially as they, and similar systems are mostly licensed per server per year so the costs do add up. I have been a user on a LSF systems for a long time and I think it is an awesome scheduler but have never been at the admin end of LSF. One thing that the Torque+Maui option is not the best is that it is not monolithic. Oftentimes it is hard to know which component to blame for a problem or more relevant which config file to use to fix a problem. Torque or Maui. On the other hand , can't get rid of Maui since Fairshare policies etc. are important to us and those seem to be in the Maui domain. (all our jobs are MPI jobs in case that is relevant. We haven't been doing checkpointing yet) Of course, there is MOAB these days, but I am not sure if that is worth the money since I have not used it. I appreciate any comments or words of wisdom you guys might have! -- Rahul
- Previous message: [Beowulf] Home Beowulf
- Next message: [Beowulf] scheduler recommendations for a HPC cluster
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
