[Beowulf] Strange SGE scheduling problem
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Schoenefeld, Keith schoenk at utulsa.eduTue Jul 22 14:54:53 PDT 2008
- Previous message: [Beowulf] Re: Religious wars
- Next message: [Beowulf] Strange SGE scheduling problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
My cluster has 8 slots (cores)/node in the form of two quad-core processors. Only recently we've started running jobs on it that require 12 slots. We've noticed significant speed problems running multiple 12 slot jobs, and quickly discovered that the node that was running 4 slots on one job and 4 slots on another job was running both jobs on the same processor cores (i.e. both job1 and job2 were running on CPU's #0-#3, and the CPUs #4-#7 were left idling. The result is that the jobs were competing for time on half the processors that were available. In addition, a 4 slot job started well after the 12 slot job has ramped up results in the same problem (both the 12 slot job and the four slot job get assigned to the same slots on a given node). Any insight as to what is occurring here and how I could prevent it from happening? We were are using SGE + mvapich 1.0 and a PE that has the $fill_up allocation rule. I have also posted this question to the hpc_training-l at georgetown.edu mailing list, so my apologies for people who get this email multiple times. Any help is appreciated. -- KS
- Previous message: [Beowulf] Re: Religious wars
- Next message: [Beowulf] Strange SGE scheduling problem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
