Image Processing on a BeoWulf
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Lindahl glindahl at hpti.comSun Aug 20 17:48:52 PDT 2000
- Previous message: Image Processing on a BeoWulf
- Next message: NFS 100:1 performance loss
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> Our desire is to allow many jobs to be initiated ad-hoc by the > production operators or by web based clients and take advantage of > the parallelism and scaling offered by the cluster. At this point you need a scheduler. You said you were using PVM. PVM isn't that smart at handing out nodes when you ask for a lot more than you have. In addition, it depends on the details when you have several jobs running on a node and talking. Sometimes (I know this is true for mpich) a job doing the wrong thing will basically spin in a busy-wait loop if the people it's trying to talk to happen to not be running. This results in such awful performance that you'd think that the job was stuck. I don't know about PVM; I'd think that the "non direct route" option should be OK, but I've never tried it. > We currently have a work > around in place - we implemented a queue that lines up the submitted > jobs sequentially. This certainly is not optimal as small jobs have > to wait behind large ones. That's a scheduler. Another scheduler you could use would be a queue system like PBS. However, what you really want is a queue system which provides "gang scheduling". With gang scheduling, only one program at a time is awake on a node, so you don't have any spin-wait problems. The RWCP guys have this for their SCORE operating system. But SCORE is pretty big, and I don't think it's easy to extract just that one feature. Now the T3E has a lovely gang scheduler... An alternate which might do better for you would be to have all your jobs use the same set of worker processes, one per cpu. Then multiple jobs would just send extra work and have to wait until the workers got around to handling them. Since PVM has full dynamic process creation, this is fairly easy to write -- N slaves, create a new master to hand out work each time you have a new job to do. -- greg
- Previous message: Image Processing on a BeoWulf
- Next message: NFS 100:1 performance loss
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
