data transfer and condor
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Thomas R Boehme mail at thomas-boehme.deFri Aug 3 08:51:42 PDT 2001
- Previous message: data transfer and condor
- Next message: Math help: Calculate pi using Gregory's Series on Beowulf?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi, To really suggest a reasonable solution, you need to give a little more detail. e.g.: How many nodes do you have? How large are the files read by each job? How long does each job take? Are the jobs mainly IO limited or do they also require high computational effort? What network interconnects are you using on the nodes / NFS server? Do you have a budget for hardware improvements? In general, I don't think scheduling the data transfer really helps, as that would basically mean, all the jobs wait for other jobs to finish the IO. It doesn't really give you more throughput. The solution is probably to provide enough bandwidth to cope with the traffic in a reasonable fashion. I would suggest looking into PVFS and distributing the data across the nodes. The other solution would be to upgrade the file server to provide as much throughput as possible (as I don't know what you have now, so I can't really suggest anything). Do you use USE_NFS = True for condor? I would test both true and false to see what gives you the better throughput. I think the condor internal transfer might be faster, but I can't tell as we only have 100 MBit networks and both NFS and condor are achieving almost the maximum possible throughput. Bye, Thomas > -----Original Message----- > From: Steven Berukoff [mailto:steveb at aei-potsdam.mpg.de] > Sent: Friday, August 03, 2001 8:36 AM > To: beowulf at beowulf.org > Subject: data transfer and condor > > > Hi all, > > We're looking at using Condor on our cluster for its checkpointing and > job-handling abilities, as the routines we're running don't require much > in the way of internode communication. We have an NFS file server which > contains our entire fileset (something on the order of 100s of GB), a > master node for the cluster, and several nodes. Outside of Condor, our > algorithm requires that each of the nodes get some subset of the data (on > the order of perhaps 100MB) and runs the analysis code on this data > segment. Obviously, each node must gather its share of the data from the > NFS file server; this of course requires a large amount of network > traffic. > > Does anyone have a clever idea about scheduling the data > transfers so that it is accomplished in a reasonable fashion? We were > hoping Condor provides this functionality to some degree, but it doesn't > seem to. > > Thanks > Steve > > > > ===== > Steve Berukoff tel: 49-331-5677233 > Albert-Einstein-Institute fax: 49-331-5677298 > Am Muehlenberg 1, D14477 Golm, Germany email:steveb at aei.mpg.de > > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf
- Previous message: data transfer and condor
- Next message: Math help: Calculate pi using Gregory's Series on Beowulf?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
