Question: Task Farm and Private Networks.
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduThu May 31 07:51:12 PDT 2001
- Previous message: Question: Task Farm and Private Networks.
- Next message: new hp 4108gl switch
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
bOn Wed, 30 May 2001, Hoeffel, Thomas wrote: > Hi, > > I currently have a small cluster in which the slave nodes are on a private > network. It is used primarily as a task farm and not as a true parallel > machine. Only the master node sees our other systems (which are on their > own switch). This casues problems with certain remote job submissions via > some commercial packages since they write both local temp files and scratch > temp files. > > Question: What is the drawback to giving each slave it's own true IP address > and allowing them to NFS mount the same file systems as the master node? In a real "compute farm" (where the tasks are embarrassingly parallel and don't communicate) none that I can think of. Indeed, it is the only sane way to go. There are many kinds of clusters, only a few of which are true "beowulfs" in the narrow sense of the definition of the architecture. For the task mix you describe (lots of embarrassingly parallel work run as separate jobs on the various "nodes") there is very little benefit to using a true beowulf architecture and plenty of additional costs in the form of scripting solutions to problems that arise due to a lack of a shared filesystem and so forth. Yes, recent list discussion has shown that you "can" use a scyld beowulf as a compute farm; it has also shown that it is a bit clumsy and difficult to do so, so why bother? It should be very easy to flatten your network -- either connect the inner switch to the outer switch (rationalizing e.g. the IP space and routing and all that) or arrange for the master node to act as a router and pass the NFS mounts through it. The In most cases I think the former makes more sense; in a few (mostly when the master is idle enough that the overhead of its acting as a router isn't "expensive" in terms of time to complete work) the latter might. Pop a more or less standard linux on each node (remembering that the nodes are now openly accessible and hence need to be configured with probably only sshd open as a means of access to minimize security hassles). You can strip the node configuration a bit -- if they are headless they probably don't need X servers, for example, and can likely live without games, KDE and/or Gnome desktops and tools, mail, news, web browsers, and the like. If they have big disks, though, there isn't much point in stripping the configuration a lot -- heterogeneity in a network costs more money in time than extra space costs in disk. Users can then login to each node and run jobs, or a remote job submission package can do it for them or you can install MOSIX on the nodes and let them login to a single node to run jobs and let MOSIX migrate them around to balance load. You may still want a tool like procstatd to monitor load on the cluster, especially if users are logging into nodes to run their jobs -- it can easily reveal which nodes are idle and ready for more work. rgb -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: Question: Task Farm and Private Networks.
- Next message: new hp 4108gl switch
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
