Naming etc. (Was: DHCP Help)
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Josip Loncaric josip at icase.eduThu Apr 11 09:00:04 PDT 2002
- Previous message: DHCP Help
- Next message: commercial parallel libraries
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"Robert G. Brown" wrote: > > [...] my hosts are all on a private > internal network anyway and not in nameservice. Good policy! Private hostnames/addresses should remain private because they are not guaranteed to be unique across the entire Internet. The DNS server should contain only registered hostnames/addresses. The head node of a cluster is typically multi-homed and its public interface should be DNS registered, but the internal private interface (and the client nodes on the internal private network) are best resolved via /etc/hosts, where internal domain name is determined from the FQDN form of the name. If /etc/hosts on client1 contains: 192.168.1.1 client1.internal.domain client1 then 'dnsdomainname' on client1 returns 'internal.domain' (clearly not found in any Internet registry). This would work fine internally, but NOT outside the cluster (e.g. sendmail may have problems, etc.). The /etc/hosts tables should be consistent across the cluster, even if there are reasons to play tricks. For example, one typically has all machines on a fast ethernet (FE) subnet (say 192.168.1.x) but a few may also have gigabit ethernet (GE) interfaces (say 192.168.2.x). Using IP level routing can result in complicated routing tables, because only specific FE hosts can also be reached via the GE interface. What about name level "routing"? While /etc/hosts can be used to make hostnames of GE machines resolve to GE addresses on GE machines but to their FE addresses on the FE-only machines, this can lead to problems with software packages which assume globally consistent hostname/address mapping. For example, grid software (Globus) needs a globally consistent FQDN/IP mapping. The grid machine name is the fully-qualified domain name or Internet name of a grid machine. It should be the name returned by the "gethostbyname()" function (from libc) and the primary name retrieved from DNS via nslookup. The primary name should correspond to the host's primary interface (if there is more than one) and be fully accessible across the grid. The grid could involve private addresses, but those are visible only WITHIN an organization because private addresses must not be routable outside an organization. This is a serious limitation -- so it is probably best to limit grids to publicly registered hosts only. Proxy processes on the head nodes to access internal machines may be needed. Most clusters are built around a private subnet, sometimes with IP masquerading enabled on the head node so that the internal clients can 'call out'. This still means that internal clients are not visible externally, i.e. one cannot 'call in' from the outside. As a consequence, parallel jobs which assume global TCP connectivity of all participating machines (e.g. MPICH-G2) will have problems in using two clusters (each with its own private internal subnet). At the moment, every node (that you wish to use in a MPICH-G2 job) must have a public IP address and must be fully accessible. To run jobs across several clusters with internal private networks, the MPI programmer would need to provide a proxy process on the head node to overcome this difficulty. In summary, naming is a simple concept but just under the surface is a can of worms created by established programming practices based on diverse assumptions. Multiply connected machines and/or public/private network mixtures need to be set up with great care. Tricky setups are fragile; simplicity and transparency works better. Sincerely, Josip -- Dr. Josip Loncaric, Research Fellow mailto:josip at icase.edu ICASE, Mail Stop 132C PGP key at http://www.icase.edu./~josip/ NASA Langley Research Center mailto:j.loncaric at larc.nasa.gov Hampton, VA 23681-2199, USA Tel. +1 757 864-2192 Fax +1 757 864-6134
- Previous message: DHCP Help
- Next message: commercial parallel libraries
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
