[Beowulf] problem of mpich-1.2.7p1

Gus Correa gus at ldeo.columbia.edu
Thu Feb 4 12:30:48 PST 2010


Hi David

Sorry to hear that OpenMPI is a troublemaker in your
Open Solaris machines.
Have you questioned the OpenMPI list about that?

I installed OpenMPI on Linux (Fedora, CentOS) without problems,
on clusters with Infiniband, Gigabit Ethernet, old Ethernet 100,
and on standalone machines, using Gnu, PGI, and Intel compilers.
In my experience it installs easily and works well on Linux.
Among other reasons, I recommended it to the person who
asked for help because he said his is a Linux cluster (Ubuntu).

We also have and use MPICH2 and MVAPICH2 here, though.

Gus Correa

David Mathog wrote:
> Gus Correa <gus at ldeo.columbia.edu> wrote
>> If you already set up passwordless ssh across the nodes
>> OpenMPI will probably get you up and running faster than MPICH2.
>>
>> OpenMPI is very easy to install, say, with gcc, g++, and
>> gfortran (make sure you have them installed on your main machine,
>> use Ubuntu apt-get, if you don't have them).
> 
> Well on Linux maybe, but since OpenMPI has been soundly kicking my butt
> trying to get it installed and working on a Solaris 5.8 Sparc system for
> the last day, I can't let that slide as a general statement.
> 
> OpenMPI 1.4.1 needed a few minor code mods to build at all using gcc on
> this system (it expects some defines that aren't present, this is with
> the sunfreeware gcc versions), and those mods were just about counting
> CPUs, which wasn't an issue in this case because it is a single CPU
> system. These same issues were also reported by another fellow for 1.3.1
> on a Solaris 8 system:
> 
>  http://www.open-mpi.org/community/lists/users/2009/02/7994.php
>  
> The gcc version works so long as mpirun only sends jobs to itself.
> Sadly, try to send ANYTHING to a remote machine (linux Intel, in case
> that matters) and it treats one to:
> 
>  mca_oob_tcp_msg_send_handler:  writev failed:  Bad file descriptor
> 
> This on a build with no warnings or errors.  Definitely a problem on the
> Solaris side, since any of the linux machines can initiate an mpirun to
> another node, or all other nodes, that works with the example programs.
>  So with gcc, OpenMPI not too useful for the front end of an MPI cluster.  
> 
> Today I'm trying again using Sun's Forte 7 tools, which requires a
> fairly complex configure line:
> 
> ./configure --with-sge  --prefix=/opt/ompi141 CFLAGS="-xarch=v8plusa"
> CXXFLAGS="-xarch=v8plusa" FFLAGS="-xarch=v8plusa"
> FCFLAGS="-xarch=v8plusa" CC=/opt/SUNWspro/bin/cc
> CXX=/opt/SUNWspro/bin/CC F77=/opt/SUNWspro/bin/f77
> FC=/opt/SUNWspro/bin/f95 CCAS=/opt/SUNWspro/bin/cc
> CCASFLAGS="-xarch=v8plusa" >configure_4.log 2>&1 &
> 
> Not sure yet if that is sufficient, as none of the preceding configure
> variants resulted in a set of Makefiles which would actually run to
> completion, and this one is still building.
> 
> Regards,
> 
> 
> David Mathog
> mathog at caltech.edu
> Manager, Sequence Analysis Facility, Biology Division, Caltech
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list