[Beowulf] WRF model on linux cluster: Mpi problem

Vincent Diepeveen diep at xs4all.nl
Fri Jul 1 04:36:08 PDT 2005


Federico,

You should just remove the entire thing and reinstall some nice
distribution with optimized MPI drivers for your cards.

What i did do here is 2 things.

a) remove pci cards from the same pci-bus that would clock down the cards
   from 66Mhz pci to 33Mhz. In general that means remove them all except the
   NIC.
b) patch a kernel to implement the kernel patches of the manufacturer of 
   the NICs, which allows the NIC to have a faster communication
   (only specific kernels work for that).
   tcp/ip will ALWAYS work at latencies factor 10 slower than 
   the MPI implementation.
c) install pdsh

That's the procedure i followed.

b.t.w. what type of machines are all nodes build of. Dual opterons?

Vincent

At 09:38 AM 7/1/2005 +0200, Federico Ceccarelli wrote:
>
>yeas, 
>
>I will remove openmosix. 
>I patched the kernel with openmosix because I used the cluster also for
>other smaller applications, so the load balance was useful to me.
>
>I already tried to switch off openmosix with
>
>> service openmosix stop
>
>but nothing seems to change...
>
>Do tou think it could be different to completely remove it, replacing
>the kernel with a new one without openmosix patch?
>
>thanks...
>
>federico
>
>
>Il giorno gio, 30-06-2005 alle 12:10 -0700, Michael Will ha scritto:
>> Vincent is on target here:
>> 
>> If your application already uses MPI as a middleware assuming
>> distributed memory, then you should definitly use a beowulf style
>> setup rather than openmosix with it's pseudo-shared memory model.
>> 
>> Look at rocks 4.0.0 http://www.rocksclusters.org/Rocks/ which
>> is free and based on CentOS 4 which again is a free version of RHEL4.
>> 
>> Michael
>> 
>> Vincent Diepeveen wrote:
>> 
>> >At 02:34 PM 6/30/2005 +0200, Federico Ceccarelli wrote:
>> >  
>> >
>> >>Thanks for you answer Vincent,
>> >>
>> >>my network cards are Intel Pro 1000, Gigabit.
>> >>
>> >>Yes I did a 72h (real time) simulations that lasted 20h on 4 cpus...same
>> >>behaviour...
>> >>
>> >>I'm thinking about a bandwith problem...
>> >>
>> >>....maybe due to hardware failure of some network card, or switch (3com
>> >>-Baseline switch 2824).
>> >>
>> >>Or the pci-raisers for the network card (I have a 2 unit rack so that I
>> >>cannot mount network cards directly on the pci slot)...
>> >>    
>> >>
>> >
>> >because the gigabit cards have such horrible one way ping pong
latencies as
>> >compared to the highend cards (myri,dolphin,quadrics and relative seen
also
>> >infiniband), the pci bus is not your biggest problem which is the case
here.
>> >
>> >The specifications of the card are so so so restricted that the pci is not
>> >the problem at all.
>> >
>> >There are many tests out there to test things. You should try some one-way
>> >pingpong test. 
>> >
>> >By the way, the reason for me to not run openmosix nor similar single
image
>> >software systems is because it has such ugly effect at the latencies and
>> >the way it pages shared memory communication between nodes is real ugly
>> >slow and bad for this type of software. There is also something called
>> >OpenSSI which is pretty active getting developed. It has the same problem.
>> >
>> >Vincent
>> >
>> >  
>> >
>> >>Did you experience problem with pci-raisers?
>> >>
>> >>Can you suggest me a bandwidth benchmark?
>> >>
>> >>thanks again...
>> >>
>> >>federico
>> >>
>> >>Il giorno gio, 30-06-2005 alle 12:44 +0200, Vincent Diepeveen ha
>> >>scritto:
>> >>    
>> >>
>> >>>Hello Federico,
>> >>>
>> >>>Hope you can find contacts to colleges.
>> >>>
>> >>>A few questions.
>> >>>  a) what kind of interconnects does the cluster have (networkcards and
>> >>>which type?)
>> >>>  b) if you run a simulation that eats a few hours instead of a few
>> >>>      
>> >>>
>> >seconds,
>> >  
>> >
>> >>>     do you get the same speed outcome difference?
>> >>>
>> >>>I see the program is pretty big for open source calculating software,
about
>> >>>1.9MB fortran code, so bit time consuming to figure out for someone who
>> >>>isn't a non-meteorological expert.
>> >>>
>> >>>E:\wrf>dir *.f* /s /p
>> >>>..
>> >>>     Total Files Listed:
>> >>>             141 File(s)      1,972,938 bytes
>> >>>
>> >>>Best regards,
>> >>>Vincent
>> >>>
>> >>>At 06:56 PM 6/29/2005 +0200, federico.ceccarelli wrote:
>> >>>      
>> >>>
>> >>>>Hi!
>> >>>>
>> >>>>I would like to get in touch with people running numerical
meteorological
>> >>>>models  on a linux cluster (16cpu) , distributed memory (1Gb every
node),
>> >>>>diskless nodes, Gigabit lan, mpich and openmosix.
>> >>>>
>> >>>>I'm tring to run WRF model but the mpi version parallelized on 4, 8,
or 16
>> >>>>nodes runs slower than the single node one! It runs correctly but so
>> >>>>        
>> >>>>
>> >slow...
>> >  
>> >
>> >>>>When I run wrf.exe on a single processor the cpu time for every
>> >>>>        
>> >>>>
>> >timestep is
>> >  
>> >
>> >>>>about 10s for my configuration.
>> >>>>
>> >>>>When I switch to np=4, 8 or 16 the cpu time for a single step sometimes
>> >>>>        
>> >>>>
>> >its
>> >  
>> >
>> >>>>faster (as It should always be, for example 3sec for 4 cpu ) but often
>> >>>>        
>> >>>>
>> >it is
>> >  
>> >
>> >>>>slower and slower (60sec and more!). The overall time of the
simulation is
>> >>>>bigger than for the single node run...
>> >>>>
>> >>>>anyone have experienced the same problem?
>> >>>>
>> >>>>thanks in advance to everybody...
>> >>>>
>> >>>>federico
>> >>>>
>> >>>>
>> >>>>
>> >>>>Dr. Federico Ceccarelli (PhD)
>> >>>>-----------------------------
>> >>>>    TechCom snc
>> >>>>Via di Sottoripa 1-18
>> >>>>16124 Genova - Italia
>> >>>>Tel: +39 010 860 5664
>> >>>>Fax: +39 010 860 5691
>> >>>>http://www.techcom.it
>> >>>>
>> >>>>_______________________________________________
>> >>>>Beowulf mailing list, Beowulf at beowulf.org
>> >>>>To change your subscription (digest mode or unsubscribe) visit
>> >>>>        
>> >>>>
>> >>>http://www.beowulf.org/mailman/listinfo/beowulf
>> >>>      
>> >>>
>> >>>>        
>> >>>>
>> >>
>> >>    
>> >>
>> >_______________________________________________
>> >Beowulf mailing list, Beowulf at beowulf.org
>> >To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>> >  
>> >
>> 
>> 
>
>
>



More information about the Beowulf mailing list