Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] PVM on wireless...

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Gerry Creager gerry.creager at tamu.edu
Thu Feb 7 11:26:47 PST 2008


FWIW, we saw this with ROCKS and MPICH, a couple of years ago.  Took a 
lot of firewall tweaking, and it's been too many beers to recall the 
details, to get things working.  It is odd.

gerry

Robert G. Brown wrote:
> On Thu, 7 Feb 2008, Bill Rankin wrote:
> 
>> I think that I managed to replicate your problem, Rob.
>>
>> Laptop running CentoOS5, pvm 3.4.5-7(rpm), wireless ethernet.
>> Server running FC6, pvm 3.4.5-7(rpm)
>>
>> Ssh working fine in both directions, PVM_ROOT and PVM_RSH set 
>> accordingly.
> 
> Try it with the firewalls completely down and I'll bet it works.
> 
> However, it is REALLY strange that it works with them UP for some
> combinations.  Or not so strange -- that's what was fooling me, after
> all.  Perhaps the port ranges being used are varying with version or
> chance.
> 
>    rgb
> 
>>
>> Running "pvm" from the shell on the server and doing an "add <laptop>" 
>> at the prompt.
>> Prompted for password.
>> PVM then hangs waiting to add remote host.
>> On the remote host, we see the pvmd running with a "ps".
>>
>> If I do nothing: the remote pvmd eventually dies and after that the 
>> command prompt on the server returns with a "1 successful" message, 
>> but a "conf" command shows that no hosts were added.
>>
>> Here is the weird part: if after I issue the "add <laptop>" command, I 
>> then go over to the laptop and run "pvm" from a shell, the connection 
>> is made and the hosts are successfully added.
>>
>> So you may want to try this and see if you get similar behavior.
>>
>>
>> Last datapoint: if from my laptop I attempt to add a host that has PVM 
>> 3.4.4 (CentOS4 rpm) installed, it starts up fine.  So I think that 
>> it's a bug in 3.4.5-7.  I haven't tried it over a wired connection yet.
>>
>> So you may want to try dropping back to version 3.4.4 on all machines 
>> and see if that helps.
>>
>>
>> Jim Kohl at ORNL seems to have several patches to 3.4.5, and I'm 
>> wondering if this issue has already been addressed.
>>
>>
>> -bill
> 

-- 
Gerry Creager -- gerry.creager at tamu.edu
Texas Mesonet -- AATLT, Texas A&M University
Cell: 979.229.5301 Office: 979.862.3982 FAX: 979.862.3983
Office: 1700 Research Parkway Ste 160, TAMU, College Station, TX 77843




More information about the Beowulf mailing list