[Beowulf] MPI shared memory errors

Rusty Lusk lusk at mcs.anl.gov
Fri Jul 9 11:53:30 PDT 2004


You might want to begin using MPICH2, available from
http://www.mcs.anl.gov/mpi/mpich .  It is more robust than the old MPICH1.  

Regards,
Rusty Lusk

From: "Brent M. Clements" <bclem at rice.edu>
Subject: [Beowulf] MPI shared memory errors
Date: Thu, 8 Jul 2004 14:16:12 -0500 (CDT)

> Does anyone know how to fix the problem below? We have an idea or two but
> want to get other admin's opinions.
> 
> Thanks,
> Brent
> 
> Brent Clements
> Linux Technology Specialist
> Information Technology
> Rice University
> 
> Linux at Rice news and information
> available only at http://linuxsupport.rice.edu
> 
> 
> ---------- Forwarded message ----------
> Date: Thu, 08 Jul 2004 13:15:18 -0500
> From: Randy Crawford <rand at rice.edu>
> To: Brent M. Clements <bclem at rice.edu>
> Subject: Re: can you send me that error again?
> 
> When running two processes over ethernet MPI, the original error was:
> 
> "
> p2_15517: (38.889341) xx_shmalloc: returning NULL; requested 65584
> p2_15517: (38.889341) p4_shmalloc returning NULL; request = 65584 bytes
> You can increase the amount of memory by setting the environment variable
> P4_GLOBMEMSIZE (in bytes); the current size is 4194304
> p2_15517:  p4_error: alloc_p4_msg failed: 0
> CHARMDEBUG> Processor 3 has PID 15518
> CHARMDEBUG> Processor 1 has PID 13334
> bm_list_13335: (39.139197) net_send: could not write to fd=5, errno =32
> "
> 
> I then reset shmmax on all the nodes to be much higher, and I think
> the failure then occurred at 128 KB.
> 
> Then I set P4_GLOBMEMSIZE to something like 2 GB (instead of 4 MB), and I got a
> different error:
> 
> p0_6444:  p4_error: exceeding max num of P4_MAX_SYSV_SHMIDS: 256
> 
> 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> 



More information about the Beowulf mailing list