<div>This reminds me of a similar issue I had.  What approaches do you take for large dense matrix multiplication in MPI, when the matrices are too large to fit into cluster memory?  If I hack up something to cache intermediate results to disk, the IO seems to drag everything to a halt and I'm looking for a better solution.  I'd like to use some libraries like PETSc, but how would you work around memory limitations like this (short of building a bigger cluster)?

<br><br><br><br> </div><blockquote style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;" class="gmail_quote">>> I don't speak fortran natively, but isn't that array

>> approximately 3.6 TB in size? > > Oops, forgot to put the decimal in the right place. > > 9915^3 * 8 bits/integer / 1024^3 bytes/GB = 907 GB. > > It could be done with a 64 bit kernel. Too big for PAE.

<br><br>Yeah, if you had a box with several hundred memory slots....<br><br>Which I say only semi-sarcastically.  They sound like they're coming,<br>they're coming.  Who knows, maybe they're here and I'm just out of

<br>touch.<br><br>If it is a sparse matrix, them just maybe one can do something on this<br>scale, but otherwise, well, it's like telling mathematica to go and<br>compute umpty-something factorial -- it will go out, make a herioc

<br>effort, use all the free memory in the universe, and die valiantly<br>(perhaps taking down your computer with it if the kernel happens to need<br>some memory at a critical time when their isn't any).  Large scale<br>

computation as a DOS attack...<br></blockquote><br><br clear="all"><br>-- <br>Peter N. Skomoroch<br><a href="mailto:peter.skomoroch@gmail.com">peter.skomoroch@gmail.com</a><br><a href="http://www.datawrangling.com">http://www.datawrangling.com

</a>