[Beowulf] Moores Law is dying

Thu Apr 23 20:12:13 PDT 2009

On Tue, Apr 14, 2009 at 02:33:51PM -0400, Robert G. Brown wrote:
> On Tue, 14 Apr 2009, Jon Forrest wrote:
>
>> The reason for this is that it's simply too hard to write
>> a program whose instructions requirie even close to the
....
>
> I'm not sure what you mean here.  Surely the instructions are in all
> cases built into the final list by a compiler (linker/loader).  Are you
> suggesting that e.g. a long-running program with fully unrolled loops
> cannot exceed 4 GB in size and still be "simple"? 

One important bound for unrolling loops is the size of the instruction
cache (primary).  A looping set of instructions all in cache will
likely be a lot faster than a loop unrolled long enough to bust (i)cache.
It also lets the bus interface move data to and from main memory with
less contention.  So unrolled loops are less interesting in this case.

Still many compiler optimizations grow code and so also will inlining
of code that yesterday was a function call.  Inline code may quickly
get you closer to 4GB than loop unrolling and still qualify as "simple".
Compilers can do this quietly in some cases (see intrinsic magic in gcc).
Some compiler users would love some library functions to be extracted
and linked inline.  Depending on the bits pulled inline and replicated N times
the 4GB limit might fall quickly.

Another way to explode the code space is generated interfaces (lib wrappers)
for all the N data type in a library for N different languages.
If I recall correctly the MPI folk already prune some of their generated wrappers
so the linker does not hurl.  C to/from Fortran C to/from C++ etc... for
all the various data types.... I forget which...

-- 
	T o m  M i t c h e l l 
	Found me a new hat, now what?