<html><body>

<DIV> </DIV>

<BLOCKQUOTE style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px solid">-------------- Original message -------------- <BR>From: "Peter St. John" <peter.st.john@gmail.com> <BR>

<DIV>DLP? Wiki has entries for Indtruction Level Parallelism and Thread LP (alsom Memory LP) but </DIV>

<DIV>not DLP?</DIV>

<DIV> </DIV>

<DIV>Hey Peter,</DIV>

<DIV> </DIV>

<DIV>That would be "data level parallelism".  So, ILP is very low level parallelism which</DIV>

<DIV>works on somewhat locally scoped instructions that are independent in a super-scalar, </DIV>

<DIV>fanned-out parallel way on a "wide" processor.  TLP refers to a slightly higher level of parallelism</DIV>

<DIV><SPAN class=q>that is instruction dominated/oriented and is associated with independent program blocks</SPAN></DIV>

<DIV><SPAN class=q>or loop interations (can be within or between programs or subroutines), and DLP refers</SPAN></DIV>

<DIV><SPAN class=q>to data dominant parallelism usually associated with looping structures that could vectorize</SPAN></DIV>

<DIV><SPAN class=q>or, in other words, allow a compiler to stimulate, with a small number of instructions, a large pipeline (and/or parallel stream [think </SPAN><SPAN class=q>GPUs here]) of independent data operations in the</SPAN></DIV>

<DIV><SPAN class=q>CPU for which the total instruction latency is trivially small in theory (i.e. when you have a</SPAN></DIV>

<DIV><SPAN class=q>vector instruction set).</SPAN></DIV>

<DIV><SPAN class=q></SPAN> </DIV>

<DIV><SPAN class=q>In one sense, the low level parallelism (and structural hazard performance limitations) of any program can be defined by a sort of aspect </SPAN><SPAN class=q>ratio that is ILP x DLP x TLP and every code/kernel has its own dimensionality and volume. </SPAN></DIV>

<DIV><SPAN class=q></SPAN> </DIV>

<DIV><SPAN class=q>A major question for performance and processor design is also, which kind of latency dominates</SPAN></DIV>

<DIV><SPAN class=q>--instruction latency or data latency--in limiting a code's performance to something less</SPAN></DIV>

<DIV><SPAN class=q>than register-to-register optimal.  Generally, in  HPC data latency is dominant and the Enterprise world is i</SPAN><SPAN class=q>nstruction latency dominates.</SPAN></DIV>

<DIV><SPAN class=q></SPAN> </DIV>

<DIV><SPAN class=q>But now I am probably rambling, and telling you something that you already know.</SPAN></DIV>

<DIV><SPAN class=q></SPAN> </DIV>

<DIV><SPAN class=q>Regards,</SPAN></DIV>

<DIV><SPAN class=q></SPAN> </DIV>

<DIV><SPAN class=q>rbw</SPAN></DIV>

<DIV><SPAN class=q><BR>-- <BR><BR>"Making predictions is hard, especially about the future." <BR><BR>Niels Bohr <BR><BR>-- <BR><BR>Richard Walsh <BR>Thrashing River Consulting-- <BR>5605 Alameda St. <BR>Shoreview, MN 55126 <BR><BR>Phone #: 612-382-4620</DIV></SPAN></BLOCKQUOTE></body></html>