interesting Athlon/P4 discussion from FreeBSD-Q-l

Jeremiah Gowdy jgowdy at home.com
Fri May 11 08:34:14 PDT 2001


> the P4 has an awesome combination of hardware prefetcher,
> fast FSB, and dram that keeps up with it.  for code that 
> needs bandwidth, this is very attractive.  and it's dramatically
> faster than anything else in the ia32 world: 1.6 GB/s versus
> at most around .8 GB/s for even PC2100 DDR systems (at least 
> so far - I'm hopeful that DDR can manage around 1.2 GB/s when
> tuned, and if the next-gen Athlon contains hardware prefetch.)

According to InQuest's article:
http://www.inqst.com/articles/p4bandwidth/p4bandwidthmain.htm

The Culprit - Longer Burst Length

Since the P4 is not getting any more work done than the P3 in this
application, then its excess bandwidth demand is probably just
extraneous, meaningless bus noise.  If so, this is a poor marketing
justification for higher bandwidth.

The P4 uses a 128-byte sectored cache line.  This means that most
external burst accesses will be 128-bytes long, though some can be
abbreviated to 64-bytes long (perhaps code fetches, some write backs or
cache misses to the second sector).  By the way, this type of long
sectored cache design can negatively impact cache-hit rates.  If 40% of
external bus accesses are 64-bytes, then perhaps 40% of the cache lines
are only using 64 of the 128-bytes available per line.  This would mean
that up to 20% of cache memory is empty (unused, invalid or
unallocated). This would negatively impact P4 cache hit rates and thus,
performance.


And as for the faster DRAM, you are of course referring to RDRAM memory.
I've yet to see a benchmark that shows RDRAM actually putting out the
bandwidth it claims in real world applications (or most benchmarks).

While the memory bandwidth of the P4 with RDRAM is, on paper, faster
than anything else in the IA-32 world, in almost every benchmark I've
ever seen in which the benchmarking program wasn't specially optimized
for SSE2, the Athlon 1.3 GHz has kicked the absolute crap out of the P4
1.5 GHz.  What good is all that memory bandwidth if the processor can't
stand up to real world applications ?  I could make a cpu/memory/chipset
combo and say "if you use it this way, it's the fastest computer ever
created", but the people are saying "But all of our applications don't
do it that way !"  Rambus proponents (mostly stockholders and people
who's bought a P4 for an outrageous price) always make claims about how
great it would be if only applications were optimized for it.  "It's
ahead of its time" they say, over time, applications will be optimized
for it.  Fine.  Someday, when things actually run FASTER on your 400mhz
bus and your 1.7ghz cpu, maybe the rest of us will consider buying one,
IF it doesn't cost the price of a small car.  But at this point, show me
something, anything, that makes it worth it to spend twice as much on a
P4 with RDRAM than an Athlon with DDR SDRAM.  

However, I don't think that day will ever actually come.  Rambus is
going down the toilet now that they're losing these lawsuits.  Intel is
disgusted with the whole Rambus affair.  DDR SDRAM is FAR cheaper, and
is sold by more than one vendor.  The 64bit CPU jihad is coming soon, so
the Athlon and the P4's days are numbered anyway.  Sure, Intel has
another P4 core on the books, and AMD *had* the Mustang on the books,
but really those processors are meaningless to the high end market once
the 64bit cpus come out.  Optimizing current 32bit applications
especially for the P4 and its RDRAM is nonsensical.  All of the Rambus
coulda/shoulda/woulda/if-only-this/if-only-that means nothing.  And
perhaps in the right place and the right time, RDRAM is a superior
product.  But the market doesn't always favor a superior product.  Think
of Rambus as Sony.  Think of the DDR SDRAM vendors as the VHS vendors.
Sony had A LOT more respect, a lot more money, a lot more everything,
and yet they couldn't beat the cheaper VHS.  It's the classic example of
the proprietary expensive best quality model vs the standardized cheaper
not as good quality model (but did I mention cheaper?), only in this
case, it's not even proven that RDRAM IS the best.  So if companies who
have a demonstrably better product can't win that fight, how can Rambus
and the P4, when they AREN'T demonstrably better ?  I won't even begin
to get into the Macintosh/Motorola vs Windows/Intel battle.

Short and simple:  The superior product doesn't always win, and the
P4+RDRAM have yet to prove beyond a doubt that they are superior.

Just think about it.

_______________________________
Jeremiah Gowdy - IT Manager

Sherline Products Inc
3235 Executive Ridge 
Vista CA 92083-8527

Sales: 1-800-541-0735
International: (760) 727-5857
Fax: (760) 727-7857
_______________________________








More information about the Beowulf mailing list