Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Re: OT: informatics software for linux clusters

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at physics.mcmaster.ca
Tue May 16 08:47:41 PDT 2006


> > That is an issue with this code.  The Athlon has a 256k L2 last I
> > remember, and a 128k L1.  Rather hard to keep lots of stuff in cache.

for their time (now well passed), 384 KB was a decent cache capacity.
(remember that AMD has traditionally used an exclusive cache mechanism
so that everything in L1 is not also in L2, unlike Intel.)

> Barton cores had 512k L2 as well as a faster front side bus.

I speculate that AMD will follow Intel to 2M/core caches as soon 
as they start producing 65 nm chips.  hopefully, they'll also add 
better _compute_ units, as well, such as at least matching Intel's
Core2 FP capabilities.

> > Right now the big issue we are running into for another aspect of this
> > project is the lack of a vector max/min function in SSE*.  (If anyone

I'm a complete SSE virgin (almost), but isn't this largely just 
a matter of doing a packed comparison, then using the resulting 
per-unit bit to load and merge?

> > from AMD/Intel is listening, this is a *big* issue, and I even have a
> > rough idea how to do it "quickly" in SSE at the expense of many SSE
> > registers.

I'd think you'd need one reg to hold the current max, one to load 
candidates into, and probably another to do the flag-vector-merge thing.
at the end you do a "horizontal" min/max to get the final result.




More information about the Beowulf mailing list