[Beowulf] scalability

amjad ali amjad11 at gmail.com
Wed Dec 9 19:14:23 PST 2009


Hi Gus,
your nice reply; as usual.

I ran my code on single socket xeon node having two cores; It ran linear
97+% efficient.

Then I ran my code on single socket xeon node having four cores ( Xeon 3220
-which really not a good quad core) I got the efficiency of around 85%.

But on four single socket nodes I ran 4 processes  (1 process on each node);
I got the efficiency of around 62%.

Yes, CFD codes are memory bandwidth bound usually.

Thank you very much.







run with 2core
On Wed, Dec 9, 2009 at 9:11 PM, Gus Correa <gus at ldeo.columbia.edu> wrote:

> Hi Amjad
>
> There is relatively inexpensive Infiniband SDR:
>
> http://www.colfaxdirect.com/store/pc/showsearchresults.asp?customfield=5&SearchValues=65
> http://www.colfaxdirect.com/store/pc/viewPrd.asp?idproduct=12
>
> http://www.colfaxdirect.com/store/pc/viewCategories.asp?SFID=12&SFNAME=Brand&SFVID=50&SFVALUE=Mellanox&SFCount=0&page=0&pageStyle=m&idcategory=2&VS12=0&VS9=0&VS10=0&VS4=0&VS3=0&VS11=0
> Not the latest greatest, but faster than Gigabit Ethernet.
> A better Gigabit Ethernet switch may help also,
> but I wonder if the impact will be as big as expected.
>
> However, are you sure the scalability problems you see are
> due to poor network connection?
> Could it be perhaps related to the code itself,
> or maybe to the processors' memory bandwidth?
>
> You could test if it is network running the program inside a node
> (say on 4 cores) and across 4 nodes with
> one core in use on each node, or other combinations
> (2 cores on 2 nodes).
>
> You could have an indication of the processors' scalability
> by timing program runs inside a single node using 1,2,3,4 cores.
>
> My experience with dual socket dual core Xeons vs.
> dual socket dual core Opterons,
> with the type of code we run here (ocean,atmosphere,climate models,
> which are not totally far from your CFD) is that Opterons
> scale close to linear, but Xeons get nearly stuck in terms of scaling
> when there are more than 2 processes (3 or 4) running in a single node.
>
> My two cents.
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
>
> amjad ali wrote:
>
>> Hi all,
>>
>> I have, with my group, a small cluster of about 16 nodes (each one with
>> single socket Xeon 3085 or 3110; And I face problem of poor scalability. Its
>> network is quite ordinary GiGE (perhaps DLink DGS-1024D 24-Port
>> 10/100/1000), store and forward switch, of price about $250 only.
>> ftp://ftp10.dlink.com/pdfs/products/DGS-1024D/DGS-1024D_ds.pdf
>>
>> How should I work on that for better scalability?
>>
>> What could be better affordable options of fast switches? (Myrinet,
>> Infiniband are quite costly).
>>
>> When buying a switch what should we see in it? What latency?
>>
>>
>> Thank you very much.
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20091209/d1c71bd8/attachment.html>


More information about the Beowulf mailing list