From samuel at unimelb.edu.au  Wed Sep  1 00:34:21 2010
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Wed, 01 Sep 2010 17:34:21 +1000
Subject: [Beowulf] When is compute-node load-average "high" in the HPC
	context? Setting correct thresholds on a warning script.
In-Reply-To: <29E4598B-4AFA-43ED-A5A8-B241CACCF217@staff.uni-marburg.de>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>
	<29E4598B-4AFA-43ED-A5A8-B241CACCF217@staff.uni-marburg.de>
Message-ID: <4C7E01FD.10104@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/09/10 01:58, Reuti wrote:

> With recent kernels also (kernel) processes in D state
> count as running.

I wouldn't say recent, that goes back as far as I can
remember.

For instance I've seen RHEL3 (2.4.x - sort of) NFS servers
with load averages in the 80's where they were run with a lot
of nfsd's that were blocked waiting for I/O due to ext3.

cheers!
Chris
- -- 
 Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computational Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkx+AfwACgkQO2KABBYQAh+QhgCfUUgmyUUGYtQ00Xd8/N/TOXN1
47gAn0DYzhSrZV1pY489HpMVhjGNVXPl
=70PC
-----END PGP SIGNATURE-----


From reuti at staff.uni-marburg.de  Wed Sep  1 01:47:29 2010
From: reuti at staff.uni-marburg.de (Reuti)
Date: Wed, 1 Sep 2010 10:47:29 +0200
Subject: [Beowulf] When is compute-node load-average "high" in the HPC
	context? Setting correct thresholds on a warning script.
In-Reply-To: <4C7E01FD.10104@unimelb.edu.au>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>
	<29E4598B-4AFA-43ED-A5A8-B241CACCF217@staff.uni-marburg.de>
	<4C7E01FD.10104@unimelb.edu.au>
Message-ID: <FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>

Am 01.09.2010 um 09:34 schrieb Christopher Samuel:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 01/09/10 01:58, Reuti wrote:
> 
>> With recent kernels also (kernel) processes in D state
>> count as running.
> 
> I wouldn't say recent, that goes back as far as I can
> remember.
> 
> For instance I've seen RHEL3 (2.4.x - sort of) NFS servers
> with load averages in the 80's where they were run with a lot
> of nfsd's that were blocked waiting for I/O due to ext3.

My impression was always (as there is a similar setting for the load_threshold in OGE), that it should limit the number of jobs on a big SMP machine when you oversubscribe by intention, as not all parallel jobs are really using all the CPU power over their lifetime (maybe such a machine was even operated w/o any NFS). Then allowing e.g. 72 slots for jobs on a 60 core maschine might get most out of it with a load near 100%.

Well, getting now 12 cores in newer CPUs and assemble them to 24 or 48 core machines would make such a setting useful again. Maybe the load sensor should honor only the scheduled jobs' load.

-- Reuti


> cheers!
> Chris
> - -- 
> Christopher Samuel - Senior Systems Administrator
> VLSCI - Victorian Life Sciences Computational Initiative
> Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>         http://www.vlsci.unimelb.edu.au/
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.10 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAkx+AfwACgkQO2KABBYQAh+QhgCfUUgmyUUGYtQ00Xd8/N/TOXN1
> 47gAn0DYzhSrZV1pY489HpMVhjGNVXPl
> =70PC
> -----END PGP SIGNATURE-----
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From mm at yuhu.biz  Wed Sep  1 03:15:55 2010
From: mm at yuhu.biz (Marian Marinov)
Date: Wed, 1 Sep 2010 13:15:55 +0300
Subject: [Beowulf] When is compute-node load-average "high" in the HPC
	context? Setting correct thresholds on a warning script.
In-Reply-To: <FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>
	<4C7E01FD.10104@unimelb.edu.au>
	<FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
Message-ID: <201009011315.58866.mm@yuhu.biz>

On Wednesday 01 September 2010 11:47:29 Reuti wrote:
> Am 01.09.2010 um 09:34 schrieb Christopher Samuel:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > On 01/09/10 01:58, Reuti wrote:
> >> With recent kernels also (kernel) processes in D state
> >> count as running.
> >
> > I wouldn't say recent, that goes back as far as I can
> > remember.
> >
> > For instance I've seen RHEL3 (2.4.x - sort of) NFS servers
> > with load averages in the 80's where they were run with a lot
> > of nfsd's that were blocked waiting for I/O due to ext3.
> 
> My impression was always (as there is a similar setting for the
>  load_threshold in OGE), that it should limit the number of jobs on a big
>  SMP machine when you oversubscribe by intention, as not all parallel jobs
>  are really using all the CPU power over their lifetime (maybe such a
>  machine was even operated w/o any NFS). Then allowing e.g. 72 slots for
>  jobs on a 60 core maschine might get most out of it with a load near 100%.
> 
> Well, getting now 12 cores in newer CPUs and assemble them to 24 or 48 core
>  machines would make such a setting useful again. Maybe the load sensor
>  should honor only the scheduled jobs' load.
> 
> -- Reuti
> 
> > cheers!
> > Chris

I believe that the load threshold should be set depending on the type of jobs 
you run on your compute nodes.

In some cases the load is not linked only to disk/network I/O and CPU, 
sometimes the jobs do a lot of in memory changes which bring more weight then 
the actual CPU or disk/network I/O. So for example a load average of 15 can 
also be considered for normal load, as far as the system is still responsive 
and the jobs time don't degrade.

-- 
Best regards,
Marian Marinov
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100901/6b3e8b8d/attachment.sig>

From reuti at staff.uni-marburg.de  Wed Sep  1 04:27:44 2010
From: reuti at staff.uni-marburg.de (Reuti)
Date: Wed, 1 Sep 2010 13:27:44 +0200
Subject: [Beowulf] When is compute-node load-average "high" in the HPC
	context? Setting correct thresholds on a warning script.
In-Reply-To: <201009011315.58866.mm@yuhu.biz>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>
	<4C7E01FD.10104@unimelb.edu.au>
	<FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
	<201009011315.58866.mm@yuhu.biz>
Message-ID: <78D212A2-76B3-4DC7-8A2A-3BA118FF4318@staff.uni-marburg.de>

Am 01.09.2010 um 12:15 schrieb Marian Marinov:

> On Wednesday 01 September 2010 11:47:29 Reuti wrote:
>> Am 01.09.2010 um 09:34 schrieb Christopher Samuel:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>> 
>>> On 01/09/10 01:58, Reuti wrote:
>>>> With recent kernels also (kernel) processes in D state
>>>> count as running.
>>> 
>>> I wouldn't say recent, that goes back as far as I can
>>> remember.
>>> 
>>> For instance I've seen RHEL3 (2.4.x - sort of) NFS servers
>>> with load averages in the 80's where they were run with a lot
>>> of nfsd's that were blocked waiting for I/O due to ext3.
>> 
>> My impression was always (as there is a similar setting for the
>> load_threshold in OGE), that it should limit the number of jobs on a big
>> SMP machine when you oversubscribe by intention, as not all parallel jobs
>> are really using all the CPU power over their lifetime (maybe such a
>> machine was even operated w/o any NFS). Then allowing e.g. 72 slots for
>> jobs on a 60 core maschine might get most out of it with a load near 100%.
>> 
>> Well, getting now 12 cores in newer CPUs and assemble them to 24 or 48 core
>> machines would make such a setting useful again. Maybe the load sensor
>> should honor only the scheduled jobs' load.
>> 
>> -- Reuti
>> 
>>> cheers!
>>> Chris
> 
> I believe that the load threshold should be set depending on the type of jobs 
> you run on your compute nodes.
> 
> In some cases the load is not linked only to disk/network I/O and CPU, 
> sometimes the jobs do a lot of in memory changes which bring more weight

I thought the load is just the number of processes which are eligible to run and in addition today which are in D state. But a single serial process w/o threads or forks shouldn't get the load over 1 by writing a lot to memory.

-- Reuti


> then 
> the actual CPU or disk/network I/O. So for example a load average of 15 can 
> also be considered for normal load, as far as the system is still responsive 
> and the jobs time don't degrade.
> 
> -- 
> Best regards,
> Marian Marinov
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From fumie.costen at manchester.ac.uk  Wed Sep  1 05:18:19 2010
From: fumie.costen at manchester.ac.uk (Fumie Costen)
Date: Wed, 01 Sep 2010 13:18:19 +0100
Subject: [Beowulf] GPU
In-Reply-To: <201009011315.58866.mm@yuhu.biz>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>	<4C7E01FD.10104@unimelb.edu.au>	<FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
	<201009011315.58866.mm@yuhu.biz>
Message-ID: <4C7E448B.805@manchester.ac.uk>

Dear All, I believe some of you have come across the phrase "GPU
computation".
I have an access to GPU cluster remotely but the speed of data transfer
between the GPUs seems to be pretty slow from the specification and I
feel this
is going to be
the serious bottle neck of the large scale computation with GPUs.
Even when we do tackle this particular problem
using the overlap of communication and computation somehow
spending some significant amount of time,
by the time when we try to publish even a conference paper,
the new-spec GPU cluster could be available and all of our
effort would be wasted and can not lead to the publication.
Furthermore,  just porting our current code to GPU won't give
us  journal papers, I guess.
I would be grateful if there are anybody who
have any experience in GPU computation and can share
the experience with me.
I am in the middle of production of the next PhD topics
and the topic has to be productive from the perspective of
journal publication.

Thank you very much,
Fumie


From michf at post.tau.ac.il  Wed Sep  1 08:17:15 2010
From: michf at post.tau.ac.il (Micha)
Date: Wed, 01 Sep 2010 18:17:15 +0300
Subject: [Beowulf] GPU
In-Reply-To: <4C7E448B.805@manchester.ac.uk>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>	<4C7E01FD.10104@unimelb.edu.au>	<FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>	<201009011315.58866.mm@yuhu.biz>
	<4C7E448B.805@manchester.ac.uk>
Message-ID: <4C7E6E7B.9090300@post.tau.ac.il>

On 01/09/2010 15:18, Fumie Costen wrote:
> Dear All, I believe some of you have come across the phrase "GPU
> computation".
> I have an access to GPU cluster remotely but the speed of data transfer
> between the GPUs seems to be pretty slow from the specification and I
> feel this
> is going to be
> the serious bottle neck of the large scale computation with GPUs.
> Even when we do tackle this particular problem
> using the overlap of communication and computation somehow
> spending some significant amount of time,
> by the time when we try to publish even a conference paper,
> the new-spec GPU cluster could be available and all of our
> effort would be wasted and can not lead to the publication.
> Furthermore,  just porting our current code to GPU won't give
> us  journal papers, I guess.
> I would be grateful if there are anybody who
> have any experience in GPU computation and can share
> the experience with me.
> I am in the middle of production of the next PhD topics
> and the topic has to be productive from the perspective of
> journal publication.
> 

GPUs have to communicate to CPU memory over PCIe which is a bottleneck indeed.

You have 2 modes of operations, standard and pinned memory (which allows DMA).

On a core-2 my experience is that max upload download speed is around 2-3 GB/s
(depending on mode) for large data sets (you reach top speed around 2 MB or s).
For small buffers it can go as low as 0.3GB/s.

I know of people who reached around 5.5GB/s but I believe that is on a Nehalem
machine with pinned memory.

You can do concurrent copy and execute to hide copy time, but you need a long
enough running kernel for that to work.

Another thing to note is that starting with Cuda 3.1 NVIDIA did some work to
allow direct transfer of data to infiniband which can reduce the CPU memory
middleman

As for research, I don't have enough experience with HPC papers, but you should
always remember to compare results in the papers for comparable hardware.

As for GPU related papers, you need to remember that this is a different
architecture with different assumptions, so in terms of papers, possible
research areas are:

1. Mapping existing algorithms to GPU architectures
2. Scalability of GPU architecture. due to the extra redirection with GPUs and
opportunities for concurrent copy and execute it has an extra level of
challenge. You can always compare standard approach to adaptive approaches.
3. Development of new techniques that are more appropriate to GPUs.
4. Hybrid algorithms that make good use of both the GPU and the CPU

> Thank you very much,
> Fumie
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rpnabar at gmail.com  Wed Sep  1 08:18:06 2010
From: rpnabar at gmail.com (Rahul Nabar)
Date: Wed, 1 Sep 2010 10:18:06 -0500
Subject: [Beowulf] When is compute-node load-average "high" in the HPC
	context? Setting correct thresholds on a warning script.
In-Reply-To: <FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>
	<29E4598B-4AFA-43ED-A5A8-B241CACCF217@staff.uni-marburg.de>
	<4C7E01FD.10104@unimelb.edu.au>
	<FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
Message-ID: <AANLkTi=EEPmZv2yHe+K+wV8PSiENUAA2dJEnQ8LmjY__@mail.gmail.com>

On Wed, Sep 1, 2010 at 3:47 AM, Reuti <reuti at staff.uni-marburg.de> wrote:
> My impression was always (as there is a similar setting for the load_threshold in OGE), that it should limit the number of jobs on a big SMP machine when you oversubscribe by intention, as not all parallel jobs are really using all the CPU power over their lifetime (maybe such a machine was even operated w/o any NFS). Then allowing e.g. 72 slots for jobs on a 60 core maschine might get most out of it with a load near 100%.

Our scheduler is currently set as to never allow over-subscription.
Also, we don't allocate shared nodes. Users get resources in 8-core
increments.

-- 
Rahul


From glykos at mbg.duth.gr  Wed Sep  1 12:42:23 2010
From: glykos at mbg.duth.gr (Nicholas M Glykos)
Date: Wed, 1 Sep 2010 22:42:23 +0300 (EEST)
Subject: [Beowulf] GPU
In-Reply-To: <4C7E448B.805@manchester.ac.uk>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>
	<4C7E01FD.10104@unimelb.edu.au>
	<FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
	<201009011315.58866.mm@yuhu.biz> <4C7E448B.805@manchester.ac.uk>
Message-ID: <Pine.LNX.4.62.1009012234100.14761@aspera.cluster.mbg.gr>


Hi Fumie,

<snip>
> I am in the middle of production of the next PhD topics and the topic 
> has to be productive from the perspective of journal publication.
</snip>

I'll play devil's advocate here, but my understanding is that you can't 
stop a creative and passionate PhD student from being productive, no 
matter what the assigned topic is. Unfortunately (and as usually happens 
with all aphorisms), the inverse statement is also true .-)

My twocents,
Nicholas

-- 


          Dr Nicholas M. Glykos, Department of Molecular Biology
     and Genetics, Democritus University of Thrace, University Campus,
  Dragana, 68100 Alexandroupolis, Greece, Tel/Fax (office) +302551030620,
    Ext.77620, Tel (lab) +302551030615, http://utopia.duth.gr/~glykos/


From lindahl at pbm.com  Wed Sep  1 18:17:37 2010
From: lindahl at pbm.com (Greg Lindahl)
Date: Wed, 1 Sep 2010 18:17:37 -0700
Subject: [Beowulf] 48-port 10gig switches?
Message-ID: <20100902011737.GB13598@bx9.net>

I'm in the market for 48-port 10gig switches (preferably not a
chassis), and was wondering if anyone other than Arista and (soon)
Voltaire makes them? Force10 seems to only have a chassis that big?
Cisco isn't my favorite vendor anyway. One would think that the
availability of a single-chip 48-port 10gig chip would lead to more
than just 2 vendors selling 'em.

-- greg


From david.ritch.lists at gmail.com  Wed Sep  1 19:02:59 2010
From: david.ritch.lists at gmail.com (David B. Ritch)
Date: Wed, 01 Sep 2010 22:02:59 -0400
Subject: [Beowulf] 48-port 10gig switches?
In-Reply-To: <20100902011737.GB13598@bx9.net>
References: <20100902011737.GB13598@bx9.net>
Message-ID: <4C7F05D3.7060404@gmail.com>

 Greg,

I know you don't like Cisco, but have you looked at the Nexus 5020?  It
has up to 52 10gig ports.

David

On 9/1/2010 9:17 PM, Greg Lindahl wrote:
> I'm in the market for 48-port 10gig switches (preferably not a
> chassis), and was wondering if anyone other than Arista and (soon)
> Voltaire makes them? Force10 seems to only have a chassis that big?
> Cisco isn't my favorite vendor anyway. One would think that the
> availability of a single-chip 48-port 10gig chip would lead to more
> than just 2 vendors selling 'em.
>
> -- greg
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>


From samuel at unimelb.edu.au  Wed Sep  1 20:08:04 2010
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Thu, 02 Sep 2010 13:08:04 +1000
Subject: [Beowulf] When is compute-node load-average "high" in the HPC
	context? Setting correct thresholds on a warning script.
In-Reply-To: <FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
References: <AANLkTinBM8B0giGfTBdRs04Q2r69LxOkbpC-vZfXftty@mail.gmail.com>
	<29E4598B-4AFA-43ED-A5A8-B241CACCF217@staff.uni-marburg.de>
	<4C7E01FD.10104@unimelb.edu.au>
	<FEA1F281-8220-44BC-81F2-09F5B77325BF@staff.uni-marburg.de>
Message-ID: <4C7F1514.5040300@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 01/09/10 18:47, Reuti wrote:

> My impression was always (as there is a similar setting
> for the load_threshold in OGE), that it should limit the
> number of jobs on a big SMP machine when you oversubscribe
> by intention

Ah, I was purely talking about how the kernel counts tasks
in I/O wait as being part of the run queue (and thus showing
up in the load average).

cheers!
Chris
- -- 
 Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computational Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkx/FRQACgkQO2KABBYQAh+D4ACglhT5LWFpcxW02OzBaLoROXd0
+dEAn2rOqsj1wRF31cbVNI7XvbDcebIh
=Ztw5
-----END PGP SIGNATURE-----


From tom.ammon at utah.edu  Wed Sep  1 21:15:25 2010
From: tom.ammon at utah.edu (Tom Ammon)
Date: Wed, 01 Sep 2010 22:15:25 -0600
Subject: [Beowulf] 48-port 10gig switches?
In-Reply-To: <20100902011737.GB13598@bx9.net>
References: <20100902011737.GB13598@bx9.net>
Message-ID: <4C7F24DD.7020209@utah.edu>

I hadn't heard about any 48-port 10GbE switch chips. Fulcrum and Dune 
don't show anything like that on their websites. Where did you hear 
about 48-port 10G asics? 24-port chips are pretty easy to find, but I 
hadn't heard about 48-port'ers.

Tom

On 09/01/2010 07:17 PM, Greg Lindahl wrote:
> I'm in the market for 48-port 10gig switches (preferably not a
> chassis), and was wondering if anyone other than Arista and (soon)
> Voltaire makes them? Force10 seems to only have a chassis that big?
> Cisco isn't my favorite vendor anyway. One would think that the
> availability of a single-chip 48-port 10gig chip would lead to more
> than just 2 vendors selling 'em.
>
> -- greg
>
>
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>    

-- 
--------------------------------------------------------------------
Tom Ammon
Network Engineer
Office: 801.587.0976
Mobile: 801.674.9273

Center for High Performance Computing
University of Utah
http://www.chpc.utah.edu


From lindahl at pbm.com  Wed Sep  1 23:00:05 2010
From: lindahl at pbm.com (Greg Lindahl)
Date: Wed, 1 Sep 2010 23:00:05 -0700
Subject: [Beowulf] 48-port 10gig switches?
In-Reply-To: <4C7F24DD.7020209@utah.edu>
References: <20100902011737.GB13598@bx9.net> <4C7F24DD.7020209@utah.edu>
Message-ID: <20100902060005.GJ27021@bx9.net>

Press about the new Voltaire 6048 48p 10g switch indicates that it's a
single switch chip:

http://www.theregister.co.uk/2010/08/30/voltaire_vantage_6048/

Arista seems to have a similar product at a similarish list price, and
that list price is a lot less than chassis switches using 24p silicon.

Fujitsu isn't selling a 48p switch, and I'm not up enough on silicon
vendors to tell you if Fulcrum is still the only other vendor. I
used to know this stuff, then I left HPC to build a search engine :-)

-- greg

On Wed, Sep 01, 2010 at 10:15:25PM -0600, Tom Ammon wrote:
> I hadn't heard about any 48-port 10GbE switch chips. Fulcrum and Dune  
> don't show anything like that on their websites. Where did you hear  
> about 48-port 10G asics? 24-port chips are pretty easy to find, but I  
> hadn't heard about 48-port'ers.
>
> Tom
>
> On 09/01/2010 07:17 PM, Greg Lindahl wrote:
>> I'm in the market for 48-port 10gig switches (preferably not a
>> chassis), and was wondering if anyone other than Arista and (soon)
>> Voltaire makes them? Force10 seems to only have a chassis that big?
>> Cisco isn't my favorite vendor anyway. One would think that the
>> availability of a single-chip 48-port 10gig chip would lead to more
>> than just 2 vendors selling 'em.
>>
>> -- greg
>>
>>
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>    
>
> -- 
> --------------------------------------------------------------------
> Tom Ammon
> Network Engineer
> Office: 801.587.0976
> Mobile: 801.674.9273
>
> Center for High Performance Computing
> University of Utah
> http://www.chpc.utah.edu
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From tom.ammon at utah.edu  Wed Sep  1 23:26:23 2010
From: tom.ammon at utah.edu (Tom Ammon)
Date: Thu, 02 Sep 2010 00:26:23 -0600
Subject: [Beowulf] 48-port 10gig switches?
In-Reply-To: <20100902060005.GJ27021@bx9.net>
References: <20100902011737.GB13598@bx9.net> <4C7F24DD.7020209@utah.edu>
	<20100902060005.GJ27021@bx9.net>
Message-ID: <4C7F438F.1070903@utah.edu>

Interesting. Although, I'm still not convinced it's a single switching 
asic. The switch chip is, of course, not the only "chip" in the switch. 
This article says the "networking protocols" run on a single chip. The 
official Voltaire press release at 
http://www.voltaire.com/NewsAndEvents/Press_Releases/press2010/Voltaire_Announces_High_Density_10_GbE_Switch_for_Efficient_Scaling_of_Cloud_Networks

doesn't say anything about a single switching asic - perhaps the author 
made an assumption about the product? You'd think they would really tout 
the fact if they had a single chip that dense.

Last time I talked with the Arista people, their nonblocking 48 port 
switch (one of two options for a 48-port switch, IIRC) was not a single 
chip - it was a non-blocking 6-chip CLOS design. And, I agree, the price 
was compelling.

So I still think there's not a 48 port 10GbE switch chip, at least not 
in merchant silicon. I don't know much about what cisco is cooking up on 
10GbE. I know Juniper was rebranding BNT (which was fulcrum-based). I 
also heard about Extreme's top of rack 10GbE but it was only 24 ports - 
you have to stack two of them together to get 48 ports.

So my answer to your original question is that since there's not 
single-chip 48p, you still have to chain together 24-port chips to get 
line-rate 10GbE performance. I'm happy to be corrected, of course - but 
a seemingly misguided statement in an article in the trade press doesn't 
seem like a very good product announcement for an innovation like that.

Tom

On 09/02/2010 12:00 AM, Greg Lindahl wrote:
> Press about the new Voltaire 6048 48p 10g switch indicates that it's a
> single switch chip:
>
> http://www.theregister.co.uk/2010/08/30/voltaire_vantage_6048/
>
> Arista seems to have a similar product at a similarish list price, and
> that list price is a lot less than chassis switches using 24p silicon.
>
> Fujitsu isn't selling a 48p switch, and I'm not up enough on silicon
> vendors to tell you if Fulcrum is still the only other vendor. I
> used to know this stuff, then I left HPC to build a search engine :-)
>
> -- greg
>
> On Wed, Sep 01, 2010 at 10:15:25PM -0600, Tom Ammon wrote:
>    
>> I hadn't heard about any 48-port 10GbE switch chips. Fulcrum and Dune
>> don't show anything like that on their websites. Where did you hear
>> about 48-port 10G asics? 24-port chips are pretty easy to find, but I
>> hadn't heard about 48-port'ers.
>>
>> Tom
>>
>> On 09/01/2010 07:17 PM, Greg Lindahl wrote:
>>      
>>> I'm in the market for 48-port 10gig switches (preferably not a
>>> chassis), and was wondering if anyone other than Arista and (soon)
>>> Voltaire makes them? Force10 seems to only have a chassis that big?
>>> Cisco isn't my favorite vendor anyway. One would think that the
>>> availability of a single-chip 48-port 10gig chip would lead to more
>>> than just 2 vendors selling 'em.
>>>
>>> -- greg
>>>
>>>
>>>
>>> _______________________________________________
>>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>>
>>>        
>> -- 
>> --------------------------------------------------------------------
>> Tom Ammon
>> Network Engineer
>> Office: 801.587.0976
>> Mobile: 801.674.9273
>>
>> Center for High Performance Computing
>> University of Utah
>> http://www.chpc.utah.edu
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>      

-- 
--------------------------------------------------------------------
Tom Ammon
Network Engineer
Office: 801.587.0976
Mobile: 801.674.9273

Center for High Performance Computing
University of Utah
http://www.chpc.utah.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100902/31c622b8/attachment.html>

From mathog at caltech.edu  Thu Sep  2 09:58:19 2010
From: mathog at caltech.edu (David Mathog)
Date: Thu, 02 Sep 2010 09:58:19 -0700
Subject: [Beowulf] 48-port 10gig switches?
Message-ID: <E1OrD7D-00068Z-Of@mendel.bio.caltech.edu>

A lot of 1 GbE switches use around 15W/port so I thought 10 GbE switches
would be real fire breathers.  It doesn't look that way though, the
power consumption cited here:

http://www.voltaire.com/NewsAndEvents/Press_Releases/press2010/Voltaire_Announces_High_Density_10_GbE_Switch_for_Efficient_Scaling_of_Cloud_Networks

is "the industry?s lowest power consumption of 6.3 watts/port" or 302W.
 I wonder what tricks they used to increase the speed to 10 GbE and drop
the power consumption.  In any case, in a 1U chassis that's still enough
power to cook something if the ventilation shuts down, so it is a good
thing it has redundant fans.  I could not find a price for the Voltaire,
but it isn't going to be cheap.  Somewhere above $20K maybe?

Regards,

David Mathog
mathog at caltech.edu
Manager, Sequence Analysis Facility, Biology Division, Caltech


From atchley at myri.com  Thu Sep  2 10:54:10 2010
From: atchley at myri.com (Scott Atchley)
Date: Thu, 2 Sep 2010 13:54:10 -0400
Subject: [Beowulf] 48-port 10gig switches?
In-Reply-To: <E1OrD7D-00068Z-Of@mendel.bio.caltech.edu>
References: <E1OrD7D-00068Z-Of@mendel.bio.caltech.edu>
Message-ID: <B3965E44-6947-4B70-BD16-B94B7EB7D16B@myri.com>

On Sep 2, 2010, at 12:58 PM, David Mathog wrote:

> A lot of 1 GbE switches use around 15W/port so I thought 10 GbE switches
> would be real fire breathers.  It doesn't look that way though, the
> power consumption cited here:
> 
> http://www.voltaire.com/NewsAndEvents/Press_Releases/press2010/Voltaire_Announces_High_Density_10_GbE_Switch_for_Efficient_Scaling_of_Cloud_Networks
> 
> is "the industry?s lowest power consumption of 6.3 watts/port" or 302W.
> I wonder what tricks they used to increase the speed to 10 GbE and drop
> the power consumption.

Not using 10GBase-T? It uses SFP+ which do not use much power.

Scott


From fumie.costen at manchester.ac.uk  Mon Sep  6 02:33:21 2010
From: fumie.costen at manchester.ac.uk (Fumie Costen)
Date: Mon, 06 Sep 2010 10:33:21 +0100
Subject: [Beowulf] a 6month job starting in this October
Message-ID: <4C84B561.4070205@manchester.ac.uk>

Dear All,  I am planning to apply for a little grant for our software
project.
But to apply for this funding, I need to nominate a person to go with
the application form.
If there is anybody who meets the following criteria, please come back
to me as
soon as possible as the deadline is very soon

1. Have a PhD
2. UK or EU nationality,
2.5 currently living in Europe (or the person who can pay your
relocation cost by yourself to Manchester, U.K. )
3. good command in shell scripting in Unix and good command in  English
4. have a deep insight of knowledge of the finite difference time domain
method
5. good experience in Fortan programming
6. strong in mathematics
7. have a good knowledge in mathematical modelling in frequency
dependent materials
8. good experience in coding using  messege passing interface in  Fortran

Thank you very much
Fumie


From nick.c.evans at gmail.com  Thu Sep  2 16:40:03 2010
From: nick.c.evans at gmail.com (Nick Evans)
Date: Fri, 3 Sep 2010 09:40:03 +1000
Subject: [Beowulf] 48-port 10gig switches?
In-Reply-To: <E1OrD7D-00068Z-Of@mendel.bio.caltech.edu>
References: <E1OrD7D-00068Z-Of@mendel.bio.caltech.edu>
Message-ID: <AANLkTikNkmdWuy_W4qadMjDbcz0ZDuHdHqVhPNcd6nGe@mail.gmail.com>

>
> I could not find a price for the Voltaire,
> but it isn't going to be cheap.  Somewhere above $20K maybe?
>
> I saw in one of their  PDF's that is $500 a port which i have worked out to
$24K so you were close in your guess
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100903/34b9cd0a/attachment.html>

From cheny at ornl.gov  Sat Sep  4 06:07:42 2010
From: cheny at ornl.gov (Chen, Yong)
Date: Sat, 04 Sep 2010 09:07:42 -0400
Subject: [Beowulf] [hpc-announce] P2S2 2010 Call for Participation
In-Reply-To: <AAF2B57BB9F87B4EB9FB446BE67D73D601386D82E8@EXCHMB.ornl.gov>
References: <AAF2B57BB9F87B4EB9FB446BE67D73D601386D82E8@EXCHMB.ornl.gov>
Message-ID: <AAF2B57BB9F87B4EB9FB446BE67D73D601386D82F7@EXCHMB.ornl.gov>

[We apologize if you receive multiple copies of this call.]

Dear Colleagues,

The final program for the Third International Workshop on Parallel
Programming Models and Systems Software for High-end Computing (P2S2)
is now available on the workshop website:
http://www.mcs.anl.gov/events/workshops/p2s2

This year the workshop features 10 highly relevant technical talks
describing improvements to parallel programming models and systems
software in the past year. The workshop also features a technical
panel titled: "Is Hybrid Programming a Bad Idea Whose Time has Come?"
where a wide range of high profile panelists in this area will argue
on programming issues in the hybrid/heterogeneous computing era.

We would like to welcome you all to attend this year's P2S2 workshop
and look forward to seeing you on September 13th, in San Diego,
California.


P2S2-2010 PROGRAM
-----------------

Opening Remarks, Time: 08:45am - 09:00am

Session 1: Communication, Time: 9:00am - 10:30am
Session Chair: Vinod Tipparaju, Oak Ridge National Laboratory
    -   "Efficient Zero-Copy Noncontiguous I/O for Globus on InfiniBand",
Weikuan Yu and Jeffrey Vetter
    -   "Scaling Linear Algebra Kernels using Remote Memory Access",
Manojkumar Krishnan, Robert Lewis and Abhinav Vishnu
    -   "High Performance Design and Implementation of Nemesis Communication
Layer for Two-sided and One-Sided MPI Semantics in MVAPICH2",
Miao Luo, Sreeram Potluri, Ping Lai, Emilio P. Mancini, Hari Subramoni,
Krishna Kandalla, Sayantan Sur and Dhabaleswar K. Panda.

Session 2: Panel: Is Hybrid Programming a Bad Idea Whose Time has Come?
Time: 11:00am - 12:30pm
Panel Moderator: Pavan Balaji, Argonne National Laboratory
Panelists:
    Bronis de Supinski, Lawrence Livermore National Laboratory
    Wu-chun Feng, Virginia Tech
    Allen Maloney, University of Oregon
    Taisuke Boku, Tsukuba University, Japan
    Vijay Saraswat, IBM Research


Session 3: Programming Models and Performance Evaluation, Time: 1:30pm - 3:30pm
Session Chair: Hui Jin, Illinois Institute of Technology
    -   "Performance Modeling for AMD GPUs",
Ryan Taylor and Xiaoming Li
    -   "A Hybrid Programming Model for Compressible Gas Dynamics using OpenCL",
Ben Bergen, Marcus Daniels and Paul Weber
    -   "Message Driven Programming with S-Net: Methodology and Performance",
Frank Penczek, Sven-Bodo Scholz, Alex Shafarenko, Chun-Yi Chen, Nader Bagherzadeh,
Clemens Grelck and JungSook Yang
    -   "Implementation and Performance Evaluation of XcalableMP: A Parallel Programming
Language for Distributed Memory Systems",
Jinpil Lee and Mitsuhisa Sato

Session 4: Scheduling and Cache Management, Time: 4:00pm - 5:30pm
Session Chair: Weikuan Yu, Auburn University
    -   "Scheduling a ~100,000 core Supercomputer for maximum utilization and capability",
Phil Andrews, Patricia Kovatch, Victor Hazlewood and Troy Baer.
    -   "Improving the Effectiveness of Context-based Prefetching with Multi-order Analysis",
Yong Chen, Huaiyu Zhu, Hui Jin and Xian-He Sun.
    -   "Hierarchical Load Balancing for Large Scale Supercomputers",
Gengbin Zheng, Esteban Meneses, Abhinav Bhatele and Laxmikant V. Kale.


PROGRAM CHAIRS
--------------
 * Pavan Balaji, Argonne National Laboratory
 * Abhinav Vishnu, Pacific Northwest National Laboratory


PUBLICITY CHAIR
---------------
 * Yong Chen, Oak Ridge National Laboratory


STEERING COMMITTEE
------------------
 * William D. Gropp, University of Illinois Urbana-Champaign
 * Dhabaleswar K. Panda, Ohio State University
 * Vijay Saraswat, IBM Research


PROGRAM COMMITTEE
-----------------
 * Ahmad Afsahi, Queen's University
 * George Almasi, IBM Research
 * Taisuke Boku, Tsukuba University
 * Ron Brightwell, Sandia National Laboratory
 * Franck Cappello, INRIA, France
 * Yong Chen, Oak Ridge National Laboratory
 * Ada Gavrilovska, Georgia Tech
 * Torsten Hoefler, Indiana University
 * Zhiyi Huang, University of Otago, New Zealand
 * Hyun-Wook Jin, Konkuk University, Korea
 * Zhiling Lan, Illinois Institute of Technology
 * Doug Lea, State University of New York at Oswego
 * Jiuxing Liu, IBM Research
 * Heshan Lin, Virginia Tech
 * Guillaume Mercier, INRIA, France
 * Scott Pakin, Los Alamos National Laboratory
 * Fabrizio Petrini, IBM Research
 * Bronis de Supinksi, Lawrence Livermore National Laboratory
 * Sayantan Sur, Ohio State University
 * Rajeev Thakur, Argonne National Laboratory
 * Vinod Tipparaju, Oak Ridge National Laboratory
 * Jesper Traff, NEC, Europe
 * Weikuan Yu, Auburn University


If you have any questions, please contact us at p2s2-chairs at mcs.anl.gov


From xingqiuyuan at gmail.com  Fri Sep 10 19:46:20 2010
From: xingqiuyuan at gmail.com (xingqiu yuan)
Date: Sat, 11 Sep 2010 10:46:20 +0800
Subject: [Beowulf] wall clock time for mpi_allreduce?
Message-ID: <AANLkTi=to0H-vKDmx0ntC28AZaDviGTKU3vE1x2yq-V7@mail.gmail.com>

Hi

I found that use of mpi_allreduce to calculate the global maximum and
minimum takes very long time, any better alternatives to calculate the
global maximum/minimum values?


From charliep at cs.earlham.edu  Sun Sep 12 05:02:01 2010
From: charliep at cs.earlham.edu (Charlie Peck)
Date: Sun, 12 Sep 2010 08:02:01 -0400
Subject: [Beowulf] wall clock time for mpi_allreduce?
In-Reply-To: <AANLkTi=to0H-vKDmx0ntC28AZaDviGTKU3vE1x2yq-V7@mail.gmail.com>
References: <AANLkTi=to0H-vKDmx0ntC28AZaDviGTKU3vE1x2yq-V7@mail.gmail.com>
Message-ID: <02656782-2350-4C84-9971-A5A4097DEB9C@cs.earlham.edu>

On Sep 10, 2010, at 10:46 PM, xingqiu yuan wrote:

> Hi
> 
> I found that use of mpi_allreduce to calculate the global maximum and
> minimum takes very long time, any better alternatives to calculate the
> global maximum/minimum values?

If only the rank 0 process needs to know the global max and min you can use MPI_Reduce() rather than MPI_Reduceall() which will substantially reduce the communication time.  The difference is that with MPI_Reduce() the result of the reduction is not communicated back to the other ranks.

charlie


From jcownie at cantab.net  Sun Sep 12 05:37:16 2010
From: jcownie at cantab.net (James Cownie)
Date: Sun, 12 Sep 2010 13:37:16 +0100
Subject: [Beowulf] wall clock time for mpi_allreduce?
In-Reply-To: <AANLkTi=to0H-vKDmx0ntC28AZaDviGTKU3vE1x2yq-V7@mail.gmail.com>
References: <AANLkTi=to0H-vKDmx0ntC28AZaDviGTKU3vE1x2yq-V7@mail.gmail.com>
Message-ID: <53A0D73E-0565-4355-B8BD-EBD3551A875E@cantab.net>


On 11 Sep 2010, at 03:46, xingqiu yuan wrote:

> Hi
> 
> I found that use of mpi_allreduce to calculate the global maximum and
> minimum takes very long time, any better alternatives to calculate the
> global maximum/minimum values?

Before pinning the blame on allreduce, are you sure that you're not seeing the effects
of load imbalance?

How are you measuring the time for the reduction?
Are you measuring the time at a single node, or at every node? (The reduction can't complete
until all the nodes "check in"...)
Have you looked at the allreduce time if you insert a barrier before the reduction?
(That won't help your overall performance, but may make it clear where the problem
really is...)

--
-- Jim
--
James Cownie <jcownie at cantab.net>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100912/f954d893/attachment.html>

From landman at scalableinformatics.com  Sun Sep 12 05:52:53 2010
From: landman at scalableinformatics.com (Joe Landman)
Date: Sun, 12 Sep 2010 08:52:53 -0400
Subject: [Beowulf] wall clock time for mpi_allreduce?
In-Reply-To: <AANLkTi=to0H-vKDmx0ntC28AZaDviGTKU3vE1x2yq-V7@mail.gmail.com>
References: <AANLkTi=to0H-vKDmx0ntC28AZaDviGTKU3vE1x2yq-V7@mail.gmail.com>
Message-ID: <4C8CCD25.9050704@scalableinformatics.com>

On 09/10/2010 10:46 PM, xingqiu yuan wrote:
> Hi
>
> I found that use of mpi_allreduce to calculate the global maximum and
> minimum takes very long time, any better alternatives to calculate the
> global maximum/minimum values?

There are several variations on this theme you can try, and some might 
work better than others.  All will be more verbose than the allreduce

repeated vector reductions.

1) Take M-vectors of length N so your vector you are reducing (index as 
1:N in F90, or 0:N-1 in C/C++) and do a maximum and minimum reduction.

2) Take vectors of length 2, and use pair reductions.  Every iteration 
you have 1/2 of the previous generation.  Would require something on the 
order of log_2(Vector_length) iterations.

This said, while allreduce is a collective and something of a 
heavyweight operation, you might be dealing with slowness due to 
something else.  I'd suggest some careful measurements of the time 
between some timing calipers to help you determine where things are 
spending time.  Allreduce and other collectives do require 
synchronization, so if something is delaying the synchronization, then 
it will appear slower.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615


From eugen at leitl.org  Wed Sep 15 02:05:44 2010
From: eugen at leitl.org (Eugen Leitl)
Date: Wed, 15 Sep 2010 11:05:44 +0200
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster in a
	Briefcase
Message-ID: <20100915090543.GB14773@leitl.org>


http://antipastohw.blogspot.com/2010/09/how-to-make-beagleboard-elastic-r.html

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE


From stuart at cyberdelix.net  Thu Sep 16 08:52:02 2010
From: stuart at cyberdelix.net (lsi)
Date: Thu, 16 Sep 2010 16:52:02 +0100
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster in
	a Briefcase
In-Reply-To: <20100915090543.GB14773@leitl.org>
References: <20100915090543.GB14773@leitl.org>
Message-ID: <4C923D22.18367.C1525B@stuart.cyberdelix.net>

Cute, but my question is, what use is one of these homegrown 
platforms?

Certainly if it was commercialised that would be a beasty compute 
appliance... but that's not my question - I'm asking, what is the 
role of the home hacker in the HPC world?

I mean, it's fine to go and make one of these things, but once you've 
made it, what do you use it for?

I ask as I presently have a "grid engine in a briefcase" sitting idle 
in my cupboard, fun to make but as I have no datasets to crunch, it's 
not even particularly good-looking eye candy!

I joined this list to get the answer to this question...

Stu

On 15 Sep 2010 at 11:05, Eugen Leitl wrote:

Date sent:      	Wed, 15 Sep 2010 11:05:44 +0200
From:           	Eugen Leitl <eugen at leitl.org>
To:             	Beowulf at beowulf.org
Copies to:      	Subject:        	[Beowulf] How to make a BeagleBoard 
Elastic R Beowulf Cluster in a
	Briefcase

> 
> http://antipastohw.blogspot.com/2010/09/how-to-make-beagleboard-elastic-r.html
> 
> -- 
> Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
> ______________________________________________________________
> ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


---
Stuart Udall
stuart at at cyberdelix.dot net - http://www.cyberdelix.net/

--- 
 * Origin: lsi: revolution through evolution (192:168/0.2)


From james.p.lux at jpl.nasa.gov  Thu Sep 16 10:21:06 2010
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 16 Sep 2010 10:21:06 -0700
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster
	in	a Briefcase
In-Reply-To: <4C923D22.18367.C1525B@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>
	<4C923D22.18367.C1525B@stuart.cyberdelix.net>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47FEDD27EBDD1@ALTPHYEMBEVSP20.RES.AD.JPL>

Why, having a Beowulf Cluster at home is the modern equivalent of "want to come and see my etchings" of the 50s and 60s..You haven't noticed the human eye candy at the local bistro following you around when you mention you have a grid engine in your briefcase? 

But more realistically.. you do it to say you've done it.  There *is* a certain amount of coolness to it.
And, there are some easily partitionable problems to run on such a thing. I've used a small toy cluster to run multiple runs of antenna models using NEC (although, I confess that when I upgraded my desktop computer to be faster, it actually got to be easier to let the multiple cores of the PC grind on it)...

There is a weird sort of problem when you look at low power/compact computing which tends to result in clusters with small numbers of nodes... the market moves fast enough that the single processor can beat the cluster pretty quickly (much more so than in big clusters... a 4x faster portable computer beats the 8 node portable cluster, but the same does not apply to a 1000 node cluster.. )

So what you need are problems that are computationally complex enough to need many nodes AND also need a low power cluster solution (for packaging reasons).  I did have such a problem about 5-6 years ago that I was funded to work on (distributed processing in a phased array antenna to calibrate out variations in shape/performance) for a couple years, but ultimately, nobody needed an antenna with that performance, so, while it was cool, it didn't have a customer.

I have thought that one good application for this sort of thing would be field processing of seismic or other geophysical data (conductivity, soil Electromagnetic properties), however, there you are competing against the other system design of "create high bandwidth data link and send data to somewhere to be processed".  Cheap communications can change a lot of things:  Who would have thought that it would be cheaper/easier/better to fly remote controlled airplanes from halfway around the world than from somewhere local.  Turns out, once you have a radio that can send the data back and forth, say, 10km, it's no more difficult to do it via a satellite and then send it anywhere you want.  (Latency *is* an issue... viz the discussion on Slashdot this morning about pigeons carrying microSD cards vs rural broadband in the UK... the "station wagon full of tapes" popped up quickly, along with the inevitable discussions about why weren't they using swallows, etc.)

There are a lot of Embarassingly Parallel (EP) tasks that could be useful to impecunious field scientists: What about taking 3D scans of pot sherds and figuring out how to put them together? And maybe the "stone souper computer" approach of using heterogenous surplus computers provides a non-capital-intensive way to get it done (since sending lots of data via satellite is expensive, and sometimes personally risky, in some places in the world)


Jim Lux

> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of lsi
> Sent: Thursday, September 16, 2010 8:52 AM
> To: Eugen Leitl; Beowulf at beowulf.org
> Subject: Re: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster in a Briefcase
> 
> Cute, but my question is, what use is one of these homegrown
> platforms?
> 
> Certainly if it was commercialised that would be a beasty compute
> appliance... but that's not my question - I'm asking, what is the
> role of the home hacker in the HPC world?
> 
> I mean, it's fine to go and make one of these things, but once you've
> made it, what do you use it for?
> 
> I ask as I presently have a "grid engine in a briefcase" sitting idle
> in my cupboard, fun to make but as I have no datasets to crunch, it's
> not even particularly good-looking eye candy!
> 
> I joined this list to get the answer to this question...
> 
> Stu
> 
> On 15 Sep 2010 at 11:05, Eugen Leitl wrote:
> 
> Date sent:      	Wed, 15 Sep 2010 11:05:44 +0200
> From:           	Eugen Leitl <eugen at leitl.org>
> To:             	Beowulf at beowulf.org
> Copies to:      	Subject:        	[Beowulf] How to make a BeagleBoard
> Elastic R Beowulf Cluster in a
> 	Briefcase
> 
> >
> > http://antipastohw.blogspot.com/2010/09/how-to-make-beagleboard-elastic-r.html
> >
> > --
> > Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
> > ______________________________________________________________
> > ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> > 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> 
> 
> ---
> Stuart Udall
> stuart at at cyberdelix.dot net - http://www.cyberdelix.net/
> 
> ---
>  * Origin: lsi: revolution through evolution (192:168/0.2)
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf


From peter.st.john at gmail.com  Thu Sep 16 10:21:46 2010
From: peter.st.john at gmail.com (Peter St. John)
Date: Thu, 16 Sep 2010 13:21:46 -0400
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster in
	a Briefcase
In-Reply-To: <4C923D22.18367.C1525B@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>
	<4C923D22.18367.C1525B@stuart.cyberdelix.net>
Message-ID: <AANLkTimhY1i=9ABxnXy42cuYD=0_SkRf8UiS0NEt=BZH@mail.gmail.com>

Stuart,

There are two inter-related but distinguishable issues here; why build a
system at home? and what can you do with a (HPC) system at home? I'm going
to take you literally and just address the 2nd question.

"Home Hacking" can be used by Home Science or Home Mathematics. Examples of
research projects that are organized by professional researchers, but can be
joined by anyone with a home computer who wishes to contribute whatever idle
CPU time he's got, is at this list:


http://en.wikipedia.org/wiki/List_of_distributed_computing_projects

The most familiar one nowadays is "Folding at Home" but there were
mathematicians doing arithmetic algebraic geometry this way in the early
90's.

If you like building computers but don't have any use for them, send them to
me :-)

Peter (Ersatz home computational mathematician)

On Thu, Sep 16, 2010 at 11:52 AM, lsi <stuart at cyberdelix.net> wrote:

> Cute, but my question is, what use is one of these homegrown
> platforms?
>
> Certainly if it was commercialised that would be a beasty compute
> appliance... but that's not my question - I'm asking, what is the
> role of the home hacker in the HPC world?
>
> I mean, it's fine to go and make one of these things, but once you've
> made it, what do you use it for?
>
> I ask as I presently have a "grid engine in a briefcase" sitting idle
> in my cupboard, fun to make but as I have no datasets to crunch, it's
> not even particularly good-looking eye candy!
>
> I joined this list to get the answer to this question...
>
> Stu
>
> On 15 Sep 2010 at 11:05, Eugen Leitl wrote:
>
> Date sent:              Wed, 15 Sep 2010 11:05:44 +0200
> From:                   Eugen Leitl <eugen at leitl.org>
> To:                     Beowulf at beowulf.org
> Copies to:              Subject:                [Beowulf] How to make a
> BeagleBoard
> Elastic R Beowulf Cluster in a
>        Briefcase
>
> >
> >
> http://antipastohw.blogspot.com/2010/09/how-to-make-beagleboard-elastic-r.html
> >
> > --
> > Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
> > ______________________________________________________________
> > ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
> > 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
> > _______________________________________________
> > Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> > To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
> ---
> Stuart Udall
> stuart at at cyberdelix.dot net - http://www.cyberdelix.net/
>
> ---
>  * Origin: lsi: revolution through evolution (192:168/0.2)
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100916/82fb3891/attachment.html>

From deadline at eadline.org  Thu Sep 16 13:25:13 2010
From: deadline at eadline.org (Douglas Eadline)
Date: Thu, 16 Sep 2010 16:25:13 -0400 (EDT)
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster 
	in a Briefcase
In-Reply-To: <4C923D22.18367.C1525B@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>
	<4C923D22.18367.C1525B@stuart.cyberdelix.net>
Message-ID: <59655.192.168.93.213.1284668713.squirrel@mail.eadline.org>


As a builder of some cheapo home clusters I would say that
software development (owning the reset switch is nice),
problem development (staging a small version of a problem
before you scale it up), and running real codes (most
HPC apps don't scale that well in any case).

Notice that it makes sense if you are in HPC
already, if you are not, you might be hard pressed
to find day-to-day uses for a cluster, though playing with
parallel cellular automata and genetic algorithms can
be fun.

BTW, my next home cluster is going to be
18 cores (AMD 2.6GHz) in a single PS and tower
case. Best cluster in the neighborhood!

Of course I'm still trying to build my HAL 9000
clone.

--
Doug


> Cute, but my question is, what use is one of these homegrown
> platforms?
>
> Certainly if it was commercialised that would be a beasty compute
> appliance... but that's not my question - I'm asking, what is the
> role of the home hacker in the HPC world?
>
> I mean, it's fine to go and make one of these things, but once you've
> made it, what do you use it for?
>
> I ask as I presently have a "grid engine in a briefcase" sitting idle
> in my cupboard, fun to make but as I have no datasets to crunch, it's
> not even particularly good-looking eye candy!
>
> I joined this list to get the answer to this question...
>
> Stu
>
> On 15 Sep 2010 at 11:05, Eugen Leitl wrote:
>
> Date sent:      	Wed, 15 Sep 2010 11:05:44 +0200
> From:           	Eugen Leitl <eugen at leitl.org>
> To:             	Beowulf at beowulf.org
> Copies to:      	Subject:        	[Beowulf] How to make a BeagleBoard
> Elastic R Beowulf Cluster in a
> 	Briefcase
>
>>
>> http://antipastohw.blogspot.com/2010/09/how-to-make-beagleboard-elastic-r.html
>>
>> --
>> Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
>> ______________________________________________________________
>> ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
>> 8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit
>> http://www.beowulf.org/mailman/listinfo/beowulf
>
>
>
> ---
> Stuart Udall
> stuart at at cyberdelix.dot net - http://www.cyberdelix.net/
>
> ---
>  * Origin: lsi: revolution through evolution (192:168/0.2)
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From james.p.lux at jpl.nasa.gov  Thu Sep 16 15:40:06 2010
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Thu, 16 Sep 2010 15:40:06 -0700
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster
	in a Briefcase
In-Reply-To: <59655.192.168.93.213.1284668713.squirrel@mail.eadline.org>
References: <20100915090543.GB14773@leitl.org>
	<4C923D22.18367.C1525B@stuart.cyberdelix.net>
	<59655.192.168.93.213.1284668713.squirrel@mail.eadline.org>
Message-ID: <ECE7A93BD093E1439C20020FBE87C47FEDD27EBE47@ALTPHYEMBEVSP20.RES.AD.JPL>


Jim Lux
+1(818)354-2075 

> -----Original Message-----
> From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Douglas Eadline
> Sent: Thursday, September 16, 2010 1:25 PM
> To: stuart at cyberdelix.net
> Cc: beowulf at beowulf.org
> Subject: Re: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster in a Briefcase
> 
> 
> As a builder of some cheapo home clusters I would say that
> software development (owning the reset switch is nice),
> problem development (staging a small version of a problem
> before you scale it up), and running real codes (most
> HPC apps don't scale that well in any case).

If you were writing proposals to scale up to hundreds of nodes, especially if you are self-funding the proposal work, then having demonstrated it on a cluster at all might lend credibility to your proposal, especially if the proposal evaluators are not cluster-afficionados (so they question the applicability of clusters in general, and are ignorant of the scaling issues)

> 
> 
> Of course I'm still trying to build my HAL 9000
> clone.
> 
> --
> Doug
> 


I don't think you want to do that, Doug...


From stuart at cyberdelix.net  Thu Sep 16 17:57:20 2010
From: stuart at cyberdelix.net (lsi)
Date: Fri, 17 Sep 2010 01:57:20 +0100
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster in
	a Briefcase
In-Reply-To: <ECE7A93BD093E1439C20020FBE87C47FEDD27EBE47@ALTPHYEMBEVSP20.RES.AD.JPL>
References: <20100915090543.GB14773@leitl.org>,
	<59655.192.168.93.213.1284668713.squirrel@mail.eadline.org>   ,
	<ECE7A93BD093E1439C20020FBE87C47FEDD27EBE47@ALTPHYEMBEVSP20.RES.AD.JPL>
Message-ID: <4C92BCF0.1666.2B4D480@stuart.cyberdelix.net>

Thanks all for your responses,

I built my engine because I imagined it, and wanted to see if it was 
possible.  Problem, I don't know much about HPC apps, and tasks need 
to be coded up for my platform (I tried to keep it as portable as 
possible, but it has a some required hooks, etc).  So it was as much 
as I could do to come up with a test job...  the engine does work 
though, pic (this is a memory dump, not a render): 
http://www.cyberdelix.net/media/retro_fractal_by_lsi.gif

Each row of numbers was calculated by a node on the grid (although, 
because this is home HPC, the grid only had two nodes).

Ah yes, so it does have a little eye candy I guess but fractals are a 
bit gratuitous, I was hoping for a more serious application.  I 
considered hunting for primes or somesuch... but I'm not a 
mathematician or scientist and I don't understand what, for example, 
the sieve of Eratosthenes is meant to be outputting, so writing code 
to run it is very difficult!

This is especially problematic because, as is noted below, a demo is 
an important part of a proposal.

I think I need to get a computational science degree, or engage a 
mathematician or scientist, to progress my project.  I was hoping 
this would be my HAL 9000 but it turns out to need more than an 
engine, it needs apps too...

Thanks for the bistro tip Jim, nice one.. :)

Stu

On 16 Sep 2010 at 15:40, Lux, Jim (337C) wrote:

> > From: beowulf-bounces at beowulf.org [mailto:beowulf-
bounces at beowulf.org] On Behalf Of Douglas Eadline

> > As a builder of some cheapo home clusters I would say that
> > software development (owning the reset switch is nice),
> > problem development (staging a small version of a problem
> > before you scale it up), and running real codes (most
> > HPC apps don't scale that well in any case).
> 
> If you were writing proposals to scale up to hundreds of nodes,
> especially if you are self-funding the proposal work, then having
> demonstrated it on a cluster at all might lend credibility to your
> proposal, especially if the proposal evaluators are not
> cluster-afficionados (so they question the applicability of clusters
> in general, and are ignorant of the scaling issues) 

> > Of course I'm still trying to build my HAL 9000 clone.

> I don't think you want to do that, Doug...


---
Stuart Udall
stuart at at cyberdelix.dot net - http://www.cyberdelix.net/

--- 
 * Origin: lsi: revolution through evolution (192:168/0.2)


From john.hearns at mclaren.com  Fri Sep 17 04:39:10 2010
From: john.hearns at mclaren.com (Hearns, John)
Date: Fri, 17 Sep 2010 12:39:10 +0100
Subject: [Beowulf] A sea of wimpy cores
Message-ID: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>

http://www.theregister.co.uk/2010/09/17/hotzle_on_brawny_and_wimpy_cores
/


I think our own Doug Eadline has been beating the drum (or the alarm
bell) about multicore for
some time also.

"may not be preferable to chips with faster but power-hungry cores"
Yeah - bring it on. Specialist computer rooms, flourinert cooling,
unusual word sizes,
and the HPC expert kept in a windowless room full of used coffee mugs.


John Hearns | CFD Hardware Specialist | McLaren Racing Limited
McLaren Technology Centre, Chertsey Road, Woking, Surrey GU21 4YH, UK

T:  +44 (0) 1483 261000
D:  +44 (0) 1483 262352
F:  +44 (0) 1483 261010
E:  john.hearns at mclaren.com
W:  www.mclaren.com


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.


From greg.matthews at diamond.ac.uk  Fri Sep 17 05:05:50 2010
From: greg.matthews at diamond.ac.uk (Gregory Matthews)
Date: Fri, 17 Sep 2010 13:05:50 +0100
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster
	in	a Briefcase
In-Reply-To: <4C923D22.18367.C1525B@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>
	<4C923D22.18367.C1525B@stuart.cyberdelix.net>
Message-ID: <4C93599E.4080109@diamond.ac.uk>

lsi wrote:
> Cute, but my question is, what use is one of these homegrown 
> platforms?

whatever happened to RGB anyway, I miss him...


-- 
Greg Matthews            01235 778658
Senior Computer Systems Administrator
Diamond Light Source, Oxfordshire, UK


From Bill.Rankin at sas.com  Fri Sep 17 06:11:52 2010
From: Bill.Rankin at sas.com (Bill Rankin)
Date: Fri, 17 Sep 2010 13:11:52 +0000
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
Message-ID: <AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>


On Sep 17, 2010, at 7:39 AM, Hearns, John wrote:

http://www.theregister.co.uk/2010/09/17/hotzle_on_brawny_and_wimpy_cores

Interesting article (more of a letter really) - to be honest when I first scanned it I was not sure of what Holzle's actual argument was.  To me, he omitted a lot of the details that makes this discussion much less black-and-white (and much more interesting) than he would contend:

1) He cites Amdahls' law, but leaves out Gustafson's.

2) Yeah, parallel programming is hard.  Deal with it.  It's helps me pay my bills.

3) While I am not a die hard FLOPS/Watt evangelist, he seems to completely ignore the power and cooling cost when discussing infrastructure costs.

The whole letter just seems like it's full of non-sequiturs and just a general gripe about processor architectures.


I think our own Doug Eadline has been beating the drum (or the alarm
bell) about multicore for some time also.

"may not be preferable to chips with faster but power-hungry cores"
Yeah - bring it on. Specialist computer rooms, flourinert cooling,
unusual word sizes, and the HPC expert kept in a windowless room full of used coffee mugs.

Wow, now *that's* an image.  On the other hand I can't help but wonder if you have just shown us the future of GPU-based computing.

:-)

Have a good weekend all,

-bill

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100917/7a9c7cd8/attachment.html>

From james.p.lux at jpl.nasa.gov  Fri Sep 17 06:56:13 2010
From: james.p.lux at jpl.nasa.gov (Lux, Jim (337C))
Date: Fri, 17 Sep 2010 06:56:13 -0700
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
Message-ID: <C8B8C18D.1218E%James.P.Lux@jpl.nasa.gov>


On 9/17/10 6:11 AM, "Bill Rankin" <Bill.Rankin at sas.com> wrote:
>> 
>> 
>> "may not be preferable to chips with faster but power-hungry cores"
>> Yeah - bring it on. Specialist computer rooms, flourinert cooling,
>> unusual word sizes, and the HPC expert kept in a windowless room full of used
>> coffee mugs.
> 
> Wow, now *that's* an image.  On the other hand I can't help but wonder if you
> have just shown us the future of GPU-based computing.
> 
> :-)
---
If you will recall the novel "The First Deadly Sin".. The bad-guy is an IT
manager with the computer in a glass temple, with the acolytes wearing
clean-room garb.


From stewart at serissa.com  Fri Sep 17 07:53:11 2010
From: stewart at serissa.com (Lawrence Stewart)
Date: Fri, 17 Sep 2010 10:53:11 -0400
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
	<AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
Message-ID: <B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>


On Sep 17, 2010, at 9:11 AM, Bill Rankin wrote:

> 
> On Sep 17, 2010, at 7:39 AM, Hearns, John wrote:
> 
>> http://www.theregister.co.uk/2010/09/17/hotzle_on_brawny_and_wimpy_cores
> 
> Interesting article (more of a letter really) - to be honest when I first scanned it I was not sure of what Holzle's actual argument was.  To me, he omitted a lot of the details that makes this discussion much less black-and-white (and much more interesting) than he would contend:
> 
> 1) He cites Amdahls' law, but leaves out Gustafson's.
> 
> 2) Yeah, parallel programming is hard.  Deal with it.  It's helps me pay my bills.
> 
> 3) While I am not a die hard FLOPS/Watt evangelist, he seems to completely ignore the power and cooling cost when discussing infrastructure costs.
> 
> The whole letter just seems like it's full of non-sequiturs and just a general gripe about processor architectures.


This letter of Holzle's is consistent with our experience at SiCortex.  The cores we had were far more power efficient than the x86's, but they were slower.  Because the interconnect was so fast, generally you could scale up farther than with commodity clusters so that you got better absolute performance and better price-performance, but it was tiring to argue these points over and over again.  Especially to customers who weren't paying for their power or infrastructure and didn't really value the low power aspect.

Holzle's letter doesn't go into enough detail however.  One of the other ideas at SiCortex was that a slow core wouldn't affect application performance of codes that were actually limited by the memory system.  We noticed many codes running at 1 - 5% of peak performance, spending the rest of their time burning a lot of power waiting for the memory.  I think this argument has yet to be tested, because the first generation SC machines didn't actually have a very good memory system.  The cores were limited to a single outstanding miss.  I think there is a fairly good case to be made that systems with slower, low power cores can get higher average efficiencies (% of peak) than fast cores -- provided that the memory systems are close to equivalent.  Everyone is using the same DRAMs.

Of course this argument doesn't work well if the application is compute bound, or fits in the cache.

There are lots of alternative ideas in this space.  Hyperthreading switches to a different thread when one blocks on the memory, turboboost runs <faster> when the power envelope permits.  I recall a paper or two about slowing down cores in a parallel application until the app itself started to run more slowly, etc.

-L

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100917/59574d10/attachment.html>

From hahn at mcmaster.ca  Fri Sep 17 08:25:22 2010
From: hahn at mcmaster.ca (Mark Hahn)
Date: Fri, 17 Sep 2010 11:25:22 -0400 (EDT)
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster
	in a Briefcase
In-Reply-To: <4C923D22.18367.C1525B@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>
	<4C923D22.18367.C1525B@stuart.cyberdelix.net>
Message-ID: <Pine.LNX.4.64.1009171115150.4602@coffee.psychology.mcmaster.ca>

> Cute, but my question is, what use is one of these homegrown
> platforms?

it's a PR stunt.  nothing wrong with that - the world needs more great demos!

beagleboard wouldn't be my first choice for this kind of thing, since it's
actually poorly suited and relatively expensive.  for instance, if you wait
for a sale, you can lay your hands on quite cheap wifi routers that have 
a similar ARM chip/etc, but with builtin ethernet support.  in fact, with 
builtin 5-8-port gigabit, which would be great for making a FNN.

otoh, if the demo made effective use of the beagleboard's onboard DSP
and audio IO, that _would_ be a cool hack.  either for some sort of 
interaction with the environment (Jim Lux mentioned distributed phased arrays
already), or even as the network (gang the audio together and assign each 
node a separate frequency, etc).

> appliance... but that's not my question - I'm asking, what is the
> role of the home hacker in the HPC world?

maybe it's just me, but I don't tend to do HPC stuff at home simply because 
it's so much easier to do using dayjob resources.  and I'm cheap.
but in general, I think the hacker community tends to stay more towards
the fringe than HPC (which has got to be considered quite mainstream).

> I mean, it's fine to go and make one of these things, but once you've
> made it, what do you use it for?

good projects are driven by first having a use to drive them...

-mark


From charliep at cs.earlham.edu  Fri Sep 17 08:53:00 2010
From: charliep at cs.earlham.edu (Charlie Peck)
Date: Fri, 17 Sep 2010 11:53:00 -0400
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster
	in	a Briefcase
In-Reply-To: <4C93599E.4080109@diamond.ac.uk>
References: <20100915090543.GB14773@leitl.org>
	<4C923D22.18367.C1525B@stuart.cyberdelix.net>
	<4C93599E.4080109@diamond.ac.uk>
Message-ID: <F164A703-0865-497F-BBF4-807F1505DFDB@cs.earlham.edu>

> lsi wrote:
>> Cute, but my question is, what use is one of these homegrown platforms?

How about education, outreach and training?  There are at least a couple of projects [1] that use small, home-built clusters in e.g. for undergraduate CS education, faculty education/re-training for parallel programming and cluster computing, and the like.  Microsoft [2] and others have also used platforms like this to explore low-power, on-demand compute platforms.

charlie

[1] MicroWulf, LittleFe
[2] http://www.greenm3.com/2009/02/microsoft-research-builds-intel-atom-servers.html


From stuart at cyberdelix.net  Fri Sep 17 09:52:57 2010
From: stuart at cyberdelix.net (lsi)
Date: Fri, 17 Sep 2010 17:52:57 +0100
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster in
	a Briefcase
In-Reply-To: <F164A703-0865-497F-BBF4-807F1505DFDB@cs.earlham.edu>
References: <20100915090543.GB14773@leitl.org>,
	<4C93599E.4080109@diamond.ac.uk>                              ,
	<F164A703-0865-497F-BBF4-807F1505DFDB@cs.earlham.edu>
Message-ID: <4C939CE9.21626.6201FE1@stuart.cyberdelix.net>

Yes Charlie,

But my question was relating to the personal use of homegrown 
systems.  There is certainly a use for the same tech in an 
institutional environment.

But what of homegrown systems that cannot be taken to work, or made 
part of a commercial product, that were just made because it could be 
done?

And I did get some ideas, but the general response seems to be "apart 
from R&D, unless you're a mathematician or scientist, not much"... I 
think this is why it needs the institutional environment - because it 
needs at least two skillsets to be useful.  One to build the box and 
another one to build the apps.  And probably another skillset again 
to use the apps, interpret the output etc.

Stu

On 17 Sep 2010 at 11:53, Charlie Peck wrote:

> >> Cute, but my question is, what use is one of these homegrown platforms?
> 
> How about education, outreach and training?  There are at least a couple of projects [1] that use small, home-built clusters in e.g. for undergraduate CS education, faculty education/re-training for parallel programming and cluster computing, and the like.  Microsoft [2] and others have also used platforms like this to explore low-power, on-demand compute platforms.


---
Stuart Udall
stuart at at cyberdelix.dot net - http://www.cyberdelix.net/

--- 
 * Origin: lsi: revolution through evolution (192:168/0.2)


From john.hearns at mclaren.com  Fri Sep 17 10:19:46 2010
From: john.hearns at mclaren.com (Hearns, John)
Date: Fri, 17 Sep 2010 18:19:46 +0100
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster ina
	Briefcase
In-Reply-To: <4C939CE9.21626.6201FE1@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>,
	<4C93599E.4080109@diamond.ac.uk>                              ,
	<F164A703-0865-497F-BBF4-807F1505DFDB@cs.earlham.edu> 
	<4C939CE9.21626.6201FE1@stuart.cyberdelix.net>
Message-ID: <68A57CCFD4005646957BD2D18E60667B11D965CC@milexchmb1.mil.tagmclarengroup.com>

> 
> Yes Charlie,
> 
> But my question was relating to the personal use of homegrown
> systems.  There is certainly a use for the same tech in an
> institutional environment.
> 
> But what of homegrown systems that cannot be taken to work, or made
> part of a commercial product, that were just made because it could be
> done?

Hmmmm.   I will chime in here, if I may.
In the early days of home computers (the TRS-80, apple, BBC Micro) days
people did these homebrew projects.
People were very keen on actually learning BASIC programming, and
assembler. there were plans in magazines
for hooking up sensors to serial and parallel ports.

In recent years this has gone away - and let's not get OS centric here.
People will download or buy applications.
But we now see the rise of the iPhone and Android, and again people are
writing simple programs for amusement,
or simple programs to do one thing well.

Where does this leave parallel computing? I for one do not see much
domestic use being made of GPUs.
People have extremely powerful graphics card in home systems - yet there
are no programs being run on them (rather than
graphics oriented games). How about, as a suggestion, motion detection
and face recognition? Your home recognises you,
and phones you if someone else is inside the home. Before anyone says
it, I know there are motion detection cameras,
and software which acts on motion detection, which will SMS you. I'm
just blueskying.

The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.


From lindahl at pbm.com  Fri Sep 17 10:25:04 2010
From: lindahl at pbm.com (Greg Lindahl)
Date: Fri, 17 Sep 2010 10:25:04 -0700
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
	<AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
	<B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>
Message-ID: <20100917172504.GA24976@bx9.net>

On Fri, Sep 17, 2010 at 10:53:11AM -0400, Lawrence Stewart wrote:

> One of the other ideas at SiCortex was that a slow core wouldn't
> affect application performance of codes that were actually limited by
> the memory system.  We noticed many codes running at 1 - 5% of peak
> performance, spending the rest of their time burning a lot of power
> waiting for the memory.  I think this argument has yet to be tested,
> because the first generation SC machines didn't actually have a very
> good memory system.

This is a somewhat well-studied thing.

* Blue Gene has a node with a slow cpu, and a better memory system
than you guys had. There are probably some published studies.

* On x86, it's not hard to slow down the memory system by reducing the
# of channels, putting in slow memory, or adding more devices such
that the bus slows down. And sometimes it's possible to get the same
cpu with ddr2 and ddr3.

* There have been a ton of academic papers over the years exploring
how memory bound various codes are.

-- greg


From charliep at cs.earlham.edu  Fri Sep 17 10:39:08 2010
From: charliep at cs.earlham.edu (Charlie Peck)
Date: Fri, 17 Sep 2010 13:39:08 -0400
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster in
	a Briefcase
In-Reply-To: <4C939CE9.21626.6201FE1@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>,
	<4C93599E.4080109@diamond.ac.uk>                              ,
	<F164A703-0865-497F-BBF4-807F1505DFDB@cs.earlham.edu>
	<4C939CE9.21626.6201FE1@stuart.cyberdelix.net>
Message-ID: <4E3EE790-FF48-4BBA-8DE3-49235B2553D0@cs.earlham.edu>

On Sep 17, 2010, at 12:52 PM, lsi wrote:

> But what of homegrown systems that cannot be taken to work, or made 
> part of a commercial product, that were just made because it could be 
> done?

The Maker community has a lot to say on this point, probably way better than I can, http://makerfaire.com/

> And I did get some ideas, but the general response seems to be "apart 
> from R&D, unless you're a mathematician or scientist, not much"... I 
> think this is why it needs the institutional environment - because it 
> needs at least two skillsets to be useful.  One to build the box and 
> another one to build the apps.  And probably another skillset again 
> to use the apps, interpret the output etc.

Most backyard rocketeers aren't going to Mars either, but they learn a lot in the process of getting a small payload a couple of meters off the ground.

charlie


From deadline at eadline.org  Fri Sep 17 16:13:08 2010
From: deadline at eadline.org (Douglas Eadline)
Date: Fri, 17 Sep 2010 19:13:08 -0400 (EDT)
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster 
	in a Briefcase
In-Reply-To: <4C92BCF0.1666.2B4D480@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>,
	<59655.192.168.93.213.1284668713.squirrel@mail.eadline.org> ,
	<ECE7A93BD093E1439C20020FBE87C47FEDD27EBE47@ALTPHYEMBEVSP20.RES.AD.JPL>
	<4C92BCF0.1666.2B4D480@stuart.cyberdelix.net>
Message-ID: <32872.192.168.93.213.1284765188.squirrel@mail.eadline.org>

BTW, there is a list of Freely available
cluster applications here (Cluster Tweaks):

http://tweaks.clustermonkey.net/index.php/Open/Freely_Available_Cluster_Applications

If anyone knows of other applications, please add
them. This is community wiki so if you want to contribute,
please create and account and have at it.

--
Doug


> Thanks all for your responses,
>
> I built my engine because I imagined it, and wanted to see if it was
> possible.  Problem, I don't know much about HPC apps, and tasks need
> to be coded up for my platform (I tried to keep it as portable as
> possible, but it has a some required hooks, etc).  So it was as much
> as I could do to come up with a test job...  the engine does work
> though, pic (this is a memory dump, not a render):
> http://www.cyberdelix.net/media/retro_fractal_by_lsi.gif
>
> Each row of numbers was calculated by a node on the grid (although,
> because this is home HPC, the grid only had two nodes).
>
> Ah yes, so it does have a little eye candy I guess but fractals are a
> bit gratuitous, I was hoping for a more serious application.  I
> considered hunting for primes or somesuch... but I'm not a
> mathematician or scientist and I don't understand what, for example,
> the sieve of Eratosthenes is meant to be outputting, so writing code
> to run it is very difficult!
>
> This is especially problematic because, as is noted below, a demo is
> an important part of a proposal.
>
> I think I need to get a computational science degree, or engage a
> mathematician or scientist, to progress my project.  I was hoping
> this would be my HAL 9000 but it turns out to need more than an
> engine, it needs apps too...
>
> Thanks for the bistro tip Jim, nice one.. :)
>
> Stu
>
> On 16 Sep 2010 at 15:40, Lux, Jim (337C) wrote:
>
>> > From: beowulf-bounces at beowulf.org [mailto:beowulf-
> bounces at beowulf.org] On Behalf Of Douglas Eadline
>
>> > As a builder of some cheapo home clusters I would say that
>> > software development (owning the reset switch is nice),
>> > problem development (staging a small version of a problem
>> > before you scale it up), and running real codes (most
>> > HPC apps don't scale that well in any case).
>>
>> If you were writing proposals to scale up to hundreds of nodes,
>> especially if you are self-funding the proposal work, then having
>> demonstrated it on a cluster at all might lend credibility to your
>> proposal, especially if the proposal evaluators are not
>> cluster-afficionados (so they question the applicability of clusters
>> in general, and are ignorant of the scaling issues)
>
>> > Of course I'm still trying to build my HAL 9000 clone.
>
>> I don't think you want to do that, Doug...
>
>
> ---
> Stuart Udall
> stuart at at cyberdelix.dot net - http://www.cyberdelix.net/
>
> ---
>  * Origin: lsi: revolution through evolution (192:168/0.2)
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From deadline at eadline.org  Fri Sep 17 16:24:25 2010
From: deadline at eadline.org (Douglas Eadline)
Date: Fri, 17 Sep 2010 19:24:25 -0400 (EDT)
Subject: [Beowulf] How to make a BeagleBoard Elastic R Beowulf Cluster 
	in a Briefcase
In-Reply-To: <4C939CE9.21626.6201FE1@stuart.cyberdelix.net>
References: <20100915090543.GB14773@leitl.org>,
	<4C93599E.4080109@diamond.ac.uk> ,
	<F164A703-0865-497F-BBF4-807F1505DFDB@cs.earlham.edu>
	<4C939CE9.21626.6201FE1@stuart.cyberdelix.net>
Message-ID: <57652.192.168.93.213.1284765865.squirrel@mail.eadline.org>


Well, it does not all have to be rocket science.
If you are a geek and you like doing geeky
things, then taking an application like
Critterding (Artificial Life)

  http://critterding.sourceforge.net/

and parallelizing it on a cluster would
be a fun project (If only I had the time)

Of course, if you created a big enough
A-life universe on your home cluster
you could have a really kick ass
"fish tank" to show your friends.

--
Doug


> Yes Charlie,
>
> But my question was relating to the personal use of homegrown
> systems.  There is certainly a use for the same tech in an
> institutional environment.
>
> But what of homegrown systems that cannot be taken to work, or made
> part of a commercial product, that were just made because it could be
> done?
>
> And I did get some ideas, but the general response seems to be "apart
> from R&D, unless you're a mathematician or scientist, not much"... I
> think this is why it needs the institutional environment - because it
> needs at least two skillsets to be useful.  One to build the box and
> another one to build the apps.  And probably another skillset again
> to use the apps, interpret the output etc.
>
> Stu
>
> On 17 Sep 2010 at 11:53, Charlie Peck wrote:
>
>> >> Cute, but my question is, what use is one of these homegrown
>> platforms?
>>
>> How about education, outreach and training?  There are at least a couple
>> of projects [1] that use small, home-built clusters in e.g. for
>> undergraduate CS education, faculty education/re-training for parallel
>> programming and cluster computing, and the like.  Microsoft [2] and
>> others have also used platforms like this to explore low-power,
>> on-demand compute platforms.
>
>
> ---
> Stuart Udall
> stuart at at cyberdelix.dot net - http://www.cyberdelix.net/
>
> ---
>  * Origin: lsi: revolution through evolution (192:168/0.2)
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


--
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From eagles051387 at gmail.com  Sat Sep 18 23:06:39 2010
From: eagles051387 at gmail.com (Jonathan Aquilina)
Date: Sun, 19 Sep 2010 08:06:39 +0200
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <20100917172504.GA24976@bx9.net>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
	<AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
	<B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>
	<20100917172504.GA24976@bx9.net>
Message-ID: <AANLkTi=fp=ejeqvAfrWVCqf3JedEBgxPbEmMMdAF53b0@mail.gmail.com>

Greg correct me if im wrong but cant you put in the memory which is
compatible with the system and slow the memory bus down via the bios?


> * On x86, it's not hard to slow down the memory system by reducing the
> # of channels, putting in slow memory, or adding more devices such
> that the bus slows down. And sometimes it's possible to get the same
> cpu with ddr2 and ddr3.
>
>

-- 
Jonathan Aquilina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100919/72caf94e/attachment.html>

From lindahl at pbm.com  Mon Sep 20 10:47:38 2010
From: lindahl at pbm.com (Greg Lindahl)
Date: Mon, 20 Sep 2010 10:47:38 -0700
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <AANLkTi=fp=ejeqvAfrWVCqf3JedEBgxPbEmMMdAF53b0@mail.gmail.com>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
	<AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
	<B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>
	<20100917172504.GA24976@bx9.net>
	<AANLkTi=fp=ejeqvAfrWVCqf3JedEBgxPbEmMMdAF53b0@mail.gmail.com>
Message-ID: <20100920174738.GF17436@bx9.net>

I'm sure that some BIOSes have that kind of feature, but none of the
ones that I'm currently using do.

On Sun, Sep 19, 2010 at 08:06:39AM +0200, Jonathan Aquilina wrote:
> Greg correct me if im wrong but cant you put in the memory which is
> compatible with the system and slow the memory bus down via the bios?
> 
> 
> > * On x86, it's not hard to slow down the memory system by reducing the
> > # of channels, putting in slow memory, or adding more devices such
> > that the bus slows down. And sometimes it's possible to get the same
> > cpu with ddr2 and ddr3.
> >
> >
> 
> -- 
> Jonathan Aquilina


From eagles051387 at gmail.com  Mon Sep 20 10:52:25 2010
From: eagles051387 at gmail.com (Jonathan Aquilina)
Date: Mon, 20 Sep 2010 19:52:25 +0200
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <20100920174738.GF17436@bx9.net>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
	<AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
	<B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>
	<20100917172504.GA24976@bx9.net>
	<AANLkTi=fp=ejeqvAfrWVCqf3JedEBgxPbEmMMdAF53b0@mail.gmail.com>
	<20100920174738.GF17436@bx9.net>
Message-ID: <AANLkTi=wcKhgz37XYJfkDaRHYTi==Nee8Afgzh3TBL_k@mail.gmail.com>

why would you want to run ram that is slower then the motherboard supports
anyway? i dont see any advantages of doing that. isnt the whole point to try
and speed up the calculation process?

On Mon, Sep 20, 2010 at 7:47 PM, Greg Lindahl <lindahl at pbm.com> wrote:

> I'm sure that some BIOSes have that kind of feature, but none of the
> ones that I'm currently using do.
>
> On Sun, Sep 19, 2010 at 08:06:39AM +0200, Jonathan Aquilina wrote:
> > Greg correct me if im wrong but cant you put in the memory which is
> > compatible with the system and slow the memory bus down via the bios?
> >
> >
> > > * On x86, it's not hard to slow down the memory system by reducing the
> > > # of channels, putting in slow memory, or adding more devices such
> > > that the bus slows down. And sometimes it's possible to get the same
> > > cpu with ddr2 and ddr3.
> > >
> > >
> >
> > --
> > Jonathan Aquilina
>


-- 
Jonathan Aquilina
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100920/1b7072cf/attachment.html>

From lindahl at pbm.com  Mon Sep 20 10:58:14 2010
From: lindahl at pbm.com (Greg Lindahl)
Date: Mon, 20 Sep 2010 10:58:14 -0700
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <AANLkTi=wcKhgz37XYJfkDaRHYTi==Nee8Afgzh3TBL_k@mail.gmail.com>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
	<AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
	<B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>
	<20100917172504.GA24976@bx9.net>
	<AANLkTi=fp=ejeqvAfrWVCqf3JedEBgxPbEmMMdAF53b0@mail.gmail.com>
	<20100920174738.GF17436@bx9.net>
	<AANLkTi=wcKhgz37XYJfkDaRHYTi==Nee8Afgzh3TBL_k@mail.gmail.com>
Message-ID: <20100920175814.GH17436@bx9.net>

Because you're trying to figure out how your application scales with
different memory speeds.

On Mon, Sep 20, 2010 at 07:52:25PM +0200, Jonathan Aquilina wrote:
> why would you want to run ram that is slower then the motherboard supports
> anyway? i dont see any advantages of doing that. isnt the whole point to try
> and speed up the calculation process?
> 
> On Mon, Sep 20, 2010 at 7:47 PM, Greg Lindahl <lindahl at pbm.com> wrote:
> 
> > I'm sure that some BIOSes have that kind of feature, but none of the
> > ones that I'm currently using do.
> >
> > On Sun, Sep 19, 2010 at 08:06:39AM +0200, Jonathan Aquilina wrote:
> > > Greg correct me if im wrong but cant you put in the memory which is
> > > compatible with the system and slow the memory bus down via the bios?
> > >
> > >
> > > > * On x86, it's not hard to slow down the memory system by reducing the
> > > > # of channels, putting in slow memory, or adding more devices such
> > > > that the bus slows down. And sometimes it's possible to get the same
> > > > cpu with ddr2 and ddr3.
> > > >
> > > >
> > >
> > > --
> > > Jonathan Aquilina
> >
> 
> 
> 
> -- 
> Jonathan Aquilina


From hearnsj at googlemail.com  Mon Sep 20 13:53:18 2010
From: hearnsj at googlemail.com (John Hearns)
Date: Mon, 20 Sep 2010 21:53:18 +0100
Subject: [Beowulf] Oracle spins own Linux for mega hardware
Message-ID: <AANLkTim+fphzVuirS+j4KsgY6aNc3bpU-QK+QJqmf+k7@mail.gmail.com>

http://www.theregister.co.uk/2010/09/20/oracle_own_linux/

Two points I would like to explore here:

"The eight OLTP servers, meanwhile, will feature 2TB of DRAM and up to
4,096 CPUs, 4PB of cluster volumes and what Ellison claimed will be
"advanced" NUMA support."
Are these designs something that Sun's HPC division was working on
behind the scenes? Is anyon ehere able to comment?
Seems strange that these suddenly appear after the Sun acquisition. My
take on it - a Sun HPC box which is being repurposed as a high end
database server.

Secondly is this talk of Oracle enhancements marketing speak - to be
honest I think it is. Anyone know what they are offering here?

John Hearns


From samuel at unimelb.edu.au  Mon Sep 20 20:09:47 2010
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Tue, 21 Sep 2010 13:09:47 +1000
Subject: [Beowulf] Oracle spins own Linux for mega hardware
In-Reply-To: <AANLkTim+fphzVuirS+j4KsgY6aNc3bpU-QK+QJqmf+k7@mail.gmail.com>
References: <AANLkTim+fphzVuirS+j4KsgY6aNc3bpU-QK+QJqmf+k7@mail.gmail.com>
Message-ID: <4C9821FB.90406@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 21/09/10 06:53, John Hearns wrote:

> Seems strange that these suddenly appear after the Sun
> acquisition. My take on it - a Sun HPC box which is being
> repurposed as a high end database server.

Actually I suspect it's the other way around, I'm guessing
they're taking the Sun Exadata2 (pre-dates the purchase) and
are working to make that a more general purpose "cloud" system.

They could be using something like ScaleMP to build larger
SMP's over its internal IB fabric.

Their 2.6.32 based kernel might also be interesting, a little
bird tells me they've added some performance patches to it..

cheers!
Chris
- -- 
 Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computational Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkyYIfsACgkQO2KABBYQAh8LxwCgl+aqsInoHFrwHGFSrPAO05WO
XxQAnjGU9BaFFVEyXuUXN/xZNzBwj2Mx
=alAp
-----END PGP SIGNATURE-----


From hahn at mcmaster.ca  Mon Sep 20 20:50:16 2010
From: hahn at mcmaster.ca (Mark Hahn)
Date: Mon, 20 Sep 2010 23:50:16 -0400 (EDT)
Subject: [Beowulf] Oracle spins own Linux for mega hardware
In-Reply-To: <4C9821FB.90406@unimelb.edu.au>
References: <AANLkTim+fphzVuirS+j4KsgY6aNc3bpU-QK+QJqmf+k7@mail.gmail.com>
	<4C9821FB.90406@unimelb.edu.au>
Message-ID: <Pine.LNX.4.64.1009202333220.28490@coffee.psychology.mcmaster.ca>

>> Seems strange that these suddenly appear after the Sun
>> acquisition. My take on it - a Sun HPC box which is being
>> repurposed as a high end database server.
>
> Actually I suspect it's the other way around, I'm guessing
> they're taking the Sun Exadata2 (pre-dates the purchase) and
> are working to make that a more general purpose "cloud" system.
>
> They could be using something like ScaleMP to build larger
> SMP's over its internal IB fabric.

AFAIKT, the product "Oracle Coherence" is sort of nosql - 
get/put access, presented as peer-to-peer caching layer.
I don't think it's a VM consistency middleware product like ScaleMP.


From tim_smith_666 at yahoo.com  Mon Sep 20 09:40:44 2010
From: tim_smith_666 at yahoo.com (Tim Smith)
Date: Mon, 20 Sep 2010 09:40:44 -0700 (PDT)
Subject: [Beowulf] dsh basic question on node/task ID
Message-ID: <675548.62156.qm@web57501.mail.re1.yahoo.com>


Hi,

I am new to parallel computing, so please forgive the naive question.

I am using the dsh command on an OSX cluster and would like to read in the node 
number from within the executed program. For example, with the Sun Grid Engine, 
I would run the following command:

/share/bin/R "--args $SEGMENTS $SGE_TASK_ID" < testR.r

and in the testR.r file, I was able to read the Task ID/node number and use it 
to parallelize my code. Which variable in the dsh command will let me read the 
node number?

thanks! 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100920/ca17f833/attachment.html>

From jeff.johnson at aeoncomputing.com  Mon Sep 20 11:17:07 2010
From: jeff.johnson at aeoncomputing.com (Jeff Johnson)
Date: Mon, 20 Sep 2010 11:17:07 -0700
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <20100920174738.GF17436@bx9.net>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>	<AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>	<B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>	<20100917172504.GA24976@bx9.net>	<AANLkTi=fp=ejeqvAfrWVCqf3JedEBgxPbEmMMdAF53b0@mail.gmail.com>
	<20100920174738.GF17436@bx9.net>
Message-ID: <4C97A523.60801@aeoncomputing.com>

  In most AMD platforms you can down clock memory speed in the BIOS under:
BIOS_Setup->Advanced->Chipset->Northbridge->DRAM Timing
Set "Memclock Mode" to "Manual" instead of "Auto" and then set the 
"Memclock Value" the desired speed in Mhz.

On 9/20/10 10:47 AM, Greg Lindahl wrote:
> I'm sure that some BIOSes have that kind of feature, but none of the
> ones that I'm currently using do.
>
> On Sun, Sep 19, 2010 at 08:06:39AM +0200, Jonathan Aquilina wrote:
>> Greg correct me if im wrong but cant you put in the memory which is
>> compatible with the system and slow the memory bus down via the bios?
>>
>>
>>> * On x86, it's not hard to slow down the memory system by reducing the
>>> # of channels, putting in slow memory, or adding more devices such
>>> that the bus slows down. And sometimes it's possible to get the same
>>> cpu with ddr2 and ddr3.
>>>
>>>
>> -- 
>> Jonathan Aquilina
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


-- 
------------------------------
Jeff Johnson
Manager
Aeon Computing

jeff.johnson at aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845
m: 619-204-9061

4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117


From reuti at Staff.Uni-Marburg.DE  Tue Sep 21 10:57:47 2010
From: reuti at Staff.Uni-Marburg.DE (Reuti)
Date: Tue, 21 Sep 2010 19:57:47 +0200
Subject: [Beowulf] dsh basic question on node/task ID
In-Reply-To: <675548.62156.qm@web57501.mail.re1.yahoo.com>
References: <675548.62156.qm@web57501.mail.re1.yahoo.com>
Message-ID: <4BE21F0A-DF34-488C-9008-35D1EF483E80@staff.uni-marburg.de>

Hi,

Am 20.09.2010 um 18:40 schrieb Tim Smith:

> I am new to parallel computing, so please forgive the naive question.
> 
> I am using the dsh command on an OSX cluster and would like to read in the node number from within the executed program. For example, with the Sun Grid Engine, I would run the following command:
> 
> /share/bin/R "--args $SEGMENTS $SGE_TASK_ID" < testR.r

the $SGE_TASK_ID will only be set for an array job. So I wonder, how you will get the node number/name out of it. Any task of an array job can be scheduled to any node and is independent from any other task of the array job.

-- Reuti


> and in the testR.r file, I was able to read the Task ID/node number and use it to parallelize my code. Which variable in the dsh command will let me read the node number?
> 
> thanks! 
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From eagles051387 at gmail.com  Tue Sep 21 22:23:20 2010
From: eagles051387 at gmail.com (Jonathan Aquilina)
Date: Wed, 22 Sep 2010 07:23:20 +0200
Subject: [Beowulf] A sea of wimpy cores
In-Reply-To: <4C97A523.60801@aeoncomputing.com>
References: <68A57CCFD4005646957BD2D18E60667B11D96088@milexchmb1.mil.tagmclarengroup.com>
	<AA8FDB46-345C-4FD2-85D1-A27393FC4EE7@sas.com>
	<B1A7966B-A17C-438C-86AA-A65622A810BD@serissa.com>
	<20100917172504.GA24976@bx9.net>
	<AANLkTi=fp=ejeqvAfrWVCqf3JedEBgxPbEmMMdAF53b0@mail.gmail.com>
	<20100920174738.GF17436@bx9.net> <4C97A523.60801@aeoncomputing.com>
Message-ID: <AANLkTimz97P5pd+ymFYtaGsfSsArR2mq_UJa1QFt7yaE@mail.gmail.com>

Jeff not sure if you read my previous post on this. i suggested the same
thing but his current bios's dont seem to support that feature
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20100922/215fffb5/attachment.html>

From prentice at ias.edu  Wed Sep 22 07:16:16 2010
From: prentice at ias.edu (Prentice Bisbal)
Date: Wed, 22 Sep 2010 10:16:16 -0400
Subject: [Beowulf] NY Times: Oracle Growth Plans Worry Rivals and Customers
Message-ID: <4C9A0FB0.6020408@ias.edu>

http://www.nytimes.com/2010/09/22/technology/22oracle.html

-- 
Prentice


From john.hearns at mclaren.com  Wed Sep 22 08:06:55 2010
From: john.hearns at mclaren.com (Hearns, John)
Date: Wed, 22 Sep 2010 16:06:55 +0100
Subject: [Beowulf] NY Times: Oracle Growth Plans Worry Rivals and Customers
In-Reply-To: <4C9A0FB0.6020408@ias.edu>
References: <4C9A0FB0.6020408@ias.edu>
Message-ID: <68A57CCFD4005646957BD2D18E60667B11E6A166@milexchmb1.mil.tagmclarengroup.com>


> http://www.nytimes.com/2010/09/22/technology/22oracle.html

"But the main party will take place on Wednesday night, when Oracle will
bus people to a series of concerts held on Treasure Island, which sits
between San Francisco and Oakland. The headlining acts include the Black
Eyed Peas, Don Henley and the Steve Miller Band.
Six acts will perform on two stages surrounded by amusement rides, four
laser systems, 150,000 cocktail napkins, mounds of food and 12
searchlights beaming into the sky. Typically, a few brave female souls
will dance near the music stages, while hundreds of male database gurus
sip free drinks and ogle."


When does he skydive into Ellisonfest dressed in an iron suit?

The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.


From prentice at ias.edu  Wed Sep 22 08:24:14 2010
From: prentice at ias.edu (Prentice Bisbal)
Date: Wed, 22 Sep 2010 11:24:14 -0400
Subject: [Beowulf] NY Times: Oracle Growth Plans Worry Rivals and Customers
In-Reply-To: <68A57CCFD4005646957BD2D18E60667B11E6A166@milexchmb1.mil.tagmclarengroup.com>
References: <4C9A0FB0.6020408@ias.edu>
	<68A57CCFD4005646957BD2D18E60667B11E6A166@milexchmb1.mil.tagmclarengroup.com>
Message-ID: <4C9A1F9E.1090905@ias.edu>

"Typically, a few brave female souls will dance near the music stages,
while hundreds of male database gurus sip free drinks and ogle."

The similarities between Oracle's Open World and SC conferences is
startling.


Hearns, John wrote:
>> http://www.nytimes.com/2010/09/22/technology/22oracle.html
> 
> "But the main party will take place on Wednesday night, when Oracle will
> bus people to a series of concerts held on Treasure Island, which sits
> between San Francisco and Oakland. The headlining acts include the Black
> Eyed Peas, Don Henley and the Steve Miller Band.
> Six acts will perform on two stages surrounded by amusement rides, four
> laser systems, 150,000 cocktail napkins, mounds of food and 12
> searchlights beaming into the sky. Typically, a few brave female souls
> will dance near the music stages, while hundreds of male database gurus
> sip free drinks and ogle."
> 
> 
> When does he skydive into Ellisonfest dressed in an iron suit?
> 

-- 
Prentice


From deadline at eadline.org  Wed Sep 22 09:17:27 2010
From: deadline at eadline.org (Douglas Eadline)
Date: Wed, 22 Sep 2010 12:17:27 -0400 (EDT)
Subject: [Beowulf] NY Times: Oracle Growth Plans Worry Rivals and 
	Customers
In-Reply-To: <4C9A1F9E.1090905@ias.edu>
References: <4C9A0FB0.6020408@ias.edu>
	<68A57CCFD4005646957BD2D18E60667B11E6A166@milexchmb1.mil.tagmclarengroup.com>
	<4C9A1F9E.1090905@ias.edu>
Message-ID: <44957.192.168.93.213.1285172247.squirrel@mail.eadline.org>


> "Typically, a few brave female souls will dance near the music stages,
> while hundreds of male database gurus sip free drinks and ogle."
>
> The similarities between Oracle's Open World and SC conferences is
> startling.

Except for the females dancing near the music stages pretty much the same

--
Doug

>
>
>
> Hearns, John wrote:
>>> http://www.nytimes.com/2010/09/22/technology/22oracle.html
>>
>> "But the main party will take place on Wednesday night, when Oracle will
>> bus people to a series of concerts held on Treasure Island, which sits
>> between San Francisco and Oakland. The headlining acts include the Black
>> Eyed Peas, Don Henley and the Steve Miller Band.
>> Six acts will perform on two stages surrounded by amusement rides, four
>> laser systems, 150,000 cocktail napkins, mounds of food and 12
>> searchlights beaming into the sky. Typically, a few brave female souls
>> will dance near the music stages, while hundreds of male database gurus
>> sip free drinks and ogle."
>>
>>
>> When does he skydive into Ellisonfest dressed in an iron suit?
>>
>
> --
> Prentice
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


From john.hearns at mclaren.com  Fri Sep 24 02:38:04 2010
From: john.hearns at mclaren.com (Hearns, John)
Date: Fri, 24 Sep 2010 10:38:04 +0100
Subject: [Beowulf] Seawater cooling
Message-ID: <68A57CCFD4005646957BD2D18E60667B11EC7AB6@milexchmb1.mil.tagmclarengroup.com>

http://www.theregister.co.uk/2010/09/23/google_finland_data_centers_cool
ed_solely_with_sea_water/


So the human race, not content with burning all that fuel to cause
global warming
is now going to boil the seas because we just can't help looking at
those YouTube videos
of cats playing the piano.

Still, having all those nice Baltic herring delivered pre-cooked might
be a plus.

John Hearns


The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.


From cbergstrom at pathscale.com  Sun Sep 26 20:06:34 2010
From: cbergstrom at pathscale.com (=?ISO-8859-1?Q?=22C=2E_Bergstr=F6m=22?=)
Date: Mon, 27 Sep 2010 10:06:34 +0700
Subject: [Beowulf] OT: Beta testing parallel debugger
Message-ID: <4CA00A3A.70605@pathscale.com>


Anyone interested to help beta test a new parallel debugger backend?  
The source will be available, but we'll also offer it as a commercial 
product with support.  Right now we're in the planning phase and 
interested to know what people will find most useful.  The basic idea is 
that the backend will expose a library API which will make it easier to 
do both remote debugging and build an interface on top.

./Christopher


From prentice at ias.edu  Mon Sep 27 12:09:16 2010
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 27 Sep 2010 15:09:16 -0400
Subject: [Beowulf] Problems with Microway Navion/SuperMicro server VGA
	display
Message-ID: <4CA0EBDC.2060300@ias.edu>

Beowulfers,

Are any of you having problems with the VGA console on your Microway
Navion or Supermicro servers?

About 2 months ago, I received for new Microway Navion Servers with
Fermi GPUs. These servers are just rebranded SuperMicro servers with the
H8DGG-QF motherboard.

I had this problem when I first started working with these systems, but
then it disappeared. Now that I'm trying to reinstall the OS on a couple
of systems, I can no longer get a VGA console.

Here's the symptoms: When I plug in the monitor on my crash, cart, it
recognizes that there's a computer connected. Otherwise it would display
the self-test message, indicating that NOTHING is connected. However, a
split second after detecting it's connected to a computer, it goes right
into power saving mode  - the LED in Dell monitor's powerbutton goes
from green to orange.

The keyboard works, because the numlock LED goes on and off as expected.

The GPU cards are Tesla cards that don't even have a display port on
them, so I don't think I need to specify the on-board display in the
BIOS, and if this needed to be done, I would assume that Microway would
have done this before shipping the system.

 Am I making an ass out of myself by assuming this? I was able to get a
console on these systems at one point without tinkering with the bios.


-- 
Prentice Bisbal
Linux Software Support Specialist/System Administrator
School of Natural Sciences
Institute for Advanced Study
Princeton, NJ


From prentice at ias.edu  Mon Sep 27 13:03:11 2010
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 27 Sep 2010 16:03:11 -0400
Subject: [Beowulf] Problems with Microway Navion/SuperMicro server VGA
	display
In-Reply-To: <4CA0EBDC.2060300@ias.edu>
References: <4CA0EBDC.2060300@ias.edu>
Message-ID: <4CA0F87F.80109@ias.edu>

Nevermind. User-error.

Prentice


Prentice Bisbal wrote:
> Beowulfers,
> 
> Are any of you having problems with the VGA console on your Microway
> Navion or Supermicro servers?
> 
> About 2 months ago, I received for new Microway Navion Servers with
> Fermi GPUs. These servers are just rebranded SuperMicro servers with the
> H8DGG-QF motherboard.
> 
> I had this problem when I first started working with these systems, but
> then it disappeared. Now that I'm trying to reinstall the OS on a couple
> of systems, I can no longer get a VGA console.
> 
> Here's the symptoms: When I plug in the monitor on my crash, cart, it
> recognizes that there's a computer connected. Otherwise it would display
> the self-test message, indicating that NOTHING is connected. However, a
> split second after detecting it's connected to a computer, it goes right
> into power saving mode  - the LED in Dell monitor's powerbutton goes
> from green to orange.
> 
> The keyboard works, because the numlock LED goes on and off as expected.
> 
> The GPU cards are Tesla cards that don't even have a display port on
> them, so I don't think I need to specify the on-board display in the
> BIOS, and if this needed to be done, I would assume that Microway would
> have done this before shipping the system.
> 
>  Am I making an ass out of myself by assuming this? I was able to get a
> console on these systems at one point without tinkering with the bios.
> 
> 

-- 
Prentice Bisbal
Linux Software Support Specialist/System Administrator
School of Natural Sciences
Institute for Advanced Study
Princeton, NJ


From prentice at ias.edu  Mon Sep 27 13:14:24 2010
From: prentice at ias.edu (Prentice Bisbal)
Date: Mon, 27 Sep 2010 16:14:24 -0400
Subject: [Beowulf] Problems with Microway Navion/SuperMicro server VGA
	display
In-Reply-To: <4CA0F87F.80109@ias.edu>
References: <4CA0EBDC.2060300@ias.edu> <4CA0F87F.80109@ias.edu>
Message-ID: <4CA0FB20.2000104@ias.edu>

Actually, I take that back. Only partially user error.  My Tesla's do
have an DVI output, but I was trying to connect to the VGA output.

At first, I thought I was just being stupid (see the last e-mail), then
I remembered something, that may have partially redeemed me: Due to the
location of the DVI input on the Tesla card, the chassis itself
interferes with the DVI connector, so it's impossible to connect the DVI
 cable to the Tesla's DVI output. Not without customizing the chassis
with a Sawzall or Dremel too, at least.

The fix is simple: configure the BIOS to always default to the onboard
VGA. Unfortunately, it's not obvious what the correct selection is.
Here's the section of the manual on this, in it's entirety:

Primary Video Controller

Use this setting to specify the primary video controller boot order.
Options include PCIE-GPP1-GPP2-GPP3a-PCI, PCIE-GPP2-GPP1-GPP3a-PCI,
PCIE-GPP3a-GPP1-GPP2-PCI or PCI-PCIE-GPP1-GPP2-GPP3a.

that make it pretty clear, doesn't it?

Prentice


Prentice Bisbal wrote:
> Nevermind. User-error.
> 
> Prentice
> 
> 
> Prentice Bisbal wrote:
>> Beowulfers,
>>
>> Are any of you having problems with the VGA console on your Microway
>> Navion or Supermicro servers?
>>
>> About 2 months ago, I received for new Microway Navion Servers with
>> Fermi GPUs. These servers are just rebranded SuperMicro servers with the
>> H8DGG-QF motherboard.
>>
>> I had this problem when I first started working with these systems, but
>> then it disappeared. Now that I'm trying to reinstall the OS on a couple
>> of systems, I can no longer get a VGA console.
>>
>> Here's the symptoms: When I plug in the monitor on my crash, cart, it
>> recognizes that there's a computer connected. Otherwise it would display
>> the self-test message, indicating that NOTHING is connected. However, a
>> split second after detecting it's connected to a computer, it goes right
>> into power saving mode  - the LED in Dell monitor's powerbutton goes
>> from green to orange.
>>
>> The keyboard works, because the numlock LED goes on and off as expected.
>>
>> The GPU cards are Tesla cards that don't even have a display port on
>> them, so I don't think I need to specify the on-board display in the
>> BIOS, and if this needed to be done, I would assume that Microway would
>> have done this before shipping the system.
>>
>>  Am I making an ass out of myself by assuming this? I was able to get a
>> console on these systems at one point without tinkering with the bios.
>>
>>
> 

-- 
Prentice Bisbal
Linux Software Support Specialist/System Administrator
School of Natural Sciences
Institute for Advanced Study
Princeton, NJ


From robh at dongle.org.uk  Wed Sep 29 09:24:13 2010
From: robh at dongle.org.uk (Robert Horton)
Date: Wed, 29 Sep 2010 17:24:13 +0100
Subject: [Beowulf] MPI-IO + nfs - alternatives?
Message-ID: <1285777453.1665.170.camel@moelwyn>

Hi,

I've been running some benchmarks on a new fileserver which we are
intending to use to serve scratch space via nfs. In order to support
MPI-IO I need to mount with the "noac" option. Unfortunately this takes
the write block performance from around 100 to 20MB/s which is a bit
annoying given that most of the workload isn't MPI-IO.

1) Does anyone have any hints for improving the nfs performance under
these circumstances? I've tried using jumbo frames, different
filesystems, having the log device on an SSD and increasing the nfs
block size to 1MB, none of which have any significant effect.

2) Are there any reasonable alternatives to nfs in this situation? The
main possibilities seem to be:

 - PVFS or similar with a single IO server. Not sure what performance I
should expect from this though, and it's a lot more complex than nfs.

 - Sharing a block device via iSCSI and using GFS, although this is also
going to be somewhat complex and I can't find any evidence that MPI-IO
will even work with GFS.

Otherwise it looks though the best bet would be to export two volumes
via nfs, only one of which is mounted with noac. Any other suggestions?

Rob


From samuel at unimelb.edu.au  Wed Sep 29 22:53:14 2010
From: samuel at unimelb.edu.au (Christopher Samuel)
Date: Thu, 30 Sep 2010 15:53:14 +1000
Subject: [Beowulf] Homebrew Cray-1A, 1/10 scale, binary compatible,
	build with FPGA
Message-ID: <4CA425CA.8040405@unimelb.edu.au>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

After the recent Cray-1 PC case here's someone who has gone one
step further and built what he claims to be a binary compatible
1/10th scale model of a Cray-1A.

http://chrisfenton.com/homebrew-cray-1a/

The only problem is that he can't find any Cray-1A software
to run on it (even after FOI requests and contacting the
Computer History Museum) so he's put out a call for help..

cheers!
Chris
- -- 
 Christopher Samuel - Senior Systems Administrator
 VLSCI - Victorian Life Sciences Computational Initiative
 Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
         http://www.vlsci.unimelb.edu.au/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkykJcoACgkQO2KABBYQAh/v5QCfddQO9iFRxLZlFjWt5NEjN7q7
aj8AoIZlVrqjPUVa0kYIb8/yBuA0XNei
=U4yp
-----END PGP SIGNATURE-----


From matthurd at acm.org  Fri Sep 24 03:21:55 2010
From: matthurd at acm.org (Matt Hurd)
Date: Fri, 24 Sep 2010 20:21:55 +1000
Subject: [Beowulf] Broadcast - not for HPC - or is it?
Message-ID: <AANLkTinc9MwLCq1SVPD_Nr1FLYoBjkkYVdDrUYcduaFO@mail.gmail.com>

I'm associated with a somewhat stealthy start-up.  Only teaser product
with some details out so far is a type of packet replicator.

Designed 24 port ones, but settled on 16 and 48 port 1RU designs as
this seemed to reflect the users needs better.

This was not designed for HPC but for low-latency trading as it beats
a switch in terms of speed.  Primarily focused on low-latency
distribution of market data to multiple users as the port to port
latency is in the range of 5-7 nanoseconds as it is pretty passive
device with optical foo at the core.  No rocket science here, just
convenient opto-electrical foo.

One user has suggested using them for their cluster but, as they are
secretive about what they do, I don't understand their use case.  They
suggested interest in bigger port counts and mentioned >1000 ports.

Hmmm, we could build such a thing at about 8-9 ns latency but I don't
quite get the point just being used to embarrassingly parallel stuff
myself.  Would have thought this opticast thing doesn't replace an
existing switch framework and would just be an additional cost rather
than helping too much.  If it has a use, may we should build one with
a lot of ports though 1024 ports seems a bit too big.

Any ideas on the list about use of low latency broadcast for specific
applications in HPC?  Are there codes that would benefit?

Regards,

Matt.
_________________
www.zeptonics.com


From macglobalus at yahoo.com  Sat Sep 25 10:07:40 2010
From: macglobalus at yahoo.com (gabriel lorenzo)
Date: Sat, 25 Sep 2010 10:07:40 -0700 (PDT)
Subject: [Beowulf] Begginers question # 1
Message-ID: <864070.13807.qm@web51103.mail.re2.yahoo.com>

IN CLUSTER COMPUTING, IS THE AMOUNT OF CORE THAT COUNTS?
If I build a cluster with 8 motherboards with 1 single core each would it be the same as using just one motherboard but with two quad core processors? I wanna build one of these but wanna save money and space and if what counts is the amount of cores to process info I think fewer motherboards with dual six-core processors is definitely cheaper just because I wont be needing that many mothers power supplies etc. thanks


From jeff.johnson at aeoncomputing.com  Mon Sep 27 13:05:03 2010
From: jeff.johnson at aeoncomputing.com (Jeff Johnson)
Date: Mon, 27 Sep 2010 16:05:03 -0400
Subject: [Beowulf] Problems with Microway Navion/SuperMicro server VGA
	display
In-Reply-To: <4CA0EBDC.2060300@ias.edu>
References: <4CA0EBDC.2060300@ias.edu>
Message-ID: <8D5CF7C2-08C5-4799-BC06-185F8916FE95@aeoncomputing.com>

Prentice,

The bios is deferring to the GPUs. Ice built lots of these and it's annoying. You can, in order of ease:

1. Use ipmitool to access the system's serial over LAN console redirection. See 'sol' function of ipmitool. Once there set VGA to onboard, save and quit.

2. Open the box and using a metal flathead screwdriver clear the system's nvram. Also a good idea to remove and install the CMOS battery upside down for a few seconds, then remove and install correctly. The reversed polarity ensures obliterating the nvram. Shorting the clearing pads doesn't always work.

3. Open the system, remove the GPUs and then bring it up to get to the bios and set it for onboard VGA.

Good luck!

--Jeff

---mobile signature---
Jeff Johnson - Aeon Computing
jeff.johnson at aeoncomputing.com

On Sep 27, 2010, at 15:09, Prentice Bisbal <prentice at ias.edu> wrote:

> Beowulfers,
> 
> Are any of you having problems with the VGA console on your Microway
> Navion or Supermicro servers?
> 
> About 2 months ago, I received for new Microway Navion Servers with
> Fermi GPUs. These servers are just rebranded SuperMicro servers with the
> H8DGG-QF motherboard.
> 
> I had this problem when I first started working with these systems, but
> then it disappeared. Now that I'm trying to reinstall the OS on a couple
> of systems, I can no longer get a VGA console.
> 
> Here's the symptoms: When I plug in the monitor on my crash, cart, it
> recognizes that there's a computer connected. Otherwise it would display
> the self-test message, indicating that NOTHING is connected. However, a
> split second after detecting it's connected to a computer, it goes right
> into power saving mode  - the LED in Dell monitor's powerbutton goes
> from green to orange.
> 
> The keyboard works, because the numlock LED goes on and off as expected.
> 
> The GPU cards are Tesla cards that don't even have a display port on
> them, so I don't think I need to specify the on-board display in the
> BIOS, and if this needed to be done, I would assume that Microway would
> have done this before shipping the system.
> 
> Am I making an ass out of myself by assuming this? I was able to get a
> console on these systems at one point without tinkering with the bios.
> 
> 
> -- 
> Prentice Bisbal
> Linux Software Support Specialist/System Administrator
> School of Natural Sciences
> Institute for Advanced Study
> Princeton, NJ
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf


From rolfmcc at mcclellanconsulting.com  Thu Sep 30 12:20:49 2010
From: rolfmcc at mcclellanconsulting.com (Rolf McClellan)
Date: Thu, 30 Sep 2010 19:20:49 +0000 (UTC)
Subject: [Beowulf] Re: 48-port 10gig switches?
References: <20100902011737.GB13598@bx9.net> <4C7F24DD.7020209@utah.edu>
	<20100902060005.GJ27021@bx9.net> <4C7F438F.1070903@utah.edu>
Message-ID: <loom.20100930T211809-63@post.gmane.org>

Tom Ammon <tom.ammon <at> utah.edu> writes:

> 
> 
> Interesting. Although, I'm still not convinced it's a single switching
> asic. The switch chip is, of course, not the only "chip" in the switch.
> This article says the "networking protocols" run on a single chip. The
> official Voltaire press release at
> 
http://www.voltaire.com/NewsAndEvents/Press_Releases/press2010/Voltaire_Announce
s_High_Density_10_GbE_Switch_for_Efficient_Scaling_of_Cloud_Networks
> doesn't say anything about a single switching asic - perhaps the author
> made an assumption about the product? You'd think they would really
> tout the fact if they had a single chip that dense.
> Last time I talked with the Arista people, their nonblocking 48 port
> switch (one of two options for a 48-port switch, IIRC) was not a single
> chip - it was a non-blocking 6-chip CLOS design. And, I agree, the
> price was compelling.
> So I still think there's not a 48 port 10GbE switch chip, at least not
> in merchant silicon. I don't know much about what cisco is cooking up
> on 10GbE. I know Juniper was rebranding BNT (which was fulcrum-based).
> I also heard about Extreme's top of rack 10GbE but it was only 24 ports
> - you have to stack two of them together to get 48 ports. 
> So my answer to your original question is that since there's not
> single-chip 48p, you still have to chain together 24-port chips to get
> line-rate 10GbE performance. I'm happy to be corrected, of course - but
> a seemingly misguided statement in an article in the trade press
> doesn't seem like a very good product announcement for an innovation
> like that. 
> Tom
> On 09/02/2010 12:00 AM, Greg Lindahl wrote:
> 
>   Press about the new Voltaire 6048 48p 10g switch indicates that it's a
> single switch chip:
> 
> http://www.theregister.co.uk/2010/08/30/voltaire_vantage_6048/
> 
> Arista seems to have a similar product at a similarish list price, and
> that list price is a lot less than chassis switches using 24p silicon.
> 
> Fujitsu isn't selling a 48p switch, and I'm not up enough on silicon
> vendors to tell you if Fulcrum is still the only other vendor. I
> used to know this stuff, then I left HPC to build a search engine 
> 

This is from a Marvell press release:
The Prestera CX family is the first commercially available solution to offer up 
to 48 ports of 10 Gigabit Ethernet (GbE) on a single chip, and the first with 
multiple ports of 40GbE with line rate throughput.  These packet processors 
provide 480Gbps full duplex throughput for Carrier Ethernet, including Mac-in-
Mac, Multiprotocol Label Switching (MPLS), Virtual Local Area Network (VLAN) 
Translation and IP Routing and can support up to 128K subscribers with a single 
device to meet the exponential growth of Internet traffic and massive 
deployments of high speed optical broadband networks.  

The Prestera CX family's reduced XAUI (RXAUI) interface enables doubling of the 
switching capacity for current broadband access platforms while maintaining 
compatibility for legacy line cards, enabling in-field upgrade of access 
platforms as networks migrate to GPON and 10G EPON.

For Converged Enhanced Ethernet (CEE) networks in next generation datacenter 
deployments, the Prestera CX family offers a highly differentiated feature set 
including the industry's first commercially available 40GbE ports, Priority Flow 
Control and Fiber Channel over Ethernet (FCoE) capabilities including Fiber 
Channel Awareness and Fiber Channel Forwarding.

Multiple reference designs based on Prestera CX are available from Marvell, pre-
loaded with software for enterprise and datacenter switching.  These include a 
single rack unit 48 port 10G SFP+ solution as well as a modular 40 port 10G SFP+ 
solution with two ports of 40GbE uplinks.