[Beowulf] Interesting

Sarma Tangirala tvssarma.omega9 at gmail.com
Wed Oct 27 10:03:32 PDT 2010


The recent digests that I am getting are quite interesting (bad google) and I have a question.

What I'd like to know is, is it possible to have a our history captured in its entirety so that none of the future generations have to run around (like Hari Seldon) because information from waaaay back is corrupt and not take care of?

Do you guys know of any existing sources that you can point me to?

Is this under distributed systems or under compression algorithms?

Any other two cents on this is welcome!
Sent from my BlackBerry

-----Original Message-----
From: beowulf-request at beowulf.org
Sender: beowulf-bounces at beowulf.org
Date: Wed, 27 Oct 2010 09:36:13 
To: <beowulf at beowulf.org>
Reply-To: beowulf at beowulf.org
Subject: Beowulf Digest, Vol 80, Issue 22

Send Beowulf mailing list submissions to
	beowulf at beowulf.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://www.beowulf.org/mailman/listinfo/beowulf
or, via email, send a message with subject or body 'help' to
	beowulf-request at beowulf.org

You can reach the person managing the list at
	beowulf-owner at beowulf.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beowulf digest..."


Today's Topics:

   1. RE: how Google warps your brain (Bill Rankin)
   2. RE: how Google warps your brain (Douglas Eadline)
   3. RE: Anybody using Redhat HPC Solution in their Beowulf
      (Hearns, John)
   4. Re: Anybody using Redhat HPC Solution in their Beowulf
      (Ellis H. Wilson III)
   5. Re: Anybody using Redhat HPC Solution in their Beowulf
      (Kilian CAVALOTTI)
   6. RE: Anybody using Redhat HPC Solution in their Beowulf
      (Lux, Jim (337C))


----------------------------------------------------------------------

Message: 1
Date: Tue, 26 Oct 2010 14:54:43 +0000
From: Bill Rankin <Bill.Rankin at sas.com>
Subject: RE: [Beowulf] how Google warps your brain
To: Beowulf Mailing List <beowulf at beowulf.org>
Cc: "Robert G. Brown" <rgb at phy.duke.edu>
Message-ID:
	<76097BB0C025054786EFAB631C4A2E3C0948F542 at MERCMBX03D.na.SAS.com>
Content-Type: text/plain; charset="us-ascii"

Heading completely off-topic now, but the area of digital media and long-term archival/retrieval is something that I find very interesting.  I'll leave it to Rob to somehow eventually tie this back into a discussion of COTs technology and HPC.


> > It's interesting: I just got an iPad a few weeks ago, mostly as a
> > reader/web-browser device, and I've been reading a variety of
> > out-of-copyright works: H. Rider Haggard, Joseph Conrad, Mark Twain.
> Thank
> > you Gutenberg Project!
> 
> It is awesome, isn't it?

Amazon also carries many of the out-of-copyright works in their Kindle store for $0 (and gives credit to Gutenburg to a small extent).  It was nice to be able to go pickup things like the Sherlock Holmes series, Homer's Illiad and some of Einstein's works (which I don't pretend to understand) and have them downloaded via 3G on Amazon's dime.

I will say that because of this I tend to overlook their rather high (IMHO) price on current digital content and have probably purchased more e-books overall as a result.
 

> > And, since I am sitting/lying here with a very sore back from moving boxes
> > of books around this weekend looking for that book that I *know* is in there
> > somewhere, the prospect of some magic box that would scan all my books into
> > a format usable into eternity would be quite nice.  I might even think that
> > a personal "print on demand" would be nice that could generate a cheap/quick
> > copy for reading in bed(yes, the iPad and Kindle, etc., are nice, but
> > there's affordances provided by the paper edition that is nice.. But I don't
> > need hardcover or, even, any cover..)

There is just *something* about paper, isn't there?  And while I don't have a library to the extent of RGBs or others, I do like having some books around (glancing at the two bookshelves in my office).  On the other hand, I still have boxes of books sitting around unopened since we moved house 4-5 years ago.  I certainly need a purge, lest I end up on one of those "hoarding" shows that seem to be popular as of late.

At some point, I have to ask myself if I really *need* to have a old beat-up, falling apart copy of "Voyage of the Space Beagle" laying around.


> > (or, even better, a service that has scanned all the books for me, e.g.
> > Google, and that upon receiving some proof of ownership of the physical
> > book, lets me have an electronic copy of the same...  I'd gladly pay some
> > nominal fee for such a thing, providing it wasn't for some horrible locked,
> > time limited format which depends on the original vendor being in business
> > 20 years from now.  I also recognize the concern about how "once in digital
> > form, copying becomes very cheap" which I think is valid.

A scanning service would be wonderful for a lot of the books I have, mainly those I view as reference-type material.  For current reference material, Safari Books Online has a reasonable usage model that allows for making hardcopy of their online content.  Now if there was only a simple way to transcribe the same content for download to my Kindle I would be set (something beyond the OCR+PDF approach, which is awkward and inconsistent).


> What a killer idea.  Acceptable use, doggone it!  I'd ship them books
> by the boxful in exchange for a movable (even DRM controlled) image, a la
> Ipod music.  I just don't want to rebuy them, like I've now bought most
> of my music collection TWICE (vinyl and CD).

[let's not get started about vinyl collections - that's a whole 'nother set of unopened boxes]

The problem is that many of the media houses are still waging an underground war on Fair Use, despite the legal decisions handed down by the courts.  As an example, I recently had a email exchange with one of the customer service people at a major network.  I was trying to locate additional interview footage from when my brother-in-law was on a certain hour-long Sunday evening news show.  This person informed me that I did not have their "permission" to recorded the over-the-air broadcast of the show and burn it on a DVD to give to my sister, so what I was doing was not legal.  

This was news to me, since this usage model was clearly defined as permissible by the Supreme Court many years ago in the Sony v. Universal "Betamax Case".  

While the market for online music, video and written works have forced the various publishers to acknowledge to the need to provide content in digital form, to a great extent they had to be dragged kicking and screaming into the 21st century.  A lot of progress has been made but there is still a lot of resistance towards efforts to open up availability and access even further.


I would like see a service where I could take bins of old books to a used book store and somehow get credits towards the purchase of e-books online.  I think that could break me of my paperback hoarding habit pretty quickly. 


-bill




------------------------------

Message: 2
Date: Tue, 26 Oct 2010 10:59:25 -0400 (EDT)
From: "Douglas Eadline" <deadline at eadline.org>
Subject: RE: [Beowulf] how Google warps your brain
To: "Hearns, John" <john.hearns at mclaren.com>
Cc: beowulf at beowulf.org, "Robert G. Brown" <rgb at phy.duke.edu>
Message-ID:
	<49886.192.168.93.213.1288105165.squirrel at mail.eadline.org>
Content-Type: text/plain;charset=iso-8859-1

<Seinfeld>
Not that there is anything wrong with that.
</Seinfeld >

>
> As usual, a highly insightful post from RGB.
>
>
>
>>  a) Multiple copies.  Passenger pigeons may be robust, but once the
> number of copies drops below a critical point, they are gone.  E. Coli
> we will always have
>> with us (possibly in a constantly changing form) because there are so
> very many copies, so very widely spread.
>
> I probably shouldn't mention Wikileaks here...
>
>>
>> At the moment, the internet has if anything VASTLY INCREASED a, b and
> c
>> for every single document in the public domain that has been ported
> to,
>> e.g. Project Gutenberg.
>>
>> Right now, I'm sitting on a cache of "Saint" books, by Leslie
> Charteris
>> (who was a great favorite of mine growing up and still is).
>>
>> Nobody is going to reprint the Saint stories.  They are a gay fantasy
>> from another time,
>
> Simon Templar? Gay? Cough.
>
> Next you will be telling me that there are gay undertones in Top Gun,
> the film with the sexiest astrophysicist ever.
>
>
>> might well last to the end of civilization.  Replicate them a few
>> million times, PERPETUATE them from generation to generation by
>> renewing
>> the copies, and backing them up, and recopying them in formats where
>> they are still useful.
>
> The cloud backup providers will be keeping copies of data on
> geographically spread sites.
> However, we should at this stage be asking what are the mechanisms for
> cloud storage companies
> for
> *) living wills - what happens when the company goes bust
>
> *) what are the strategies for migrating the data onto new storage
> formats
>
>
>>
>> Or, to put it differently, suppose every single human on the planet
> had
>> access to the modern equivalent of Diophantus's Arithmetica on their
>> computer, their Kindle, their Ipad
> I believe that was the original intent for the Web. Still under
> development!
>
>
> The contents of this email are confidential and for the exclusive use of
> the intended recipient.  If you receive this email in error you should not
> copy it, retransmit it, use it or disclose its contents but should return
> it to the sender immediately and delete your copy.
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


-- 
Doug

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



------------------------------

Message: 3
Date: Tue, 26 Oct 2010 09:16:47 +0100
From: "Hearns, John" <john.hearns at mclaren.com>
Subject: RE: [Beowulf] Anybody using Redhat HPC Solution in their
	Beowulf
To: "Ellis H. Wilson III" <ellis at runnersroll.com>,
	<beowulf at beowulf.org>
Message-ID:
	<68A57CCFD4005646957BD2D18E60667B12154E23 at milexchmb1.mil.tagmclarengroup.com>
	
Content-Type: text/plain; charset="us-ascii"

> I don't think you could find a statement more orthogonal to the spirit
> of the Beowulf list than, "Please, please don't "roll your own"
> system..."  Isn't Beowulfery about the drawing together of inexpensive
> components in an intelligent fashion suited just for your particular
> application while using standardized (and thereby cheap by the law of
> scale) hardware?  I'm not suggesting Richard build his own NIC - but
> there is nothing wrong with using even a distribution of Linux not
> intended for HPC (so long as you're smart about it) and picking and
> choosing the software (queuing managers, tracers, etc) he finds works
> best.
> 
> Also, I would argue if a company is selling you an HPC solution, it's
> either:
> 1. A true Beowulf in terms of using COTS hardware, in which case you
> are
> likely getting less than your money is worth or


Ellis, I am going to politely disagree with you - now there's a
surprise!

I have worked as an engineer for two HPC companies - Clustervision and
Streamline.
My slogan phrase on this issue is "Any fool can go down PC World and buy
a bunch of PCs"
By that I mean that CPU is cheap these days, but all you will get is a
bunch of boxes
on your loading bay. As you say, and you are right, you then have the
option of installing
Linux plus a cluster management stack and getting a cluster up and
running.

However, as regards price, I would say that actually you will be paying
very, very little premium
for getting a supported, tested and pre-assembled cluster from a vendor.
Academic margins are razor thin - the companies are not growing fat over
academic deals.
They also can get special pricing from Intel/AMD if the project can be
justified - probably ending
up at a price per box near to what you pay at PC World.

Or take (say) rack top switches. Do you want to have a situation where
the company which supports your cluster
has switches sitting on a shelf, so when a switch fails someone (me!) is
sent out the next morning to deliver
a new switch in a box, cable it in and get you running?
Or do you want to deal direct with the returns department at $switch
vendor, or even (shudder) take the route
of using the same switches as the campus network - so you don't get to
choose on the basis of performance or
suitability, but just depend on the warm and fuzzies your campus IT
people have.


We then come to support - say you buy that heap of boxes from a Tier 1 -
say it is the same company your
campus IT folks have a campus wide deal with. You'll get the same type
of support you get for general
servers running Windows - and you'll deal with first line support staff
on the phone every time.
Me, I've been there, seen there, done it with tier 1 support like that.
As a for instance, HPC workloads tend to stress the RAM in a system, and
you get frequent ECC errors on 
a young system as it is bedding in. Try phoning support every time a
light comes on, and get talked through
the "have you run XXX diagnostic", it soon gets wearing.
Before Tier 1 companies cry foul, of course both the above companies and
all other cluster companies integrate
Tier 1 servers - but that is a different scenario from getting boxes
delivered through your campus agreement with
$Tier1.












The contents of this email are confidential and for the exclusive use of the intended recipient.  If you receive this email in error you should not copy it, retransmit it, use it or disclose its contents but should return it to the sender immediately and delete your copy.



------------------------------

Message: 4
Date: Tue, 26 Oct 2010 12:09:12 -0400
From: "Ellis H. Wilson III" <ellis at runnersroll.com>
Subject: Re: [Beowulf] Anybody using Redhat HPC Solution in their
	Beowulf
To: "Hearns, John" <john.hearns at mclaren.com>
Cc: beowulf at beowulf.org
Message-ID: <4CC6FD28.1050303 at runnersroll.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 10/26/10 04:16, Hearns, John wrote:
> I have worked as an engineer for two HPC companies - Clustervision and
> Streamline.
> My slogan phrase on this issue is "Any fool can go down PC World and buy
> a bunch of PCs"

Well if you are buying PCs in bulk at retail pricing, you are a fool 
anyway.  Plus most PC World PCs won't have ECC RAM so I wasn't really 
referring to those as few of us tolerate random bit flips.

> However, as regards price, I would say that actually you will be paying
> very, very little premium
> for getting a supported, tested and pre-assembled cluster from a vendor.
> Academic margins are razor thin - the companies are not growing fat over
> academic deals.
> They also can get special pricing from Intel/AMD if the project can be
> justified - probably ending
> up at a price per box near to what you pay at PC World.

Again, not comparing PC World to Tier 1 bulk purchases.  I'm comparing 
Tier 1 bulk purchases w/o an OS (so you can DIY) with specialized HPC 
vendor purchases where you don't have to DIY.  Even then, perhaps it 
breaks even the first year if you get a very, very good deal from the 
HPC vendor.  However, to get the deal you are probably contracted into 
four or five years of support and when considering HPC, involving more 
humans are the fastest way to get a really inefficient and expensive 
cluster.  After the first year and up until the lifetime of the cluster 
involving human support annually will add a large cost overhead you have 
to account for at the beginning (and probably buy less hardware because 
of which).

> Or take (say) rack top switches. Do you want to have a situation where
> the company which supports your cluster
> has switches sitting on a shelf, so when a switch fails someone (me!) is
> sent out the next morning to deliver
> a new switch in a box, cable it in and get you running?

That's probably a hell of a lot faster than waiting on a vendor to get 
you a new switch through some RMA process.  Plus you know the cabling is 
done right :).

Optimally IMHO, in university setups physical scientists create the need 
for HPC.  These types shouldn't (as Kilian mentions) need to inherit all 
of the responsibilities and overheads of cluster management to use one 
(or pay cluster vendors annually for support).  They should simply walk 
over to the CS department, find system guys (who would probably drool 
over the potential of administering a reasonably sized cluster) and work 
out an agreement where the physical science types can "just use it" and 
the systems/CS guys administer it and can once in a while trace 
workloads, test new load balancing mechanisms, try different kernel 
settings for performance, etc.  This way the physical scientists get 
their work done on a well supported HPC system for no extra cash and 
computer scientists get great, non-toy traces and workloads to further 
their own research.  Both parties win.

Now in organizations that don't have a CS department I agree that HPC 
vendors are the way to go.

ellis


------------------------------

Message: 5
Date: Tue, 26 Oct 2010 11:18:56 +0200
From: Kilian CAVALOTTI <kilian.cavalotti.work at gmail.com>
Subject: Re: [Beowulf] Anybody using Redhat HPC Solution in their
	Beowulf
To: "Ellis H. Wilson III" <ellis at runnersroll.com>
Cc: beowulf at beowulf.org
Message-ID:
	<AANLkTimORzXXM=69Lq3KLNuanO30v6k2+qHy7Vs-6e-_ at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8

Hi,

On Tue, Oct 26, 2010 at 1:00 AM, Ellis H. Wilson III
<ellis at runnersroll.com> wrote:
> Also, I would argue if a company is selling you an HPC solution, it's
> either:
> 1. A true Beowulf in terms of using COTS hardware, in which case you are
> likely getting less than your money is worth or

Well, depends on how you value your time and the required expertise to
put all those COTS and OSS pieces together to make them run smoothly
and efficiently.
Most scientists and HPC systems users are not professional sysadmins
(which is good, they have a job to do), and the value of trained,
experienced, skilled individuals who can put together a reliable and
useful HPC system is sometimes overlooked (ie. undervalued).

I agree with your later statement, though:

> I personally don't think the "market for cluster vendors" is [...]
> the Beowulf list.

Cheers,
-- 
Kilian


------------------------------

Message: 6
Date: Wed, 27 Oct 2010 09:32:43 -0700
From: "Lux, Jim (337C)" <james.p.lux at jpl.nasa.gov>
Subject: RE: [Beowulf] Anybody using Redhat HPC Solution in their
	Beowulf
To: "Ellis H. Wilson III" <ellis at runnersroll.com>, "Hearns, John"
	<john.hearns at mclaren.com>
Cc: "beowulf at beowulf.org" <beowulf at beowulf.org>
Message-ID:
	<ECE7A93BD093E1439C20020FBE87C47FEDD29F961A at ALTPHYEMBEVSP20.RES.AD.JPL>
	
Content-Type: text/plain; charset="us-ascii"


> 
> Optimally IMHO, in university setups physical scientists create the need
> for HPC.  These types shouldn't (as Kilian mentions) need to inherit all
> of the responsibilities and overheads of cluster management to use one
> (or pay cluster vendors annually for support).  They should simply walk
> over to the CS department, find system guys (who would probably drool
> over the potential of administering a reasonably sized cluster) and work
> out an agreement where the physical science types can "just use it" and
> the systems/CS guys administer it and can once in a while trace
> workloads, test new load balancing mechanisms, try different kernel
> settings for performance, etc.  This way the physical scientists get
> their work done on a well supported HPC system for no extra cash and
> computer scientists get great, non-toy traces and workloads to further
> their own research.  Both parties win.
> 


I don't know about this model.
This is like developing software on prototype hardware.  The hardware guys and gals keep wanting to change the hardware, and the software developers complain that their software keeps breaking, or that the hardware is buggy (and it is).

The computational physics and computational biology guys get to work on cool, nifty stuff to push their dissertation forward by using a hopefully stable computational platform.
But I don't think the CS guys would drool over the possibility of administering a cluster. The CS guys get to be sysadmin/maintenance types...not very fun for them, and not the kind of work that would work for their dissertation.  

Now, if the two groups were doing research on new computational methods (what's the best way to simulate X) perhaps you'd get a collaboration.




------------------------------

_______________________________________________
Beowulf mailing list
Beowulf at beowulf.org
http://www.beowulf.org/mailman/listinfo/beowulf


End of Beowulf Digest, Vol 80, Issue 22
***************************************




More information about the Beowulf mailing list