Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] O'Reilly Clusters Book Review

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Glen Otero gotero at linuxprophet.com
Fri Feb 25 00:23:31 PST 2005


My review of O'Reilly's latest clusters book published at HPCwire 
(http://www.tgc.com/hpcwire.html):

> 'Crazy Talk' Clutters New Cluster Book
>  Glen Otero, Linux Prophet

>   When my colleagues and I heard that O'Reilly was releasing another
>   cluster book ("High Performance Linux Clusters with OSCAR, Rocks,
>   openMosix & MPI"), we knew it would not turn out well. One of my
>   colleagues even said, "It's going to be written by some guy that
>   doesn't know anything and [gets all excited] over clusters."
>
>   Why such a pessimistic prediction?
>
>   For one, it was uttered by the same cluster expert that O'Reilly
>   ignored while producing their first cluster book debacle several 
> years
>   ago. When told that their first book ("Building Linux Clusters" by
>   David Spector) should be scrapped and rewritten, O'Reilly ignored
>   their reviewers. The advice only came from the knowledgeable folks at
>   VA Linux, *the* cluster company at that time. But what does VA Linux
>   know? It's O'Reilly, they obviously know better.
>   The first O'Reilly cluster book was a complete disaster. I wrote a
>   scathing review of it for Linux Journal in 2000. Completely void of
>   anything useful, the book and included software were simply not
>   finished. It was like reading a rough draft. Totally embarrassed, and
>   suddenly void of hubris, O'Reilly apologized to its audience and
>   pulled the book from print.
>   Not satisfied to sit around pointing fingers and complaining, I told
>   O'Reilly I would help them with their next cluster book attempt, if
>   there even was one. Before long, I signed a contract to write a
>   clusters book for O'Reilly. But in their infinite wisdom, they didn't
>   like the first few chapters that I submitted. Although I had gotten
>   other cluster experts to review what I had written, O'Reilly didn't
>   bother to get any experts to review what I was writing. They just
>   didn't like it, so they dismissed it out of hand. Needless to say the
>   "we know better" attitude was back, and that ended the contract.
>
>   Which brings us to present day. This latest cluster book suffers from
>   the same brain damaged, hubris-driven process at O'Reilly. Just like
>   the first book, it's written by a virtual unknown in the cluster
>   community (Joseph D. Sloan) and comes across as having been written 
> in
>   a vacuum.
>
>   Let's start with the book's title, "High Performance Linux Clusters
>   with OSCAR, Rocks, openMosix & MPI." There's nothing high-performance
>   about this book because there's no discussion of using any high
>   performance networks like Myrinet, Infiniband, or Quadrics outside of
>   four paragraphs on page 40. There are so many ill-informed sweeping
>   generalizations made about cluster networks on that page that I threw
>   the book against the wall when I read them. For example, Quadrics and
>   Infiniband are clearly established networking technologies, not 
> merely
>   "emerging," as the author believes. Sloan obviously hasn't attended a
>   Supercomputing conference in the last several years. Unfortunately,
>   the rest of the book is rife with several inaccurate cluster
>   oversimplifications and incorrect definitions of terms like single
>   system image (SSI) and virtual machine interface (VMI). The
>   "beginner's guide" design of the book is no excuse for inaccuracies
>   and oversimplifications.
>
>   In my eyes, this book was doomed for the trash after page 8. Sloan
>   states that the term "Beowulf" is a politically charged term that
>   would be avoided in the book.  That is the most ridiculous thing I
>   have ever heard. It's impossible to take that comment seriously,
>   especially since the author doesn't even take the time to properly
>   define a Beowulf.  For these reasons alone, I can't take this book
>   seriously. I've thrown back my share of adult beverages with Don
>   Becker, and trust me when I say that the political nature of Beowulf
>   has never come up. Adding to the confusion, the phrase "more
>   traditional Beowulf-style cluster" is then used on page 63. I hope 
> now
>   you'll understand why I think this book is schizophrenic at best.
>
>   Defining a Beowulf shouldn't have been too difficult for Sloan. He
>   could have used a term that he introduced on page 10, "asymmetric
>   cluster."  But I guess it's too much to ask that the Beowulf project,
>   Tom Sterling and Don Becker's brainchild that started the high
>   performance cluster phenomenon, be properly described and defined in 
> a
>   clusters book.  By the way, I've never heard the term "asymmetric
>   architecture" used when describing clusters. And, outside this book,
>   you won't either.
>
>   After page 8, it's apparent that the author has nothing original to
>   offer and is going to regurgitate what has already been written about
>   clusters. There is absolutely no value in this because the online
>   documentation for all of the cluster projects covered by the author 
> is
>   far more informative than what is included in the book. For example,
>   while screenshots of a cluster install are included in the online
>   Rocks documentation, they are omitted in the book. Furthermore, after
>   regurgitating much of the online Rocks documentation, the author
>   doesn't offer any additional helpful hints or troubleshooting advice.
>   As someone who runs a company that provides and supports cluster
>   software based on Rocks, I can tell you that there are plenty of
>   pitfalls that should have been mentioned.
>
>   This underscores my major complaint with this book. There's nothing
>   new, nothing novel and no real help offered. Everything is just laid
>   out superficially in front of the reader for them to make the right
>   cluster decision. The book should guide the cluster decision-making
>   process, but it only offers a bunch of questions -- with no
>   substantial answers.
>
>   Sloan even admits on page 91 that there is a very detailed set of
>   installation instructions for OSCAR, including screen shots, 
> available
>   online. So why is this book necessary again? Oh yeah, the author is
>   supposed to help the reader decide if OSCAR, or any cluster toolkit
>   for that matter, is right for the reader. Unfortunately, no help of
>   any kind is offered.
>
>   The typos and omissions weren't rampant this time, but the errors I
>   found on pages 76, 123, 127, 130, and 136 provided nasty flashbacks 
> of
>   the first O'Reilly book. Good thing I resigned myself to do a shot of
>   tequila after every typo I found. It dulled the pain this book
>   inflicted.
>
>   OK. "Part I -- An Introduction to Clusters" is just inaccurate and
>   infuriating. "Part II -- Getting Started Quickly" contains recycled
>   and reformatted content easily found for free online. "Part III --
>   Building Custom Clusters" isn't really about building custom 
> clusters,
>   but looks more closely at some software that was gleaned over in 
> Parts
>   I & II. While I don't agree with the inclusion of the parallel 
> virtual
>   file system (PVFS) and the omission of Sun Grid Engine in Part III,
>   I'm sure this can be chalked up to one of the tough decisions the
>   author had to make, like the omission of PVM and Condor from the 
> book.
>   "Part IV -- Cluster Programming" is actually a very good introduction
>   to programming, debugging, and profiling MPI programs.
>
>   It's obvious that this book has no clear identity. It's like a 5th
>   grader's book report: a lifeless facsimile of what's been read,
>   totally void of originality, wisdom or topic advancement. But it's a
>   quick read because it uses small words.
>
>   Should I be this harsh? After all, cluster computing is a complex
>   subject where the answer to most questions is "it depends."  However,
>   I believe that O'Reilly owed us an excellent book after their first
>   cluster gaffe, so I'm disappointed that O'Reilly took the easy way 
> out
>   by reorganizing and watering down documentation that is available
>   elsewhere. Even the content in the exemplary Part IV can be found in
>   several other places. It's just a lot less technical and intimidating
>   here. 
>
>   There are better ways to write a clusters book. I know because I've
>   read several cluster book outlines by members of the cluster
>   intelligentsia that would have been better than this offering. So I'm
>   not going easy on O'Reilly, no matter how good their intentions. The
>   cluster community has a difficult enough time assisting people with
>   clusters without books like this dynamiting the proverbial cluster
>   well. The statement on page 28, "...benchmarking is probably a
>   meaningless activity and waste of time," is just plain wrong and
>   demonstrates a glaring lack of cluster understanding.
>
>   If you really want to learn about clusters, pick up a copy of
>   Sterling's "Beowulf Cluster Computing with Linux," 2nd edition, or
>   check out Warewulf, Rocks, OSCAR, OpenMosix, and ClusterWorld online.
>   You could join a mailing list, like the Beowulf mailing list, and
>   subscribe to ClusterWorld Magazine. This is where the creators and
>   maintainers of all that is clustering hang out, announce, debate,
>   rant, create, lurk, help, and publish. If you want to be part of
>   clustering's future, then you'll check out the community's Cluster
>   Agenda and attend this year's ClusterWorld conference.
>
>   =================================================
>   Glen Otero received his Ph.D. in Microbiology and Immunology from 
> UCLA
>   in 1995 and immediately escaped to the more temperate climes and
>   better surf in San Diego. After some research on the molecular and
>   cellular biology of HIV and Herpes viruses at the Salk Institute for
>   Biological Sciences, Glen left the wet lab research bench in 1999.
>   Although leaving the research bench, he didn't leave science
>   altogether; traveling all the way across the street to the San Diego
>   Supercomputer Center (SDSC) for a stint at the Protein Data Bank. It
>   was while at SDSC that Glen had his Linux clusters and bioinformatics
>   epiphany. Soon after that illuminating event, Glen founded Linux
>   Prophet, a bioinformatics consultancy specializing in the
>   implementation, design, and deployment of Linux Beowulf clusters in
>   the life sciences. Late in 2002 Linux Prophet evolved into Callident,
>   a Linux cluster software and high performance computing company.
>
Glen Otero Ph.D.
Linux Prophet

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 10616 bytes
Desc: not available
Url : http://www.scyld.com/pipermail/beowulf/attachments/20050225/3919966f/attachment.bin


More information about the Beowulf mailing list