[Beowulf] O'Reilly Clusters Book Review

Glen Otero gotero at linuxprophet.com
Fri Feb 25 00:23:31 PST 2005


My review of O'Reilly's latest clusters book published at HPCwire 
(http://www.tgc.com/hpcwire.html):

> 'Crazy Talk' Clutters New Cluster Book
>  Glen Otero, Linux Prophet

>   When my colleagues and I heard that O'Reilly was releasing another
>   cluster book ("High Performance Linux Clusters with OSCAR, Rocks,
>   openMosix & MPI"), we knew it would not turn out well. One of my
>   colleagues even said, "It's going to be written by some guy that
>   doesn't know anything and [gets all excited] over clusters."
>
>   Why such a pessimistic prediction?
>
>   For one, it was uttered by the same cluster expert that O'Reilly
>   ignored while producing their first cluster book debacle several 
> years
>   ago. When told that their first book ("Building Linux Clusters" by
>   David Spector) should be scrapped and rewritten, O'Reilly ignored
>   their reviewers. The advice only came from the knowledgeable folks at
>   VA Linux, *the* cluster company at that time. But what does VA Linux
>   know? It's O'Reilly, they obviously know better.
>   The first O'Reilly cluster book was a complete disaster. I wrote a
>   scathing review of it for Linux Journal in 2000. Completely void of
>   anything useful, the book and included software were simply not
>   finished. It was like reading a rough draft. Totally embarrassed, and
>   suddenly void of hubris, O'Reilly apologized to its audience and
>   pulled the book from print.
>   Not satisfied to sit around pointing fingers and complaining, I told
>   O'Reilly I would help them with their next cluster book attempt, if
>   there even was one. Before long, I signed a contract to write a
>   clusters book for O'Reilly. But in their infinite wisdom, they didn't
>   like the first few chapters that I submitted. Although I had gotten
>   other cluster experts to review what I had written, O'Reilly didn't
>   bother to get any experts to review what I was writing. They just
>   didn't like it, so they dismissed it out of hand. Needless to say the
>   "we know better" attitude was back, and that ended the contract.
>
>   Which brings us to present day. This latest cluster book suffers from
>   the same brain damaged, hubris-driven process at O'Reilly. Just like
>   the first book, it's written by a virtual unknown in the cluster
>   community (Joseph D. Sloan) and comes across as having been written 
> in
>   a vacuum.
>
>   Let's start with the book's title, "High Performance Linux Clusters
>   with OSCAR, Rocks, openMosix & MPI." There's nothing high-performance
>   about this book because there's no discussion of using any high
>   performance networks like Myrinet, Infiniband, or Quadrics outside of
>   four paragraphs on page 40. There are so many ill-informed sweeping
>   generalizations made about cluster networks on that page that I threw
>   the book against the wall when I read them. For example, Quadrics and
>   Infiniband are clearly established networking technologies, not 
> merely
>   "emerging," as the author believes. Sloan obviously hasn't attended a
>   Supercomputing conference in the last several years. Unfortunately,
>   the rest of the book is rife with several inaccurate cluster
>   oversimplifications and incorrect definitions of terms like single
>   system image (SSI) and virtual machine interface (VMI). The
>   "beginner's guide" design of the book is no excuse for inaccuracies
>   and oversimplifications.
>
>   In my eyes, this book was doomed for the trash after page 8. Sloan
>   states that the term "Beowulf" is a politically charged term that
>   would be avoided in the book.  That is the most ridiculous thing I
>   have ever heard. It's impossible to take that comment seriously,
>   especially since the author doesn't even take the time to properly
>   define a Beowulf.  For these reasons alone, I can't take this book
>   seriously. I've thrown back my share of adult beverages with Don
>   Becker, and trust me when I say that the political nature of Beowulf
>   has never come up. Adding to the confusion, the phrase "more
>   traditional Beowulf-style cluster" is then used on page 63. I hope 
> now
>   you'll understand why I think this book is schizophrenic at best.
>
>   Defining a Beowulf shouldn't have been too difficult for Sloan. He
>   could have used a term that he introduced on page 10, "asymmetric
>   cluster."  But I guess it's too much to ask that the Beowulf project,
>   Tom Sterling and Don Becker's brainchild that started the high
>   performance cluster phenomenon, be properly described and defined in 
> a
>   clusters book.  By the way, I've never heard the term "asymmetric
>   architecture" used when describing clusters. And, outside this book,
>   you won't either.
>
>   After page 8, it's apparent that the author has nothing original to
>   offer and is going to regurgitate what has already been written about
>   clusters. There is absolutely no value in this because the online
>   documentation for all of the cluster projects covered by the author 
> is
>   far more informative than what is included in the book. For example,
>   while screenshots of a cluster install are included in the online
>   Rocks documentation, they are omitted in the book. Furthermore, after
>   regurgitating much of the online Rocks documentation, the author
>   doesn't offer any additional helpful hints or troubleshooting advice.
>   As someone who runs a company that provides and supports cluster
>   software based on Rocks, I can tell you that there are plenty of
>   pitfalls that should have been mentioned.
>
>   This underscores my major complaint with this book. There's nothing
>   new, nothing novel and no real help offered. Everything is just laid
>   out superficially in front of the reader for them to make the right
>   cluster decision. The book should guide the cluster decision-making
>   process, but it only offers a bunch of questions -- with no
>   substantial answers.
>
>   Sloan even admits on page 91 that there is a very detailed set of
>   installation instructions for OSCAR, including screen shots, 
> available
>   online. So why is this book necessary again? Oh yeah, the author is
>   supposed to help the reader decide if OSCAR, or any cluster toolkit
>   for that matter, is right for the reader. Unfortunately, no help of
>   any kind is offered.
>
>   The typos and omissions weren't rampant this time, but the errors I
>   found on pages 76, 123, 127, 130, and 136 provided nasty flashbacks 
> of
>   the first O'Reilly book. Good thing I resigned myself to do a shot of
>   tequila after every typo I found. It dulled the pain this book
>   inflicted.
>
>   OK. "Part I -- An Introduction to Clusters" is just inaccurate and
>   infuriating. "Part II -- Getting Started Quickly" contains recycled
>   and reformatted content easily found for free online. "Part III --
>   Building Custom Clusters" isn't really about building custom 
> clusters,
>   but looks more closely at some software that was gleaned over in 
> Parts
>   I & II. While I don't agree with the inclusion of the parallel 
> virtual
>   file system (PVFS) and the omission of Sun Grid Engine in Part III,
>   I'm sure this can be chalked up to one of the tough decisions the
>   author had to make, like the omission of PVM and Condor from the 
> book.
>   "Part IV -- Cluster Programming" is actually a very good introduction
>   to programming, debugging, and profiling MPI programs.
>
>   It's obvious that this book has no clear identity. It's like a 5th
>   grader's book report: a lifeless facsimile of what's been read,
>   totally void of originality, wisdom or topic advancement. But it's a
>   quick read because it uses small words.
>
>   Should I be this harsh? After all, cluster computing is a complex
>   subject where the answer to most questions is "it depends."  However,
>   I believe that O'Reilly owed us an excellent book after their first
>   cluster gaffe, so I'm disappointed that O'Reilly took the easy way 
> out
>   by reorganizing and watering down documentation that is available
>   elsewhere. Even the content in the exemplary Part IV can be found in
>   several other places. It's just a lot less technical and intimidating
>   here. 
>
>   There are better ways to write a clusters book. I know because I've
>   read several cluster book outlines by members of the cluster
>   intelligentsia that would have been better than this offering. So I'm
>   not going easy on O'Reilly, no matter how good their intentions. The
>   cluster community has a difficult enough time assisting people with
>   clusters without books like this dynamiting the proverbial cluster
>   well. The statement on page 28, "...benchmarking is probably a
>   meaningless activity and waste of time," is just plain wrong and
>   demonstrates a glaring lack of cluster understanding.
>
>   If you really want to learn about clusters, pick up a copy of
>   Sterling's "Beowulf Cluster Computing with Linux," 2nd edition, or
>   check out Warewulf, Rocks, OSCAR, OpenMosix, and ClusterWorld online.
>   You could join a mailing list, like the Beowulf mailing list, and
>   subscribe to ClusterWorld Magazine. This is where the creators and
>   maintainers of all that is clustering hang out, announce, debate,
>   rant, create, lurk, help, and publish. If you want to be part of
>   clustering's future, then you'll check out the community's Cluster
>   Agenda and attend this year's ClusterWorld conference.
>
>   =================================================
>   Glen Otero received his Ph.D. in Microbiology and Immunology from 
> UCLA
>   in 1995 and immediately escaped to the more temperate climes and
>   better surf in San Diego. After some research on the molecular and
>   cellular biology of HIV and Herpes viruses at the Salk Institute for
>   Biological Sciences, Glen left the wet lab research bench in 1999.
>   Although leaving the research bench, he didn't leave science
>   altogether; traveling all the way across the street to the San Diego
>   Supercomputer Center (SDSC) for a stint at the Protein Data Bank. It
>   was while at SDSC that Glen had his Linux clusters and bioinformatics
>   epiphany. Soon after that illuminating event, Glen founded Linux
>   Prophet, a bioinformatics consultancy specializing in the
>   implementation, design, and deployment of Linux Beowulf clusters in
>   the life sciences. Late in 2002 Linux Prophet evolved into Callident,
>   a Linux cluster software and high performance computing company.
>
Glen Otero Ph.D.
Linux Prophet

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 10616 bytes
Desc: not available
Url : http://www.scyld.com/pipermail/beowulf/attachments/20050225/3919966f/attachment.bin


More information about the Beowulf mailing list