quick question
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert Pratte pratte at lincweb.comTue Jun 20 17:45:26 PDT 2000
- Previous message: quick question
- Next message: quick question
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
You have probably gone through this list already, but sometimes it is helpful to check off the basics at least. 1) examine the hardware you are using. I wouldn't be surprised if the biggest bottleneck you are facing is disk access. What type of file system are you using, OS, etc? I would guess that you are dealing with disk intensive processes (unless you are stuffing 1.5 gig of data into your free memory...:)...), so increasing processor throughput via threading the application/running it parallel/etc. may not gain much. I have seen HUGE differences with processes like this, though, by upgrading drive arrays. If you aren't using them, look at EMCs, or similar products. 2) how is the data set up. Are you processing one giant log for a small group of processes, or is this some concatenated/conglomerated log(s) that can easily be divided. In the case of the latter, distributing the logs (not necessarily the process) may be a quick answer. 3) examine the script running the process. You probably have some regular expression matching going on if this is a shell script/perl/python/etc. Read the O'Reilly Regular Expression book, if you haven't already....quite elucidating. If you are using a binary, check the source code (if available), there are lots of performance tweaks for C/C++/etc that may be useful. Possibly recompiling using different flags may be useful. 4) examine processes running on the box. unnecessary daemons, etc. just drag down performance....and create security hazards. Is the kernel optimized? 5) see #1.....I am really suspicious that disk may be chewing up a lot of your time. Kurt Brust wrote: > Hello, I am sure you are busy, so i will not take up much of your time. > > In regards to clustering, Is it possible to setup a beowulf cluster, to > help process a log file (txt based) over multiple processer's to help > distrube the load? Right now its at 1.5 gigs a day, takes 12 hours to > process, I am looking to cut that down as much as possible. > > Thanks for your time!!! > > _______________________________________________ > Beowulf mailing list > Beowulf at beowulf.org > http://www.beowulf.org/mailman/listinfo/beowulf
- Previous message: quick question
- Next message: quick question
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
