pratte at lincweb.com
Tue Jun 20 17:45:26 PDT 2000
You have probably gone through this list already, but sometimes it is
to check off the basics at least.
1) examine the hardware you are using. I wouldn't be surprised if the
bottleneck you are facing is disk access. What type of file system are you
OS, etc? I would guess that you are dealing with disk intensive processes
you are stuffing 1.5 gig of data into your free memory...:)...), so
throughput via threading the application/running it parallel/etc. may not
I have seen HUGE differences with processes like this, though, by upgrading
drive arrays. If you aren't using them, look at EMCs, or similar products.
2) how is the data set up. Are you processing one giant log for a small
of processes, or is this some concatenated/conglomerated log(s) that can
easily be divided. In the case of the latter, distributing the logs (not
the process) may be a quick answer.
3) examine the script running the process. You probably have some regular
matching going on if this is a shell script/perl/python/etc. Read the
O'Reilly Regular Expression
book, if you haven't already....quite elucidating. If you are using a
binary, check the
source code (if available), there are lots of performance tweaks for
may be useful. Possibly recompiling using different flags may be useful.
4) examine processes running on the box. unnecessary daemons, etc. just
down performance....and create security hazards. Is the kernel optimized?
5) see #1.....I am really suspicious that disk may be chewing up a lot of
Kurt Brust wrote:
> Hello, I am sure you are busy, so i will not take up much of your time.
> In regards to clustering, Is it possible to setup a beowulf cluster, to
> help process a log file (txt based) over multiple processer's to help
> distrube the load? Right now its at 1.5 gigs a day, takes 12 hours to
> process, I am looking to cut that down as much as possible.
> Thanks for your time!!!
> Beowulf mailing list
> Beowulf at beowulf.org
More information about the Beowulf