DBases in very large RAMDisks

Thomas Lovie tlovie at pokey.mine.nu
Fri Mar 9 18:31:49 PST 2001

Many of the commercial databases already effectively do this.  The database
engine knows what pages to read and write to disk, and it will cache
information in RAM up to the total resources it is allowed to use.  Initial
reads will be from disk, but subsequent ones will use the information in
RAM.  In addition the OS will provide another level of cache, but it will
generally not be as good as the one inside the database.  I can't comment on
any of the low cost databases available for Linux, but commercial solutions
like Sybase and Oracle would definitely have this level of sophistication
built in.

As usual, the performance gains that you could see would depend on the
application.  If your database was a few large tables, that couldn't all fit
into RAM, perhaps you would see significant gains just by making more
resources available.  However, if your database was many small tables which
couldn't all fit into RAM either, but your queries only operated on a subset
of the tables that could fit into RAM, then the performance gains may not be
as significant.

One other point to make is that if your database has full transactional
support, then there could potentially be a bottleneck.  The transaction log
is a write-ahead log, as in the database writes what it intends to do, all
the way to disk, then does that operation on the database (usually in RAM)
then writes that it was successful to the transaction log, then sync's the
database to disk when it has free time.  So if your application has *alot*
of small transactions, there may be a performace issue here.

RAM is cheap, why don't you try it?

Tom Lovie.

-----Original Message-----
From: beowulf-admin at beowulf.org [mailto:beowulf-admin at beowulf.org]On
Behalf Of Eugene Leitl
Sent: Friday, March 09, 2001 10:51 AM
To: beowulf at beowulf.org
Subject: DBases in very large RAMDisks

In my current application, I have a purely static ~700 MBytes dbase,
indices and all. It appeared to me, that even without partitioning across
machines, this would fit into a 1 GByte machine's RAMDisk (much cheaper
and noticeably faster than a solid-state disk, I would imagine), and offer
much better reponse times without changing a single line of code. A single
machine could thus serve one or two orders of magnitude more queries, or
far more complex (and hence forbiddingly expensive) queries.

I'm sure somebody here has experiences with such a setup, are there any
gotchas? What is the end-user speedup to expect? What is further speedup
typically, if one bypasses the filesystem entirely, and (the logical next
step) operates on stuff loaded directly into memory?

-- Eugene

Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list