[Beowulf] Clearing out scratch space

John Hearns hearnsj at googlemail.com
Tue Jun 12 01:06:06 PDT 2018


In the topic on avoiding fragmentation Chris Samuel wrote:

>Our trick in Slurm is to use the slurmdprolog script to set an XFS project
>quota for that job ID on the per-job directory (created by a plugin which
>also makes subdirectories there that it maps to /tmp and /var/tmp for the
>job) on the XFS partition used for local scratch on the node.

I had never thought of that, and it is a very neat thing to do.
What I would like to discuss is the more general topic of clearing files
from 'fast' storage.
Many sites I have seen have dedicated fast/parallel storage which is
referred to as scratch space.
The intention is to use this scratch space for the duration of a project,
as it is expensive.
However I have often seen that the scratch space i used as permanent
storage, contrary to the intentions of whoever sized it, paid for it and
installed it.

I feel that the simplistic 'run a cron job and delete files older than N
days' is outdated.

My personal take is that heirarchical storage is the answere, automatically
pushing files to slower and cheaper tiers.

But the thought struck me - in the Slurm prolog script create a file called
THESE-FILES-WILL-SELF-DESTRUCT-IN-14-DAYS
Then run a cron job to decrement the figure 14
I guess that doesnt cope with running multiple jobs on the same data set -
but then again running a job marks that data as 'hot' an dyou reset the
timer to 14 days.

What do most sites do for scratch space?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20180612/28145d52/attachment-0001.html>


More information about the Beowulf mailing list