[Beowulf] Clearing out scratch space

Matt Wallis mattw at madmonks.org
Wed Jun 13 01:45:14 PDT 2018



On 12/06/2018 6:06 PM, John Hearns via Beowulf wrote:
> My personal take is that heirarchical storage is the answere, 
> automatically pushing files to slower and cheaper tiers.

This is my preference as well, if manual intervention is required, it 
won't get done, but you do need to tune it a fair bit to ensure things 
are not being pushed between tiers inappropriately. You do want to make 
sure that you don't end up on tape without essentially being considered 
archived.

> What do most sites do for scratch space?

I personally haven't used this one as yet, but there's a lot of interest 
around BeeGFS and BeeOND.
BeeGFS is a parallel file system from the Fraunhofer Institute in 
Germany, was originally FhGFS. Very fast, very simple, easy to manage.

BeeOND, or BeeGFS On Demand allows you to create a temporary file system 
per job, typically with something like node local SSD/NVMe devices. I 
believe this is being done on TSUBAME 3.0, and one of my customers in 
Queensland is running it as well.

BeeGFS by itself is a pretty interesting PFS, BeeOND sounds great as a 
concept, my concern would be expectation management around yet another 
resource. As in, users getting upset because the number of nodes in your 
job now also impacts the amount of scratch space you can write to and at 
what speed. Then add to that staging data in and out of the space.

That said, my Queensland customer is absolutely stoked with the 
performance he's getting, and it does eliminate the whole question of 
cleaning up scratch space, when the job is over, the scratch space is gone.

I have another system based on BeeGFS coming online in the second half 
of the year, that I can't talk about right now, but I will be looking 
for new adjectives for speed when it hits.

Matt.


More information about the Beowulf mailing list