[Beowulf] Mature open source hierarchical storage management
Craig.Tierney at noaa.gov
Sat Oct 24 08:00:38 PDT 2009
Carl Thomas wrote:
> HI all,
> We are currently in the midst of planning a major refresh of our
> existing HPC cluster.
> It is expected that our storage will consist of a combination of fast
> fibre channel and SATA based disk and we would like to implement a
> system whereby user files are automatically migrated to and from slow
> storage depending on frequency of usage. Initial investigations seem
> to indicate that larger commercial hierarchical storage management
> systems vastly exceed our budget.
> Is there any mature open source alternatives out there? How are other
> organisations dealing with transparently presenting different tiers of
> storage to non technical scientists?
Sun opensourced SamFS last year:
I don't know what the state of the project is, but it is a place to start.
The way we do at our NOAA site is to let the users migrate their data
HSM system manually. There are several reasons for this. With the
of GPFS, there are no filesystems that are really good for HPC clusters
allow for automatic migration. I wouldn't want to use CXFS, StorNext,
as the HPC filesystem across dozens or hundreds of nodes.
There is another practical reason I wouldn't want to do it, even with
GPFS. I want
to prevent users from doing stupid things. Having the HSM try and
archive a source
code directory (not tarred) would be one of them. I know that many of
have policies for implementing containers or controlling if/when files
but for a general user community like HPC typically has, I think it is
better to educate
them on the proper way to archive files when the user decides data
should be archived.
That also reduces unneeded archives as well (tapes do get expensive).
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf