[Beowulf] Small files

Reuti reuti at staff.uni-marburg.de
Thu Jun 12 03:43:53 PDT 2014


Hi,

Am 11.06.2014 um 21:03 schrieb Tom Harvill:

> This is my first time posting to this list, thanks in advance for any time you spend
> replying.
> 
> We've found that a large majority of our files (~40MM of ~50MM) are less than 10KB.
> We believe our filesystem (lustre) is bottlenecked with IOPs and locking related to
> jobs running against these files.  We have ~700TB usable storage with ~500TB consumed,
> almost all consumption is by a relatively small number of very very large files.

What data is represented in 10KB: binary or ASCII data - would it work to put it in a database instead of all these single files? How do you access the files: by some kind of index, name, directory...?

-- Reuti


> I want to ask this general question: how does your shop deal with the general problem of
> small files in filesystems on (beowulf) compute clusters? Specifically, files that users expect
> to actively use for read and write operations for their research.
> 
> Do you distinguish and segregate them (and/or the people that use them) on special
> hardware/filesystems?
> 
> Thanks!
> Tom
> 
> Tom Harvill
> Holland Computing Center
> University of Nebraska
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list