[Beowulf] Putting /home on Lusture of GPFS

Wed Dec 24 07:48:19 PST 2014

Ryan,

Thanks for that tid-bit. I never thought of that.

On 12/23/2014 09:14 PM, Novosielski, Ryan wrote:
> I run an old Lustre (1.8.9), but it doesn't support some forms of file 
> locking that were even required for compiling some software. Doesn't 
> happen often, but enough to give me pause.
>
> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
> || \\UTGERS      |---------------------*O*---------------------
> ||_// Biomedical | Ryan Novosielski - Senior Technologist
> || \\ and Health | novosirj at rutgers.edu <mailto:novosirj at rutgers.edu>- 
> 973/972.0922 (2x0922)
> ||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
>     `'
>
> On Dec 23, 2014, at 12:11, Prentice Bisbal 
> <prentice.bisbal at rutgers.edu <mailto:prentice.bisbal at rutgers.edu>> wrote:
>
>> Beowulfers,
>>
>> I have limited experience managing parallel filesytems like GPFS or
>> Lustre. I was discussing putting /home and /usr/local for my cluster on
>> a GPFS or Lustre filesystem, in addition to using it just for /scratch.
>> I've never done this before, but it doesn't seem like all that bad an
>> idea. My logic for this is the following:
>>
>> 1. Users often try to run programs from in /home, which leads to errors,
>> no matter how many times I tell them not to do that. This would make the
>> system more user-friendly. I could use quotas/policies to encourage them
>> to use 'steer' them to use other filesystems if needed.
>>
>> 2. Having one storage system to manage is much better than 3.
>>
>> 3. Profit?
>>
>> Anyway, another person in the conversation felt that this would be bad,
>> because if someone was running a job that would hammer the fileystem, it
>> would make the filesystem unresponsive, and keep other people from
>> logging in and doing work. I'm not buying this concern for the following
>> reasons:
>>
>> If a job can hammer your parallel filesystem so that the login nodes
>> become unresponsive, you've got bigger problems, because that means
>> other jobs can't run on the cluster, and the job hitting the filesystem
>> hard has probably slowed down to a crawl, too.
>>
>> I know there are some concerns  with the stability of parallel
>> filesystems, so if someone wants to comment on the dangers of that, too,
>> I'm all ears. I think that the relative instability of parallel
>> filesystems compared to NFS would be the biggest concern, not 
>> performance.
>>
>> -- 
>> Prentice Bisbal
>> Manager of Information Technology
>> Rutgers Discovery Informatics Institute (RDI2)
>> Rutgers University
>> http://rdi2.rutgers.edu
>>
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org 
>> <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit 
>> http://www.beowulf.org/mailman/listinfo/beowulf

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20141224/72119aa0/attachment-0001.html>