[Beowulf] typical protocol for cleanup of /tmp: on reboot? cron job? tmpfs?
reuti at staff.uni-marburg.de
Fri Aug 20 14:40:35 PDT 2010
Am 20.08.2010 um 22:40 schrieb Rahul Nabar:
> What's the typical protocol about the cleanup of /tmp folders? Do
> people clean them on each reboot or at intervals with a cron (sounds a
> bad idea). I was always under the impression that a reboot cleans them
> but apparantly not on my CentOS distro, by default.
> I was burnt earlier today when ompi-ps acted erratically and I
> diagnosed it to be caused by stale state information in the /tmp
> folder. The remnant of some old dead jobs that had somehow crashed.
> One other option that I've seen mentioned is mounting /tmp on a tmpfs.
> Is that a good idea? The risk of using up too much RAM if a program
> gets out of hand writing to /tmp.
> On the other hand compute-nodes can go a long time without any
> reboots; so a more frequent cleanup cycle on /tmp might be desirable?
> I suppose most programs ought to cleanup behind them on /tmp but then
> again there are bound to be bad apples.
are you using any queuing system? I try to get all applications set up in such a way, that they write all their stuff to $TMPDIR. It's in [OS]GE and I think also in Torque for some time now, to be created automatically (as job specific directory on a node) and removed after the job.
A load sensor which checks the space on a node in /scratch and put the queue instance into alarm state, if it falls under a certain value, can in addition prevent a black hole in the cluster, where one after the other job crashes due to missing scratch space.
> Any comments?
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf