Stale pid file problem, and a proposed solution
rsync at raf.org
Mon Jan 20 22:55:11 UTC 2020
Joseph C. Sible via rsync wrote:
> Today, rsyncd manages its pid file by open()ing it with O_CREAT|O_EXCL
> at startup, and then unlink()ing it at shutdown. If the open() fails
> at startup because the file already exists, then rsyncd will assume
> another instance of itself is already running and not start.
> However, there's a problem with this approach: if rsyncd is terminated
> without being able to clean up (e.g., kill -9, or the server losing
> power), then the stale pid file will prevent rsyncd from ever
> restarting until an administrator manually intervenes.
> I propose a solution to this problem: open the file without O_EXCL,
> then try to take an exclusive lock on the whole file (we already use
> file locks to limit max connections, so this change wouldn't add any
> new requirements to rsyncd). If we can't get the lock, then abort, and
> if we can, then truncate the file and write our PID into it. Since
> locks never outlive the process that took them, this fixes the stale
> pid file problem.
> Does this seem like a reasonable idea? If so, I'll write and submit a
> patch that implements it.
> Joseph C. Sible
I think that's very sensible. It's what my daemon program does
(libslack.org/daemon) to ensure a single instance of a daemon.
It probably means that the pidfile shouldn't be on an NFS-mounted
file system but hopefully that won't be a problem for anyone.
Or could that be a problem?
More information about the rsync