Stale pid file problem, and a proposed solution

raf rsync at raf.org
Mon Jan 20 22:55:11 UTC 2020


Joseph C. Sible via rsync wrote:

> Today, rsyncd manages its pid file by open()ing it with O_CREAT|O_EXCL
> at startup, and then unlink()ing it at shutdown. If the open() fails
> at startup because the file already exists, then rsyncd will assume
> another instance of itself is already running and not start.
> 
> However, there's a problem with this approach: if rsyncd is terminated
> without being able to clean up (e.g., kill -9, or the server losing
> power), then the stale pid file will prevent rsyncd from ever
> restarting until an administrator manually intervenes.
> 
> I propose a solution to this problem: open the file without O_EXCL,
> then try to take an exclusive lock on the whole file (we already use
> file locks to limit max connections, so this change wouldn't add any
> new requirements to rsyncd). If we can't get the lock, then abort, and
> if we can, then truncate the file and write our PID into it. Since
> locks never outlive the process that took them, this fixes the stale
> pid file problem.
> 
> Does this seem like a reasonable idea? If so, I'll write and submit a
> patch that implements it.
> 
> Joseph C. Sible

I think that's very sensible. It's what my daemon program does
(libslack.org/daemon) to ensure a single instance of a daemon.
It probably means that the pidfile shouldn't be on an NFS-mounted
file system but hopefully that won't be a problem for anyone.
Or could that be a problem?

cheers,
raf




More information about the rsync mailing list