Stale pid file problem, and a proposed solution

Joseph C. Sible josephcsible at gmail.com
Mon Jan 20 16:38:43 UTC 2020


Today, rsyncd manages its pid file by open()ing it with O_CREAT|O_EXCL
at startup, and then unlink()ing it at shutdown. If the open() fails
at startup because the file already exists, then rsyncd will assume
another instance of itself is already running and not start.

However, there's a problem with this approach: if rsyncd is terminated
without being able to clean up (e.g., kill -9, or the server losing
power), then the stale pid file will prevent rsyncd from ever
restarting until an administrator manually intervenes.

I propose a solution to this problem: open the file without O_EXCL,
then try to take an exclusive lock on the whole file (we already use
file locks to limit max connections, so this change wouldn't add any
new requirements to rsyncd). If we can't get the lock, then abort, and
if we can, then truncate the file and write our PID into it. Since
locks never outlive the process that took them, this fixes the stale
pid file problem.

Does this seem like a reasonable idea? If so, I'll write and submit a
patch that implements it.

Joseph C. Sible



More information about the rsync mailing list