bad hardlinks with rsync

John Van Essen vanes002 at umn.edu
Fri Sep 19 21:43:19 EST 2003


On Thu, 18 Sep 2003, Chris Tarnutzer <tarnutzer at ethlife.ethz.ch> wrote:

> Hi List
> 
> I've experienced some problems with rsync. I'm backuping a complete
> machine's rootdirectory. After completion I see in the log of the
> output, that rsync links some files which are surely *not* the same
> on the source System. Or well, it says, that it makes links, using
> the filename1 => filename2 notation. On the source System this files
> are not the same and some files are missing on the target System
> after the sync. Well, the log says, "partial transfer". But why the
> link-sign or why does rsync think, that this files are the same??
...
> stats of the files with "=>" on the source host
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> ls -li var/cache/httpd-accel/a/f4/4d0d0d7f891a691ad5c739448f37f
>  898149 -rw-r-----    1 www-data www-data     4094 17. Sep 14:37
> var/cache/httpd-accel/a/f4/4d0d0d7f891a691ad5c739448f37f
> ls -li etc/sv/qmail-smtp/log/supervise/status
>  890631 -rw-r--r--    1 root     root           18 17. Sep 14:26
> etc/sv/qmail-smtp/log/supervise/status
> 
> ls -li var/cache/httpd-accel/e/b8/fc770c62d513c4032b02da7b526c2
>  681656 -rw-r-----    1 www-data www-data     6068 17. Sep 15:41
> var/cache/httpd-accel/e/b8/fc770c62d513c4032b02da7b526c2
> ls -li etc/sv/qmail/log/supervise/status
>  888784 -rw-r--r--    1 root     root           18 17. Sep 15:33
> etc/sv/qmail/log/supervise/status
> 
> ls -li var/www/ethlife/articles/news/.xmlstyle_cache/a5/ba/f217e3bdffeccb7d16c0e277fabf.type
>  886182 -rw-rw-r--    1 elcms_ht elcms          29 17. Sep 15:40 
> var/www/ethlife/articles/news/.xmlstyle_cache/a5/ba/f217e3bdffeccb7d16c0e277fabf.type
> ls -li var/cache/httpd-accel/0/0f/1ecf467cbcf662efc18681c7f6fe7
>  701948 -rw-r-----    1 www-data www-data    18649 17. Sep 15:40
> var/cache/httpd-accel/0/0f/1ecf467cbcf662efc18681c7f6fe7
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

These pairs of timestamps are recent and pretty close together, which makes
me think this is happening for a given pair of files - FileA and FileB:

- rsync begins building the filelist on the source and adds FileA
- FileA is a dynamic file which gets deleted (and possibly replaced later)
  and so the inode goes in the free inode pool.
- FileB is created, using that recycled inode.
- rsync slogs along, encounters FileB, and adds it to its filelist.

Now FileA and FileB have the same inode in rsync's filelist.  FileA may even
no longer exist (that's a risk you take of rsyncing live systems).

Since the inode number are what's used to figure out hardlinks, FileA and
FileB are hardlinked together on the target system.

I have no suggestion for avoiding this pitfall and still use hard links.
-- 
        John Van Essen  Univ of MN Alumnus  <vanes002 at umn.edu>
        3DGamers  Systems Software Support  <jve at 3dgamers.com>




More information about the rsync mailing list