rsync -H option yields corrupt replicas (due to non-unique inode ids)

Kevin Korb kmk at sanitarium.net
Thu Sep 5 19:21:44 CEST 2013


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Rsync determines hard links via inode numbers.  That is the only way
to determine that 2 files are actually the same file.

On 09/05/13 12:08, Andrew J. Romero wrote:
> Hi,
> 
> Our organization hosts a specialized Linux distribution.
> 
> As is typical with Linux distributions, the set of files that make
> up our Linux distro contains a very complex web of self-referential
> hard links.
> 
> Several other sites use our  Linux distro and maintain either
> partial or full internal mirror copies of it.
> 
> The standard method used by Linux mirror sites to pull/replicate a
> subset of a Linux distribution (or a complete Linux distribution)
> from a master repository is to use rsync with options that produce
> the following behavior:
> 
> the first time a unique file is encountered, it's content is
> replicated; however,  when subsequent hard links to the file are
> detected, only the hardlinks are replicated.
> 
> The primary copy of our Linux distro is stored on our BlueArc Titan
> NAS (NFS server). Relative to the mirror-sites, our rsync server
> "sits in front of" the NAS.
> 
> Internally the BlueArc Titan has a unique object id for files;
> however, the inode ID presented to clients by the BlueArc Titan is
> not unique, rsync (with -H option) is erroneously identifying
> unique files as a hard-links to different files. Causing mirror
> repositories to be essentially corrupt and not usable.
> 
> It is my understanding that the NFS v3 spec. does not require NFS
> servers to present unique inode ids to clients. I believe that the
> reasoning is that: large scale NAS appliances internally need to 
> use very wide object ids; but, externally need to present (when
> asked) inode ids that any client an deal with.
> 
> Are there options to rsync that will allow me to reliably replicate
> my hard-link rich Linux distro from my NAS.
> 
> Thanks
> 
> Andy
> 

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
	Kevin Korb			Phone:    (407) 252-6853
	Systems Administrator		Internet:
	FutureQuest, Inc.		Kevin at FutureQuest.net  (work)
	Orlando, Florida		kmk at sanitarium.net (personal)
	Web page:			http://www.sanitarium.net/
	PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlIovagACgkQVKC1jlbQAQfHdwCeLTR/n+bzzDauqxLmpKz61pkR
3+YAoM+UAsCG4RhcbVXeY0hSQ4BZzmm+
=1vPo
-----END PGP SIGNATURE-----


More information about the rsync mailing list