rsync and many files

Kevin Korb kmk at sanitarium.net
Mon Jun 6 20:51:46 MDT 2011


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A lot of this has to do with the filesystems and operating systems involved.

Since you didn't specify I will guess Linux with ext3.  If that is the
case run don't walk to ext4.  Also, mount the filesystems with the
noatime and nodiratime options.  This will prevent every stat() call
from also writing to the filesystem which can be a huge performance benefit.

On 06/06/11 22:48, Greg Siekas wrote:
> 40 files a second seems very slow.  Are you sure the majority of the time is generating the file list and determine what's changed?  How many of the millions of files are changed?
> 
> On modern hardware I see 1000's of files per second when scanning for changed files. 
> 
> On Jun 6, 2011, at 12:39 PM, Steven Levine <steve53 at earthlink.net> wrote:
> 
>> In <F992406D6E81B54DBB33210217FE7AFD07F80C84 at exchange1.mtb.netclusive.de>,
>> on 06/06/11
>>   at 12:04 PM, Cliff Simon <cliff.simon at netclusive.com> said:
>>
>> Hi,
>>
>>> We are using rsync via rsnapshot, but this is not elementary. It is used
>>> to backup many (above 100 servers) and works very well. Now there is one
>>> server with many (several millions) files. The files are not very big, so
>>> the complete backup is about 500 GB.
>>
>>> Now my problem is, that the backup needs about 14 hours - the most time
>>> is to generate the filelist and check whether the files are new/changed
>>> or not.
>>
>>> My rsync-command is:
>>> /usr/bin/rsync -a --bwlimit=9000 --delete --numeric-ids --relative
>>> --delete-excluded --exclude=/some/pathes/ --rsh=/usr/bin/ssh
>>> --link-dest=/dest.path/daily.1/ root at 192.x.x.x:/path.to.backup/
>>
>>> Do you have an idea to reduce the backup time?
>>
>> A bit of math says 2*10^6 / 14 hours is about 40 files/second.  How fast
>> do you think rsync should be and how does this compare to backups on your
>> other servers?
>>
>> Are you sure the it is not the hardware that is limiting the rsync's
>> performance?
>>
>> Based on my knowledge of the rsync sources, I believe the file list
>> generation algorithms are pretty efficient.  There is quite a bit of code
>> in the code path, but it's hard to avoid this given the number of options
>> available to control the sync process.
>>
>> Steven
>>
>> -- 
>> ----------------------------------------------------------------------
>> "Steven Levine" <steve53 at earthlink.net>  eCS/Warp/DIY etc.
>> www.scoug.com www.ecomstation.com
>> ----------------------------------------------------------------------
>>
>> -- 
>> Please use reply-all for most replies to avoid omitting the mailing list.
>> To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
>> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

- -- 
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
	Kevin Korb			Phone:    (407) 252-6853
	Systems Administrator		Internet:
	FutureQuest, Inc.		Kevin at FutureQuest.net  (work)
	Orlando, Florida		kmk at sanitarium.net (personal)
	Web page:			http://www.sanitarium.net/
	PGP public key available on web site.
~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk3tkkIACgkQVKC1jlbQAQdFZwCgxycYzAP98QFZX/2GMUllYcug
skwAoKO5VIOhq/ttIYAua7lHHD24LXIM
=2fSh
-----END PGP SIGNATURE-----


More information about the rsync mailing list