AW: rsync and many files

Cliff Simon cliff.simon at netclusive.com
Tue Jun 7 01:50:58 MDT 2011


Hello together,

first thank you for your answers!

Destination: Linux / ext3 / rsync 3.0.3
Source: Linux / ext3 / rsync 2.6.9

@Paul:
> Both ends have to be at least 3.0.0 to enable the incremental recursion.
The incremental use looks interesting. But because there are only a few updated/new files, I think it won´t give a huge improvement of the duration time.
In addition I´m not able to update both to 3.* unfortunately.

@Steven:
> Are you sure the it is not the hardware that is limiting the rsync's performance?
I´m not sure. But 40 files per second doesn´t sounds fast. I´m also not sure, whether there is a chance to get it work faster at all.

@Greg:
> Are you sure the majority of the time is generating the file list and determine what's changed?
How can I check it 
> On modern hardware I see 1000's of files per second when scanning for changed files.
Destination is a QuadCore with 12 Gig of RAM.
Source is a bit older and only has a DualCore with 4 GB.

@Kevin:
> Since you didn't specify I will guess Linux with ext3.
Correct
> If that is the case run don't walk to ext4. 
OK (was not planned)
Also, mount the filesystems with the noatime and nodiratime options. This will prevent every stat() call from also writing to the filesystem which can be a huge performance benefit.
noatime was already active. nodiratime was added now.

Best regards
Cliff

> -----Ursprüngliche Nachricht-----
> Von: Kevin Korb [mailto:kmk at sanitarium.net]
> Gesendet: Dienstag, 7. Juni 2011 04:52
> An: rsync at lists.samba.org
> Betreff: Re: rsync and many files
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> A lot of this has to do with the filesystems and operating systems involved.
> 
> Since you didn't specify I will guess Linux with ext3.  If that is the case run
> don't walk to ext4.  Also, mount the filesystems with the noatime and
> nodiratime options.  This will prevent every stat() call from also writing to the
> filesystem which can be a huge performance benefit.
> 
> On 06/06/11 22:48, Greg Siekas wrote:
> > 40 files a second seems very slow.  Are you sure the majority of the time is
> generating the file list and determine what's changed?  How many of the
> millions of files are changed?
> >
> > On modern hardware I see 1000's of files per second when scanning for
> changed files.
> >
> > On Jun 6, 2011, at 12:39 PM, Steven Levine <steve53 at earthlink.net>
> wrote:
> >
> >> In
> >>
> <F992406D6E81B54DBB33210217FE7AFD07F80C84 at exchange1.mtb.netclusiv
> e.de
> >> >,
> >> on 06/06/11
> >>   at 12:04 PM, Cliff Simon <cliff.simon at netclusive.com> said:
> >>
> >> Hi,
> >>
> >>> We are using rsync via rsnapshot, but this is not elementary. It is
> >>> used to backup many (above 100 servers) and works very well. Now
> >>> there is one server with many (several millions) files. The files
> >>> are not very big, so the complete backup is about 500 GB.
> >>
> >>> Now my problem is, that the backup needs about 14 hours - the most
> >>> time is to generate the filelist and check whether the files are
> >>> new/changed or not.
> >>
> >>> My rsync-command is:
> >>> /usr/bin/rsync -a --bwlimit=9000 --delete --numeric-ids --relative
> >>> --delete-excluded --exclude=/some/pathes/ --rsh=/usr/bin/ssh
> >>> --link-dest=/dest.path/daily.1/ root at 192.x.x.x:/path.to.backup/
> >>
> >>> Do you have an idea to reduce the backup time?
> >>
> >> A bit of math says 2*10^6 / 14 hours is about 40 files/second.  How
> >> fast do you think rsync should be and how does this compare to
> >> backups on your other servers?
> >>
> >> Are you sure the it is not the hardware that is limiting the rsync's
> >> performance?
> >>
> >> Based on my knowledge of the rsync sources, I believe the file list
> >> generation algorithms are pretty efficient.  There is quite a bit of
> >> code in the code path, but it's hard to avoid this given the number
> >> of options available to control the sync process.
> >>
> >> Steven
> >>
> >> --
> >> ---------------------------------------------------------------------
> >> - "Steven Levine" <steve53 at earthlink.net>  eCS/Warp/DIY etc.
> >> www.scoug.com www.ecomstation.com
> >> ---------------------------------------------------------------------
> >> -
> >>
> >> --
> >> Please use reply-all for most replies to avoid omitting the mailing list.
> >> To unsubscribe or change options:
> >> https://lists.samba.org/mailman/listinfo/rsync
> >> Before posting, read:
> >> http://www.catb.org/~esr/faqs/smart-questions.html
> 
> - --
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> 	Kevin Korb			Phone:    (407) 252-6853
> 	Systems Administrator		Internet:
> 	FutureQuest, Inc.		Kevin at FutureQuest.net  (work)
> 	Orlando, Florida		kmk at sanitarium.net (personal)
> 	Web page:			http://www.sanitarium.net/
> 	PGP public key available on web site.
> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.17 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iEYEARECAAYFAk3tkkIACgkQVKC1jlbQAQdFZwCgxycYzAP98QFZX/2GMUllYcu
> g
> skwAoKO5VIOhq/ttIYAua7lHHD24LXIM
> =2fSh
> -----END PGP SIGNATURE-----
> --
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html


More information about the rsync mailing list