Best organizing hundreds of thousands files for rsync and find
Matthew.Stier at us.fujitsu.com
Thu Mar 28 05:30:14 MDT 2013
Another issue is that rsync and find are single threaded applications. No matter how many processors/cores/threads the system has, each invocation of find or rsync will use only one thread.
You can gain some parallelization by stepping up a level in the directory and running running find's or rsync's at the first subdirectory level. I do this when transferring files between systems over a modern gigabit LAN.
From: rsync-bounces at lists.samba.org [mailto:rsync-bounces at lists.samba.org] On Behalf Of Cristian Bichis
Sent: Thursday, March 28, 2013 1:31 AM
To: rsync at lists.samba.org
Subject: Best organizing hundreds of thousands files for rsync and find
I need to organize about 100 millions small files (and the number grows up) on a server which should be copied to other server.
I am wondering how many files are recommended to be kept into a folder for optimal performance? As well, if I have a folder with only subfolders (not files) what number of subfolders are recommended to have?
As well, the question could be for "find" command, not just for for rsync as I am doing some cleanups using find (or for - find).
I made a mistake before and I increased a lot the number of subfoldersfolders (having just few files within them) and rsync performance was decreasing considerably. Was a mistake which I will try to correct.
So now as the number of files is increasing constantly I need to find out a solution on long term to correct the current issues.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rsync