Speed up rsync with many of excluded files

Georg Limbach georf at georf.de
Wed Nov 13 13:35:58 UTC 2019


Hello raf,

thank you for your advice. I try it this way now:

# get all files remote
$SSH "cd /srv/$PROJECT/shared/uploads/; find -type f ! -path './tmp/*' !
-name '*pdf_before_qpdf'" > "$ALL_REMOTE_FILES"

# remove exclude entries
comm -23 "$ALL_REMOTE_FILES" "$EXCLUDE_FILE1" > "$ALL_REMOTE_FILES"
comm -23 "$ALL_REMOTE_FILES" "$EXCLUDE_FILE2" > "$ALL_REMOTE_FILES"
comm -23 "$ALL_REMOTE_FILES" "$EXCLUDE_FILE3" > "$ALL_REMOTE_FILES"

# unse --files-from without --delete
rsync --files-from "$ALL_REMOTE_FILES" -aAXz [...]

# now only delete files
rsync -r --delete --existing --ignore-existing [...]

This works! Thanks!

And to all the other guys who works on rsync: Thanks for this nice tool!

Georg



Am 13.11.19 um 01:09 schrieb raf via rsync:
> raf via rsync wrote:
> 
>> Georg Limbach via rsync wrote:
>>
>>> Hello,
>>>
>>> I have a problem while rsyncing a directory with perhaps 1.000.000
>>> files. Before this I generate some files locally and add them to temp
>>> files. So this files should be excluded from syncing.
>>>
>>> My rsync call looks like this:
>>>
>>> rsync --exclude='tmp/' \
>>> --exclude-from=/tmp/tmp.GF7SsFPnS3 \
>>> --exclude-from=/tmp/tmp.8SjJNCHyaI \
>>> --exclude-from=/tmp/tmp.CxZXEoPjgV \
>>> --exclude-from=/tmp/tmp.G3g2iMo4bs \
>>> --exclude-from=/tmp/tmp.H9KJYPMfMS \
>>> --exclude-from=/tmp/tmp.PNi7cJaREP \
>>> --exclude-from=/tmp/tmp.S4N9H4lsU7 \
>>> --exclude-from=/tmp/tmp.a5Zlgh6pUK \
>>> --exclude-from=/tmp/tmp.eiUlMluAe8 \
>>> --exclude-from=/tmp/tmp.ma0S1YSewc \
>>> --exclude-from=/tmp/tmp.sLR95oVbVD \
>>> --exclude-from=/tmp/tmp.zbfeLpezMX \
>>> -ax --info=progress2 \
>>> -e 'ssh -x -T -o Compression=no' \
>>> '/srv/project/shared/uploads/' \
>>> 'project at host:/srv/project/shared/uploads/'
>>>
>>> In every temp file are 1000s of lines with files like this:
>>>
>>> orders/scan/file/2234480/scan.pdf
>>>
>>> When I started rsync it only should transfer perhaps 1000 files and
>>> delete 20 ones. But it takes 15 minutes to do that. I think the big
>>> count of excluded files tear down the compare speed of files.
>>>
>>> What can I do to speed up this process?
>>>
>>> Thanks for your advise!
>>>
>>> Georg
>>
>> I found a big speed up when I starting providing a list of
>> candidate files to rsync with the --files-from option.
>> If you have some way of identifying which files might need
>> to be rsynced that's quicker than rsync itself checking
>> everything, it can make a big difference. That's not why
>> I did it, but it was a nice bonus.
> 
> But it might not help if doing --delete at the same
> time. Can the --delete be done occasionally instead of
> every time? Unless the files to be deleted can be added
> to the list of files handed to rsync. I don't know if
> that'll do what I think but it might.
> 
> cheers,
> raf
> 
> 



More information about the rsync mailing list