Speed up rsync with many of excluded files
Georg Limbach
georf at georf.de
Wed Nov 13 13:35:58 UTC 2019
Hello raf,
thank you for your advice. I try it this way now:
# get all files remote
$SSH "cd /srv/$PROJECT/shared/uploads/; find -type f ! -path './tmp/*' !
-name '*pdf_before_qpdf'" > "$ALL_REMOTE_FILES"
# remove exclude entries
comm -23 "$ALL_REMOTE_FILES" "$EXCLUDE_FILE1" > "$ALL_REMOTE_FILES"
comm -23 "$ALL_REMOTE_FILES" "$EXCLUDE_FILE2" > "$ALL_REMOTE_FILES"
comm -23 "$ALL_REMOTE_FILES" "$EXCLUDE_FILE3" > "$ALL_REMOTE_FILES"
# unse --files-from without --delete
rsync --files-from "$ALL_REMOTE_FILES" -aAXz [...]
# now only delete files
rsync -r --delete --existing --ignore-existing [...]
This works! Thanks!
And to all the other guys who works on rsync: Thanks for this nice tool!
Georg
Am 13.11.19 um 01:09 schrieb raf via rsync:
> raf via rsync wrote:
>
>> Georg Limbach via rsync wrote:
>>
>>> Hello,
>>>
>>> I have a problem while rsyncing a directory with perhaps 1.000.000
>>> files. Before this I generate some files locally and add them to temp
>>> files. So this files should be excluded from syncing.
>>>
>>> My rsync call looks like this:
>>>
>>> rsync --exclude='tmp/' \
>>> --exclude-from=/tmp/tmp.GF7SsFPnS3 \
>>> --exclude-from=/tmp/tmp.8SjJNCHyaI \
>>> --exclude-from=/tmp/tmp.CxZXEoPjgV \
>>> --exclude-from=/tmp/tmp.G3g2iMo4bs \
>>> --exclude-from=/tmp/tmp.H9KJYPMfMS \
>>> --exclude-from=/tmp/tmp.PNi7cJaREP \
>>> --exclude-from=/tmp/tmp.S4N9H4lsU7 \
>>> --exclude-from=/tmp/tmp.a5Zlgh6pUK \
>>> --exclude-from=/tmp/tmp.eiUlMluAe8 \
>>> --exclude-from=/tmp/tmp.ma0S1YSewc \
>>> --exclude-from=/tmp/tmp.sLR95oVbVD \
>>> --exclude-from=/tmp/tmp.zbfeLpezMX \
>>> -ax --info=progress2 \
>>> -e 'ssh -x -T -o Compression=no' \
>>> '/srv/project/shared/uploads/' \
>>> 'project at host:/srv/project/shared/uploads/'
>>>
>>> In every temp file are 1000s of lines with files like this:
>>>
>>> orders/scan/file/2234480/scan.pdf
>>>
>>> When I started rsync it only should transfer perhaps 1000 files and
>>> delete 20 ones. But it takes 15 minutes to do that. I think the big
>>> count of excluded files tear down the compare speed of files.
>>>
>>> What can I do to speed up this process?
>>>
>>> Thanks for your advise!
>>>
>>> Georg
>>
>> I found a big speed up when I starting providing a list of
>> candidate files to rsync with the --files-from option.
>> If you have some way of identifying which files might need
>> to be rsynced that's quicker than rsync itself checking
>> everything, it can make a big difference. That's not why
>> I did it, but it was a nice bonus.
>
> But it might not help if doing --delete at the same
> time. Can the --delete be done occasionally instead of
> every time? Unless the files to be deleted can be added
> to the list of files handed to rsync. I don't know if
> that'll do what I think but it might.
>
> cheers,
> raf
>
>
More information about the rsync
mailing list