rsync slowness

Gregory Heytings gregory at heytings.org
Mon Feb 6 17:14:27 UTC 2023


I got hit by that bug again, and this time I took the time to try to 
create a minimal recipe.  It turns out that the culprit was the -y flag, 
which I had been using to save some bandwidth, but which doesn't work well 
when there are "too many" files in a directory.

A simple reproducer (which does not exactly reproduce the issue described 
below, but seems close enough):

#!/bin/bash
mkdir -p src
declare -i i=0 j=0
while :
do
   echo $RANDOM$RANDOM$RANDOM > src/$(printf %09d $j).$RANDOM$RANDOM$RANDOM.foobarbaz.$RANDOM.$RANDOM.$RANDOM
   let i++
   let j+=100
   (($i == 100000)) && break
done
cp -a src dst
rm -f dst/00[7-9]*
rsync -avy src/ dst/

After copying about 900 files, rsync will hang during a couple of minutes, 
before copying another batch of about 900 files.  During the next 
iterations, the batches become smaller: about 200 files.  After each 
batch, rsync waits during a couple of minutes.  I stopped the process 
after two hours, and only 17500 files had been copied.  By comparison,

rsync -av src/ dst/

only takes 2 seconds to copy the 30000 files.

I removed the -y option from my scripts, but perhaps how -y works in 
directories with many files could be improved in one way or another?

>
> I've definitely not seen that.  If you can produce a working example and 
> tar it up for us to look at, that might be interesting/useful.
>
> Just to check, though: you do not have --checksum/-c on, right?
>
>> I finally take the time to report an rsync slowness pattern that I've 
>> been seeing for years.
>>
>> Assuming:
>>
>> a directory with many (> 20K) files, for example a maildir, on the 
>> sending side, and
>>
>> a partial copy of that directory on the receiving side, with "enough" 
>> missing missing (say 5K),
>>
>> then the receiving side will do the following: it will take about one 
>> minute to start transferring the missing files, and will apparently 
>> hang about one minute every 200 files or so.  After a few (about 10 or 
>> 20) iterations, the receiving side apparently hangs.
>>
>> On the receiving side rsync is using 100% CPU, on the sending side not 
>> more than a few percents.  In case it matters, both sides are using 
>> ext4 filesystems, and Debian GNU/Linux.
>>
>> I tried --msgs2stderr -M--msgs2stderr, but that does not print any 
>> error message, I also tried to disable compression, to use 
>> --whole-file, to use --no-inc-recursive, ..., to no avail.
>>
>> Any hints?




More information about the rsync mailing list