Rsync compression problem - sometimes ineffective?
Bodle, Donald E
donald_bodle at reyrey.com
Thu Jun 12 17:35:44 GMT 2008
Running rsync 2.6.9-1.el4.rf on CentOS 4.4 client and remote server.
Backing up user data from 2 different clients using following:
su - $HOSTID -c 'rsync -azr --timeout=600 --log-file=$DEBUGFILE
--log-file-format="%o %f %b %l %i" --stats --delete --bwlimit=$BANDWDT
--rsh="ssh -P ____" $STAGE $TARGET:$TARGETDIR'
Using "bytes sent"/"literal data" from statistics as a rough estimation
(I know there is overhead in the bytes sent) of the effectiveness of
compression, most days I see reasonable compression, such as from our
summary (X MBytes compressed=bytes sent; XMbytes uncompressed=Literal
rsync $HOSTID transferred 46.20 MBytes compressed (210.45 MBytes
52 minutes and 6 seconds
6,896 files changed out of 81,720 total files (8.44%)
rsync $HOSTID transferred 543.53 MBytes compressed (3.66 GBytes
2 hours, 16 minutes and 38 seconds
7,343 files changed out of 79,944 total files (9.19%)
Some days, I see no evidence of compression, such as this:
rsync $HOSTID transferred 52.10 MBytes compressed (50.06 MBytes
59 minutes and 48 seconds
5,350 files changed out of 80,257 total files (6.67%)
or similarly this:
rsync $HOSTID transferred 1007.55 MBytes compressed (1004.59 MBytes
3 hours, 38 minutes and 47 seconds
9,888 files changed out of 79,306 total files (12.47%)
My initial thought was that days of no apparent compression were when
the majority of the changed files were small files (like when gzipping a
small ASCII file doubles it size) or already compressed files. But so
far I haven't been able to confirm this. I'm not sure this logic
applies since rsync compresses data blocks (at least as I understand
it), and those blocks would be fairly consistent in size (I think). Is
this general understanding of rsync's compression correct?
I searched the samba.org local archives first, and then Internet wide,
using +rsync +compression +problem, but didn't find any similar posts.
Less restrictive searches didn't help any either. I also didn't see
anything in the FAQ or current issues and debugging areas.
Has anyone seen this sort of behaviour before? Can you offer
suggestions of additional diagnostics to attempt? What additional
information might be useful to support my contention that this is
related to the data being changed on those "uncompressed" days?
Donald E. Bodle, Jr.
Sr. Systems Developer
The Reynolds and Reynolds Co.
Are you okay with today, if tomorrow is the end?
- Superchick (So Bright)
This message is confidential and may contain confidential information.
It is intended only for the individual[s] named herein. If this message
is being sent from a member of the legal department, it may also be
legally privileged. If you are not the named addressee[s] you must
delete this email immediately. Do not disseminate, distribute or copy.
More information about the rsync