rsync splits filenames, creates special characters where none are, weird permissions

Lenz Weber mail at lenzw.de
Wed Jan 7 15:26:12 MST 2015


Hi,

Am 07.01.2015 um 18:25 schrieb Paul Slootman:
> On Wed 07 Jan 2015, Lenz Weber wrote:
> 
>> Where the local destination /data/snapshots is an NFS volume mounted with the flags
>>     (rw,noatime,addr=192.168.1.XX)
>> and the source is a symlink to a zfs snapshot - that looks like this:
>>     /var/backups/mail -> /tank/mail/.zfs/snapshot/zfs-auto-snap_hourly-2015-01-07-1417
> 
> Why not skip the NFS part and run rsync to the destination over the
> network? Rsync is written to minimize network traffic at the cost of
> local IO, and if you're doing NFS then that "local IO" is really also
> network traffic.  You also eliminate one potential source of problems in
> that case.

If I were setting up a new backup host, I would consider this, but this is a "grown"
platform - and as you know, those are now always easy (or quick) to change, so I'll have
to stick to that solution for now.
As for the "potential source of problems" part: This exact data set (source) was residing
on another server (without the zfs setup) before, where it backed up just fine. So I think
the problem is most likely to be found on the source part, not on the target part.

> 
>> as far as I can tell, both systems work with UTF8 just fine (source is Ubuntu 14.04 and target is Debian Lenny)
>>
>> Now there seems to be a problem while gathering or transferring the file list,
>> as rsync tries to create files/folders that share a part with real files on the source,
>> but with additional characters, sometimes cut off, without the preceding parent folder et cetera.
> 
> How often? Every file? 10% 1%? ...
We're speaking of about a million files and a dozen errors on each transfer.
But that's just all I can see - usually, the transfer cancels at one or the other point with different error messages
so I can't say if there would be more errors if the transfer would complete.

Some of these are (going back through my logs)

#case 1: (most of the time i guess)
rsync: connection unexpectedly closed (147733412 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(635) [receiver=3.0.3]
rsync: connection unexpectedly closed (55 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(635) [generator=3.0.3]

#case 2:
rsync: writefd_unbuffered failed to write 4 bytes [generator]: Broken pipe (32)
rsync error: error in rsync protocol data stream (code 12) at io.c(1544) [generator=3.0.3]

#case 3:
unknown message 31:5178099 [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(475) [generator=3.0.3]
rsync error: received SIGUSR1 (code 19) at main.c(1304) [receiver=3.0.3]



> 
>> The source file names in this case look like this:
>>
>> /var/backups/mail/redacted-domain/catchall/Maildir/.Sent/cur/1313508314.M654736P32713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S
>> /var/backups/mail/redacted-domain/info/Maildir/.Sent/cur/1313508314.M654736P32713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S
>>
>> but rsync fails on files like this, that clearly do not exist:
>>
>> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=423\#001\#305\#001O\#233\#240é"
>> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S"
>> skipping non-regular file "83E13498714.M297793P23544V000"
>> skipping non-regular file "Ø \#201"
>> skipping non-regular file "redacted-domain/catchall/Maildir/.Sent/cur/1301490998.M622842P6671V0000000000000801I00280BD9_0.redacted-hostname\#004"
>> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=pedition/courierimapkeywords/:list"
> 
> Is this reproducible, i.e. a second run (after cleaning up the mess it
> left behind) creates these same files again, or others?
Most of the time it is some kind of pattern within the same run, but different patterns between different runs.

> 
> My first thought is that this combination of factors is triggering some
> sort of memory problems which is corrupting the filenames. It may also
> be useful to do a run with --checksum to catch any data corruption (or
> to see if it finds mismatches where there shouldn't).

I will try this. Though I will try disabling rssh first (Wayne Davison
suggested in another mail that an enforced command could be the reason for that, I didn't think of that!).
Will send more information tomorrow - let's see how it works out.

> 
> If this can be narrowed down to a fairly small transfer which goes wrong
> reproducibly, then using strace -f on rsync (with -o strace-output.txt)
> then perhaps you can see whether the errors already occur when reading
> the files or not.
> 
I have tried it with significantly smaller datasets and could not reproduce the problem :(
> 
> I have not heard of rsync performing this way, so I strongly suspect
> some hardware problem.
> 
> 
> Paul
> 

Thank you very much so far, at least I'm not alone with this intimidating mess :)

Regards,
Lenz


More information about the rsync mailing list