rsync splits filenames, creates special characters where none are, weird permissions
mail at lenzw.de
Thu Jan 8 01:33:00 MST 2015
The backup has run twice without problems since - it seems like rssh as
shell on the host was the culprit.
Thank you again, you have really helped me with this!
On 07.01.2015 23:26, Lenz Weber wrote:
> Am 07.01.2015 um 18:25 schrieb Paul Slootman:
>> On Wed 07 Jan 2015, Lenz Weber wrote:
>>> Where the local destination /data/snapshots is an NFS volume mounted with the flags
>>> and the source is a symlink to a zfs snapshot - that looks like this:
>>> /var/backups/mail -> /tank/mail/.zfs/snapshot/zfs-auto-snap_hourly-2015-01-07-1417
>> Why not skip the NFS part and run rsync to the destination over the
>> network? Rsync is written to minimize network traffic at the cost of
>> local IO, and if you're doing NFS then that "local IO" is really also
>> network traffic. You also eliminate one potential source of problems in
>> that case.
> If I were setting up a new backup host, I would consider this, but this is a "grown"
> platform - and as you know, those are now always easy (or quick) to change, so I'll have
> to stick to that solution for now.
> As for the "potential source of problems" part: This exact data set (source) was residing
> on another server (without the zfs setup) before, where it backed up just fine. So I think
> the problem is most likely to be found on the source part, not on the target part.
>>> as far as I can tell, both systems work with UTF8 just fine (source is Ubuntu 14.04 and target is Debian Lenny)
>>> Now there seems to be a problem while gathering or transferring the file list,
>>> as rsync tries to create files/folders that share a part with real files on the source,
>>> but with additional characters, sometimes cut off, without the preceding parent folder et cetera.
>> How often? Every file? 10% 1%? ...
> We're speaking of about a million files and a dozen errors on each transfer.
> But that's just all I can see - usually, the transfer cancels at one or the other point with different error messages
> so I can't say if there would be more errors if the transfer would complete.
> Some of these are (going back through my logs)
> #case 1: (most of the time i guess)
> rsync: connection unexpectedly closed (147733412 bytes received so far) [receiver]
> rsync error: error in rsync protocol data stream (code 12) at io.c(635) [receiver=3.0.3]
> rsync: connection unexpectedly closed (55 bytes received so far) [generator]
> rsync error: error in rsync protocol data stream (code 12) at io.c(635) [generator=3.0.3]
> #case 2:
> rsync: writefd_unbuffered failed to write 4 bytes [generator]: Broken pipe (32)
> rsync error: error in rsync protocol data stream (code 12) at io.c(1544) [generator=3.0.3]
> #case 3:
> unknown message 31:5178099 [generator]
> rsync error: error in rsync protocol data stream (code 12) at io.c(475) [generator=3.0.3]
> rsync error: received SIGUSR1 (code 19) at main.c(1304) [receiver=3.0.3]
>>> The source file names in this case look like this:
>>> but rsync fails on files like this, that clearly do not exist:
>>> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=423\#001\#305\#001O\#233\#240é"
>>> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=42352:2,S"
>>> skipping non-regular file "83E13498714.M297793P23544V000"
>>> skipping non-regular file "Ø \#201"
>>> skipping non-regular file "redacted-domain/catchall/Maildir/.Sent/cur/1301490998.M622842P6671V0000000000000801I00280BD9_0.redacted-hostname\#004"
>>> skipping non-regular file "2713V0000000000000801I000B03CC_6.redacted-hostname,S=pedition/courierimapkeywords/:list"
>> Is this reproducible, i.e. a second run (after cleaning up the mess it
>> left behind) creates these same files again, or others?
> Most of the time it is some kind of pattern within the same run, but different patterns between different runs.
>> My first thought is that this combination of factors is triggering some
>> sort of memory problems which is corrupting the filenames. It may also
>> be useful to do a run with --checksum to catch any data corruption (or
>> to see if it finds mismatches where there shouldn't).
> I will try this. Though I will try disabling rssh first (Wayne Davison
> suggested in another mail that an enforced command could be the reason for that, I didn't think of that!).
> Will send more information tomorrow - let's see how it works out.
>> If this can be narrowed down to a fairly small transfer which goes wrong
>> reproducibly, then using strace -f on rsync (with -o strace-output.txt)
>> then perhaps you can see whether the errors already occur when reading
>> the files or not.
> I have tried it with significantly smaller datasets and could not reproduce the problem :(
>> I have not heard of rsync performing this way, so I strongly suspect
>> some hardware problem.
> Thank you very much so far, at least I'm not alone with this intimidating mess :)
More information about the rsync