Trying to diagnose incomplete file transfer

Albert Croft acroft at cyber-wizard.com
Sat Mar 4 06:39:52 UTC 2023


At $work I have an odd situation involving incomplete file transfers, 
but I am unsure where the issue may be occurring. Here is the scenario.

Problem:
Sometimes the file transfer seems to have completed, but the file size 
does not match that on the remote system.


Details:
I transfer a number of large (>1GB) Tar-Gzipped (.tgz) files via SSH 
tunnels from $customer. Because of some previous issues, sometimes the 
SSH tunnels may be terminated externally. As a result, I am currently 
using the 'split' command to break the files into 1-GB "chunks" (ex.: 
foo.tgz.aa, foo.tgz.ab, ...).

For the rsync transfer, I am using the following options:
     rsync -az \
         -e "ssh ..." \
         --link-dest=/local/path1 \
         --link-dest=/local/path2 \
         --remove-source-files \
         user at remote:/path/to/files \
         /local/path1/

where
	'-e "ssh ..."' is the set of SSH options (for tunneling, etc.).
	'--link-dest=/local/path1' refers to a local directory that might 
contain a copy of the file.
	'--link-dest=/local/path2' refers to a local directory that might 
contain a copy of the file.

I am frequently encountering times where the file appears to have been 
transferred but is incomplete. (Example: foo.tgz.ab now exists on the 
local system, has been removed from the remote, but is incomplete.)


Additional notes:
To my knowledge I do not know if the 'gzip' '--rsyncable' option is 
being used (but I do not think so--I suspect the file is created using a 
command similar to 'tar czf foo.tgz ...').

The rsync commands may be launched from command-line or cron, but use 
the same format and options in either case. As a result, there may be 
multiple rsync processes pulling files from the same remote path to the 
same local path.

I know that when rsync transfers a file (ex.: foo.tgz.ab) that during 
the transfer process it is named '.foo.tgz.ab.??????' (where '.??????' 
is a 6-character unique extension), and that upon completion the file is 
renamed to 'foo.tgz.ab'. (So I may see .foo.tgz.ab.4e67d0 and 
.foo.tgz.ab.fa7325 in the directory while the transfers are going.)


I am unsure if this is a result of the combination of options I am 
using, or where to begin troubleshooting. Any guidance or direction 
would be appreciated.

-Albert C.





More information about the rsync mailing list