rsync file corruption when destination is a SAN LUN (Solaris 9 & 10)

Terry Countryman terry.countryman at oit.gatech.edu
Thu Oct 1 13:45:40 MDT 2009


I have run into a problem using 'rsync' to copy files from local disk  
to a SAN mounted LUN / file-system.
The 'rsync' seems to run fine and it reports no errors, but some  
files are corrupted (check-sums don't match originals,
and file data is changed).

So far, I have found this problem on both Solaris 9 and Solaris 10  
OSes and on several different models of
Sparc systems using different versions of 'rsync' (2.6.8, 3.0.2, and  
3.0.6).  All of these systems are using
QLogic HBAs and connect to QLogic FC switches and the SAN storage is  
on Sun/StorageTek arrays.

My quick example of the problem:

	SAN mounted LUN / file-system == /apps
	local disk has OS & system files  == /

		mkdir /apps/junk

		rsync -avcHS /sbin/. /apps/junk/.
			<no errors reported>
			<no errors reported in system logs>

	then immediately do the same 'rsync' again

		rsync -avcHS /sbin/. /apps/junk/.

	it finds 2-3 files where the check-sums don't match and it re-copies
	them.  And if I do a 3rd 'rsync', it re-copies the same 2-3 files.

If I don't use the "sparse files" option, "-S", the copies are  
successful and the data matches
between the original files and the 'rsync'-ed copies.  But I need to  
use the sparse files processing for the
files that I need to copy.

I do not see this problem if the 'rsync'-s are from:
	-	local-disk to local-disk
	-	local-disk to NFS file-system
	-	NFS file-system to local-disk

What other data would be useful to debug this problem?


=====
Terry Countryman
terry.countryman at oit.gatech.edu


More information about the rsync mailing list