Solution For Rsync and Cygwin Daylight Savings Timezone Problems

jw schultz jw at pegasys.ws
Wed Apr 2 05:16:20 EST 2003


On Wed, Apr 02, 2003 at 09:55:10PM +1030, Wayne Piekarski wrote:
> > Thanks for the reminder.  Unfortunately your email was
> > rambling so that it was unclear what can actually be done to
> > avoid the problem.  Here in the US Daylight savings time
> > will take effect this coming Sunday.
> 
> Sorry about the rambling :) I wanted to dump out everything I'd learned
> because it was quite confusing initially and there was nothing available on
> how to fix it.

It can't be fixed.  At best it might be avoidable at a cost.
Otherwise all that can be done is to amelliorate the
problems.

> The hack I made corrects the problem as such, but I'm not sure if it has
> other effects that will break other applications. By disabling the daylight
> savings tick box Windows will not automatically adjust its time, but who
> knows what else? I wouldn't want to say that this is the correct solution
> just yet. I've emailed the Cygwin mailing list with the same thing in case
> someone spots something there and maybe a possible fix to make the DLLs deal
> with this.
> 
> > With the transition between standard time and daylight
> > savings time MS-Windows systems are known to appear to
> > change the modification time of existing files.  The effect
> > of this is to give the appearance that every file has
> > changed.  This will cause affected transfers to take
> > inordinate amounts of time.  This affects FAT filesystems
> > which store times in localtime and not NTFS which uses UTC.
> 
> I have not looked into NTFS but I do know it has wierd issues with UTC as
> well, probably not to the extent of FAT but I don't think it is as clean as
> Unix-ish implementations.

If things behaved logically... Oh, yeah this is windws.
NTFS stores dates in UTC
FAT stores dates in localtime
In converting UTC->local and local->UTC windows ignores the
date.  DST is handled by faking the TZ offset.

stat() returns dates in UTC, should use windows internal
function which returns UTC times.  That would mean that FAT
filesystems have wrong offset in UTC->local and NTFS files 
get no conversion.  If stat() uses a windows function that
returns localtime then FAT should be OK because it would use
libc_local2utc(localtime) but NTFS would be broken beacause
it would use libc_local2utc(ms_utc2local(utc)).


> > The impact of this may be minimized by running rsync with
> > the --modify-window=3601 command-line option.  This will
> > cause rsync to ignore modification time differences of one
> > hour will allow rsync jobs to complete in the usual time
> > period with a minimal impact on backup integrity.  To get
> > back to normal it will be necessary to run rsync with the
> > usual modify-window on all files.  This can be done in
> > stages.
> 
> If you run with this flag, does it just skip the updates on those files, or
> does it update the time stamp mtime as well? I'm thinking it just skips them
> and you will have to run with --modify-window until DST ends a few months
> later.

Yes, until you run rsync with the normal --modify-window on them.

In the fall someone got badly bitten by this.  On tuesday or
wednesday he noticed his system running very slow and
discovered that the sunday backup was still running, along
with all the scheduled ones since then.  Getting the files
synced required more than 24 hours of rsync time.  Most
sites it will be a case of rsync taking a couple of hours
instead of less than five minutes but for some sites it
could take days.  Having new jobs kick in just slows the
whole thing down because they will catch up with the first
one and then run lockstep with it, all of them processing
each file in parallel.

For such sites they can either make sure that only one job
is running and just wait for completion, or they can use the
--modify-window=3601 on the main job and when resources are
available bring it up to date in peices.

To illustrate, assume you have a big system with a bunch of
drives (d,e,f,g,h,i,j,k,l).  Your normal script backs up each
drive in sequence.

	rsyncopts="-rt --modify-window=1"
	for drive in d e f g h i j k l
	do
		rsync $rsyncopts /$drive bserve::$drive
	done

I know, it's a dumb script but it encapsulates the core logic.
If your script were that simple you could simply change it.

When this hits each drive takes 6-7 hours to sync.  To
avoid the problem what you do is modify the script.

	rsyncopts="-rt --modify-window=3601"
	
To get out of avoidance mode each night you run one of these
drives with --modify-window=1.  So on monday you schedule
	rsync -rt --modify-window=1 /d bserve::d

on tueseday
	rsync -rt --modify-window=1 /e bserve::e

f is oversized so on wednesday
	rsync -rt --modify-window=1 /f/bigarea1 bserve::f/bigarea1
	rsync -rt --modify-window=1 /f/bigarea2 bserve::f/bigarea2

and thursday you get the rest of f
	rsync -rt --modify-window=1 /f bserve::f

and so on.  Once all the of the drives have been done you
switch the script back to use --modify-window=1.


-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt


More information about the rsync mailing list