What is it doing?

Perry Smith pedzsan at gmail.com
Tue Jan 14 08:09:28 MST 2014


Thank you to all for helping.

Just to explain and justify myself: the reason I thought rsync might help
is tar is too simple.  If I can't even get all of the stat calls in a day, then I
won't be able to get all of the files transferred in a day either.  I thought
rsync would help by "picking up where it left off".  tar won't do this for sure.

From rsync's perspective, "picking up where it left off" is only for the
copy and not for the stat calls.  I thought perhaps rsync would save some
state somewhere to tell it how far it had gotten on the previous connection.

If you think of the list produced by a glob pattern as a circular list, then
a single call to rsync will try to traverse the circle a single time if it is
able to.  If it is not able to, then it could note how far in the list it got and
pick up at that point in the circle on the next call.  The state could be
saved in a local file that is explicitly given on the command line.

This is non-trivial because the glob pattern is matched on the server so
even the count of the matches is not known by the client.  But the client
and server could chit-chat with each other with the server telling the
client how far it had gotten and the client saving it.  On reconnect, the
client would inform the server "Oh, by the way, you already did these
starting points".  globs, by convention, are alphabetized.  So the old
list could be matched up with the new list and the new starting point
deduced.

Its almost like the glob results denote a metafile that is copied, one
line at a time, to the client, as the traversal of each of the trees
completes.

Perry

On Jan 14, 2014, at 5:25 AM, Christian Huldt <christian at solvare.se> wrote:

> Signed PGP part
> The answer to the original question "What is it doing?" is usually
> best answered by rsync it self with -vvv
> 
> Perry Smith skrev 2014-01-14 00:19:
> > Yea.  Shouldn't be hard to split up.  The hard part is some type of
> > dependable rotation.
> >
> > You mention "pause"... I have to disconnect so I assume that would
> > "abort" the transfer.  But that triggered another question: would
> > daemon mode help in this situation? (I assume not.  The daemon
> > probably folks and the child does all the work and dies when the
> > connection is lost.)
> >
> > Perry
> >
> > On Jan 13, 2014, at 4:55 PM, Kevin Korb <kmk at sanitarium.net>
> > wrote:
> >
> >> Signed PGP part If you have to abort it then I suppose that makes
> >> sense.  Otherwise you could throttle or pause it.
> >>
> >> If you do have to split it up then it shouldn't be difficult.
> >> Your original command was specifying multiple sources using a
> >> glob of some kind so you would just need to alter that.
> >>
> >> On 01/13/2014 05:51 PM, Perry Smith wrote:
> >>> The NFS server is off somewhere else, locked down. secure,
> >>> blah blah.
> >>>
> >>> Doing it via a script that rotates is the same number of stat
> >>> calls but it would start at a different place each day.
> >>>
> >>> If I start it day 1 and it gets 25% through the stat calls, on
> >>> day 2, will rsync start where it left off or start back at the
> >>> beginning?  I figure since it does not save context, I would
> >>> start back at the beginning.
> >>>
> >>> So if I rotate, it would start at a different point.
> >>>
> >>> On Jan 13, 2014, at 4:44 PM, Kevin Korb <kmk at sanitarium.net>
> >>> wrote:
> >>>
> >>>> Signed PGP part It is still the same number of stat calls.
> >>>> Doesn't really matter if you split them up.
> >>>>
> >>>> Can you rsync to the NFS server directly?
> >>>>
> >>>> On 01/13/2014 05:34 PM, Perry Smith wrote:
> >>>>> Ok.  I can get the Mac up to version 3 but I'm wondering if
> >>>>> I need to rethink my whole strategy.  Since the source is
> >>>>> on NFS, doing a stat on all the files each run may cost me
> >>>>> too much time.
> >>>>>
> >>>>> I might need to split it into smaller pieces and then
> >>>>> rotate through the pieces via a script.  Do you have any
> >>>>> suggestions for this type of situation?
> >>>>>
> >>>>> Perry
> >>>>>
> >>>>> On Jan 13, 2014, at 4:08 PM, Kevin Korb
> >>>>> <kmk at sanitarium.net> wrote:
> >>>>>
> >>>>>> Signed PGP part On 01/13/2014 05:05 PM, Perry Smith
> >>>>>> wrote:
> >>>>>>> A friend and I noticed the --times or --archive flag.
> >>>>>>> I have not stopped it yet but I'll add that flag
> >>>>>>> (probably --times).
> >>>>>>>
> >>>>>>> This is the first time so it must be #2.
> >>>>>>>
> >>>>>>> The side issuing the command is a Mac using rsync
> >>>>>>> version 2.6.9 protocol version 29.  The other side is
> >>>>>>> AIX using rsync version 3.1.0 protocol version 31 (that
> >>>>>>> I built myself).
> >>>>>>
> >>>>>> Yes, if either end is version 2 then rsync will have to
> >>>>>> index the entire tree on both systems before it starts
> >>>>>> copying anything.
> >>>>>>
> >>>>>>> I don't mind recompiling rsync on the Mac side if you
> >>>>>>> think that would improve things.
> >>>>>>
> >>>>>> I have no Mac experience but that is the way it is
> >>>>>> everywhere else.
> >>>>>>
> >>>>>>> I was trying to find some type of scratch file or
> >>>>>>> something but could not.  I'm curious, where is the
> >>>>>>> index kept?
> >>>>>>
> >>>>>> There is no index kept.  Rsync has no memory between
> >>>>>> runs which is why copying the timestamps is important.
> >>>>>>
> >>>>>> When I say indexing files I really mean it is going
> >>>>>> through the tree and doing a stat() on everything so it
> >>>>>> will have a list of existing files and timestamps to
> >>>>>> compare with the other end. Rsync v3 does this too but it
> >>>>>> does it incrementally while it is also copying stuff.
> >>>>>>
> >>>>>>> Thank you for your help Perry
> >>>>>>>
> >>>>>>> On Jan 13, 2014, at 2:49 PM, Kevin Korb
> >>>>>>> <kmk at sanitarium.net> wrote:
> >>>>>>>
> >>>>>>>> Signed PGP part First, don't run rsync without
> >>>>>>>> either --times or --archive.  Without that rsync
> >>>>>>>> won't copy timestamps and it won't be able to tell
> >>>>>>>> what is changed when you run it again.
> >>>>>>>>
> >>>>>>>> Second, if rsync isn't copying anything then there
> >>>>>>>> are 2 reasons... 1. You already have most of the
> >>>>>>>> files copied and it is going through them looking for
> >>>>>>>> a file that needs updating 2. You are using rsync
> >>>>>>>> version 2 where all files had to be indexed before it
> >>>>>>>> copied anything.
> >>>>>>>>
> >>>>>>>> On 01/13/2014 03:06 PM, Perry Smith wrote:
> >>>>>>>>> This is my first time to really use rsync.  I did
> >>>>>>>>> small tests to get the arguments like I wanted and
> >>>>>>>>> then kicked off the big rsync about 2 and a half
> >>>>>>>>> hours ago. So far, it has not copied over any
> >>>>>>>>> files.
> >>>>>>>>>
> >>>>>>>>> The command I used is:
> >>>>>>>>>
> >>>>>>>>> rsync \ --relative \ --recursive \ --copy-links \
> >>>>>>>>> host:/glob/that/matches/about/eighty/./directories
> >>>>>>>>> \ /local/target/dir
> >>>>>>>>>
> >>>>>>>>> The list of directories are all full of symbolic
> >>>>>>>>> links that point off to NFS mounted file systems.
> >>>>>>>>> I don't expect it to complete today but I do have
> >>>>>>>>> to stop it each day at the end of the work day. But
> >>>>>>>>> it worries me that it has yet to copy over any
> >>>>>>>>> files.
> >>>>>>>>>
> >>>>>>>>> Is it really making progress?  Or will it take
> >>>>>>>>> this long to really start copying files over each
> >>>>>>>>> day I start it?
> >>>>>>>>>
> >>>>>>>>> I expect the total amount copied to be about 400G
> >>>>>>>>> and about 4 million files.
> >>>>>>>>>
> >>>>>>>>> It is possible to break this up into pieces if
> >>>>>>>>> that would help.
> >>>>>>>>>
> >>>>>>>>> Thank you for your help and advice, Perry
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>>>
> >>>>
> >>>>>>>>
> >>
> >>>>>>>>
> Kevin Korb			Phone:    (407) 252-6853
> >>>>>>>> Systems Administrator		Internet: FutureQuest, Inc.
> >>>>>>>> Kevin at FutureQuest.net  (work) Orlando, Florida
> >>>>>>>> kmk at sanitarium.net (personal) Web page:
> >>>>>>>> http://www.sanitarium.net/ PGP public key available
> >>>>>>>> on web site.
> >>>>>>>> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>>>
> >>>>
> >>>>>>>>
> >>
> >>>>>>>>
> --
> >>>>>>>> Please use reply-all for most replies to avoid
> >>>>>>>> omitting the mailing list. To unsubscribe or change
> >>>>>>>> options:
> >>>>>>>> https://lists.samba.org/mailman/listinfo/rsync
> >>>>>>>> Before posting, read:
> >>>>>>>> http://www.catb.org/~esr/faqs/smart-questions.html
> >>>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >>>>>>
> >>>>>>
> >>>>
> >>>>>>
> >>
> >>>>>>
> Kevin Korb			Phone:    (407) 252-6853
> >>>>>> Systems Administrator		Internet: FutureQuest, Inc.
> >>>>>> Kevin at FutureQuest.net  (work) Orlando, Florida
> >>>>>> kmk at sanitarium.net (personal) Web page:
> >>>>>> http://www.sanitarium.net/ PGP public key available on
> >>>>>> web site.
> >>>>>> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >>>>>>
> >>>>>
> >>>>>>
> >>>>
> >>>>>>
> >>
> >>>>>>
> --
> >>>> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >>>>
> >>>>
> >>
> >>>>
> Kevin Korb			Phone:    (407) 252-6853
> >>>> Systems Administrator		Internet: FutureQuest, Inc.
> >>>> Kevin at FutureQuest.net  (work) Orlando, Florida
> >>>> kmk at sanitarium.net (personal) Web page:
> >>>> http://www.sanitarium.net/ PGP public key available on web
> >>>> site.
> >>>> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >>>>
> >>>
> >>>>
> >>
> >>>>
> --
> >> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >>
> >>
> Kevin Korb			Phone:    (407) 252-6853
> >> Systems Administrator		Internet: FutureQuest, Inc.
> >> Kevin at FutureQuest.net  (work) Orlando, Florida
> >> kmk at sanitarium.net (personal) Web page:
> >> http://www.sanitarium.net/ PGP public key available on web site.
> >> ~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~'`^`'~*-,._.,-*~
> >>
> >
> >>
> >
> >
> 
> -- 
> Please use reply-all for most replies to avoid omitting the mailing list.
> To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 495 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.samba.org/pipermail/rsync/attachments/20140114/89c03109/attachment.pgp>


More information about the rsync mailing list