Feature Request - Recursive Rsync Parameter - Example Script

Matt Olson molson at kavi.com
Wed Oct 22 11:25:51 EST 2003


I wanted to flag a problem and offer a possible solution.

The problem:

large rsync operation fails on machines with modest amounts of memory.

Proposal:

Add a parameter to rsync recursive to specify a recursion level (See 
example bash wrapper below).  (works with recursive file system rsyncs 
only, i.e. -a or -r) The logic goes:

if recursion switch true and recursion_level > 0

  -rsync this directory only
  -call rsync for each subdirectory with a decremented recursion_level and 
pass the same switches along

else (recursion_level really is 0)

  -perform the full rsync (from this level)
  
What this does is help break up the job into smaller pieces.  Otherwise 
rsync can consume hundreds of megabyte of memory attempting to perform a 
single operation.  In this scenario you'll see one rsync process for each 
level of recursion.

Here's and example bash script that is an attempt at this idea:  (it 
supports the -n options so you can see the calls it makes)

My bash scripting skills need some work, but, you get the idea.  If 
someone wants to further develop this script, feel free.  

Cheers.


#!/bin/bash
# copyright 2003 Matt Olson, Kavi Corporation (molson at kavi.com)
# Licence: General Public License

# set our environment

IFS=$'\n' # This keeps bash from breaking up file names with space in them.

if [ ! -n "$1" ]
then
  echo "Usage: `basename $0` <recursion_level> \"quoted_rsync_parameters\" <source path> <destination_path>"
  echo "Note: argument parsing is order dependent."
  exit 1
fi  

# Some debugging help, fifth parm will echo args 
if [ -n "$5" -a "$5" = "args" ]

then

  for arg in "$@"
  do
    echo "Arg #$index = $arg"
  done

fi  

# Assign parameters to some variables.

r_level=$1
rsync_options=$2
source_path_parm=$3
dest_path_parm=$4

rsync_no_r_options=`echo $rsync_options | sed -e "s/r//" | sed -e "s/a/lptgoD/"`

# Let's support the rsync test mode.

test_run=`echo $rsync_options | grep n`

# We need to decide if the source is a remote host
# Parse out the <source path> and if it is remote, capture the hostname

if [ `echo $source_path_parm | grep ":"` ]

then

  remote_source_host=${source_path_parm%:*}
  remote_source_path=${source_path_parm#*:}

fi

# We need to also decide if the destination is a remote host.
# Parse out the <source path> and if it is remote, capture the hostname

if [ `echo $dest_path_parm | grep ":"` ]

then

 remote_dest_host=${dest_path_parm%:*}
 remote_dest_path=${dest_path_parm#*:}

fi

# At this point we need to see if there are additional directories to 
# call with lrsync, as long as our recursion level is > 0.
                                                                                                                                                                                                      
# To build a list of targets, we need to determine if the host is remote.
if [ $remote_source_host ]
                                                                                                                                                                                                      
then
                                                                                                                                                                                                      
  # If host is remote, get a file list via rsh
  directory_object=`rsh $remote_source_host ls -1p $remote_source_path | grep /`
                                                                                                                                                                                                      
else
                                                                                                                                                                                                      
  # If host is local, get a file list
  directory_object=`ls -1p $source_path_parm | grep /`
                                                                                                                                                                                                      
fi

if [ $test_run ]
then

  echo "lrsync:  directory_object: $directory_object"

fi
                                                                                                                                                                                                      
# With these results walk through list returned and call rsync/lrsync

# Testing the recursion level.
# At this point if we are at recursion level 0 then do some rsyncs.
# If not at r_level 0 then call lrsync with a decremented r_level.
# If no additional directory objects to recurse, do the rsync.

if [ $1 = 0 ] || [ -z "$directory_object" ] 
then

  # Do rsync(s)

  #  If this is a test run, echo some extra info.
  if [ $test_run ]
  then

    echo "lrsync:  rsync $rsync_options $source_path_parm $dest_path_parm"
    rsync $rsync_options $source_path_parm $dest_path_parm

  else

    rsync $rsync_options $source_path_parm $dest_path_parm

  fi

else

  # Do lrsync(s)

  # Set some variables

  next_r_level=$(($r_level - 1))

  # Next we have to rsync the top level files in the directory we are going to recurse.

  if [ $test_run ]
  then

    echo "lrsync: rsync $rsync_no_r_options $source_path_parm/* $dest_path_parm/."
    rsync $rsync_no_r_options $source_path_parm/* $dest_path_parm/.

  else

    rsync $rsync_no_r_options $source_path_parm/* $dest_path_parm/.

  fi

  # Walk through the directories at this level.

  for file_or_dir in $directory_object
  do
  
    if [ $remote_dest_host ]
    then

      if [ $test_run ]
      then 

        echo "lrsync: rsh $remote_dest_host mkdir $dest_path_parm/$file_or_dir"

      else

        # If host is remote, make directory on remote host via rsh
        rsh $remote_dest_host mkdir $dest_path_parm/$file_or_dir

      fi

    else

      if [ $test_run ]
      then

        echo "lrsync: mkdir $dest_path_parm/$file_or_dir"

      else

        # If host is local, make the directory
        mkdir $dest_path_parm/$file_or_dir

      fi

    fi

    #  If this is a test run, echo some extra info.

    if [ $test_run ]
    then

      echo "lrsync: lrsync $next_r_level $rsync_options '$source_path_parm/$file_or_dir' '$dest_path_parm/$file_or_dir'"
      lrsync $next_r_level $rsync_options $source_path_parm/$file_or_dir $dest_path_parm/$file_or_dir $5

    else

      lrsync $next_r_level $rsync_options $source_path_parm/$file_or_dir $dest_path_parm/$file_or_dir $5

    fi

  done

  unset IFS # Just doing the right thing.

fi







-- 
Matt Olson
Platform Engineer
Kavi Corporation
                                                                                                                                                                                                      
Phone 503.813.9383
e-mail molson at kavi.com





More information about the rsync mailing list