[PATCH] Reduce memory usage

Jindřich Makovička makovick at gmail.com
Tue Sep 28 08:20:38 UTC 2021


On Mon, 27 Sep 2021 22:03:01 +0200
Johannes Altmanninger <aclopte at gmail.com> wrote:

> > On Sun, Sep 26, 2021 at 01:54:13PM +0200, Jindřich Makovička via
> > rsync wrote:
> > 
> > Applying the attached patch, which reduces the default allocation
> > to 32 pointers, and preallocates 32K pointers only for the main
> > file lists in send_file_list and recv_file_list, reduces the peak
> > memory usage in my case from 142MB to 12MB.  
> 
> The patch looks very reasonable from what I can tell.
> Out of curiosity, how did you measure peak memory?
> I used "/bin/time -v rsync", and already get (very small)
> improvements in maximum RSS when copying the rsync source tree :)

Thanks for feedback. I profiled the memory usage using valgrind, simply
by running

valgrind --tool=massif ./rsync -anx /usr /tmp/

This issue affects only large trees with lots of small subdirectories,
you will not see much improvement with just a couple of directories the
rsync tree has. The outputs below are generated from the sender process
output by ms_print .

Before the fix:

--------------------------------------------------------------------------------
Command:            ./rsync -anx /usr /tmp/
Massif arguments:   (none)
ms_print arguments: massif.out.199013
--------------------------------------------------------------------------------


    MB
127.6^                                         #                              
     |                                         #                              
     |                                         #                              
     |                                         #                              
     |                           @             #                              
     |                           @             #                              
     |                           @             #                              
     |                           @             #                              
     |                           @             #                         :    
     |                           @             #                         :    
     |                           @             #                        ::    
     |     :           :::       @     @       #                        ::    
     |     :           ::        @     @       #                        ::    
     |     :     :     ::        @     @  :    #           :            :: :  
     |     : :   :     :: : :    @    :@:::    #         : :            :: :  
     |     :::  :: ::  :: : :    @    :@: :    #        :: :            ::::  
     |  :  ::: ::::::  :: : :    @    :@: :    #  :    ::: :            ::::  
     |  :  ::::::::::  :: :::    @    :@: :    #  ::@  ::: :        :   ::::::
     |  : ::::::::::::::: :::  : @::  :@: :    #  ::@  :::@:        :   ::::@:
     |  : ::::::::::::::: :::::: @:::::@: @::::#::::@:::::@:::::@:::::@:::::@:
   0 +----------------------------------------------------------------------->Gi
     0                                                                   4.574

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
 47  2,853,846,986      133,806,904      133,270,491       536,413            0
99.60% (133,270,491B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->99.12% (132,629,814B) 0x1329CC: my_alloc (in /home/henry/build/rsync/rsync)
| ->95.80% (128,188,416B) 0x11383E: flist_expand (in /home/henry/build/rsync/rsync)
| | ->95.80% (128,188,416B) 0x117AAB: send_file_name (in /home/henry/build/rsync/rsync)
| | | ->95.80% (128,188,416B) 0x1184B4: send_directory (in /home/henry/build/rsync/rsync)
| | | | ->95.80% (128,188,416B) 0x118CD1: send1extra (in /home/henry/build/rsync/rsync)
| | | |   ->95.80% (128,188,416B) 0x1190F4: send_extra_file_list (in /home/henry/build/rsync/rsync)
| | | |     ->95.80% (128,188,416B) 0x12A17A: send_files (in /home/henry/build/rsync/rsync)
| | | |     | ->95.80% (128,188,416B) 0x1362C5: client_run (in /home/henry/build/rsync/rsync)
| | | |     |   ->95.80% (128,188,416B) 0x136F2D: start_client (in /home/henry/build/rsync/rsync)
| | | |     |     ->95.80% (128,188,416B) 0x137604: main (in /home/henry/build/rsync/rsync)
| | | |     |       
| | | |     ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| | | |     
| | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| | | 
| | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| | 
| ->02.16% (2,883,584B) 0x16E977: pool_alloc (in /home/henry/build/rsync/rsync)
| | ->02.16% (2,883,584B) 0x116E12: make_file (in /home/henry/build/rsync/rsync)
| |   ->02.16% (2,883,584B) 0x11747F: send_file_name (in /home/henry/build/rsync/rsync)
| |     ->02.06% (2,752,512B) 0x1184B4: send_directory (in /home/henry/build/rsync/rsync)
| |     | ->02.06% (2,752,512B) 0x118CD1: send1extra (in /home/henry/build/rsync/rsync)
| |     |   ->02.06% (2,752,512B) 0x1190F4: send_extra_file_list (in /home/henry/build/rsync/rsync)
| |     |     ->02.06% (2,752,512B) 0x12A17A: send_files (in /home/henry/build/rsync/rsync)
| |     |     | ->02.06% (2,752,512B) 0x1362C5: client_run (in /home/henry/build/rsync/rsync)
| |     |     |   ->02.06% (2,752,512B) 0x136F2D: start_client (in /home/henry/build/rsync/rsync)
| |     |     |     ->02.06% (2,752,512B) 0x137604: main (in /home/henry/build/rsync/rsync)
| |     |     |       
| |     |     ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |     |     
| |     ->00.10% (131,072B) in 1+ places, all below ms_print's threshold (01.00%)
| |     
| ->01.07% (1,425,928B) 0x116C90: make_file (in /home/henry/build/rsync/rsync)
| | ->01.07% (1,425,928B) 0x11747F: send_file_name (in /home/henry/build/rsync/rsync)
| |   ->01.07% (1,425,928B) 0x1184B4: send_directory (in /home/henry/build/rsync/rsync)
| |     ->01.07% (1,425,928B) 0x118CD1: send1extra (in /home/henry/build/rsync/rsync)
| |       ->01.07% (1,425,928B) 0x1190F4: send_extra_file_list (in /home/henry/build/rsync/rsync)
| |         ->01.06% (1,422,960B) 0x12A17A: send_files (in /home/henry/build/rsync/rsync)
| |         | ->01.06% (1,422,960B) 0x1362C5: client_run (in /home/henry/build/rsync/rsync)
| |         |   ->01.06% (1,422,960B) 0x136F2D: start_client (in /home/henry/build/rsync/rsync)
| |         |     ->01.06% (1,422,960B) 0x137604: main (in /home/henry/build/rsync/rsync)
| |         |       
| |         ->00.00% (2,968B) in 1+ places, all below ms_print's threshold (01.00%)
| |         
| ->00.10% (131,886B) in 1+ places, all below ms_print's threshold (01.00%)
| 
->00.48% (640,677B) in 1+ places, all below ms_print's threshold
(01.00%)

After the fix, the flist_expand allocs drop from 95% to 11%.

--------------------------------------------------------------------------------
Command:            ./rsync -anx /usr /tmp/
Massif arguments:   (none)
ms_print arguments: massif.out.198082
--------------------------------------------------------------------------------


    MB
7.345^                                               #                        
     |                                               #                        
     |                                               #                     :::
     |                                               #:      ::          ::: :
     |                                               #:      :       @@::: : :
     |                                            @::#:  ::::: ::::::@ ::: : :
     |                                          ::@: #:::: ::: : :: :@ ::: : :
     |                                    :     : @: #:: : ::: : :: :@ ::: : :
     |                                  :::::@::: @: #:: : ::: : :: :@ ::: : :
     |                                :@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |                         ::  :: :@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |                      @@ : @@: ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |                      @ :: @ : ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |                 @@:::@ :: @ : ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |              :::@ :: @ :: @ : ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |          @@::: :@ :: @ :: @ : ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |        ::@ ::: :@ :: @ :: @ : ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |     @@@: @ ::: :@ :: @ :: @ : ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |::  :@@@: @ ::: :@ :: @ :: @ : ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
     |: :::@@@: @ ::: :@ :: @ :: @ : ::@: :: @: : @: #:: : ::: : :: :@ ::: : :
   0 +----------------------------------------------------------------------->Gi
     0                                                                   4.582

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 33  3,246,882,750        7,702,128        7,138,545       563,583            0
92.68% (7,138,545B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->80.61% (6,208,772B) 0x1329EB: my_alloc (in /home/henry/build/rsync/rsync)
| ->59.56% (4,587,520B) 0x16E997: pool_alloc (in /home/henry/build/rsync/rsync)
| | ->59.56% (4,587,520B) 0x116E0F: make_file (in /home/henry/build/rsync/rsync)
| |   ->59.56% (4,587,520B) 0x11747C: send_file_name (in /home/henry/build/rsync/rsync)
| |     ->57.86% (4,456,448B) 0x1184B1: send_directory (in /home/henry/build/rsync/rsync)
| |     | ->57.86% (4,456,448B) 0x118CCE: send1extra (in /home/henry/build/rsync/rsync)
| |     |   ->57.86% (4,456,448B) 0x1190F1: send_extra_file_list (in /home/henry/build/rsync/rsync)
| |     |     ->57.86% (4,456,448B) 0x12A199: send_files (in /home/henry/build/rsync/rsync)
| |     |     | ->57.86% (4,456,448B) 0x1362E4: client_run (in /home/henry/build/rsync/rsync)
| |     |     |   ->57.86% (4,456,448B) 0x136F4C: start_client (in /home/henry/build/rsync/rsync)
| |     |     |     ->57.86% (4,456,448B) 0x137623: main (in /home/henry/build/rsync/rsync)
| |     |     |       
| |     |     ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
| |     |     
| |     ->01.70% (131,072B) 0x11A220: send_file_list (in /home/henry/build/rsync/rsync)
| |       ->01.70% (131,072B) 0x136275: client_run (in /home/henry/build/rsync/rsync)
| |         ->01.70% (131,072B) 0x136F4C: start_client (in /home/henry/build/rsync/rsync)
| |           ->01.70% (131,072B) 0x137623: main (in /home/henry/build/rsync/rsync)
| |             
| ->19.33% (1,489,110B) 0x116C8D: make_file (in /home/henry/build/rsync/rsync)
| | ->19.33% (1,489,110B) 0x11747C: send_file_name (in /home/henry/build/rsync/rsync)
| |   ->19.33% (1,489,110B) 0x1184B1: send_directory (in /home/henry/build/rsync/rsync)
| |     ->19.33% (1,489,110B) 0x118CCE: send1extra (in /home/henry/build/rsync/rsync)
| |       ->19.33% (1,489,110B) 0x1190F1: send_extra_file_list (in /home/henry/build/rsync/rsync)
| |         ->19.30% (1,486,142B) 0x12A199: send_files (in /home/henry/build/rsync/rsync)
| |         | ->19.30% (1,486,142B) 0x1362E4: client_run (in /home/henry/build/rsync/rsync)
| |         |   ->19.30% (1,486,142B) 0x136F4C: start_client (in /home/henry/build/rsync/rsync)
| |         |     ->19.30% (1,486,142B) 0x137623: main (in /home/henry/build/rsync/rsync)
| |         |       
| |         ->00.04% (2,968B) in 1+ places, all below ms_print's threshold (01.00%)
| |         
| ->01.70% (131,072B) 0x145EEC: alloc_xbuf (in /home/henry/build/rsync/rsync)
| | ->01.70% (131,072B) in 3 places, all below massif's threshold (1.00%)
| |   
| ->00.01% (1,070B) in 1+ places, all below ms_print's threshold (01.00%)
| 
->11.11% (856,064B) 0x132A32: my_alloc (in /home/henry/build/rsync/rsync)
| ->11.11% (856,064B) 0x11383C: flist_expand (in /home/henry/build/rsync/rsync)
|   ->06.81% (524,288B) 0x117F7A: add_dirs_to_tree (in /home/henry/build/rsync/rsync)
|   | ->06.81% (524,288B) 0x119370: send_extra_file_list (in /home/henry/build/rsync/rsync)
|   |   ->06.81% (524,288B) 0x12A199: send_files (in /home/henry/build/rsync/rsync)
|   |   | ->06.81% (524,288B) 0x1362E4: client_run (in /home/henry/build/rsync/rsync)
|   |   |   ->06.81% (524,288B) 0x136F4C: start_client (in /home/henry/build/rsync/rsync)
|   |   |     ->06.81% (524,288B) 0x137623: main (in /home/henry/build/rsync/rsync)
|   |   |       
|   |   ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
|   |   
|   ->04.31% (331,776B) 0x117AA8: send_file_name (in /home/henry/build/rsync/rsync)
|     ->04.31% (331,776B) 0x1184B1: send_directory (in /home/henry/build/rsync/rsync)
|       ->04.31% (331,776B) 0x118CCE: send1extra (in /home/henry/build/rsync/rsync)
|         ->04.31% (331,776B) 0x1190F1: send_extra_file_list (in /home/henry/build/rsync/rsync)
|           ->04.31% (331,776B) 0x12A199: send_files (in /home/henry/build/rsync/rsync)
|           | ->04.31% (331,776B) 0x1362E4: client_run (in /home/henry/build/rsync/rsync)
|           |   ->04.31% (331,776B) 0x136F4C: start_client (in /home/henry/build/rsync/rsync)
|           |     ->04.31% (331,776B) 0x137623: main (in /home/henry/build/rsync/rsync)
|           |       
|           ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
|           
->00.96% (73,709B) in 1+ places, all below ms_print's threshold (01.00%)

> On Mon, Sep 27, 2021 at 04:42:25PM +0200, Jindřich Makovička via
> rsync wrote:
> 
> > Reduce memory usage
> >  
> > Start only with 32 entries for the partial file lists, instead of
> > 32k. 
> 
> The log message could be a bit more detailed. You already mentioned
> that the big 32k allocation predates the recursive version, that's
> very useful information to add here.
> Maybe even explain why we change the first check from
> "flist->malloced != FLIST_START" to "flist->files".
> (Also I'd use "git send-email" to send patches inline but I'm not
> sure what's the convention here)
> 
> > diff --git a/flist.c b/flist.c
> > index 3442d868..0f7a64e6 100644
> > --- a/flist.c
> > +++ b/flist.c
> > @@ -303,11 +303,11 @@ static void flist_expand(struct file_list
> > *flist, int extra) if (flist->malloced < flist->used + extra)
> >  		flist->malloced = flist->used + extra;
> >  
> >  	new_ptr = realloc_array(flist->files, struct file_struct
> > *, flist->malloced); 
> > -	if (DEBUG_GTE(FLIST, 1) && flist->malloced != FLIST_START)
> > {
> > +	if (DEBUG_GTE(FLIST, 1) && flist->files) {  
> 
> Yep, the new check makes more sense because now it's more obvious
> that the debug message is only printed when flist->files is
> realloc'd, and not when it's allocated for the first time.
> 
> >  		rprintf(FCLIENT, "[%s] expand file_list pointer
> > array to %s bytes, did%s move\n", who_am_i(),
> >  		    big_num(sizeof flist->files[0] *
> > flist->malloced), (new_ptr == flist->files) ? " not" : "");
> >  	}
> > @@ -2184,10 +2184,11 @@ struct file_list *send_file_list(int f, int
> > argc, char *argv[]) if (preserve_hard_links && protocol_version >=
> > 30 && !cur_flist) init_hard_links();
> >  #endif
> >  
> >  	flist = cur_flist = flist_new(0, "send_file_list");
> > +	flist_expand(flist, FLIST_START_LARGE);
> >  	if (inc_recurse) {
> >  		dir_flist = flist_new(FLIST_TEMP,
> > "send_file_list");  
> 
> This probably wants an flist_expand(dir_flist, FLIST_START_LARGE),
> because dir_flist is a global.  I think the idea is that all
> flist_new() inside loops/recursive calls should start small, but
> lists that are only ever allocated once should stay at 32k.
> 
> >  		flags |= FLAG_DIVERT_DIRS;
> >  	} else
> >  		dir_flist = cur_flist;
> > @@ -2539,10 +2540,11 @@ struct file_list *recv_file_list(int f, int
> > dir_ndx) if (preserve_hard_links && !first_flist)
> >  		init_hard_links();
> >  #endif
> >  
> >  	flist = flist_new(0, "recv_file_list");
> > +	flist_expand(flist, FLIST_START_LARGE);
> >  
> >  	if (inc_recurse) {
> >  		if (flist->ndx_start == 1)
> >  			dir_flist = flist_new(FLIST_TEMP,
> > "recv_file_list");  
> 
> Same here I guess. Maybe we should add an "initial size" parameter to
> flist_new(), so it can call flist_expand() automatically?

Thanks for the hints. I will add an explicit flist_expand for the time
being.

> 
> >  		dstart = dir_flist->used;
> > diff --git a/rsync.h b/rsync.h
> > index 88319732..17f8700e 100644
> > --- a/rsync.h
> > +++ b/rsync.h
> > @@ -916,12 +916,13 @@ extern int xattrs_ndx;
> >  
> >  /*
> >   * Start the flist array at FLIST_START entries and grow it
> >   * by doubling until FLIST_LINEAR then grow by FLIST_LINEAR
> >   */
> > -#define FLIST_START	(32 * 1024)
> > -#define FLIST_LINEAR	(FLIST_START * 512)
> > +#define FLIST_START	(32)
> > +#define FLIST_START_LARGE	(32 * 1024)
> > +#define FLIST_LINEAR	(FLIST_START_LARGE * 512)  
> 
> Probably these should remain aligned (I'm assuming tab has width 8)

Point taken.

Regards,
-- 
Jindrich Makovicka



More information about the rsync mailing list