performance with >50GB files

René Rebe rene at exactcode.de
Tue Jan 10 18:46:19 GMT 2006


Hi,

in reply to my previous post, I can reproduce the issue locally here.

I produced a 50146750688 bytes /home/test.dat out of cat'ing a lot of data
files together (needed some input data ...). The initial rsync take over an hour
saturating the 100mbit ethernet. I then used shred on the first GB of the file and
cat'ed some more data to the end and rerun rsync:

rsync -arvPe ssh 192.168.2.45:/home/test.dat test.dat

the sending Athlon XP 2500+ is saturated while the receiver gets:

    619708416   1%    1.03MB/s   13:02:58

of course the dual cpu ppc64 receiver is idling waiting for any data to arrive.

The sender:

22952 root      18   0 12988  11m  624 D 95.0  2.2   5:24.75 rsync              

Oprofile shows:

samples  %        symbol name
9273739  87.1946  match_sums
633910    5.9602  map_ptr
459817    4.3233  mdfour64
217974    2.0495  copy64
32467     0.3053  mdfour_update
11310     0.1063  get_checksum2
2206      0.0207  writefd_unbuffered
871       0.0082  sum_update
638       0.0060  writefd
522       0.0049  mplex_write
306       0.0029  compare_targets
256       0.0024  io_flush
253       0.0024  matched
241       0.0023  send_token
229       0.0022  msg_list_push
198       0.0019  mdfour_tail
128       0.0012  write_int
122       0.0011  readfd_unbuffered
105      9.9e-04  readfd
94       8.8e-04  send_files
69       6.5e-04  mdfour_begin
63       5.9e-04  mdfour_result
37       3.5e-04  write_buf
33       3.1e-04  .plt
25       2.4e-04  copy4
25       2.4e-04  read_int
19       1.8e-04  read_buf
11       1.0e-04  read_timeout
10       9.4e-05  get_checksum1
1        9.4e-06  _fini
1        9.4e-06  clean_flist
1        9.4e-06  deflate_fast
1        9.4e-06  parse_arguments

So far ... I continue to analyze the issue, maybe some rsync developer already comes
to a conclusionn while I start reading thru the source.

Best regards,

On Monday 09 January 2006 23:38, René Rebe wrote:

> today we had a performance issue transfering a big amount of data where
> one file was over 50GB. Rsync was tunneled over SSH and we expected the data 
> to be synced within hours. However after over 10 hours the data is still not 
> synced ... The sending box has rsync running with 60-80 % CPU load (2GHz
> Pentium 4) while the receiver is nearly idle.
> 
> So far I had no acces to the poblematic setup but I will have to analyze this
> soon. I would like to ask beforehand if there are known performance hits
> syncing such huge files?

-- 
René Rebe - Rubensstr. 64 - 12157 Berlin (Europe / Germany)
            http://www.exactcode.de | http://www.t2-project.org
            +49 (0)30  255 897 45


More information about the rsync mailing list