file statistics collection using stat(2) data obtained by rsync

Hugo Connery hmc at ER.DTU.DK
Sun Sep 16 09:13:18 GMT 2007


Hi Matt and list,

Yes, uid/access time based statistics gathering is quite orthogonal to
rsync's motivation.  But, rsync, as it backs up my data, it has access
to all the statistics I need, so why not piggy back the stats gathering
on rsync as a matter of efficiency?

But, perhaps orthogonal extensions breaks one of the fundamental rules:
do one thing well.

Regards,

  Hugo
________________________________________
From: hashproduct at gmail.com [hashproduct at gmail.com] On Behalf Of Matt McCutchen [hashproduct+rsync at gmail.com]
Sent: Saturday, September 15, 2007 17:32
To: Hugo Connery
Cc: rsync
Subject: Re: file statistics collection using stat(2) data obtained by rsync

Note: In the future, please Cc the rsync list in your responses so
that others can help you if I become unavailable and so that future
users can refer to your message.

On 9/15/07, Hugo Connery <hmc at er.dtu.dk> wrote:
> I want to obtain summary statistics grouped by file owner and access times.    i.e at the
> end of the operation report for each user the number of bytes that the user has stored
> that has been accessed within a group of time periods (last 3 months, 3-6 months, 6-12 months etc.)
> This basically forms a table of data sizes.

The calculation of these statistics appears to be completely
orthogonal to what rsync is doing (copying files).  Unless keeping the
number of stat(2) calls low is critical in your scenario, I think it
would be much easier and more appropriate to write a separate script
to calculate the statistics.

Matt


More information about the rsync mailing list