creation date and OSX [performance]

Vitorio Machado v.machado at permanence-informatique.fr
Sat Feb 2 08:24:16 GMT 2008


Hi,

I think it's OK to run getattrlist once assuming that there are  
creation date. My arguments:

1) First of all, most of Macs run only on HFS+, some exceptions will  
be those running under UFS (only saw one person talk about this for a  
server, but I think it's very rare) or that have volumes under FAT/ 
NTFS filesystem to be compatible with Windows.

2) From getattrlist manpage:
     The getattrlist() function is only supported by certain volume  
format
      implementations.  For maximum compatibility, client programs  
should use
      high-level APIs (such as the Carbon File Manager) to access  
file system
      attributes.  These high-level APIs include logic to emulate  
file system
      attributes on volumes that don't support getattrlist().

In other words, if we really care about compatibility, we should use  
Carbon system call that tests it for us and make the dirt work for  
us. May be a good idea.

3) Also from getattrlist manpage:
      Not all volumes support all attributes.  See the discussion of
      ATTR_VOL_ATTRIBUTES for a discussion of how to determine  
whether a par-
      ticular volume supports a particular attribute.

I don't really know what discussion it refers, but I suppose it  
should be on the Apple developer site http://developer.apple.com . I  
didn't have the time to look, yet.

4) Always from getattrlist manpage:

COMPATIBILITY
      Not all volumes support getattrlist().  The best way to test  
whether a
      volume supports this function is to simply call it and check  
the error
      result.  getattrlist() will return ENOTSUP if it is not  
supported on a
      particular volume.

I suppose that a getattrlist on unsupported volumes will return this  
error, if it works like I'm expecting, we should only catch it and  
that's it.

Also note that (always from getattrlist manpage):

      The getattrlist() function has been undocumented for more than  
two years.
      In that time a number of volume format implementations have  
been created
      without a proper specification for the behaviour of this  
routine.  You
      may encounter volume format implementations with slightly  
different be-
      haviour than what is described here.  Your program is expected  
to be tol-
      erant of this variant behaviour.

So, there are some clues to be checked. I will probably check  
something if I have some time, but I already engaged myself with 10.3  
compatibility and I had unexpected personal problems. So I can't say  
when I would be able to give some time for those projects.

Best regards,

Vitorio

Le 2 févr. 08 à 07:18, Mike Bombich a écrit :

> Looking at this patch from a performance perspective, it appears  
> that getattrlist is called twice for every file:
>
> 23:57:24.341  lstat                                  00-basic- 
> permissions/owned-by- 
> root                                                                   
>                                     0.000011   rsync
> 23:57:24.341  listxattr                              00-basic- 
> permissions/owned-by- 
> root                                                                   
>                                     0.000006   rsync
> 23:57:24.341  getattrlist                            00-basic- 
> permissions/owned-by- 
> root                                                                   
>                                     0.000006   rsync
> 23:57:24.341  getattrlist                            00-basic- 
> permissions/owned-by- 
> root                                                                   
>                                     0.000005   rsync
> 23:57:24.341  lstat                                  00-basic- 
> permissions/owned-by- 
> www                                                                    
>                                     0.000008   rsync
> 23:57:24.341  listxattr                              00-basic- 
> permissions/owned-by- 
> www                                                                    
>                                     0.000005   rsync
> 23:57:24.341  getattrlist                            00-basic- 
> permissions/owned-by- 
> www                                                                    
>                                     0.000006   rsync
> 23:57:24.341  getattrlist                            00-basic- 
> permissions/owned-by- 
> www                                                                    
>                                     0.000005   rsync
>
>
> The first time it is called by
>
> sys_llistxattr()
> 	getCreationTime()
>
>  -- basically we determine if the file has a creation date.  If it  
> does, then add the CRTIME_XATTR string to the xattr list.  The  
> creation date isn't actually cached here, though.  To get the  
> actual creation date, getattrlist is called again via xattrs.c
>
> get_xattr_data()
> 	sys_lgetxattr()
> 		get_crtime_xattr()
> 			getCreationTime()
>
>
> The performance hit is significant, and I'm wondering how safe it  
> is to simply assume that every file has a creation date (given that  
> this section is wrapped in #if HAVE_OSX_XATTRS), therefore, drop  
> the first getCreationDate and add the CRTIME_XATTR string to the  
> xattr list by default.  For example:
>
> // sysxattrs.c:150
> ssize_t sys_llistxattr(const char *path, char *list, size_t size)
> {
> 	ssize_t ret = listxattr(path, list, size, XATTR_NOFOLLOW);
> 	if (ret < 0)
> 		return ret;
> //	if (getCreationTime(path) != NULL) {
> 		ret += sizeof CRTIME_XATTR;
> 		if (list) {
> 			if ((size_t)ret > size) {
> 				errno = ERANGE;
> 				return -1;
> 			}
> 			memcpy(list + ret - sizeof CRTIME_XATTR,
> 			       CRTIME_XATTR, sizeof CRTIME_XATTR);
> 		}
> //	}
> 	return ret;
> }
>
>
> Or would this bomb out running on MOSX with a non-HFS volume as the  
> source?  Or is there a better way to avoid this call (e.g.  
> determine the underlying filesystem)?
>
> Mike
>
> On Dec 1, 2007, at 10:45 PM, Robert DuToit wrote:
>
>> Hi,
>>  I've been using rsync (OSX Tiger now Leopard) to backup my home  
>> folder daily using -a -H -A -X link-dest=dir to make incremental  
>> backups. There was a problem though since many files especially  
>> images, movies etc would be recopied each time instead of creating  
>> hard links. I have been testing the pre5 release and found that it  
>> seems to make hard links correctly for all files. I am hoping  
>> rsync 3.0 can replace the Apple version which has been so flawed.
>>
>> I tried the osx-create-time.diff patch too and it works but it  
>> took twice as long to copy my home folder as without and ground to  
>> a halt the last time. I know the creation date issue is somewhat  
>> "fringe" for rsync but it does matter to a lot of OSX folk so I  
>> don't know if there is any way to speed it up. I've made some  
>> small backup wrapper applications for people and they always want  
>> the creation date.....
>>
>> I noticed the rsync version that is used now in Carbon Copy Cloner  
>> is pretty "clean" with meatdata and saves the creation date and is  
>> very fast.....  Just some thoughts-I don't have any experience  
>> with this code so can't help in that way.  Thanks, Rob--
>> To unsubscribe or change options: https://lists.samba.org/mailman/ 
>> listinfo/rsync
>> Before posting, read: http://www.catb.org/~esr/faqs/smart- 
>> questions.html
>>
>
> -- 
> To unsubscribe or change options: https://lists.samba.org/mailman/ 
> listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart- 
> questions.html

-------------- next part --------------
HTML attachment scrubbed and removed


More information about the rsync mailing list