Rsync 3.0.0pre8 and Mac OS X

Matt McCutchen matt at mattmccutchen.net
Thu Jan 24 04:28:04 GMT 2008


On Wed, 2008-01-23 at 16:01 +0100, Rudolf E. Reiber wrote:
> I tried Rsync 3.0.0pre8 on my mac running os X 10.5.
> 
> I was very pleased about the --iconv feature, as i have to sync some  
> LINUX-machines and I had really trouble with some filenames.
> But I found one strange thing in connection with the mac.
> 
> First of all, the translation between the LINUX ISO-8859-15 and the  
> mac ut-8 works (nearly) perfect.
> 
> As I live in Germany, we have often filenames containing special  
> characters (Umlaute like äöuÄÖÜ).
> And all the filenames look perfect on my mac.
> 
> But whenever I run rsync again, all the files containing one of this  
> special character in the name are deleted and copied again.
> And these are quite a lot.
> 
> I found the reason for this behavoiur.
> Let me explain it with the example of the letter ä (&uuml) in HTML.
> On the LINUX machines running utf-8 the ä is coded as $C3A4 which is  
> in utf-8 equal to the character E4. The ä occupies in that way 2 bytes.
> 
> I was very astonished, when I copied a mac-filename, pasted into a  
> texteditor and looked at the file:
> 
> In the mac-filename the letter ä is coded as: $61CC88, which in utf-8  
> means the letter "a" followed by a $0308. (Combining diacritical marks)
> So the Mac combines the letter a with the two points above it instead  
> using the E4 letter
> Now the things are clear: The filenames are different, in spite of  
> looking equally.

Yup.  The Mac HFS+ filesystem automatically decomposes Unicode
characters in the stored versions of filenames, which confuses a number
of programs, including rsync and git.  A flamewar about whether to blame
the problem on HFS+ or the application has been running on the git list
for a week now.

> A question to the developers: do you see any solution to this problem?  
> Perhaps a --icont=utf8mac, iso885915 ?

Precisely.  We need an iconv encoding name for "the form of UTF-8 that
the Mac likes", and none of the existing encodings in the iconv on my
computer fit the bill.  Another option is store the umlaut-named files
on a filesystem other than HFS+ on the Mac.

Matt



More information about the rsync mailing list