[linux-cifs-client] iocharset iso8859-1 vs cp1252

bugzilla KevinH bugzilla.kevinh at gmail.com
Tue Oct 17 00:55:21 GMT 2006


On 10/16/06, simo <idra at samba.org> wrote:
> On Mon, 2006-10-16 at 16:56 -0500, bugzilla KevinH wrote:
> > It is my understanding that although very similar, there are subtle
> > differences
> > ( http://mail.python.org/pipermail/xml-sig/2001-May/005509.html )
> > between iso8859-1 and Windows-1252, especially in the area of the C1
> > Control Code set
> > ( http://en.wikipedia.org/wiki/C0_and_C1_control_codes and
> > http://en.wikipedia.org/wiki/Windows-1252 ).
> >
> > I have a file on a Windows 2003 server which I believe is using the
> > 0x0096 EN-DASH character in the filename, and I am finding this file
> > impossible to read from a linux box using mount -t cifs -o
> > iocharset=iso8859-1.  I don't seem to be able to assign
> > iocharset=cp1252 which is what I want.  It appears to me that a cp1252
> > doesn't even exist ( http://lxr.linux.no/source/fs/nls/?v=2.6.18 ).
> > Is there some reason for this?  Am I completely mistaken in all of
> > this, and there is really some other silly reason I cannot read the
> > file?
>
> Why don't you just use utf8 ?
> Windows uses utf16 so utf8 is the best match.
>

Well, I had a reason, but it turns out it wasn't a very good one.

The reason I wasn't using utf8 was because I was reading a file
generated by a program which I have no control over, which contained a
list of files on the windows server for me to copy.  That file is in
CP1252.

But you are absolutely right, rather than depending on the filesystem
drivers to handle the conversion for me, I can just as easily convert
the CP1252 strings I read out of the file to UTF8.

I still think there is a gap in coverage of iocharsets...but my
problem is fixed, so I have no good reason to ask for a CP1252
iocharset anymore.

Thanks Simo.

-Kevin


More information about the linux-cifs-client mailing list