[clug] A plain text file that isn't, how do I convert it?

Brad Hards bradh at frogmouth.net
Fri Oct 31 06:03:20 GMT 2008


On Friday 31 October 2008 03:42:55 pm Alex Satrapa wrote:
> On 31/10/2008, at 14:26 , Paul Warren wrote:
> > <97> instead of - (but should be an em dash I beleive)
> > <93> and <94> instead of "
>
> 0x97 or 0222 representing — (em-dash)
> 0x93 representing “ (left double quote)
> 0x94 representing ” (right double quote)
>
> That is Windows Latin 1, to the best of my knowledge (as opposed to
> ISO Latin 1). I shudder to think what everyone's mail readers have
> done with those symbols I've inserted ;)  I expect that my Apple Mail
> client is using UTF-8.
They look fine using KMail :-) Presumably Qt is converting the specified 
format into something sane either using iconv or some internal codec.

Speaking of iconv, if you know what encoding the file is using (or are willing 
to guess) and you know what format you actually want, then you might find the 
iconv(1) command line utility of use. For example:
iconv --from-code WINDOWS-1252 --to-code UTF8 < input.txt > output.txt

Brad




More information about the linux mailing list