[clug] A plain text file that isn't, how do I convert it?

Peter Barker pbarker at barker.dropbear.id.au
Fri Oct 31 03:41:21 GMT 2008


On Fri, 31 Oct 2008, Paul Warren wrote:

> \222 for -
> \223 and \224 for "

> What I'd like to do is convert all these wierd characters to the
> standard, normal, plain old easy to recognise characters, so that it
> looks like normal text in less, emacs, and other text viewers.

If you can enumerate the problems, then:
tr '\222' '-' < file > file.new

> So, can anyone tell me just what is going on here? Is it some mismatch
> between character encoding and fonts or what, I'm a bit confusede.

I think you've been Microsoftified.  The " thing I've seen before; they 
are "smart quotes".  They're used for changing "This is a quote" to ``This 
is a quote'' (but using a single character, obviously).

Try converting from windows-1252.  My guess is it won't work, but you 
could try :)

recode CP1252..ascii < file > file.new

Here's a URL for you: 
http://www.oreillynet.com/pub/a/oreilly/news/oram_0100.html

> Paul Warren

Yours,
-- 
Peter Barker                          |   Programmer,Sysadmin,Geek.
pbarker at barker.dropbear.id.au	      |   You need a bigger hammer.
:: It's a hack! Expect underscores! - Nigel Williams


More information about the linux mailing list