Thoughts on dos cr/lf conversion
David Collier-Brown
davecb at Canada.Sun.COM
Wed Nov 11 17:46:30 GMT 1998
Andrew Tridgell wrote:
> yep, others have thought about it, and there is even a version of
> Samba-1.9.02 around that has it fully implemented!
[snip]
> I've put a copy of that implementation in ~samba-bugs/samba-1.9.02aTI/
> on samba.anu.edu.au. The work was done by Dan Lydick at Texas
> Instruments (lydick at cvpsun01.csc.ti.com).
(http://samba.anu.edu.au/~samba-bugs drew a "Not Found"
diagnostic, and there isn't a pub/damba-bugs on the ftp
server)
Pending reading the code, I'd like to warn about
the "read first block" heuristic. If one just looks
for 8-bit characters, you occasionally mistake a
binary file for text. A stronger test is to look
for the eighth bit still, but note if there was at least
one \n found in the block. If you don't find at least
one \n in a reasonable blocksize, you're looking at something
with **remarkably** long lines, and you'd best look farther.
One scenario that works is to use a heuristic to make
a tentative decision based on the first few KB, but
arrange to check the whole file as part of getting ready
to provide it in translated form. This pulled an old DMC
program I wrote from ``often wrong'' up to ``never seemed
to fail''. It actually guessed wrong a LOT, but the user
couldn't tell unless he read the logs...
--dave
--
David Collier-Brown, | Always do right. This will gratify some people
185 Ellerslie Ave., | and astonish the rest. -- Mark Twain
Willowdale, Ontario | http://java.science.yorku.ca/~davecb
Home: (416) 223-8968 Work: (905) 477-0437 Email: davecb at canada.sun.com
More information about the samba-technical
mailing list