Thoughts on dos cr/lf conversion

David Collier-Brown davecb at Canada.Sun.COM
Wed Nov 11 17:46:30 GMT 1998


Andrew Tridgell wrote: 
> yep, others have thought about it, and there is even a version of
> Samba-1.9.02 around that has it fully implemented! 
[snip]
> I've put a copy of that implementation in ~samba-bugs/samba-1.9.02aTI/
> on samba.anu.edu.au. The work was done by Dan Lydick at Texas
> Instruments (lydick at cvpsun01.csc.ti.com).

	(http://samba.anu.edu.au/~samba-bugs drew a "Not Found"
	diagnostic, and there isn't a pub/damba-bugs on the ftp
	server)

	Pending reading the code, I'd like to warn about
	the "read first block" heuristic.  If one just looks
	for 8-bit characters, you occasionally mistake a
	binary file for text.  A stronger test is to look
	for the eighth bit still, but note if there was at least
	one \n found in the block. If you don't find at least
	one \n in a reasonable blocksize, you're looking at something 	
	with **remarkably** long lines, and you'd best look farther.

	One scenario that works is to use a heuristic to make
	a tentative decision based on the first few KB, but 
	arrange to check the whole file as part of getting ready
	to provide it in translated form.  This pulled an old DMC
	program I wrote from ``often wrong'' up to ``never seemed 
	to fail''. It actually guessed wrong a LOT, but the user 
	couldn't tell unless he read the logs...

--dave
-- 
David Collier-Brown,  | Always do right. This will gratify some people
185 Ellerslie Ave.,   | and astonish the rest.        -- Mark Twain
Willowdale, Ontario   | http://java.science.yorku.ca/~davecb
Home: (416) 223-8968  Work: (905) 477-0437 Email: davecb at canada.sun.com


More information about the samba-technical mailing list