[PATCH] fix util/rfc1738.c

Douglas Bagnall douglas.bagnall at catalyst.net.nz
Thu Feb 22 00:46:47 UTC 2018


On 22/02/18 11:22, Kamen Mazdrashki via samba-technical wrote:
> Hi Douglas,
> 
> Not related to your patches, but just to thank you for your
> mails -> I am having tons of fun reading them :)
> 
> Keep up your awesome work!

Thanks Kamen,

This whole rfc1738 thing is the result of me avoiding doing what I am
supposed to be doing, so I'm hoping you'll now notice relative silence
from me.

cheers,
Douglas


> Cheers,
> Kamen
> 
> 
> On 02/20/2018 10:59 PM, Douglas Bagnall via samba-technical wrote:
>> We have a file called rfc1738.c that implements a few functions for
>> handling URL-style percent-encoding. It has a few unfortunate issues
>> and shows little sign of having read the eponymous RFC, which in any
>> case was obsoleted by RFC 3986 in 2004.
>>
>> This set of patches introduces tests and fixes many problems,
>> simplifying code along the way. Unused functions are removed, bizarre
>> encodings are avoided, and error detection is introduced.
>>
>> Percent-encoding involves escaping bytes as a percent symbol followed
>> by two hex digits; for example "%" can be encoded as "%25", and "A" as
>> "%41". RFC 1738 and RFC 3986 agree on this but differ on the canonical
>> form of an escaped string: while "%" is necessarily encoded and "A" is
>> normally not, "!" ("%21") is normally encoded in RFC 3986 but not in
>> RFC 1738. The proposed patch follows the newer convention; our current
>> behaviour is idiosyncratic. More serious is an apparent confusion with
>> printf syntax, where we treat "%%" as a valid way of escaping "%". It
>> is not. For decoding we are using sscanf(), which accepts strings like
>> "% a" and "%-5" as escapes.
>>
>> Errors in an encoded string have been silently ignored by the decoding
>> routine. The proposed solution provides an API that may look odd:
>> `char * rfc1738_unescape(char *s)` returns NULL on error, otherwise a
>> pointer to the end of the string (like strtok, but for other reasons).
>> Consider this kind of thing:
>>
>>   char *s = "innocent message%00secret coda";
>>   char *e = rfc1738_unescape(s);
>>   if (e == NULL) {
>>       //error
>>   }
>>   /* s is now "innocent message\00secret coda" */  printf("%s", s);
>>   /* "innocent message" */  so if this might matter, you can go
>>
>>   if (e == NULL || e - s != strlen(s)) {
>>       //error
>>   }
>>
>> There are currently a number of encoding functions, but only one of
>> them is used (or indeed usable), so the patch removes the others.
>>
>> How did we get in this mess? At the top of rfc1738.c:
>>
>>> * This file imported from the Squid project.  [...]
>>> * - Andrew Bartlett Oct-2009
>> So, what were Squid thinking? Well, they were in a bit of flux. Some
>> commits following our October 2009 fork:
>>
>> 2009-11-04 Improve %nn parser
>> 2009-11-04 regression in %nn change, failed to handle trailing %
>> 2009-11-05 Another regression in %nn cleanup
>> 2009-11-10 Polish rfc1738 library code. Add cppunit tests.
>> 2010-03-20 Better fix for fromhex()
>> 2010-03-20 Cleanup rfc1738.c
>> 2010-03-20 Correct buffer calculation for snprintf
>> 2010-04-09 Bug 2899: Restore lost rfc1738_unescape() data type
>> 2011-03-05 RFC 1738 encoder upgraded action flags
>> 2011-08-28 Bug 3295: broken escaping in rfc1738_do_escape
>>
>> Which is *another* good argument for keeping your git master in a good
>> state: someone could fork any bit of your most terrible code at any
>> moment.
>>
>> As it happens (and this why I was looking here), this is not the worst
>> fork-and-forget percent-encoding library I have come across. Possibly
>> because the format is so obviously simple and foolproof, nobody ever
>> bothers writing tests or reading the spec or even writing their own
>> routines, preferring to copy them from the last person who refused to
>> think about it.
>>
>> As a side-effect or preliminary step, these patches move some routines
>> for fast safe accurate hex-digit parsing out of librpc/ndr/uuid.c into
>> lib/util. I hope to use them in other places too.
>>
>> Douglas
> 
> 




More information about the samba-technical mailing list