[jcifs] Re: SMB URL encoding/decoding

Allen, Michael B (RSCH) Michael_B_Allen at ml.com
Mon Feb 25 09:59:51 EST 2002


> -----Original Message-----
> From:	rjw [SMTP:rob at wygand.com]
> 
> Guys,
> 
> All of this talk about the url decoding, encoding got me to thinking. 
> (This is practice now, not theory).
> 
> We're creating an AWFUL lot of String objects in Java that get tossed 
> away. First, SmbFile takes an SmbURL String as a parameter. I've built 
> up this SmbURL from all the little Strings I have lying around: 
> username, password, server, path, etc. And the first thing that SmbFile 
> does? breaks that SmbURL right back down into it's components.
> 
	Not exactly. There's no doubt, the use of Strings is abominable (that's
	somewhat inherent in Java programming because there's no pointer
	arithmetic) but if you look at SmbFile you'll see there is one constructor that
	takes an SmbFile and derives a new one by in large part through simply
	assignment rather than creating additional copies of String objects or running
	it though the SmbURL parser again. It's just not public. It's used internally to
	create sub-SmbFiles from a directory listing. This is done for exactly the
	reasons you described. What's more time consuming; creating one SmbFile
	manually, or 100 from a directory listing manually? This is one of those an Ant
	vs. Elephant cases. We squashed that Elephant with listFiles() a while back.

> Then, add in encoding. I do the following in my code:
> 
>    StringBuffer encodedPath = new StringBuffer();
>    StringTokenizer st = new StringTokenizer (smburl, "/");
>    while (st.hasToken()) {
>      st.append (URLEncoder.encode (st.nextToken()));
>    }
>    smburl = encodedPath.toString();
> 
	This will go away. It will not be necessary to encode '@', '+', or any other
	character except for '%' because it's the escape sequence identifier. In the
	password field it will only be absolutely necessary to encode '/' and '%'. That's
	it. You will be able to simply convert the smburl to char[] and scan for these few
	characters and then bail out and decode it properly if you run into one.

	Incidentally, your code does not consider that '/', '+', and '%' might appear in
	the user info (password) field.

> That's a lot of String creates... would it be possible to add a second 
> (third? another. =) constructor to SmbFile that takes the required 
> components:
> 
>    public SmbFile (String server, String user, String pass,
>                    String domain, String path);
> 
	I don't see why this is really necessary for reasons described above. I would
	rather remove the cause of the problem rather than add something to mask it.

> Or whatever all of the required fields are (I'm not thinking entirely 
> straight today.. too much bourbon last night). That would then allow you 
> to directly assign them, Mike, instead of having to do all that parsing, 
> thereby saving lots of cycles for object allocation and gc.
> 
	There's definitely some improvement that can be done here. I don't know if
	you've been following the other posts in this thread but the conclusion was
	that the current SmbURL handling is just wrong. The correct method is
	actually much easier. As a result I think the code will be more streamlined. It
	should probably manipulate char[] rather than using charAt and substring
	which will prevent a few unnecessary String objects from being created for
	each URL parse.

	We might also consider the intern() method of String. The intern() method is
	like a little String cache. If you call intern() on a String, it will be placed in a
	"pool" if it was not already (All literal strings and string-valued constant
	expressions are interned) and return the *same* String object. This means
	that if have Strings that you know will appear over and over again you can
	intern() them and test for equality with the == operator rather than using
	equals(). I don't know if we're going to be comparing certain fields enough or if
	the strings would be long enough to be worth it but it's something to
	consider. 

> I'm not saying we should do away with the SmbURL constructor -- it's 
> very handy -- but maybe we should support both.
> 
	There is no SmbURL constructor. SmbURL.parseSmbURL is static at the
	moment. But it might just disappear entirely.

	Mike





More information about the jcifs mailing list