[jcifs] Equivalent for java.io.File.getCanonicalPath()

Sat Jan 31 05:03:05 GMT 2004

What version of jCIFS are you using?

Julian Reschke said:
> Michael B Allen wrote:
>
>> Julian Reschke said:
>>
>>>>>(new File("A")).getCanonicalPath()
>>>>>(new File("a")).getCanonicalPath()
>>>>
>>>>
>>>>It changes the case?!
>>>
>>>Yes.
>>
>>
>> Well I don't think I could justify such a thing. I wonder what their
>> reasoning was for that.
>
> To allow clients to determine whether two File objects refer to the same
> storage object, and to obtain it's canonical name.

I'm not sure I beleive that but I don't subscribe to it regardless. The
client should change the case of a path and I never knew canonicalizing a
path meant it should flatten case. If the user wants to determine if two
paths refer to the same object they can do a caseless campare of the two
conanicalized paths. JCIFS provides the equals() method just for this
purpose. It will do a caseless comparison and campares the server
component by IP rather than name.

>>>No no no, *don't* let the client do it. It is absolutely important that
>>>what is returned here is what *the server* says is the canonical name.
>>>Missing some kind of normalization (trailing dots? Unicode
>>>normalization?) may cause the client to think that two things are
>>>different, leading to potential security issues.
>>
>> I don't understand this. Canonicalizing a path *is* normalizing.
>> Trailing
>> dots may be removed during canonicalization although JCIFS does not
>> permform any Unicode normalization. Please explain further if you think
>> this behavor is incorrect.
>
> The main point here is that it's the server's job to specify what the
> canonical name is. If a client takes over that job, and doesn't do
> *exactly* the same thing as the server, there is a problem. (Because
> then the client will claim that there are two distinct objects when
> there aren't).

I'm not certain what you're explaining here in general but the jCIFS
client should perfrom canoncialization like any other client or server.
Canonicalization is just factoring out '..'. That's it. Any other
components shouldn't have any effect on how paths are resolved on the
server. There should be no discrepency between how the client and server
resolve paths. If there is that is a bug in the client or server and in
either case jCIFS will be coded to emulate Windows 2000. I would not be
surprised to learn MS servers handle some strange cases in unexpected ways
(read -- has bugs) but I can only deal with that when it happends. Anyway,
what's the alternative?

>>>It doesn't.
>>
>>
>> Can you give me an example of two URLs that refer to the same file but
>> do
>> not equate equals() == true?
>
> I think this already happens when two names differ only in
> upper/lowercase, such as "smb://foo/bar" and "smb://foo/Bar". You may
> also want to try "smb://foo/bar " (on NTFS).

This is not true. Please do not post falsehoods here. If you want to
report a bug then please take the time to check that you can reproduce it.

> Speaking of which,
>
> 	(new SmbFile("smb://foo/bar)).renameTo(new SmbFile("smb://foo/Bar"))
>
> currently seems to cause a failure when the remote system is NTFS
> (exception: file not found).

1) A directory must end with a '/' (although in this case it wouldn't
matter).
2) You cannot rename a share with jCIFS. That would require RPC which we
currently do not support.
3) Is your NTFS filesystem case sensitive? If not, I'm not surprised you
got an error -- the two paths refer to the same object.

>
>>>Speaking of which, I did some more testing and have a few more
>>> questions:
>>>
>>>1) Caching of lastmodified(): after new SmbFile(), new
>>>SmbFileOutputStream(), .close, lastmodified() on the SmbFile still
>>>returns 0.
>>
>>
>> I don't understand this. What does SmbFileOutputStream and close() have
>> to
>> do with lastModified? Do you have a small code example that illustrates
>> the problem?

This could definately be a possibility. It sounds like you are using
0.8.0b. The setLastModified method as well as a lot of serXxx methods were
just added in that version and I don't recall if I considered the
attribute cache completely. Just from glancing at the code it does indeed
set attrExpiration = 0 so I would appreciate it if you would post a
concrete example that illustrates the problem because it's not obvious to
me how that condition can occur.

Also, beware that I beleive lastModified will return 0 for directories on
Win95/98/ME under certain conditions. If you are not using these operating
systems and lastModified should never return 0.

> I'll try. In the meantime, just consider the following:
>
> - create new file, set lastmodified to something in the past
> - write to the file
> - check lastmodified
>
> In my tests laatmodified did not seem to change immediately (setting
> jcifs.smb.client.attrExpirationPeriod to 0 seemed to fix that).

Mmm, I seem to remember something like this happening before but I thought
I fixed it. I'll look into this.

>>>3) URL handling: according to RFC2396, URLs never ever contain blanks or
>>>non-ASCII characters. However, jCifs accepts those, returns them in this
>>>format and the documentation even uses names with blanks in examples.
>>>This probably should be cleaned up. I assume that the support can't be
>>>removed due to backward compatibility reasons?
>>
>>
>> No. JCIFS must support all characters supported by SMB paths or it just
>> won't work. There are few problems such as '#' in a path will be
>
> Of course, however the question is *how* to do that. URLs by definition
> contain only ASCII characters, so the SMB URL draft should specify how
> non-ASCII characters (and reserved ASCII characters) should be encoded
> in SMB URLs (even if the code continues to accept non-ASCII characters).

The SMB URL draft will follow the W3C IRI initiative that specifies how to
escape non-ascii characters in a URL:

  http://www.w3.org/TR/charmod/#sec-URIs

JCIFS currently does not support this because we use the java.net.URL
class for almost all URL handling. We have to to support the Java URL
protocol handler. In a future version of jCIFS the java.net.URI class will
be used which automatically decodes URL escape sequences. However we will
always support using Unicode URLs without escape sequences even if it
means violating RFC2396, the IRI spec, and the SMB URL draft if that odd
scenario should arise.

>> interpreted by the java.net.URL parser as a reference. Otherwise, we try
>
> That seems correct. If you need a "#" in a URL path segment, you need to
> escape it.

But we don't decode escapes so it (and '%' I believe) are the cause of
some grief.

>> to comply with RFC2396 wherever possible but there are no plans to be
>> completely compatible with it.
>
> Well, if the SMB draft is supposed to be accepted by the IETF, it simply
> has to be compliant to the base spec.

Well I don't know what Chris is going to do. It's a tough position because
2396 just doesn't mate well with CIFS. CIFS is a WAN network filesystem
which means you are going to need to support Unicode and not just the
locale dependant encoding. JCIFS is not going to require hex escapes to
encode non-ascii characters.

Mike