[cifs-protocol] [REG:119070521001876] SMB3 LZ77 decompression issues
Edgar Olougouna
edgaro at microsoft.com
Fri Jul 5 19:02:01 UTC 2019
Aurélien,
I will take a look at this and follow-up. If you apply the change Metze suggested to the pseudo-code, does it allow you to decompress the payload?
Thanks,
Edgar
-----Original Message-----
From: Bryan Burgin <bburgin at microsoft.com>
Sent: Friday, July 5, 2019 1:23 PM
To: Aurélien Aptel <aaptel at suse.com>; Interoperability Documentation Help <dochelp at microsoft.com>; cifs-protocol at lists.samba.org
Cc: support <support at mail.support.microsoft.com>
Subject: [REG:119070521001876] SMB3 LZ77 decompression issues
Hi Aurélien,
Thank you for your question. We created SR 119070521001876 to track your issue. An engineer will contact you soon.
Bryan
-----Original Message-----
From: Aurélien Aptel <aaptel at suse.com>
Sent: Friday, July 5, 2019 8:03 AM
To: Interoperability Documentation Help <dochelp at microsoft.com>; cifs-protocol at lists.samba.org
Subject: SMB3 LZ77 decompression issues
Hello,
I'm posting again with dochelp in CC.
I've been able to trigger a LZ77 compressed SMB3 Read response against the latest Windows Server 2019 but I am unable to decompress it.
Request
=======
SMB2 (Server Message Block Protocol version 2)
[....]
Read Request (0x08)
StructureSize: 0x0031
0000 0000 0011 000. = Fixed Part Length: 24
.... .... .... ...1 = Dynamic Part: True
Padding: 0x00
Flags: 0x02, Compressed
.... ...0 = Unbuffered: Client is NOT asking for UNBUFFERED read
.... ..1. = Compressed: Client is asking for COMPRESSED data Read Length: 131072 File Offset: 0 GUID handle File: a
File Id: 00000012-0004-0000-0100-000004000000
[Frame handle opened: 52]
Min Count: 0
Channel: None (0x00000000)
Remaining Bytes: 0
Blob Offset: 0x00000000
Blob Length: 0
Channel Info Blob: NO DATA
Response
========
0000 fc 53 4d 42 00 00 02 00 02 00 00 00 50 00 00 00 .SMB.... ....P...
0010 fe 53 4d 42 40 00 02 00 00 00 00 00 08 00 0a 00 .SMB at ... ........
0020 01 00 00 00 00 00 00 00 07 00 00 00 00 00 00 00 ........ ........
0030 ff fe 00 00 01 00 00 00 35 00 00 00 00 10 00 00 ........ 5.......
0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
0050 11 00 50 00 00 00 02 00 00 00 00 00 00 00 00 00 ..P..... ........
0060 ff ff ff 7f ff 07 00 0f ff 00 00 fc ff 01 00 ........ .......
NetBIOS Session Service
Message Type: Session message (0x00)
Length: 111
SMB2 (Server Message Block Protocol version 2)
SMB2 Compression Transform Header
ProtocolId: fc534d42
OriginalSize: 131072
CompressionAlgorithm: LZ77 (0x0002)
Reserved: 0000
Offset: 0x00000050
Let's look again and annotate...
0000 fc 53 4d 42 00 00 02 00 02 00 00 00 50 00 00 00 .SMB.... ....P...
^^^^^^^^^^^ ^^^^^^^^^^^
compression transform header compressed data offset = 0x50
SMB2 header follows READ
vvvvvvvvvvv vvvvv
0010 fe 53 4d 42 40 00 02 00 00 00 00 00 08 00 0a 00 .SMB at ... ........
0020 01 00 00 00 00 00 00 00 07 00 00 00 00 00 00 00 ........ ........
0030 ff fe 00 00 01 00 00 00 35 00 00 00 00 10 00 00 ........ 5.......
0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ........ ........
0050 11 00 50 00 00 00 02 00 00 00 00 00 00 00 00 00 ..P..... ........
^^
read data offset from SMB2 header is 0x50 again
0060 ff ff ff 7f ff 07 00 0f ff 00 00 fc ff 01 00 ........ .......
^^
compressed data starts here (0x10 + 0x50 = 0x60)
So the LZ77 compressed data is
ff ff ff 7f ff 07 00 0f ff 00 00 fc ff 01 00
I've tried to decode it using [MS-XCA] 2.4.4 "Plain LZ77 Decompression"
[1] which has pseudo code that is easily runnable in python. I can decode the examples on that page fine:
>>> decode(bytes.fromhex(" ff ff ff 1f 61 62 63 17 00 0f ff 26 01"))
bytearray(b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc'+
b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc'+
b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc'+
b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc'+
b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc')
But if I try to decode my compressed payload it is invalid:
>>> decode(bytes.fromhex(" ff ff ff 7f ff 07 00 0f ff 00 00 fc ff 01 00"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "lz.py", line 54, in decode
raise Exception("error")
This corresponds to this line in the pseudo-code:
If MatchLength < 15 + 7
Return error.
And it fails in the very beggining after only outputting 1 byte (ff). The uncompressed payload should be all 0xFF.
Stefan Metzmacher found that there is a bug in the pseudo-code when dealing with long matches:
"Stefan Metzmacher" <metze at samba.org> writes:
> It seems the compression algorithm has a bug regarding matches longer
> than UINT16_MAX + 3.
>
> In your example we an original payload of 131072 bytes with 0xff.
>
> 1. The first byte is encoded directly.
>
> 2. We find a match with offset 1 and length 131071
>
> 3. We do offset -= 1 and length -= 3 (we have offset=0, length =
> 131068)
>
> 4. Length is >= 7, we do length -= 7 and encode it (=> length =
> 131061)
>
> 5. length is >= 15, we do length -=15 and encode it (=> length =
> 131046)
>
> 6. length is >= 255, we do length += (15 + 7)
> (=> length = 131068 (0x1FFFC) again)
> Encoding this into just 2 bytes doesn't work.
>
> Ah! It seems the 0x0000 length means the length is encoded in the
> following 3 bytes! fc ff 01 is just 131068
It is actually the following 4 bytes.
So this change was needed in the pseudo-code from MS-XNA:
--- lz77decompress-example1a.py 2019-07-05 15:08:16.145761364 +0200
+++ lz77decompress-example1b.py 2019-07-05 15:40:20.824646872 +0200
@@ -81,6 +81,10 @@ def decode(ibuf):
# read 2 bytes from InputPosition
MatchLength = struct.unpack_from('<H', ibuf, InputPosition)[0]
InputPosition += 2
+ if MatchLength == 0:
+ # read 4 bytes from InputPosition
+ MatchLength = struct.unpack_from('<I', ibuf, InputPosition)[0]
+ InputPosition += 4
Can Microsoft confirm the pseudo-code is now complete?
Cheers,
--
Aurélien Aptel / SUSE Labs Samba Team
GPG: 1839 CB5F 9F5B FB9B AA97 8C99 03C8 A49B 521B D5D3 SUSE Linux GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg)
More information about the cifs-protocol
mailing list