[cifs-protocol] Cannot uncompress SMB3 LZ77 payload

Stefan Metzmacher metze at samba.org
Fri Jul 5 13:44:42 UTC 2019


Am 05.07.19 um 11:44 schrieb Stefan Metzmacher via cifs-protocol:
> Am 04.07.19 um 21:52 schrieb Matthieu Suiche via cifs-protocol:
>> Isn't it using the LZXPRESS algorithm instead?
>>
>> On Thu, Jul 4, 2019, 8:14 PM Aurélien Aptel via cifs-protocol <
>> cifs-protocol at lists.samba.org> wrote:
>>
>>>
>>> Hello,
>>>
>>> I've been able to trigger a LZ77 compressed Read response against the
>>> latest
>>> Windows Server 2019 but I am unable to decompress it.
>>>
>>> Request
>>> =======
>>>
>>> SMB2 (Server Message Block Protocol version 2)
>>>     [....]
>>>     Read Request (0x08)
>>>         StructureSize: 0x0031
>>>             0000 0000 0011 000. = Fixed Part Length: 24
>>>             .... .... .... ...1 = Dynamic Part: True
>>>         Padding: 0x00
>>>         Flags: 0x02, Compressed
>>>             .... ...0 = Unbuffered: Client is NOT asking for UNBUFFERED
>>> read
>>>             .... ..1. = Compressed: Client is asking for COMPRESSED data
>>>         Read Length: 131072
>>>         File Offset: 0
>>>         GUID handle File: a
>>>             File Id: 00000012-0004-0000-0100-000004000000
>>>             [Frame handle opened: 52]
>>>         Min Count: 0
>>>         Channel: None (0x00000000)
>>>         Remaining Bytes: 0
>>>         Blob Offset: 0x00000000
>>>         Blob Length: 0
>>>         Channel Info Blob: NO DATA
>>>
>>>
>>> Response
>>> ========
>>>
>>> 0000  fc 53 4d 42 00 00 02 00  02 00 00 00 50 00 00 00   .SMB.... ....P...
>>> 0010  fe 53 4d 42 40 00 02 00  00 00 00 00 08 00 0a 00   .SMB at ... ........
>>> 0020  01 00 00 00 00 00 00 00  07 00 00 00 00 00 00 00   ........ ........
>>> 0030  ff fe 00 00 01 00 00 00  35 00 00 00 00 10 00 00   ........ 5.......
>>> 0040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........
>>> 0050  11 00 50 00 00 00 02 00  00 00 00 00 00 00 00 00   ..P..... ........
>>> 0060  ff ff ff 7f ff 07 00 0f  ff 00 00 fc ff 01 00      ........ .......
>>>
>>> NetBIOS Session Service
>>>     Message Type: Session message (0x00)
>>>     Length: 111
>>> SMB2 (Server Message Block Protocol version 2)
>>>     SMB2 Compression Transform Header
>>>     ProtocolId: fc534d42
>>>     OriginalSize: 131072
>>>     CompressionAlgorithm: LZ77 (0x0002)
>>>     Reserved: 0000
>>>     Offset: 0x00000050
>>>
>>>
>>> Let's look again and annotate...
>>>
>>>
>>> 0000  fc 53 4d 42 00 00 02 00  02 00 00 00 50 00 00 00   .SMB.... ....P...
>>>       ^^^^^^^^^^^                          ^^^^^^^^^^^
>>>  compression transform header            compressed data offset = 0x50
>>>
>>>
>>>    SMB2 header follows                     READ
>>>       vvvvvvvvvvv                          vvvvv
>>> 0010  fe 53 4d 42 40 00 02 00  00 00 00 00 08 00 0a 00   .SMB at ... ........
>>> 0020  01 00 00 00 00 00 00 00  07 00 00 00 00 00 00 00   ........ ........
>>> 0030  ff fe 00 00 01 00 00 00  35 00 00 00 00 10 00 00   ........ 5.......
>>> 0040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00   ........ ........
>>> 0050  11 00 50 00 00 00 02 00  00 00 00 00 00 00 00 00   ..P..... ........
>>>             ^^
>>>   read data offset from SMB2 header is 0x50 again
>>>
>>>
>>> 0060  ff ff ff 7f ff 07 00 0f  ff 00 00 fc ff 01 00      ........ .......
>>>       ^^
>>>     compressed data starts here (0x10 + 0x50 = 0x60)
>>>
>>> So the LZ77 compressed data is
>>>
>>>     ff ff ff 7f ff 07 00 0f ff 00 00 fc ff 01 00
>>>
>>> I've tried to decode it using [MS-XCA] 2.4.4 "Plain LZ77 Decompression"
>>> [1] which has pseudo code that is easily runnable in python. I can
>>> decode the examples on that page fine:
>>>
>>>   >>> decode(bytes.fromhex(" ff ff ff 1f 61 62 63 17 00 0f ff 26 01"))
>>>
>>> bytearray(b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc'+
>>>
>>> b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc'+
>>>
>>> b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc'+
>>>
>>> b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc'+
>>>             b'abcabcabcabcabcabcabcabcabcabcabcabcabcabcabcabc')
>>>
>>> But if I try to decode my compressed payload it is invalid:
>>>
>>>   >>> decode(bytes.fromhex(" ff ff ff 7f ff 07 00 0f  ff 00 00 fc ff 01
>>> 00"))
>>>   Traceback (most recent call last):
>>>     File "<stdin>", line 1, in <module>
>>>     File "lz.py", line 54, in decode
>>>       raise Exception("error")
>>>
>>> This corresponds to this line in the pseudo-code:
>>>
>>>                      If MatchLength < 15 + 7
>>>                         Return error.
>>>
>>> And it fails in the very beggining after only outputting 1 byte
>>> (ff). The uncompressed payload should be all 0xFF.
>>>
>>> You can see and run the script online here [2].
>>>
>>> So, any ideas on what I'm missing? Is the LZ77 encoding used in the
>>> packet different? Am I missinterpreting some fields?
> 
> It seems the compression algorithm has a bug regarding matches longer
> than UINT16_MAX + 3.
> 
> In your example we an original payload of 131072 bytes with 0xff.
> 
> 1. The first byte is encoded directly.
> 
> 2. We find a match with offset 1 and length 131071
> 
> 3. We do offset -= 1 and length -= 3 (we have offset=0, length = 131068)
> 
> 4. Length is >= 7, we do length -= 7 and encode it (=> length = 131061)
> 
> 5. length is >= 15, we do length -=15 and encode it (=> length = 131046)
> 
> 6. length is >= 255, we do length += (15 + 7)
>    (=> length = 131068 (0x1FFFC) again)
>    Encoding this into just 2 bytes doesn't work.
> 
>    Ah! It seems the 0x0000 length means the length is encoded in the
>    following 3 bytes! fc ff 01 is just 131068

Actually I missed the last 00 byte, so the length field is 4 byte not 3.

This patch fixes the decompression:

--- lz77decompress-example1a.py 2019-07-05 15:08:16.145761364 +0200
+++ lz77decompress-example1b.py 2019-07-05 15:40:20.824646872 +0200
@@ -81,6 +81,10 @@ def decode(ibuf):
                         # read 2 bytes from InputPosition
                         MatchLength = struct.unpack_from('<H', ibuf,
InputPosition)[0]
                         InputPosition += 2
+                        if MatchLength == 0:
+                            # read 4 bytes from InputPosition
+                            MatchLength = struct.unpack_from('<I',
ibuf, InputPosition)[0]
+                            InputPosition += 4

Can you extent this thread to dochelp at microsoft.com
(and still cc: cifs-protocol at lists.samba.org)

Thanks!
metze


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/cifs-protocol/attachments/20190705/31b4fd20/signature.sig>


More information about the cifs-protocol mailing list