Samba + exFAT : how to avoid pre-allocating when copying big files?

Joseph j at gget.it
Mon Dec 7 20:07:55 UTC 2020


Thank you for your response Jeremy.
Good news: if I still write *from Windows*, but from a Python script like
this:

    import os
    with open(r'\\RASPBERRYPI\public\test\hello.txt', 'wb') as f:
        for i in range(100):
            f.write(os.urandom(10*1000*1000))  # 10 MB blocks

then the problem does not happen: each 10MB block is appended one after
another, and there is no "preallocation".
This seems totally logical, but happy to see it working: so writing from
Windows to a remote Linux+Samba+exFAT computer in itself is working fine!
(if the client is *not* Window Explorer, but another file-copying process,
such as the Python script here)

Now the only problem is the Windows Explorer file copy which probably does
this EOF fileseek to be sure there's no ENOSPC error.

Is there a full verbosity logging in Samba server that would allow me to
see exactly which open(), write(), seek() are sent by the Windows Explorer
Samba client to the Samba server? Can we log so precisely all IO calls? I'm
curious to see what Windows Explorer is sending exactly.

Since millions of people use exFAT in the NAS context (especially in the
RaspPi world and also people who use media players / TV which don't support
ext4 but only NTFS or exFAT), it would be great to see if a fix could be
possible :)
I've literally seen dozens of forum posts about nearly exactly this issue
(NAS-related / RaspPi / media-players-related forums, ec.).

I would be happy to analyze precisely what the Explorer does to see if a
trick could solve this.

PS: perhaps just *not* doing a flush() after the "seek EOF" would be enough.
Indeed I noticed that, on the Linux computer, with Python:

    with open('/mnt/exfat/test.bin', 'wb') as f:
        f.seek(1000*1000*1000)  # move 1 GB forward, no delay!
        f.write(b'END')                  # no delay!
        f.seek(0)                           # go to the beginning to
actually write the file content, no delay!
        f.write(...)                          # write the actual file
content

all the first lines happen without any delay. There is a delay if and only
if we flush() after the seek() or write(b'END'), but we could easily bypass
this.
With this method, the 1GB is written only once on disk, and not two times.

All the best.


More information about the samba-technical mailing list