[Samba] Need help with file corruption issue

David Coppit david at coppit.org
Fri May 31 10:51:40 MDT 2013


Hey Volker, thanks for the reply.

> Can you explain for really stupid people what this does and where the problem is?

Here's what the perl code is doing:

1) In a loop...
1.1) Write a file to the local disk, using a random filename and 5
random floats followed by a newline as the content.
1.2) chown the file so that the samba mount user can read it
1.3) Read that file from a cifs mount of that very same local disk
location, hosted by samba
1.4) Compare the written content versus the read content, exiting if
they are different.
1.5) Delete the temp file

What I see is that most of the time the samba-provided version of the
file is identical, but sometimes it's not. When it's not, the content
appears to be the contents of the previously read (and now deleted!)
temp file. In some failure cases it's a truncated version of the
previously read file. It's definitely not a perl issue since after the
script croaks, I can "cat" the file on both the local disk and the
samba share and the results are different:

# cat /grid/samba_stress_test/85fsYXTNhJ
0.9504576548397450.9504576548397450.9504576548397450.9504576548397450.950457654839745

# cat /root/grid/samba_stress_test/85fsYXTNhJ
0.5406506065286610.5406506065286610.5406506065286610.5406506065286610.540650606528661

> Also, I am a little confused about the scenario

In this test, I did a self-mount to rule out Windows. If you want I
can try to write this test using Windows instead of the cifs module.
But I'm pretty sure it's not the cifs module, since in our real
system, we're obviously not doing a self-mount like this. Instead
we're mounting the CentOs samba share on a Windows machine. In this
case what we're seeing is a failure to unzip a file because it's
truncated -- same symptom. What perhaps we didn't notice at the time
is that maybe the truncated content that we do get is also wrong --
didn't check this.

> It might help if you could send us an strace of that script producing
> the error together with a network trace.

I did the following:

# tshark -p -w wireshark.out port 445 or port 139
# strace perl samba_stress_test.pl > strace.txt 2>&1

Let me know if that's wrong. I'll attach the gzip'd files. Skip past
all the successes to see the failure at the very end.

On Fri, May 31, 2013 at 2:32 AM, Volker Lendecke
<Volker.Lendecke at sernet.de> wrote:
> On Thu, May 30, 2013 at 11:20:24AM -0400, David Coppit wrote:
>> Hi all,
>>
>> I've run into an issue and am wondering if folks can give some advice
>> on how to resolve it.
>>
>> Basically Samba appears to be getting confused, providing some other
>> file's contents.
>>
>> Initially I saw this on a Windows host that has mounted a share from
>> CentOs, but I've been able to repro it on the CentOs host using a
>> self-mount.
>
> Sorry, I don't know perl enough to actually see the sequence
> of events exactly enough. Can you explain for really stupid
> people what this does and where the problem is? It might
> help if you could send us an strace of that script producing
> the error together with a network trace.
>
> Also, I am a little confused about the scenario: You are
> saying that you saw this on a Windows host that has mounted
> a CentOs share? This means that the cifs kernel module is
> not involved at all here?
>
> With best regards,
>
> Volker Lendecke
>
> --
> SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
> phone: +49-551-370000-0, fax: +49-551-370000-9
> AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
> http://www.sernet.de, mailto:kontakt at sernet.de


More information about the samba mailing list