[linux-cifs-client] CIFS Umount apparently causes loss of data
Jeff Layton
jlayton at redhat.com
Tue Jun 24 01:13:57 GMT 2008
On Mon, 23 Jun 2008 12:30:02 +0100
"Richard Walters" <richard.walters at tdbnetworks.com> wrote:
> Dear linux-cifs-client list,
>
> As root user, I am mounting a cifs mount on a RHEL4 server, where the
> filstore is located on a Windows 2000 server on the same network
> segment. There are no firewalls in between the two physical machines.
> CIFS version is 1.48a.RH.
>
> I am using the following to mount the share, create a 150 Mb sparsefile,
> associate the sparsefile with a loop device, create an ext3 filesystem
> on the sparsefile, and then mount the subsequently created filesystem.
>
> mount -t cifs --verbose -o
> forcedirectio,username=<DOMAIN>\<USERNAME>,password=<PASSWORD>
> //<WINDOWS MACHINE>/<MOUNT LOCATION> /mnt/mountpoint
>
> dd if=/dev/zero of="/mnt/mountpoint/SPARSEFILE" bs=1M count=1
> seek="150"
>
> losetup /dev/loop20 /mnt/mountpoint/SPARSEFILE
>
> mkfs -t ext3 /dev/loop20
>
> mount -t ext3 /dev/loop20 /mnt/backup
>
> All of the above occurs without error.
>
> I then write files and directory structures to /mnt/backup and confirm
> that they are there with all the correct permissions etc etc. I can
> manipulate the data in /mnt/backup - change permissions etc etc.
>
> Then I issue:
>
> umount /mnt/backup
> losetup -d /dev/loop20
> umount /mnt/mountpoint
>
> At this point I expect the files and directory structures written to
> /mnt/backup to have been populated to the file system created in the
> sparsefile, which is no longer mounted.
>
> However, if I remount using the following:
>
> mount -t cifs --verbose -o
> username=<DOMAIN>\<USERNAME>,password=<PASSWORD> //<WINDOWS
> MACHINE>/<MOUNT LOCATION> /mnt/mountpoint
>
> losetup /dev/loop20 /mnt/mountpoint/SPARSEFILE
>
> mount -t ext3 /dev/loop20 /mnt/backup
>
> I find that the data on /mnt/backup is not as I expect. Generally,
> files in the root (/mnt/backup) are as expected, but any directory
> structures have disappeared. Changed permissions on files in the root
> remain changed. Deleted files in the root partition have reappeared.
> I am even getting inconsistent results on the above - depending on how
> quickly I unmount, and on how many files/directories I copy to
> /mnt/backup.
>
> It appears from investigation that the files remain until the umount of
> the cifs filesystem.
>
> For example, if I just umount the loop device (and complete the losetup
> -d), and then complete
>
> losetup /dev/loop20 /mnt/mountpoint/SPARSEFILE
>
> mount -t ext3 /dev/loop20 /mnt/backup
>
> Everything is just as I would expect - there are no surprises at all.
>
> Initially I suspected cifs data caching, so I employed the directio
> option on the cifs mount command - there has been no discernable
> difference. dmesg shows that this option is NOT rejected, but looking
> at the /proc/mounts the option does not show up, although this could be
> a red herring:
>
> //WINDOWS MACHINE/MOUNT LOCATION /mnt/mountpoint cifs
> rw,mand,noatime, nodiratime,unc=\\WINDOWS MACHINE\MOUNT
> LOCATION,username=<username>,domain=<windows
> domain>,rsize=16384,wsize=57344 0 0
>
> It really does appear that the data loss occurs on the cifs umount - I
> suspected inode caching, but the directio option should have ensured
> that this was OK.
>
> Out of interest, I have tried this on RHEL5 to a different Windows file
> server, but end up with a similar result. I have also tried using ext2
> rather than ext3 filesystems - there is no change to the overall
> behaviour.
>
> I have deployed debug logging on cifs, but dmesg does not provide any
> particular clue as to an underlying cause - all routine rcs = 0
>
> Has anyone come across something similar, or can point me in the right
> direction to resolve this?
>
A bit of a strange use-case, but in principle, it should work. This
particular problem isn't ringing any bells for me though...
1.48aRH is pretty old by now. My first suggestion would be to test a
RHEL4.7-ish or 5.2-ish kernel. Those have updated CIFS code. Even
better might be to test the kernels on my people page:
http://people.redhat.com/jlayton
...they have some patches I'm considering for later updates that are
not yet in the current RHEL releases. If the problem isn't resolved in
there, then I'd suggest opening a support case and having the RH
support people escalate this up to a BZ and we can start working on it
there.
It sounds like there's a nice, well-defined reproducer, so we should be
able to figure out the problem. I'll be on vacation until next week
though, so I'll plan to have a closer look at this after then...
Cheers,
--
Jeff Layton <jlayton at redhat.com>
More information about the linux-cifs-client
mailing list