[Samba] Possible Filesystem Corruption with Samba 3.0.25a (with XFS and LVM)

Andri aoeuid at gmail.com
Tue Jun 26 15:29:01 GMT 2007


Adam Tauno Williams wrote:
>> This is a major issue, but due to the lack of helpful info, I'm forced
>> to ask in various places.
>> Perhaps Deluge Torrent's allocation routines got Samba confused?
> 
> Most likely something in the Kernel got them mucked up.  Or your
> hardware is junk.

I've done occasional memtests for a few days straight, and all have
ended successfully. If it wasn't one of those one-in-a-quintillion
chances that the sun flipped the necessary bits in memory, I'm
betting on software bugs.

>> There aren't many suspects -- either Samba, XFS (which probably is
>> more common than Samba, so less likely) 
> 
> No, XFS would be my first suspect with LVM a close second, and hardware
> a third.  I'd eat my hat if Samba had anything to do with this other
> than dispatching a write request (which it is the kernels job to deal
> with sanely).  Samba or other applications do not deal with disk
> geometry.

I'm sure someone will eat their hat if this problem's origin is ever found, but
my goal is not to suggest the sauce, but to try and stop this from ever
recurring to me, or to someone else :)

> Why?
> 
> Why?

Okay, I admit that those were only my guesses.

> 
>> Syslog-and-friends don't even care about files,
> 
> What does this mean?  Of course they "care about files"

I just can't see how syslog and such small (code-wise) and stable
services can all of a sudden take input from some file listing and
output it to a raw device.

>> The peculiar thing is, that the info that was written on top of
>> /dev/hdb3 contains the filepaths of /storage, so I'm betting it had
> 
> Ah, IDE hardware.  So that puts it solidly on the suspect list.

Yes, most disks nowadays are IDE. If you meant to say PATA, then root was
PATA, the LVM disks were on SATA. What part do you suspect exactly? The
controller inside the motherboard? The disk itself has no bad blocks, and was
monitored minutely, and tested every few days with the SMART self-tests.

>> something to do with Samba
> 
> EXTREMELY doubtful.
> 
>> , which at the time was actively dealing with /storage.

That "EXTREMELY doubtful" will probably be the answer from the kernel
mailinglist, from the XFS developers, from the LVM developers and from
the hardware makers, but unfortunately the "wasn't me" way of handling
possible bugs is useless, I feel.
I'd appreciate if some would take a look at the output I pasted instead, which
I'll add again, because I accidentally left out a few lines from the beginning.

00000200  00 42 42 42 18 01 00 00  00 00 00 00 00 01 00 00  |.BBB............|
00000210  10 00 00 00 e9 00 00 00  69 8a 17 9a 99 19 01 26  |........i......&|
00000220  00 fd 00 00 00 00 00 00  24 08 00 00 00 00 00 00  |........$.......|
00000230  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000240  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000260  00 00 00 00 00 00 00 00  d4 3e 00 00 00 00 01 00  |.........>......|
00000270  9f 01 12 00 07 00 00 00  40 00 00 00 99 41 7c 46  |........ at ....A|F|
00000280  71 7a 09 00 00 fd 00 00  00 00 00 00 24 08 00 00  |qz..........$...|
00000290  00 00 00 00 86 01 00 00  f1 03 00 00 00 00 00 00  |................|
000002a0  2f 73 74 6f 72 61 67 65  00 53 6f 66 74 77 61 72  |/storage.Softwar|
000002b0  65 2f 57 69 6e 64 6f 77  73 2f 47 61 6d 65 73 2f  |e/Windows/Games/|
000002c0  54 69 74 61 6e 20 51 75  65 73 74 20 2d 2d 20 49  |Titan Quest -- I|
000002d0  6d 6d 6f 72 74 61 6c 20  54 68 72 6f 6e 65 2f 54  |mmortal Throne/T|
000002e0  69 74 61 6e 2e 51 75 65  73 74 2e 49 6d 6d 6f 72  |itan.Quest.Immor|
000002f0  74 61 6c 2e 54 68 72 6f  6e 65 2d 55 6e 6c 65 61  |tal.Throne-Unlea|
00000300  73 68 65 64 2f 75 6e 6c  2d 74 71 69 74 2e 70 61  |shed/unl-tqit.pa|
00000310  72 74 31 35 2e 72 61 72  00 42 42 42 18 01 00 00  |rt15.rar.BBB....|
00000320  00 00 00 00 00 01 00 00  10 00 00 00 e9 00 00 00  |................|
00000330  69 8a 82 e8 ad de e1 fe  00 fd 00 00 00 00 00 00  |i...............|
00000340  25 08 00 00 00 00 00 00  01 00 00 00 00 00 00 00  |%...............|
00000350  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

My idea is, that if I could find out what wanted to write or made up the above
datastructure, I could start tracing the steps backwards and with luck find out
the reason why this ended up in & near the superblock.

You, Adam, did not even mention the only physical evidence I have to help find
the source of this problem -- do you just lack the experience with Samba's inner
structures and source, or simply did not have any ideas as to what might've
conjured this data up?


More information about the samba mailing list