[clug] logical volume becomes unstable when nearly full

Sun Apr 23 21:27:14 GMT 2006

Hello Leo

Although you do not appear to have had many fill/expire cycles on the 
storage, File fragmentation may be the underlying cause of your 
problems.  Especially with multimedia data where there is a continuous 
high I/O bandwidth requirement even small levels of fragmentation can 
cause unrealistic/unsustainable head seeking.  The physical disk 
mechanics cannot seek and access the data all over the platter for very 
long before under/over runs occur. The more small purges (holes in 
physical data you do) such as in editing, the quicker the problem shows up.

With normal data there are asynchronous bursts that can be absorbed in 
cache but with media information the data transfer rate has to be 
maintained for a long period.  While the physically written 
sectors/clusters are contiguous there is no problem.  Once they are not 
contiguous then all hell breaks loose.

Solution is to define really big cluster sizes. Even better define the 
cluster size to be just slightly bigger then the typical media file 
size.  This way files cannot be fragmented.

What I am saying here is totally independent of what file system/volume 
manager.  I have seen some fairly spectacular system failures due to 
this problem.  Normal IT people do not understand real time media 
requirements (e.g. Broadcast TV [270 Mb/s]). I have even seen some of 
the really big storage providers sales droids make these types of mistakes.

Another way to give your self some more seek bandwidth is to move to 
multiple RAID 1 or RAID 3 which is a combination of Mirroring and Striping.

Hope this might give you some clues.

Neil Pickford
Broadcast and Digital Media Projects
Parliament House

Leo Kliger wrote:
> Hi All,
> 
> Has anyone experienced disk array instability using a combination of
> software RAID, LVM and ext3 file systems when nearly full? The full
> story is as follows:-
> 
> I've been running a MythTV PVR on an Ubuntu Breezy Badger system for
> several months with periodic trouble on my storage volume.
> 
> This symptom shows up seemingly as hard errors on the last disk in the
> array when it's nearly full:-
> 
> Apr 22 19:33:04 <hostname> kernel: [5429952.423000] SCSI error : <2 0 0
> 0> return code = 0x8000002
> Apr 22 19:33:04 <hostname> kernel: [5429952.423000] end_request: I/O
> error, dev sdc, sector 781417535
> 
> The original configuration for the storage disk was x2 400GB sata drives
> striped with LVM and formatted with ext3.
> The above errors resulted in the volume becoming read only. Rebooting
> the system would buy a little more time but it would crash again.
> I tried disabling journaling and ran the volume as ext2. This made no
> difference.
> 
> I decided to bite the bullet and buy two more disks and with some data
> juggling rebuilt everything as a software RAID5 and once again followed
> the text book of using LVM across my RAID5 with an ext3 on top. 
> 
> I also used my new found redundancy to swap out the disk that was
> crashing on me.
> 
> Sadly my victory was shorted lived and as my disk filled up (as they do
> on pvr's) the last disk in the array would drop out of my RAID5 array.
> Given that there are now x4 disks the last disk is also on a different
> sata port. Even so, given that the errors seemed hard again I had a the
> disk replaced but stability just doesn't seem to hold unless I keep
> above 5% free space. 
> 
> The expected behaviour is for oldest recordings to evaporate and for
> space to become free for new recordings - but my disk array seems to
> crash before this can happen, last night my RAID5 dropped altogether.
> 
> Okay - big deal - I lost a whole lot of recordings.... I'll get over
> that. I have rebuilt the array using software RAID only and formatted
> using ext3 again. But this time I have not used LVM in between (I'm
> probably never going to want to grow the volume anyway).
> 
> As asked above - does anybody have any idea where I'm going wrong?
> 
> Some extra machine info is as follows:-
> 
> Linux <hostname> 2.6.12-10-k7 #1 Sat Mar 11 16:59:38 UTC 2006 i686
> GNU/Linux
> 
> 0000:00:00.0 Memory controller: nVidia Corporation: Unknown device 005e
> (rev a3)
> 0000:00:01.0 ISA bridge: nVidia Corporation: Unknown device 0050 (rev
> a3)
> 0000:00:01.1 SMBus: nVidia Corporation: Unknown device 0052 (rev a2)
> 0000:00:02.0 USB Controller: nVidia Corporation: Unknown device 005a
> (rev a2)
> 0000:00:02.1 USB Controller: nVidia Corporation: Unknown device 005b
> (rev a3)
> 0000:00:04.0 Multimedia audio controller: nVidia Corporation: Unknown
> device 0059 (rev a2)
> 0000:00:06.0 IDE interface: nVidia Corporation: Unknown device 0053 (rev
> f2)
> 0000:00:07.0 IDE interface: nVidia Corporation: Unknown device 0054 (rev
> f3)
> 0000:00:08.0 IDE interface: nVidia Corporation: Unknown device 0055 (rev
> f3)
> 0000:00:09.0 PCI bridge: nVidia Corporation: Unknown device 005c (rev
> a2)
> 0000:00:0a.0 Bridge: nVidia Corporation: Unknown device 0057 (rev a3)
> 0000:00:0b.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev
> a3)
> 0000:00:0c.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev
> a3)
> 0000:00:0d.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev
> a3)
> 0000:00:0e.0 PCI bridge: nVidia Corporation: Unknown device 005d (rev
> a3)
> 0000:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
> 0000:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
> 0000:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
> 0000:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 NorthBridge
> 0000:01:06.0 Multimedia video controller: Brooktree Corporation Bt878
> Video Capture (rev 11)
> 0000:05:00.0 VGA compatible controller: nVidia Corporation: Unknown
> device 0161 (rev a1)
> 
> Will any extra info help?
> 
> Regards,
> 
> Leo
> 
> 
> 
> 
>