[clug] EXT4 Reliability

Paul Wayper paulway at mabula.net
Wed Sep 30 05:56:16 MDT 2009


On 29/09/09 21:58, Ben Nizette wrote:
> On Tue, 2009-09-29 at 20:38 +1000, Francis Whittle wrote:
>
>> That said, I'll tend to go for ext2 for backup media, for portability.
>
> If you get a (non-disk) failure during a backup operation at least with
> ext[34] you're not likely to munge existing data, just whatever was
> changing at the time and hopefully not even that.  Using ext2 you're
> likely to loose data and the data you loose could be anywhere.  If
> portability is an issue and data security isn't then, moral issues
> aside, IMO you may as well use FAT.
>
> The situation is somewhat different with flash media where you're likely
> to loose data by the erase block regardless; no (current) fs can save
> you from munging adjacent data.
>
> ISO9660 isn't a terrible plan actually as it can perform error
> correction, not just detection.  Where ext* can just loose as little
> unrelated data as possible, ISO9660 can potentially recreate the missing
> data completely.

Which is why using par2 to create a bunch of files which provide extra 
'parity' information for the entire file set is probably a good way to go:

[paulway at tachyon ~]$ ll vmlinux.*
-rw-rw-r--. 1 paulway paulway 16429655 2009-08-26 23:46 vmlinux.lzma
[paulway at tachyon ~]$ par2create -u -n8 -r50 vmlinux.par vmlinux.lzma
par2cmdline version 0.4, Copyright (C) 2003 Peter Brian Clements.

par2cmdline comes with ABSOLUTELY NO WARRANTY.

This is free software, and you are welcome to redistribute it and/or modify
it under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or (at your
option) any later version. See COPYING for details.

Block size: 8216
Source file count: 1
Source block count: 2000
Redundancy: 50%
Recovery block count: 1000
Recovery file count: 8

Opening: vmlinux.lzma
Computing Reed Solomon matrix.
Constructing: done.
Wrote 8216000 bytes to disk
Writing recovery packets
Writing verification packets
Done
[paulway at tachyon ~]$ ll vmlinux.*
-rw-rw-r--. 1 paulway paulway 16429655 2009-08-26 23:46 vmlinux.lzma
-rw-rw-r--. 1 paulway paulway    40404 2009-09-30 21:38 vmlinux.par.par2
-rw-rw-r--. 1 paulway paulway  1317728 2009-09-30 21:38 
vmlinux.par.vol0000+125.par2
-rw-rw-r--. 1 paulway paulway  1317728 2009-09-30 21:38 
vmlinux.par.vol0125+125.par2
-rw-rw-r--. 1 paulway paulway  1317728 2009-09-30 21:38 
vmlinux.par.vol0250+125.par2
-rw-rw-r--. 1 paulway paulway  1317728 2009-09-30 21:38 
vmlinux.par.vol0375+125.par2
-rw-rw-r--. 1 paulway paulway  1317728 2009-09-30 21:38 
vmlinux.par.vol0500+125.par2
-rw-rw-r--. 1 paulway paulway  1317728 2009-09-30 21:38 
vmlinux.par.vol0625+125.par2
-rw-rw-r--. 1 paulway paulway  1317728 2009-09-30 21:38 
vmlinux.par.vol0750+125.par2
-rw-rw-r--. 1 paulway paulway  1317728 2009-09-30 21:38 
vmlinux.par.vol0875+125.par2

So I've written 50% redundancy there - in theory I can destroy 49% of the 
original file and get it back using par2repair:

[paulway at tachyon ~]$ cp vmlinux.lzma{-orig,}
[paulway at tachyon ~]$ dd if=/dev/urandom of=vmlinux.lzma bs=1k count=1k 
conv=nocreat,notrunc
1+0 records in
1+0 records out
1024 bytes (1.0 kB) copied, 0.00136205 s, 752 kB/s
[paulway at tachyon ~]$ diff vmlinux.lzma-orig vmlinux.lzma
Binary files vmlinux.lzma-orig and vmlinux.lzma differ
[paulway at tachyon ~]$ dd if=/dev/urandom of=vmlinux.lzma bs=1k count=1k 
conv=nocreat,notrunc
1024+0 records in
1024+0 records out
1048576 bytes (1.0 MB) copied, 0.363479 s, 2.9 MB/s
[paulway at tachyon ~]$ par2verify vmlinux.par.par2
par2cmdline version 0.4, Copyright (C) 2003 Peter Brian Clements.

par2cmdline comes with ABSOLUTELY NO WARRANTY.

This is free software, and you are welcome to redistribute it and/or modify
it under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or (at your
option) any later version. See COPYING for details.

Loading "vmlinux.par.par2".
Loaded 4 new packets
Loading "vmlinux.par.vol0250+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0000+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0500+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0375+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0125+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0750+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0875+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0625+125.par2".
Loaded 125 new packets including 125 recovery blocks

There are 1 recoverable files and 0 other files.
The block size used was 8216 bytes.
There are a total of 2000 data blocks.
The total size of the data files is 16429655 bytes.

Verifying source files:

Target: "vmlinux.lzma" - damaged. Found 1872 of 2000 data blocks.

Scanning extra files:


Repair is required.
1 file(s) exist but are damaged.
You have 1872 out of 2000 data blocks available.
You have 1000 recovery blocks available.
Repair is possible.
You have an excess of 872 recovery blocks.
128 recovery blocks will be used to repair.
[paulway at tachyon ~]$ par2repair vmlinux.par.par2
par2cmdline version 0.4, Copyright (C) 2003 Peter Brian Clements.

par2cmdline comes with ABSOLUTELY NO WARRANTY.

This is free software, and you are welcome to redistribute it and/or modify
it under the terms of the GNU General Public License as published by the
Free Software Foundation; either version 2 of the License, or (at your
option) any later version. See COPYING for details.

Loading "vmlinux.par.par2".
Loaded 4 new packets
Loading "vmlinux.par.vol0250+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0000+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0500+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0375+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0125+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0750+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0875+125.par2".
Loaded 125 new packets including 125 recovery blocks
Loading "vmlinux.par.vol0625+125.par2".
Loaded 125 new packets including 125 recovery blocks

There are 1 recoverable files and 0 other files.
The block size used was 8216 bytes.
There are a total of 2000 data blocks.
The total size of the data files is 16429655 bytes.

Verifying source files:

Target: "vmlinux.lzma" - damaged. Found 1872 of 2000 data blocks.

Scanning extra files:


Repair is required.
1 file(s) exist but are damaged.
You have 1872 out of 2000 data blocks available.
You have 1000 recovery blocks available.
Repair is possible.
You have an excess of 872 recovery blocks.
128 recovery blocks will be used to repair.

Computing Reed Solomon matrix.
Constructing: done.
Solving: done.

Wrote 16429655 bytes to disk

Verifying repaired files:

Target: "vmlinux.lzma" - found.

Repair complete.
[paulway at tachyon ~]$ diff vmlinux.lzma-orig vmlinux.lzma
[paulway at tachyon ~]$

My guess as to why it generates a number of these verification files is so 
that you can store them on separate media - burn them to different CDs for 
example - and then use them to recover the files as necessary.

This should be proof against entire sector errors (as we saw, it can survive 
an entire linear megabyte being hosed completely) as well as more insidious 
bit errors.

Hope this helps,

Paul

P.S. for those trying this out at home, note that some of the commands extend 
beyond the 80 characters of my email program...  Don't practice the dd on 
anything you don't want to lose.


More information about the linux mailing list