[clug] Keeping Data Alive

Alan Vidler avidler at iinet.net.au
Sun Feb 21 22:46:17 MST 2010

Touches on a pet obsession of mine:

'Answer' is you *cannot* archive-and-forget data and expect it to
be useable in the distant future - and 'distant future' in the
electronic era is now mentioned in years not decades or centuries.

The problem is not new - there are still ancient writings no one
has been able to decipher desite being wll preserved on clay
tablets etc, and vast stacks readable only because someone back
them produced the Rosetta stone.

Despite talk of old copies of the original Gutenberg Bible (or
worse, the Domesday Book) still existing and still human readable
I suspect that correct to the nearest whole number, there are
*zero* clug members who could read it - but most could, if
determined, read modern versions of them.

If you want data to be readable it needs to be *maintained*, both
recording media and content. There is (almost) always a
transition period where people/machines can handle both the old
and the new stuff and if that window of opportunity is not taken
to transfer, things get ever more difficult or impossible...

To illustrate (years are approx/guesses)

1966: I wrote a Fortran II routine, stored on punched cards.
1967: Transferred to 7 track 2400 foot 800bpi mag tape
1968: Upgraded to Fortran 66.
1970: Translated from CDC 3000 Series to CDC Cyber character set.
1971: Stored on 9 track 1600bpi tape
1973: Moved to 6250bpi tape
1975: Moved to 4MB hard disk (washing machine size!)
1980: Moved to IBM compatible mainframe
1980: Translated to EBCDIC
1982: Back to tape
1983: Translated to Asci.
1983: Copied to 5.25" floppy
1983: Minor changes to use on a CP/M-80 system.
1985: Moved to DOS.
1985: Variously stored on hard disk and 3.5" floppies
1995: Moved to linux and also Sun/Solaris boxes via QIC tape
1996: Translated into C (though both versions still used)
1999: [last run to date]

The point is that the code/algorithm survived for 33 years (and
counting), but only because it was maintained.
I doubt anyone could handle any of the pre-1980 versions now, let
alone the 1966 version.

The electronic age has made things worse and getting more so:
- In theory could read data from 1966 punched cards if available
- had fun in 1980's trying to read tapes written in 1960's,
watching tape go one way and oxide another as enter read heads!
- I have a stack of old 3.5" floppies - get about a 50% failure
rate when try and read them.

Alan Vidler

- - - - -

On 22/02/2010 3:10 PM jm sent:
> These are the short of things that may lead to virtualisation being big 
> not just hypervisor/paravirtualisation, but rather emulation. Faking the 
> old hardware so that the old OSes can be used to view old data stuck in 
> legacy file formats only readable by the old applications which only run 
> on the old OS on the old H/W. The really interesting bit (read 
> difficult) is not faking the running platform but faking the I/O. I 
> heard a while ago that tidbinbilla radio telescope replaced its PDP-11 
> with an industrial computer which faked the PDP hardware to run the 
> required apps (written in forth?), and went so far as to have an ISA 
> board to mate to the telescope to control it. Unfortunately, I don't 
> have any links/references to verify how correct this is. Anyway, you can 
> imagine that with the pahsing ou of ISA buses in favour of various forms 
> PCI buses that, unless there have changed again, that they must have 
> been stock piling "old" PCs just in case something failed.
> Jeff.
> On 22/02/10 2:37 PM, Andrew Janke wrote:
>>> I have a fruit box of 8" floppy disks, various boxes of old obsolete
>>> magnetic tapes etc
>>> All of which are not much good without  the media reader but more
>>> importantly
>>> the O/S, application software and user knowledge to read them.
>> I use spinning disks (and open formats) as much as possible.  I keep
>> buying a bigger spinning disk and copy everything old over and where
>> possible converting all old files to text.
>> Mind you I still have a bunch of old Macintosh Word 1? format files
>> that are proving difficult to read with anything. :(  I found these a
>> few months back in a zip archive and am yet to find something to read
>> them.
>> I may have to find someone with a Mac SE/30 or something...
>> ark!
>> -- 
>> Andrew Janke
>> (a.janke at gmail.com || http://a.janke.googlepages.com/)
>> Canberra->Australia    +61 (402) 700 883

More information about the linux mailing list